Compare PHP hash_file with Java output - java

I have the output of UTF-8 hash_file that I need to calculate and check on my java client. Based on the hash_file manual I'm extracting the contents of the file and create the MD5 hash hex on Java, but I can't make them match. I tried suggestions on [this question] without success2.
Here's how I do it on Java:
public static String calculateStringHash(String text, String encoding)
throws NoSuchAlgorithmException, UnsupportedEncodingException{
MessageDigest md = MessageDigest.getInstance("MD5");
return getHex(md.digest(text.getBytes(encoding)));
}
My results match the ones from this page.
For example:
String jake: 1200cf8ad328a60559cf5e7c5f46ee6d
From my Java code: 1200CF8AD328A60559CF5E7C5F46EE6D
But when trying on files it doesn't work. Here's the code for the file function:
public static String calculateHash(File file) throws NoSuchAlgorithmException,
FileNotFoundException, IOException {
BufferedReader br = null;
StringBuilder sb = new StringBuilder();
try {
String sCurrentLine;
br = new BufferedReader(new FileReader(file));
while ((sCurrentLine = br.readLine()) != null) {
sb.append(sCurrentLine);
}
} catch (IOException ex) {
LOG.log(Level.SEVERE, null, ex);
} finally {
try {
if (br != null) {
br.close();
}
} catch (IOException ex) {
LOG.log(Level.SEVERE, null, ex);
}
}
return calculateStringHash(sb.toString(),"UTF-8");
}
I verified that on the PHP side hash_file is used and UTF-8 is the encryption. Any ideas?

Your reading method removes all the end of lines from the file. readLine() returns a line, without its line terminator. Print the contents of the StringBuilder, and you'll understand the problem.
Moreover, a hashing algorithm is a binary operation. It operates on bytes, and returns bytes. Why are you transforming the bytes in the file into a String, to later transform the String back to an array of bytes in order to hash it. Just read the file as a byte array, using an InputStream, instead of reading it as a String. Then hash this byte array. This will also avoid using the wrong file encoding (your code uses the platform default encoding, which might not be the encding used to create the file).

I guess you are missing out on the new line characters from the file since you call br.readLine().
It is better to read the file into byte array, and pass that onto md.digest(...).

Related

Java XML Parsing - incorrect string version of the data with VTD-XML

I am parsing an XML document in UTF-8 encoding with Java using VTD-XML.
A small excerpt looks like:
<literal>𠀋</literal>
<literal>𠂉</literal>
<literal>𠂢</literal>
I want to iterate through each literal and print it out to the console. However, what I get is:
¢
I am correctly navigating to each element. The way that I get the text value is by calling:
private static String toNormalizedString(String name, int val, final VTDNav vn) throws NavException {
String strValue = null;
if (val != -1) {
strValue = vn.toNormalizedString(val);
}
return strValue;
}
I've also tried vn.getXPathStringVal();, however it yields the same results.
I know that each of the literals above aren't just strings of length one. Rather, they seem to be unicode "characters" composed of two characters. I am able to correctly parse and output the kanji characters if they're length is just one.
My question is - how can I correctly parse and output these characters using VTD-XML? Is there a way to get the underlying bytes of the text between the literal tags so that I can parse the bytes myself?
EDIT
Code to process each line of the XML - converting it to a byte array and then back to a String.
try (BufferedReader br = new BufferedReader(new FileReader("res/sample.xml"))) {
String line;
while ((line = br.readLine()) != null) {
byte[] myBytes = null;
try {
myBytes = line.getBytes("UTF-8");
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
System.exit(-1);
}
System.out.println(new String(myBytes));
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
You are probably trying to get the string involving characters that is greater than 0x10000. That bug is known and is in the process of being addressed... I will notify you once the fix is out.
This question may be identical to this one...
Map supplementary Unicode characters to BMP (if possible)

Remove Base64 prefix from InputStream

I have a Base64 encoded Image String residing in a File Server. The encoded String has a prefix (ex: "data:image/png;base64,") for support in popular modern browsers (it's obtained via JavaScript's Canvas.toDataURL() method). The client sends a request for the image to my server which verifies them and returns a stream of the Base64 encoded String.
If the client is a web client, the image can be displayed as is within an <img> tag by setting the src to the Base64 encoded String. However, if the client is an Android client, the String needs to be decoded into a Bitmap without the prefix. Though, this can be done fairly easily.
The Problem:
In order to simplify my code and not reinvent the wheel, I'm using an Image Library for the Android client to handle loading, displaying, and caching the images (Facebook's Fresco Library to be exact). However, no library seems to support Base64 decoding (I want my cake and to eat it too). A solution I came up with is to decode the Base64 String on the server as it is being streamed to the client.
The Attempt:
S3Object obj = s3Client.getObject(new GetObjectRequest(bucketName, keyName));
Base64.Decoder decoder = Base64.getDecoder();
//decodes the stream as it is being read
InputStream stream = decoder.wrap(obj.getObjectContent());
try{
return new StreamingOutput(){
#Override
public void write(OutputStream output) throws IOException, WebApplicationException{
int nextByte = 0;
while((nextByte = stream.read()) != -1){
output.write(nextByte);
}
output.flush();
output.close();
stream.close();
}
};
}catch(Exception e){
e.printStackTrace();
}
Unfortunately, the Fresco library still has a problem displaying the image (with no stack traces!). As there doesn't seem to be an issue on my server when decoding the stream (no stack traces either), it leads me to believe that it must be an issue with the prefix. Which leaves me with a dilemma.
The Question: How do I remove the Base64 prefix from a Stream being sent to the client without storing and editing the entire Stream on the server? Is this possible?
Fresco does support decoding data URIs, just as the web client does.
The demo app has an example of this.
How do I remove the Base64 prefix from a Stream being sent to the client without storing and editing the entire Stream on the server?
Removing the prefix while sending the stream to the client turns out to be a pretty complex task. If you don't mind storing the whole String on the server you could simply do:
BufferedReader br = null;
StringBuilder sb = new StringBuilder();
String line;
try {
br = new BufferedReader(new InputStreamReader(stream));
while ((line = br.readLine()) != null) {
sb.append(line);
}
String result = sb.toString();
//comma is the charater which seperates the prefix and the Base64 String
int i = result.indexOf(",");
result = result.substring(i + 1);
//Now, that we have just the Base64 encoded String, we can decode it
Base64.Decoder decoder = Base64.getDecoder();
byte[] decoded = decoder.decode(result);
//Now, just write each byte from the byte array to the output stream
} catch (IOException e) {
e.printStackTrace();
} finally {
if (br != null) {
try {
br.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
But to be more efficient and not store the entire Stream on the server, creates a much more complicated task. We could use the Base64.Decoder.wrap() method but the problem with that is that it throws an IOException if it reaches a value that cannot be decoded (wouldn't it be nice if they provided a method that just left the bytes as is if they can't be decoded?). And unfortunately, the Base64 prefix can't be decoded because it's not Base64 encoded. So, it would throw an IOException.
To get around this problem, we would have to use an InputStreamReader to read the InputStream with the specified appropriate Charset. Then we would have to cast the ints received from the InputStream's read() method call to chars. When we reach the appropriate amount of chars, we would have to compare it with the Base64 prefix's intro ("data"). If it's a match, we know the Stream contains the prefix, so continue reading until we reach the prefix end character (the comma: ","). Finally, we can begin streaming out the bytes after the prefix. Example:
S3Object obj = s3Client.getObject(new GetObjectRequest(bucketName, keyName));
Base64.Decoder decoder = Base64.getDecoder();
InputStream stream = obj.getObjectContent();
InputStreamReader reader = new InputStreamReader(stream);
try{
return new StreamingOutput(){
#Override
public void write(OutputStream output) throws IOException, WebApplicationException{
//for checking if string has base64 prefix
char[] pre = new char[4]; //"data" has at most four bytes on a UTF-8 encoding
boolean containsPre = false;
int count = 0;
int nextByte = 0;
while((nextByte = stream.read()) != -1){
if(count < pre.length){
pre[count] = (char) nextByte;
count++;
}else if(count == pre.length){
//determine whether has prefix or not and act accordingly
count++;
containsPre = (Arrays.toString(pre).toLowerCase().equals("data")) ? true : false;
if(!containsPre){
//doesn't have Base64 prefix so write all the bytes until this point
for(int i = 0; i < pre.length; i++){
output.write((int) pre[i]);
}
output.write(nextByte);
}
}else if(containsPre && count < 25){
//the comma character (,) is considered the end of the Base64 prefix
//so look for the comma, but be realistic, if we don't find it at about 25 characters
//we can assume the String is not encoded correctly
containsPre = (Character.toString((char) nextByte).equals(",")) ? false : true;
count++;
}else{
output.write(nextByte);
}
}
output.flush();
output.close();
stream.close();
}
};
}catch(Exception e){
e.printStackTrace();
return null;
}
This seems a bit hefty of a task to do on the server so I think decoding on the client side is a better choice. Unfortunately, most Android client side libraries don't have support for Base64 decoding (especially with the prefix). However, as #tyronen pointed out Fresco does support it if the String is already obtained. Though, this removes one of the key reasons to use an image loading library.
Android Client Side Decoding
To decode on the client side application is pretty easy. First obtain the String from the InputStream:
BufferedReader br = null;
StringBuilder sb = new StringBuilder();
String line;
try {
br = new BufferedReader(new InputStreamReader(stream));
while ((line = br.readLine()) != null) {
sb.append(line);
}
return sb.toString();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (br != null) {
try {
br.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
Then decode the String using Android's Base64 class:
int i = result.indexOf(",");
result = result.substring(i + 1);
byte[] decodedString = Base64.decode(result, Base64.DEFAULT);
Bitmap bitMap = BitmapFactory.decodeByteArray(decodedString, 0, decodedString.length);
The Fresco library seems hard to update due to them using a lot of delegation. So, I moved on to using the Picasso image loading library and created my own fork of it with the Base64 decoding ability.

BufferedReader does not read all the lines in text file

I have a function.
public ArrayList<String> readRules(String src) {
try (BufferedReader br = new BufferedReader(new FileReader(src))) {
String sCurrentLine;
while ((sCurrentLine = br.readLine()) != null) {
System.out.println(sCurrentLine);
lines.add(sCurrentLine);
}
} catch (IOException e) {
e.printStackTrace();
}
return lines;
}
My file have 26.400 lines but this function just read 3400 lines at end of file.
How do I read all lines in file.
Thanks!
Why don't you use the utility method Files.readAllLines() (available since Java 7)?
This method ensures that the file is closed when all bytes have been read or an IOException (or another runtime exception) is thrown.
Bytes from the file are decoded into characters using the specified charset.
public ArrayList<String> readRules(String src) {
return Files.readAllLines(src, Charset.defaultCharset());
}
while ((sCurrentLine = br.readLine()) != null)
It is likely that you have an empty line or a line that is treated as null.
Try
while(br.hasNextLine())
{
String current = br.nextLine();
}
Edit: Or, in your text file, when a line is too long, the editor automatically wraps a single line into many lines. When you don't use return key, it is treated as a single line by BufferedReader.
Notepad++ is a good tool to prevent confusing a single line with multiple lines. It numbers the lines with respect to usage of return key. Maybe you could copy/paste your input file to Notepad++ and check if the line numbers match.
You can also cast into a List of strings using readAllLines() and then loop through it.
List<String> myfilevar = Files.readAllLines(Paths.get("/PATH/TO/MY/FILE.TXT"));
for(String x : myfilevar)
{
System.out.println(x);
}

Binary file not being read properly in Java

I am trying to read a binary file in Java using the bufferedReader. I wrote that binary-file using "UTF-8" encoding. The code for writing into a binary file:
byte[] inMsgBin=null;
try {
inMsgBin = String.valueOf(cypherText).getBytes("UTF-8");
//System.out.println("CIPHER TEXT:FULL:BINARY WRITE: "+inMsgBin);
} catch (UnsupportedEncodingException ex) {
Logger.getLogger(EncDecApp.class.getName()).log(Level.SEVERE, null, ex);
}
try (FileOutputStream out = new FileOutputStream(fileName+ String.valueOf(new SimpleDateFormat("yyyyMMddhhmm").format(new Date()))+ ".encmsg")) {
out.write(inMsgBin);
out.close();
} catch (IOException ex) {
Logger.getLogger(EncDecApp.class.getName()).log(Level.SEVERE, null, ex);
}
System.out.println("cypherText charCount="+cypherText.length());
Here 'cypherText' is a String with some content. Total no of characters written in the file is given as 19. Also after writing, when I open the binary file in Notepad++, it shows some characters. Selecting all the content of the file counts to 19 characters in total.
Now when I read the same file using BufferedReader, using the following lines of code:
try
{
DecMessage obj2= new DecMessage();
StringBuilder cipherMsg=new StringBuilder();
try (BufferedReader in = new BufferedReader(new FileReader(filePath))) {
String tempLine="";
fileSelect=true;
while ((tempLine=in.readLine()) != null) {
cipherMsg.append(tempLine);
}
}
System.out.println("FROM FILE: charCount= "+cipherMsg.length());
Here the total no of characters read (stored in 'charCount') is 17 instead of 19.
How can I read all the characters of the file correctly?
Specify the same charset while reading file.
try (final BufferedReader br = Files.newBufferedReader(new File(filePath).toPath(),
StandardCharsets.UTF_8))
UPDATE
Now i got your problem. Thanks for the file.
Again : Your file still readable to any text reader like Notepad++ ( Since your characters includes extended and control characters you are seeing those non readable characters . but it is still in ASCII.)
Now back to your problem, You have two problem with your code.
While reading file you should specify the Correct Charset. Readers are character readers - Bytes would be convert into characters while reading. If you specify the Charset it would use that else it would use the default system charset. So you should create BufferedReader as follows
try (final BufferedReader br = Files.newBufferedReader(new File(filePath).toPath(),
StandardCharsets.UTF_8))
Second issue, you have characters which includes Control characters. while reading file line by line , by default bufferedReader uses System's default EOL characters and skip those characters. thats why you are getting 17 instead of 19 ( since you have 2 characters are CR). To avoid this issue you should read characters.
int ch;
while ((ch = br.read()) > -1) {
buffer.append((char)ch);
}
Overall the below method would return proper text.
static String readCyberText() {
StringBuilder buffer = new StringBuilder();
try (final BufferedReader br = Files.newBufferedReader(new File("C:\\projects\\test2201404221017.txt").toPath(),
StandardCharsets.UTF_8)){
int ch;
while ((ch = br.read()) > -1) {
buffer.append((char)ch);
}
return buffer.toString();
}
catch (IOException e) {
e.printStackTrace();
return null;
}
}
And you can test by
String s = readCyberText();
System.out.println(s.length());
System.out.println(s);
and output as
19
ia#
m©Ù6ë<«9K()il
Note: the length of String is 19, however when it display it just displayed 17 characters. because the console considered as eof and displayed in different line. but the String contain all 19 characters properly.

printing results of base64_decode gives unexpected output

For a class, I was given a file of base64 encoded salted sha-256 hashed passwords.
the file is in the form:
username:base64 encoded sha256 password:salt
My original thought was to base64 decode the hash so I would be left with:
username:salted hashed password:salt
then run it through JTR or hashcat to crack the passwords.
My problem is in the base64 decoding process.
my code looks like:
public static byte[] decode(String string) {
try {
return new BASE64Decoder().decodeBuffer(string);
} catch (Exception e) {
throw new RuntimeException(e);
}
}
public static void splitLine(String strLine)
throws Exception {
StringTokenizer st = new StringTokenizer(strLine, ":");
if (st.hasMoreTokens())
userName = st.nextToken();
if (st.hasMoreTokens())
password = st.nextToken();
if (st.hasMoreTokens())
salt = st.nextToken();
}
public static void main(String[] argv) {
String line = null;
String pwdFile = null;
int count = 0;
try {
pwdFile = argv[0];
BufferedReader br = new BufferedReader(new FileReader(pwdFile));
line = br.readLine();
while (line != null) {
splitLine(line);
/* alternative #1: generates a lot of non-printable characters for the hash */
System.out.println(userName+":"+new String(decode(password))+":"+salt);
/* alternative #2: gives a list of the decimal values for each byte of the hash */
System.out.println(userName+":"+Arrays.toString(decode(password))+":"+salt);
count++;
line = br.readLine();
}
br.close();
System.err.println("total lines read: " + count);
} catch (Exception e) {
e.printStackTrace();
System.exit(-1);
}
}
With alternative #1, I end up with 50,000 more lines in my output file than were in the input file, so i assume some of the decoded strings contain newline characters which I need to fix as well.
How do I get back to and print the original hash value for the password in a format that either hashcat or JTR will recognize as salted sha256?
Problem: You are trying to to work with Base64 encoded password hashes and when they are decoded, there are unprintable characters
Background: When a value is hashed, the bytes are all changed according to a hashing algorithm and the resulting bytes are often beyond the range of printable characters. Base64 encoding is simply an alphabet that maps ALL bytes into printable characters.
Solution: work with the bytes that Base64 decode returns instead of trying to make them into a String. Convert those raw bytes to Hex representations (Base16) before you print them or give them to Hashcat or JTR. In short, you need to do something like the following (it happens to use Guava library):
String hex = BaseEncoding.base16().encode(bytesFromEncodedString);
This is condensed from a longer answer I posted

Categories