What does & 0xff do And MD5 Structure? - java

import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
public class JavaMD5 {
public static void main(String[] args) {
String passwordToHash = "MyPassword123";
String generatedPassword = null;
try {
MessageDigest md = MessageDigest.getInstance("MD5");
md.update(passwordToHash.getBytes());
byte[] bytes = md.digest();
StringBuilder sb = new StringBuilder();
for (int i = 0; i < bytes.length; i++) {
sb.append(Integer.toString((bytes[i] & 0xff) + 0x100, 16).substring(1));
}
generatedPassword = sb.toString();
} catch (NoSuchAlgorithmException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
System.out.println(generatedPassword);
}
}
This line is the problem :
sb.append(Integer.toString((bytes[i] & 0xff) + 0x100, 16).substring(1));
what does each part do in this structure????
Thanks and I'm sorry for asking Beacuse I'm new in java.

Presumably most of the code is clear and the only mystery for you here is this expression:
(bytes[i] & 0xff) + 0x100
The first part:
bytes[i] & 0xff
widens the byte at position i to an int value with zeros in bit positions 8-31. In Java, the byte data type is a signed integer value, so the widening sign-extends the value. Without the & 0xff, values greater than 0x7f would end up as negative int values. The rest is then fairly obvious: it adds 0x100, which simply turns on the bit at index 8 (since it is guaranteed to be 0 in (bytes[i] & 0xff). It is then converted to a hex String value by the call to Integer.toString(..., 16).
The reason for first adding 0x100 and then stripping off the 1 (done by the substring(1) call, which takes the substring starting at position 1 through the end) is to guarantee two hex digits in the end result. Otherwise, byte values below 0x10 would end up as one-character strings when converted to hex.
It's debatable whether all that has better performance (it certainly isn't clearer) than:
sb.append(String.format("%02x", bytes[i]));

It's a really messy way of translating to a hexadecimal string.
& 0xFF performs a binary AND, causing the returning value to be between 0 and 255 (which a byte always is anyway)
+ 0x100 adds 256 to the result to ensure the result is always 3 digits
Integer.toString(src, 16) converts the integer to a string with helix 16 (hexadecimal)
Finally .substring(1) strips the first character (the 1 from step 2)
So, this is a very elaborate and obfuscated way to convert a byte to an always 2-character hexadecimal string.

Related

Replicating this Java hash in Python

I'm trying to replicate this hashing code in Python, but both languages handles bytes differently and generating very different outputs.
Can someone guide me here ?
Java Code (Original Code)
public static String hash(String filePath, String salt) {
String finalHash = null;
Path path = Paths.get(filePath);
try {
MessageDigest md = MessageDigest.getInstance("SHA-1");
byte[] data = Files.readAllBytes(path);
byte[] dataDigest = md.digest(data);
byte[] hashDigest = md.digest(salt.getBytes("ISO-8859-1"));
byte[] xorBytes = new byte[dataDigest.length];
for (int i = 0; i < dataDigest.length && i < hashDigest.length; i++) {
xorBytes[i] = (byte) (dataDigest[i] << 1 ^ hashDigest[i] >> 1);
}
finalHash = (new HexBinaryAdapter()).marshal(xorBytes);
} catch (IOException | NoSuchAlgorithmException e) {
e.printStackTrace();
}
return finalHash;
}
Python Code (Translated by me)
def generate_hash(file_path: str, salt: bytes) -> str:
with open(file_path, 'rb') as f:
data = f.read()
hashed_file = sha1(data).digest()
hashed_salt = sha1(salt).digest()
xor_bytes = []
for i in range(len(hashed_file)):
xor_bytes.append((hashed_file[i] << 1 ^ hashed_salt[i] >> 1))
return ''.join(map(chr, xor_bytes)) # This is probably not equivalent of HexBinaryAdapter
There are the following issues:
The shift operations are wrongly implemented in the Python code:
In the Python code the generated hash is stored in a bytes-like object as a list of unsigned integer values between 0 and 255 [1], e.g. 0xc8 = 11001000 = 200. In Java, integers are stored as signed values, whereby the two's complement is used to represent negative numbers [2][3]. The value 0x8c would be interpreted as -56 if stored in a byte variable.
The >>-operator produces a different result on the binary level for signed and unsigned values, because it is an arithmetic shift operator which preserves the sign [4][5][6]. Example:
signed -56 >> 1 = 1110 0100 = -28
unsigned 200 >> 1 = 0110 0100 = 100
The <<-operator, on the other hand, does not cause the above problem, but can lead to values that cannot be represented by a byte. Example:
signed -56 << 1 = 1 1001 0000 = -112
unsigned 200 << 1 = 1 1001 0000 = 400
For these reasons, in the Python code the following line
xor_bytes.append((hashed_file[i] << 1 ^ hashed_salt[i] >> 1))
has to be replaced by
xor_bytes.append((hashed_file[i] << 1 ^ tc(hashed_salt[i]) >> 1) & 0xFF)
where
def tc(val):
if val > 127:
val = val - 256
return val
determines the negative value of the two's complement representation (or more sophisticated with bitwise operators see [7]).
The use of the bitwise and (&) with 0xFF ensures that only the relevant byte is taken into account in the Python code, analogous to the Java code [5].
There are several ways to convert the list/bytes-like object into a hexadecimal string (as in the Java code), e.g. with [8][9]
bytes(xor_bytes).hex()
or with [8][10] (as binary string)
binascii.b2a_hex(bytes(xor_bytes))
In the Python code the encoding of the salt must be taken into account. Since the salt is already passed as a binary string (in the Java code it is passed as a string), the encoding must be performed before the function is called:
saltStr = 'MySalt'
salt = saltStr.encode('ISO-8859-1')
For a functional consistency with the Java code, the salt would have to be passed as a string and the encoding would have to be performed within the function.

Byte to Integer and then to String conversion in Java

I got a code for MD5 hash generation in Java. It generates the hash in byte array "bytes" and then converts to integer and then to string as follows:
byte[] bytes=md.digest(textToHash.getBytes());
StringBuilder sb=new StringBuilder();
for(int i=0;i<bytes.length;i++)
sb.append(Integer.toString((bytes[i] & 0xff) + 0x100, 16).substring(1));
I understood that bytes[i] & 0xff converts byte to integer of 32 bit length copying the byte to the least significant byte of the integer:
What does value & 0xff do in Java?
However I couldn't understand what + 0x100, 16 does in the parentheses at line 4 of the above code. Your help is appreciated.
Breaking down Integer.toString((bytes[i] & 0xff) + 0x100, 16).substring(1):
Adding 0x100 (which is 256 decimal) sets the 9th bit to 1, which guarantees the binary number representation of the result has exactly 9-bits. You could equivalently do & 0x100.
After setting bit 8, the result from the toString() will be 9 chars long (of zeroes and ones).
substring(1) effectively ignores bit 8 and outputs the lower 8 bits
So what?
This code puts leading zeroes on values, so all values are exactly 8 binary characters. There's no way to make Integer.toString() alone do this.

Java integer to hex and to int

I have the problem, that the method does not work as expected. In most cases it works. However there is a case it does not work.
I have a byte array containing some values. In hex e.g.: 0x04 0x42 (littleEndian). If I use the method convertTwoBytesToInt, I get a really small number. It should be > 16000 and not smaller than 2000.
I have two methods:
private static int convertTwoBytesToInt(byte[] a){
String f1 = convertByteToHex(a[0]);
String f2 = convertByteToHex(a[1]);
return Integer.parseInt(f2+f1,RADIX16);
}
private static byte[] convertIntToTwoByte(int value){
byte[] bytes = ByteBuffer.allocate(4).putInt(value).array();
System.out.println("Check: "+Arrays.toString(bytes));
byte[] result = new byte[2];
//big to little endian:
result[0] = bytes[3];
result[1] = bytes[2];
return result;
}
I call them as follows:
byte[] h = convertIntToTwoByte(16000);
System.out.println("AtS: "+Arrays.toString(h));
System.out.println("tBtInt: "+convertTwoBytesToInt(h));
If I use the value 16000, there is no problem, but if I use 16900, the integer value of "convertTwoBytesToInt" is 1060.
Any Idea?
Based on the example you provided, my guess is that convertByteToHex(byte) is converting to a single-digit hex string when the byte value is less than 0x10. 16900 is 0x4204 and 1060 is 0x424.
You need to ensure that the conversion is zero-padded to two digits.
A much simpler approach is to use bit manipulation to construct the int value from the bytes:
private static int convertTwoBytesToInt(byte[] a) {
return ((a[1] & 0xff) << 8) | (a[0] & 0xff);
}

why Java throws a NumberFormatException

I got an exception while parsing a string to byte
String Str ="9B7D2C34A366BF890C730641E6CECF6F";
String [] st=Str.split("(?<=\\G.{2})");
byte[]bytes = new byte[st.length];
for (int i = 0; i <st.length; i++) {
bytes[i] = Byte.parseByte(st[i]);
}
That's because the default parse method expects a number in decimal format, to parse hexadecimal number, use this parse:
Byte.parseByte(st[i], 16);
Where 16 is the base for the parsing.
As for your comment, you are right. The maximum value of Byte is 0x7F. So you can parse it as int and perform binary AND operation with 0xff to get the LSB, which is your byte:
bytes[i] = Integer.parseInt(st[i], 16) & 0xFF;
Assuming you want to parse the string as hexadecimal, try this:
bytes[i] = Byte.parseByte(st[i], 16);
The default radix is 10, and obviously B is not a base-10-digit.
Java is very picky on signedness, it will not accept values to overflow. Thus, if you parse a Byte and it is larger than 127 (for example, 130 dec or 83 hex) you will get a NumberFormatException. Same happens if you parse an 8 digit hex number as an Integer (or a 16 digit hex number as a Long) and it starts with 8-F. Such values will not be interpreted as negative (two's complement) but as illegal.
If you think that this is anal retentive, I totally agree. But that's Java style.
To parse hex values as two's complement numbers either use a large enough integer type (for example, if you are parsing a Byte use Integer instead and type cast it to a byte later) or -- if you need to parse a Long, split the number in half it is 16 digits, then combine. Here's an example:
public static long longFromHex(String s) throws IllegalArgumentException {
if (s.length() == 16)
return (Long.parseLong(s.substring(0,8),16)<<32)|(Long.parseLong(s.substring(8,16),16)&0xffffffffL);
return Long.parseLong(s, 16);
}
Or, to read a Byte, just use Integer instead:
public static byte byteFromHex(String s) throws IllegalArgumentException {
int i = Integer.parseInt(s, 16);
if (i < 0 || i > 255) throw new IllegalArgumentException("input string "+s+" does not fit into a Byte");
return (byte)i;
}

Which SHA-256 is correct? The Java SHA-256 digest or the Linux commandline tool

When I calculate in Java an SHA-256 of a string with the following method
public static void main(String[] args) throws NoSuchAlgorithmException {
MessageDigest md = MessageDigest.getInstance("SHA-256");
byte[] hash = md.digest("password".getBytes());
StringBuffer sb = new StringBuffer();
for(byte b : hash) {
sb.append(Integer.toHexString(b & 0xff));
}
System.out.println(sb.toString());
}
I get :
5e884898da2847151d0e56f8dc6292773603dd6aabbdd62a11ef721d1542d8
on the commandline I do the following (I need the -n to not add a newline) :
echo -n "password" | sha256sum
and get
5e884898da28047151d0e56f8dc6292773603d0d6aabbdd62a11ef721d1542d8
if we compare these more closely I find 2 subtle differences
5e884898da2847151d0e56f8dc6292773603dd6aabbdd62a11ef721d1542d8
5e884898da28047151d0e56f8dc6292773603d0d6aabbdd62a11ef721d1542d8
or :
5e884898da28 47151d0e56f8dc6292773603d d6aabbdd62a11ef721d1542d8
5e884898da28 0 47151d0e56f8dc6292773603d 0 d6aabbdd62a11ef721d1542d8
Which of the 2 is correct here?
Result: Both are but I was wrong...
fixed it by using :
StringBuffer sb = new StringBuffer();
for(byte b : hash) {
sb.append(String.format("%02x", b));
}
Thanks!
I'll take a reasonable guess: both are outputting the same digest, but in your Java code that outputs the byte[] result as a hex string, you outputting small byte values (less than 16) without a leading 0. So a byte with value "0x0d" is being written as "d" not "0d".
The culprit is the toHexString. It appears to be outputting 6 for the value 6 whereas the sha256sum one is outputting 06. The Java docs for Integer.toHexString() state:
This value is converted to a string of ASCII digits in hexadecimal (base 16) with no extra leading 0s.
The other zeros in the string aren't being affected since they're the second half of the bytes (e.g., 30).
One way to fix it would be to change:
for(byte b : hash) {
sb.append(Integer.toHexString(b & 0xff));
}
to:
for(byte b : hash) {
if (b < 16) sb.append("0");
sb.append(Integer.toHexString(b & 0xff));
}
They're both right - it's your Java code that is at fault, because it is not printing out the leading 0 for a hex value less than 0x10.
You still need "echo -n" to prevent the trailing \n
The one generated by sha256sum seems correct. Your implementation seems to drop those two zeroes.
Using #paxdiablo idea had problem with big number as it appear as negative, so
Instead of:
for(byte b : hash) {
sb.append(Integer.toHexString(b & 0xff));
}
you could do:
for(byte b : hash) {
if (b > 0 && b < 16) {
sb.append("0");
}
sb.append(Integer.toHexString(b & 0xff));
}
And read #Sean Owen answer.
You can also get the right result using this:
MessageDigest md = MessageDigest.getInstance("SHA-256");
byte[] hash = md.digest("password".getBytes());
BigInteger bI = new BigInteger(1, hash);
System.out.println(bI.toString(16));

Categories