The question is about the correct way of creating a hash in Java:
Lets assume I have a positive BigInteger value that I would like to create a hash from. Lets assume that below instance of the messageDigest is a valid instance of (SHA-256)
public static final BigInteger B = new BigInteger("BD0C61512C692C0CB6D041FA01BB152D4916A1E77AF46AE105393011BAF38964DC46A0670DD125B95A981652236F99D9B681CBF87837EC996C6DA04453728610D0C6DDB58B318885D7D82C7F8DEB75CE7BD4FBAA37089E6F9C6059F388838E7A00030B331EB76840910440B1B27AAEAEEB4012B7D7665238A8E3FB004B117B58", 16);
byte[] byteArrayBBigInt = B.toByteArray();
this.printArray(byteArrayBBigInt);
messageDigest.reset();
messageDigest.update(byteArrayBBigInt);
byte[] outputBBigInt = messageDigest.digest();
Now I only assume that the code below is correct, as according to the test the hashes I produce match with the one produced by:
http://www.fileformat.info/tool/hash.htm?hex=BD0C61512C692C0CB6D041FA01BB152D4916A1E77AF46AE105393011BAF38964DC46A0670DD125B95A981652236F99D9B681CBF87837EC996C6DA04453728610D0C6DDB58B318885D7D82C7F8DEB75CE7BD4FBAA37089E6F9C6059F388838E7A00030B331EB76840910440B1B27AAEAEEB4012B7D7665238A8E3FB004B117B58
However I am not sure why we are doing the step below i.e.
because the returned byte array after the digest() call is signed and in this case it is a negative, I suspect that we do need to convert it to a positive number i.e. we can use a function like that.
public static String byteArrayToHexString(byte[] b) {
String result = "";
for (int i=0; i < b.length; i++) {
result += Integer.toString((b[i] & 0xff) + 0x100, 16).substring(1);
}
return result;
}
thus:
String hex = byteArrayToHexString(outputBBigInt)
BigInteger unsignedBigInteger = new BigInteger(hex, 16);
When I construct a BigInteger from the new hex string and convert it back to byte array then I see that the sign bit, that is most significant bit i.e. the leftmost bit, is set to 0 which means that the number is positive, moreover the whole byte is constructed from zeros ( 00000000 ).
My question is: Is there any RFC that describes why do we need to convert the hash always to a "positive" unsigned byte array. I mean even if the number produced after the digest call is negative it is still a valid hash, right? thus why do we need that additional procedure. Basically, I am looking for a paper: standard or rfc describing that we need to do so.
A hash consists of an octet string (called a byte array in Java). How you convert it to or from a large number (a BigInteger in Java) is completely out of the scope for cryptographic hash algorithms. So no, there is no RFC to describe it as there is (usually) no reason to treat a hash as a number. In that sense a cryptographic hash is rather different from Object.hashCode().
That you can only treat hexadecimals as unsigned is a bit of an issue, but if you really want to then you can first convert it back to a byte array, and then perform new BigInteger(result). That constructor does threat the encoding within result as signed. Note that in protocols it is often not needed to convert back and forth to hexadecimals; hexadecimals are mainly for human consumption, a computer is fine with bytes.
Related
I have a function for hashing passwords, that returns a byte[] with entries using the full range of the byte datatype from -128 to 127. I have tried to convert the byte[] to a String using new String(byte_array, StandardCharsets.UTF_8);. This does return a String - however it can not properly encode negative numbers - hence it encodes them to a "�" character. When comparing two of those characters using: new String(new byte[]{-1}, StandardCharsets.UTF_8).equals(new String(new byte[]{-2}, StandardCharsets.UTF_8)) it turns out the String representation for all negative numbers is equal as the expression above returns true. While this doesn't fully ruin my hashing functionality as the hash of the same expression will still always yield the same result, this is obviously not what I want as it increases the chance of two different inputs yielding the same output drastically.
Is there some easy fix for this or any alternative idea how to convert the byte[] to a String? For context I want to use the String to later write it to a file to store it in a file and later read it again to compare it to other hashes.
Edit: After a bit of trying around with the tips from the comments my solution is to convert the byte[] to a char[] and add 128 to every value. The char array can then easily be converted to a String or be written to a file directly (byteHash is the byte[]):
char[] charHash = new char[byteHash.length];
for(int i = 0; i < byteHash.length; i++){
charHash[i] = (char) (byteHash[i]+128);
}
return new String(charHash);
I do not really like the solution but it works.
The appropriate solution to this is to use an encoding like hexadecimal (https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/HexFormat.html) or Base64 (https://docs.oracle.com/javase/8/docs/api/java/util/Base64.html) to convert an arbitrary byte sequence to a string reversibly.
I've got a huge string of bits (with some \n in it too) that I pass as a parameter to a method, which should isolate the bits 8 by 8, and convert them all to bytes using parseInt().
Thing is, every time the substring of 8 bits starts with a 1, the resulting byte is a negative number. For example, the first substring is '10001101', and the resulting byte is -115. I can't seem to figure out why, can someone help? It works fine with other substrings.
Here's my code, if needed :
static String bitsToBytes(String geneString) {
String geneString_temp = "", sub;
for(int i = 0; i < geneString.length(); i = i+8) {
sub = geneString.substring(i, i+8);
if (sub.indexOf("\n") != -1) {
if (sub.indexOf("\n") != geneString.length())
sub = sub.substring(0, sub.indexOf("\n")) + sub.substring(sub.indexOf("\n")+1, sub.length()) + geneString.charAt(i+9);
}
byte octet = (byte) Integer.parseInt(sub, 2);
System.out.println(octet);
geneString_temp = geneString_temp + octet;
}
geneString = geneString_temp + "\n";
return geneString;
}
In Java, byte is a signed type, meaning that when the most significant bit it set to 1, the number is interpreted as negative.
This is precisely what happens when you print your byte here:
System.out.println(octet);
Since PrintStream does not have an overload of println that takes a single byte, the overload that takes an int gets called. Since octet's most significant bit is set to 1, the number gets sign-extended by replicating its sign bit into bits 9..32, resulting in printout of a negative number.
byte is a signed two's complement integer. So this is a normal behavior: the two's complement representation of a negative number has a 1 in the most-significant bit. You could think of it like a sign bit.
If you don't like this, you can use the following idiom:
System.out.println( octet & 0xFF );
This will pass the byte as an int while preventing sign extension. You'll get an output as if it were unsigned.
Java doesn't have unsigned types, so the only other thing you could do is store the numbers in a wider representation, e.g. short.
In Java, all integers are signed, and the most significant bit is the sign bit.
Because parseInt parse signed int that means it converts the binary if it begins with 0 its positive and if 1 its negative try to use parseUnsignedInt instead
I am creating an encryption algorithm and is to XOR two strings. While I know how to XOR the two strings the problem is the length. I have two byte arrays one for the plain text which is of a variable size and then the key which is of 56 bytes lets say. What I want to know is what is the correct method of XORing the two strings. Concatenate them into one String in Binary and XOR the two values? Have each byte array position XOR a concatenated Binary value of the key and such. Any help is greatly appreciated.
Regards,
Milinda
To encode just move through the array of bytes from the plain text, repeating the key as necessary with the mod % operator. Be sure to use the same character set at both ends.
Conceptually we're repeating the key like this, ignoring encoding.
hello world, there are sheep
secretsecretsecretsecretsecr
Encrypt
String plainText = "hello world, there are sheep";
Charset charSet = Charset.forName("UTF-8");
byte[] plainBytes = plainText.getBytes(charSet);
String key = "secret";
byte[] keyBytes = key.getBytes(charSet);
byte[] cipherBytes = new byte[plainBytes.length];
for (int i = 0; i < plainBytes.length; i++) {
cipherBytes[i] = (byte) (plainBytes[i] ^ keyBytes[i
% keyBytes.length]);
}
String cipherText = new String(cipherBytes, charSet);
System.out.println(cipherText);
To decrypt just reverse the process.
// decode
for (int i = 0; i < cipherBytes.length; i++) {
plainBytes[i] = (byte) (cipherBytes[i] ^ keyBytes[i
% keyBytes.length]);
}
plainText = new String(plainBytes, charSet); // <= make sure same charset both ends
System.out.println(plainText);
(As noted in comments, you shouldn't use this for anything real. Proper cryptography is incredibly hard to do properly from scratch - don't do it yourself, use existing implementations.)
There's no such concept as "XOR" when it comes to strings, really. XOR specifies the result given two bits, and text isn't made up of bits - it's made up of characters.
Now you could just take the Unicode representation of each character (an integer) and XOR those integers together - but the result may well be a sequence of integers which is not a valid Unicode representation of any valid string.
It's not clear that you're even thinking in the right way to start with - you talk about having strings, but also having 56 bytes. You may have an encoded representation of a string (e.g. the result of converting a string to UTF-8) but that's not the same thing.
If you've got two byte arrays, you can easily XOR those together - and perhaps cycle back to the start of one of them if it's shorter than the other, so that the result is always the same length as the longer array. However, even if both inputs are (say) UTF-8 encoded text, the result often won't be valid UTF-8 encoded text. If you must have the result in text form, I'd suggest using Base64 at that point - there's a public domain base64 encoder which has a simple API.
I try to compare 2 byte arrays.
Byte array 1 is an array with the last 3 bytes of a sha1 hash:
private static byte[] sha1SsidGetBytes(byte[] sha1)
{
return new byte[] {sha1[17], sha1[18], sha1[19]};
}
Byte array 2 is an array that I fill with 3 bytes coming from an hexadecimal string:
private static byte[] ssidGetBytes(String ssid)
{
BigInteger ssidBigInt = new BigInteger(ssid, 16);
return ssidBigInt.toByteArray();
}
How is it possible that this comparison:
if (Arrays.equals(ssidBytes, sha1SsidGetBytes(snSha1)))
{
}
works most of the times but sometimes not. Byte Order?
e.g. for "6451E6" (hex string) it works fine, for "ABED74" it does not...
The problem is pretty obvious if you try this:
BigInteger b1 = new BigInteger("6451E6", 16);
BigInteger b2 = new BigInteger("ABED74", 16);
System.out.println(b1.toByteArray().length);
System.out.println(b2.toByteArray().length);
Specifically, ABED74 creates a BigInteger whose byte array is 4 bytes long--so of course it's not going to be equal to any three byte array.
The straightforward fix is to change the return statement in ssidGetBytes from
return ssidBigInt.toByteArray();
to
byte[] ba = ssidBigInt.toByteArray();
return new byte[] { ba[ba.length - 3], ba[ba.length - 2], ba[ba.length - 1] };
Your approach of parsing a hex string via BigInteger is flawed, basically. For example, new BigInteger("ABED74").toByteArray() returns an array of 4 bytes, not three. While you could hack around this, you're fundamentally not trying to do anything involving BigInteger values... you're just trying to parse hex.
I suggest you use the Apache Codec library to do the parsing:
byte[] array = (byte[]) new Hex().decode(text);
(The API for Apache Codec leaves something to be desired, but it does work.)
From the javadoc's (emphasis mine):
http://download.oracle.com/javase/1.5.0/docs/api/java/math/BigInteger.html#toByteArray%28%29
Returns a byte array containing the
two's-complement representation of
this BigInteger. The byte array will
be in big-endian byte-order: the most
significant byte is in the zeroth
element. The array will contain the
minimum number of bytes required to
represent this BigInteger, including
at least one sign bit, which is
(ceil((this.bitLength() + 1)/8)).
(This representation is compatible
with the (byte[]) constructor.)
There is a lot of computations going on inside the ByteInteger(String,radix) constructor that you are using, which does not guarantee the constructed BigInteger will produce a byte array (via its toByteArray() method) comparable to the result of a String's getBytes() encoding.
The output of toByteArray() is intended to be used (mostly) as input to the (byte[]) constructor of BigInteger. It makes no guarantee for uses other than those.
Look at it like this: the output of toByteArray() is the byte representation of the BigInteger object and everything in it including internal attributes like magnitude. Those attributes do not exist in the input String, but are computed during construction of the BitInteger object.
That will be incompatible to the byte representation of the input String which only carries the initial numeric value with which to create a BigInteger.
I want to compute Hash of a String, but the Hash value should be a number (long or integer).
In other words I want to compute integer hash of a string.
Collusion resistance in not the concern.
Is there an way to convert MessageDigest of SHA-256 to a number.
I am using Java to accomplish this.
Try to call hashCode() method. It is already implemented and does exactly what you want.
Most obviously there is a hashCode() method on String
As for converting the MessageDigest to a number, you can either use hashCode again or take the byte array from the digest and compact this down to whatever size you want, integer, long or whatever with (say) xor.
public int compactDigest(MessageDigest digest) {
byte [] byteArr = digest.digest();
// +3 since conversion to int array with divide length by four.
// and we don't want to lose any bytes.
ByteBuffer bytes = ByteBuffer.allocate(byteArr.length + 3);
bytes.put(byteArr);
bytes.rewind();
IntBuffer ints = bytes.asIntBuffer();
int compactDigest = 0;
for (int i = 0; i < ints.limit(); ++i) {
compactDigest ^= ints.get(i);
}
return compactDigest;
}
A Sha Hash has 256 Bits e.g.
"364b7e70a9966ef7686ab814958cd0017b7f19147a257d40603d4a1307662b42"
this will exceed the range of long and integer.
You could use new BigInteger( hash, 16 ); for a decimal representation.
public static void main(String[] args) throws NoSuchAlgorithmException {
MessageDigest digest = MessageDigest.getInstance("SHA-256");
digest.update("string".getBytes() );
byte[] hash = digest.digest();
BigInteger bi = new BigInteger( hash );
System.out.println( "hex:" + bi.toString(16) + "\r\ndec:" + bi.toString() );
}
class String has a hashcode method, like any other Java class, which transforms the string into a number. See the documentation of this method for the exact algorithm it uses.
every object in java has hashCode() method. You can override it and specify your own logic. Look at the examples.
Please find it here: http://pastebin.com/j6Cffkcp;
I but it returns only string.
Cryptographic hashes created using the JCE classes (MessageDigest in your case) are essentially a sequence of bytes (256 bits for SHA-256). If you wish to store and manage these are numbers, you'll need to convert these into BigInteger or BigDecimal objects (given the length of the digest).
It is not always that a cryptographic hash of String objects are computed, and it is often done for the purpose of one-way encryption of secrets. If you are using the hash for other purposes, especially to ensure some sort of uniqueness among the Strings (that would be important when storing these objects in a hash map), you're better off using the hash value computed by the String.hashCode method.