Wrong SHA-256 hash of a string in MessageDigest

Wrong SHA-256 hash of a string in MessageDigest - java

in some tests i using MessageDigest library in Groovy and sometimes this function returns incorrect value. Here is my code below:
import java.security.MessageDigest;
String.metaClass.toSHA256 = {
def messageDigest = MessageDigest.getInstance("SHA-256")
messageDigest.update(delegate.getBytes("UTF-8"))
new BigInteger(1, messageDigest.digest()).toString(16).padLeft(40, '0')
}
for example - i tryed to encode to SHA-256 this string:
582015-04-23 20:47:112015-04-23 23:59:000020502015-04-23 20:47:11tests-from-api["afoot"]33facafaece3afd353bcbe88637d11b7
My method return
cb2814380117cd5621064c1d7512b32e3cb8c8cb2b1f20016f6da763598d738
But online generators, returns 0cb2814380117cd5621064c1d7512b32e3cb8c8cb2b1f20016f6da763598d738
it calculates wrong like 2 times in 40 tries
Could you help me how to fix it?

You're missing a '0' at the start, because you're padding left to 40 characters (presumably having copied that code from something where the hash is expected to be 40 characters) instead of the 64 characters that actually makes up a SHA-256 hash (in hex).
So you could just fix the padLeft code - but personally I would just avoid using BigInteger for hex conversions. That's not what it's designed for - it's designed for maths operations on large integers.
Instead, use one of the many hex converters in common utility libraries - which are designed precisely to convert byte arrays to hex, with nothing about integers at all. If you don't want to use a library, there's plenty of code on Stack Overflow to convert a byte[] to hex.

Related

Verify Hashing.sha256() generated hash

I have this code created using Google Guava:
String sha256hex = Hashing.sha256()
.hashString(cardNum, StandardCharsets.UTF_8)
.toString();
How I can verify the generated values is a properly generated hash?

SHA-256 and, in general, the family of SHA 2 algorithms is wonderfully described in Wikipedia and different RFCs, RFC 6234 and the superseded RFC 4634.
All these sources dictate that the output provided by the SHA 256 hash function is 256 bits length, 32 bytes (the number that accompanies the SHA word is the mentioned value for every algorithm in the family, roughly speaking).
These sequence of bytes is typically encoded in hex. This is the implementation provided by Guava as well.
Then, the problem can be reduced to identify if a string in Java is a valid hex encoding.
That problem has been already answered here, in SO, for example in this question.
For its simplicity, consider the solution proposed by #laycat:
boolean isHex = mac_addr.matches("^[0-9a-fA-F]+$");
As every byte is encoded with two hex characters and, as mentioned, the SHA-256 algorithm produces and output of 32 bytes you can safely check for a string of 64 characters length, as suggested in the answer of #D.O. as well. Your validation code could be similar to this:
boolean canBeSha256Output = sha256Hex.matches("^[0-9a-fA-F]{64}$");
Please, be aware that there is no possibility for saying if a character hex string of a certain length on its own is or not the result of a hash function, whichever hash function you consider.
You only can be sure that a hash output is a hash output if and only if it matches the result of applying the corresponding hash function over the original input.

You could use a regex to verify that it looks like a sha256 hash(64 hexadecimal characters), like
\b[A-Fa-f0-9]{64}\b

String hex hash to bytes

I have String hash in hex form ("e6fb06210fafc02fd7479ddbed2d042cc3a5155e") and I would like to compare it to crypt.digest().
One way, which works fine, is to convert crypt.digest() to hex, but I would like to avoid multiple conversions and rather convert hash from hex form (above) to byte array.
What I tried was:
byte[] hashBytes = new BigInteger(hash, 16).toByteArray();
but it does not match with crypt.digest(). When I convert hashBytes back to hex I get "00e6fb06210fafc02fd7479ddbed2d042cc3a5155e".
The leading zeros seem to be the reason why I fail to match byte arrays. Why do they occur? How can I get the same result using crypt.digest() and toByteArray?

The reason for the extra 00 is that e6 has it high (sign) bit set.
A redundant byte 00 makes it an unsigned value for BigInteger.
String hash = "e6fb06210fafc02fd7479ddbed2d042cc3a5155e";
byte[] hashBytes = new BigInteger(hash, 16).toByteArray();
hashBytes = hashBytes.length > 1 && hashBytes[0] == 0
? Arrays.copyOfRange(hashBytes, 1, hashBytes.length) : hashBytes;
System.out.println(Arrays.toString(hashBytes));
The question arises, what if the hash actually starts with a 00?
Then you need the hash length, or do a lenient comparison.

The answer can be found in the following answer from a thread about the highly related question Convert a string representation of a hex dump to a byte array using Java?:
The issue with BigInteger is that there must be a "sign bit". If the leading byte has the high bit set then the resulting byte array has an extra 0 in the 1st position. But still +1.
– Gray Oct 28 '11 at 16:20
Since the first bit has a special meaning (indicating the sign, 0 for positive, 1 for negative), BigInteger will prefix the data with an additional 0 in case your data started with a 1 on the high bit. Otherwise it would be interpreted as negative although it was not negative to begin with.
I.e. data like
101110
is turned into
0101110
You could easily undo this manually by using Arrays.copyOfRange(data, 1, data.length) if it happens.
However, instead of fixing that code, I would suggest using one of the other solutions posted in the linked thread. They are cleaner and easier to read and maintain.

How to generate a random alphanumeric string of 16 characters in Java using SecureRandom

I am trying to generate an alphanumeric string using SecureRandom.
Here is my little code :
import java.math.BigInteger;
import java.security.SecureRandom;
public class GenerateSecureRandom {
private static SecureRandom SECURE_RANDOM = new SecureRandom();
public static String nextSessionId() {
return new BigInteger(64, SECURE_RANDOM).toString(16);
}
}
It works ! I get for example this : 7c52cfce6c479803 = 16 characters : OK !
My problem is that sometimes this code returns a string of 15 characters and I do not understand why. (I'm a beginner...)
Here is an example : 515c38584d0a077 = 15 characters : ERROR
What am I doing wrong ?
I am beginner in Java programming, so please be indulgent if my way of proceeding is not correct :)
If this code is not correct, how can I get the expected result? I would like to use SecureRandom.
Thank you in advance for your answers
Sorry if my question is duplicated, I searched without finding an answer ...

The BigInteger constructor you're using takes a maximum bit length.
Constructs a randomly generated BigInteger, uniformly distributed over the range 0 to (2numBits - 1), inclusive.
Parameters:
numBits - maximum bitLength of the new BigInteger.
You may more rarely get strings even less than 15 characters long.
This occurs because the most significant bit(s) may be zero, resulting in a hexadecimal number that doesn't need a full 16 characters to be represented. Notice how none of the strings you generate start with 0.
If there are less than 16 characters, then prepend '0' characters.

Is it possible to limit the hashcode into specific number of characters in Java

I have written a method to convert a plain text into it's hashcode using MD5 algorithm. Please find the code below which I used.
public static String convertToMD5Hash(final String plainText){
MessageDigest messageDigest = null;
try {
messageDigest = MessageDigest.getInstance("MD5");
} catch (NoSuchAlgorithmException e) {
LOGGER.warn("For some wierd reason the MD5 algorithm was not found.", e);
}
messageDigest.reset();
messageDigest.update(plainText.getBytes());
final byte[] digest = messageDigest.digest();
final BigInteger bigInt = new BigInteger(1, digest);
String hashtext = bigInt.toString(8);
return hashtext;
}
This method works perfectly but it returns a lengthy hash. I need to limit this hash text to 8 characters. Is there any possibilities to set the length of the hashcodes in Java?

Yes and No. You can use a substring of the original hash if you always cut the original hash-string similary (ie. 8 last/first characters). What are you going to do with that "semi-hash" is another thing.
Whatever it is you're going to do, be sure it has nothing to do with security.
Here's why: MD5 is 128-bit hash, so there's 2^128 = ~340,000,000,000,000,000,000,000,000,000,000,000,000 possible permutations. The quite astronomical amount of permutations is the thing that makes bruteforcing this kind of string virtually impossible. By cutting down to 8 characters, you'll end up with 32-bit hash. This is because a single hex-value takes 4 bits to represent (thus, also 128-bit / 4 bit = 32 hex-values). With 32-bit hash there's only 2^32 = 4,294,967,296 combinations. That's about 79,228,162,514,264,337,593,543,950,336 times less secure than original 128-bit hash and can be broken in matter of seconds with any old computer that has processing power of an 80's calculator.

No. MD5 is defined to return 128 bit values. You could use Base64 to encode them to ASCII and truncate it using String#substring(0, 8).
In Java 8 (not officially released yet), you can encode a byte[] to Base64 as follows:
String base64 = Base64.getEncoder().encodeToString(digest);
For earlier Java versions see Decode Base64 data in Java

all hash algorithms should randomly change bits in whole hash whenever any part of data has changed. so you can just choose 8 chars from your hash. just don't pick them randomly - it must be reproducible

Firstly as everyone has mentioned, the 64 bit hash is not secure enough. Ultimately it depends on what you exactly plan to do with the hash.
If you still need to convert this to 8 characters, I suggest downcasting the BigInteger to a Long value using BigIteger.longValue()
It will ensure that the long value it produces is consistent with the hash that was produced.
I am not sure if taking most significant 64 bits from the 128 bit hash is good idea. I would rather take least significant 64 bits. What this ensures is that
when hash(128, a) = hash(128, b) then hash(64, a) = hash(64, b) will always be true.
But we have to live with collision in case of 64 bits i.e. when hash(64, a) = hash(64, b) then hash(128, a) = hash(128, b) is not always true.
In a nutshell, we ensure that we do not have a case where 128 bit hashes of 2 texts are different, but their 64 bit hashes are same. It depends on what you really use the hash for, but I personally feel this approach is more correct.

XOR Encryption in Java: losing data after decryption

I'm currently writing a very small Java program to implement a one-time-pad, where the pad (or key) itself is generated as a series of bytes using a SecureRandom object, which is seeded using a simple string with the SHA-512 algorithm.
Generating the one-time-pad hasn't caused any problems, and if I supply the same seed string each time, as expected I get the same sequence of psuedo-random numbers, making the decryption process possible as long as the person decrypting has the seed string used to encrypt.
When I try to encrypt a file, the program reads in the data 64 chars at a time (except for the end of file, which is generally an odd number), and generates 64 bytes (or matching amount) of psuedo random bytes. XOR is performed between the elements of both arrays, the resulting char array containing the cipher characters is written to file, and the process repeats until all text in the file has been read.
Now, because Java treats all primitives as signed numbers (the data type byte ranges from -128 to 127, not 0 to 255) this means that the XOR operation can (and does) result in some negative values (-128 to -1). It seems that Java does not recognise these values as valid ASCII, and simply writes a ? (question mark) to the file for any negative values. When it comes to reading from the file to decrypt the cipher text, the negative value that resulted in the ? to be written to file is lost, replaced with 63, the valid ASCII code for a question mark.
This means that XORing this value is useless, without the original value there is no way to produce the plaintext. Incidentally, if I reproduce the behaviour of encrypting some data and then decrypting the data immediately after, in the same program run, and printing status along the way, there are no problems. Only if the data is written to file is the information lost.
I should also mention that I did try adding 128 to each encryption XOR result, and then subtracting it before performing the decryption XOR (to put each value in a valid ASCII range), but the ? problem still showed up because there are 31 ASCII codes from 128 to 159 that I'm unable to read and appear as ?
I've been banging my head off the wall on this for a while now, any help is appreciated.
Cheers.

This is very confused. If you are processing a char array, the elements are 16 bits wide, they are unsigned, and not all values are valid. So (a) you cant possibly be having a problem with signs or bytes, and (b) you shouldn't be doing that at all. You should be reading the file into a byte array, XOR-ing, and writing out the byte array directly to the output file. No Readers or Writers, no chars, no Strings.

I guess the problem is in the way you write the file. Write directly the converted byte array to a FileOutputStream and do not try to convert it to string first. For reading, do the same thing, read it to a byte array.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.