Verify Hashing.sha256() generated hash

Verify Hashing.sha256() generated hash - java

I have this code created using Google Guava:
String sha256hex = Hashing.sha256()
.hashString(cardNum, StandardCharsets.UTF_8)
.toString();
How I can verify the generated values is a properly generated hash?

SHA-256 and, in general, the family of SHA 2 algorithms is wonderfully described in Wikipedia and different RFCs, RFC 6234 and the superseded RFC 4634.
All these sources dictate that the output provided by the SHA 256 hash function is 256 bits length, 32 bytes (the number that accompanies the SHA word is the mentioned value for every algorithm in the family, roughly speaking).
These sequence of bytes is typically encoded in hex. This is the implementation provided by Guava as well.
Then, the problem can be reduced to identify if a string in Java is a valid hex encoding.
That problem has been already answered here, in SO, for example in this question.
For its simplicity, consider the solution proposed by #laycat:
boolean isHex = mac_addr.matches("^[0-9a-fA-F]+$");
As every byte is encoded with two hex characters and, as mentioned, the SHA-256 algorithm produces and output of 32 bytes you can safely check for a string of 64 characters length, as suggested in the answer of #D.O. as well. Your validation code could be similar to this:
boolean canBeSha256Output = sha256Hex.matches("^[0-9a-fA-F]{64}$");
Please, be aware that there is no possibility for saying if a character hex string of a certain length on its own is or not the result of a hash function, whichever hash function you consider.
You only can be sure that a hash output is a hash output if and only if it matches the result of applying the corresponding hash function over the original input.

You could use a regex to verify that it looks like a sha256 hash(64 hexadecimal characters), like
\b[A-Fa-f0-9]{64}\b

Related

Different length of hashed password using argon2 in python and java

Not sure if it is the right output but when I hash a password with argon in java, i get the output:
$argon2i$v=19$m=512,t=2,p=2$J1w6n04CBSEA8p0cCqeK7A$tb1ihqduhONYZN0+ldKkw980Y7h7ZJ2OcDTsXyIMibo
while python gives me:
argon2$argon2i$v=19$m=512,t=2,p=2$TjZiM3ZTdGFIQUlZ$CocCpAIXQc722ndqkFZWxw
the parameters seem the same: i, 512, 2, p =2
Any argon2 guru who can tell me how I can have the same length output? Prefer playing with the java since it is a simple ussd app.

After the p=2 in the string, there is a dollar sign. The string between this dollar sign and the next dollar sign is the salt of the hash. After the second dollar sign is the actual key that has been derived by Argon2. In the Java example, (after being base64 decoded), the hash length is 32. However, in Python, the hash length is 16 (which is the default in Python). So, in Python, if you did:
import argon2
argon2.hash_password(b"Password",memory_cost=512,time_cost=2,parallelism=2,hash_len=32)
Then you will have the same length hash as in the Java example. Also, if you specify the same salt for both the Python and Java implementations, then the hashes should be identical to each other (given that the parameters are the same between the two).

Wrong SHA-256 hash of a string in MessageDigest

in some tests i using MessageDigest library in Groovy and sometimes this function returns incorrect value. Here is my code below:
import java.security.MessageDigest;
String.metaClass.toSHA256 = {
def messageDigest = MessageDigest.getInstance("SHA-256")
messageDigest.update(delegate.getBytes("UTF-8"))
new BigInteger(1, messageDigest.digest()).toString(16).padLeft(40, '0')
}
for example - i tryed to encode to SHA-256 this string:
582015-04-23 20:47:112015-04-23 23:59:000020502015-04-23 20:47:11tests-from-api["afoot"]33facafaece3afd353bcbe88637d11b7
My method return
cb2814380117cd5621064c1d7512b32e3cb8c8cb2b1f20016f6da763598d738
But online generators, returns 0cb2814380117cd5621064c1d7512b32e3cb8c8cb2b1f20016f6da763598d738
it calculates wrong like 2 times in 40 tries
Could you help me how to fix it?

You're missing a '0' at the start, because you're padding left to 40 characters (presumably having copied that code from something where the hash is expected to be 40 characters) instead of the 64 characters that actually makes up a SHA-256 hash (in hex).
So you could just fix the padLeft code - but personally I would just avoid using BigInteger for hex conversions. That's not what it's designed for - it's designed for maths operations on large integers.
Instead, use one of the many hex converters in common utility libraries - which are designed precisely to convert byte arrays to hex, with nothing about integers at all. If you don't want to use a library, there's plenty of code on Stack Overflow to convert a byte[] to hex.

Is it possible to limit the hashcode into specific number of characters in Java

I have written a method to convert a plain text into it's hashcode using MD5 algorithm. Please find the code below which I used.
public static String convertToMD5Hash(final String plainText){
MessageDigest messageDigest = null;
try {
messageDigest = MessageDigest.getInstance("MD5");
} catch (NoSuchAlgorithmException e) {
LOGGER.warn("For some wierd reason the MD5 algorithm was not found.", e);
}
messageDigest.reset();
messageDigest.update(plainText.getBytes());
final byte[] digest = messageDigest.digest();
final BigInteger bigInt = new BigInteger(1, digest);
String hashtext = bigInt.toString(8);
return hashtext;
}
This method works perfectly but it returns a lengthy hash. I need to limit this hash text to 8 characters. Is there any possibilities to set the length of the hashcodes in Java?

Yes and No. You can use a substring of the original hash if you always cut the original hash-string similary (ie. 8 last/first characters). What are you going to do with that "semi-hash" is another thing.
Whatever it is you're going to do, be sure it has nothing to do with security.
Here's why: MD5 is 128-bit hash, so there's 2^128 = ~340,000,000,000,000,000,000,000,000,000,000,000,000 possible permutations. The quite astronomical amount of permutations is the thing that makes bruteforcing this kind of string virtually impossible. By cutting down to 8 characters, you'll end up with 32-bit hash. This is because a single hex-value takes 4 bits to represent (thus, also 128-bit / 4 bit = 32 hex-values). With 32-bit hash there's only 2^32 = 4,294,967,296 combinations. That's about 79,228,162,514,264,337,593,543,950,336 times less secure than original 128-bit hash and can be broken in matter of seconds with any old computer that has processing power of an 80's calculator.

No. MD5 is defined to return 128 bit values. You could use Base64 to encode them to ASCII and truncate it using String#substring(0, 8).
In Java 8 (not officially released yet), you can encode a byte[] to Base64 as follows:
String base64 = Base64.getEncoder().encodeToString(digest);
For earlier Java versions see Decode Base64 data in Java

all hash algorithms should randomly change bits in whole hash whenever any part of data has changed. so you can just choose 8 chars from your hash. just don't pick them randomly - it must be reproducible

Firstly as everyone has mentioned, the 64 bit hash is not secure enough. Ultimately it depends on what you exactly plan to do with the hash.
If you still need to convert this to 8 characters, I suggest downcasting the BigInteger to a Long value using BigIteger.longValue()
It will ensure that the long value it produces is consistent with the hash that was produced.
I am not sure if taking most significant 64 bits from the 128 bit hash is good idea. I would rather take least significant 64 bits. What this ensures is that
when hash(128, a) = hash(128, b) then hash(64, a) = hash(64, b) will always be true.
But we have to live with collision in case of 64 bits i.e. when hash(64, a) = hash(64, b) then hash(128, a) = hash(128, b) is not always true.
In a nutshell, we ensure that we do not have a case where 128 bit hashes of 2 texts are different, but their 64 bit hashes are same. It depends on what you really use the hash for, but I personally feel this approach is more correct.

Generate Unique hash from Long id

I need to generate a unique hash from a ID value of type Long. My concern is that it should not globally generate the same hash from two different Long/long values.
MD5 hashing looks a nice solution but the hash String is very long. I only need characters
0-9
a-z and A-Z
And just 6-characters like: j4qwO7
What could be the simpliest solution?

Your requirements cannot be met. You've got an alphabet of 62 possible characters, and 6 characters available - which means there are 626 possible IDs of that form.
However, there are 2568 possible long values. By the pigeon-hole principle, it's impossible to give each of those long values a different ID of the given form.

You don't have to use the hex representation. Build your own hash representation by using the actual hash bytes from the function. You could truncate the hash output to simplify the hash representation, but that would make collisions more probable.
Edit:
The other answers stating that what you ask isn't possible, based on the number of possible long values, is teoretically true, if you actually need the whole range.
If your IDs are auto incremented from zero and up, just 62^6 = 56800235584 values might be more than enough for you, depending on your needs.

Step 1. Switch to using ints instead of longs, or allow for a longer "hash". See every other answer for discussion of why 6 characters is unsufficient for dealing with longs.
Step 2. Encrypt your number using an algorithm which does not using padding. Personally, I suggest skip32 encoding. I make no promises that this is strong enough for security, but if your goal is "make random-looking IDs," it works well.
Step 3. Encode your number as a base_62 number (as opposed to base_10, not as opposed to base64 encoding).

Your question doesn't make sense.
'Unique hash' is a contradiction in terms.
A 'unique hash' value of a Java long must be 64 bits in length, like the long itself, and of course the simplest hash function for that is f(x) = x, i.e. the long value itself.
6 characters that can be 0-9, A-Z, and a-z can only yield 62^6 = 56800235584 distinct values, which isn't enough.

You can use long value irself as same as hash (for indexing/search purposes).
if you need to obfuscate/hide your long value, you can use any symmetric
encryption algorithm with 64-bit block, for example - DES or AES in ECB-mode.

Update:
No need to use Hashids. Base 36 is pretty enough.
long id = 12345;
String hash = Integer.toString(Math.abs((int)id), 36);
Original answer, with Hashids:
You might want to use Hashids
long id = 12345;
Hashids hashids = new Hashids("this is my salt");
String hash = hashids.encrypt(id); // "ryBo"
"ryBo" is going to be unique, as it can be converted back to your long. Hashids just converts, doesn't hash further.
long[] numbers = hashids.decrypt("ryBo");
// numbers[0] == 12345
If you really have a 64-bit value, the hash string is going to be quite long (around 16 characters, depending on the alphabet), but if you don't plan to have more than 2^16 thingies, you can get away with truncating the 64-bit hash to 32-bit (an int).
long id = 12345;
String hash = hashids.encrypt(Math.abs((int)id));

XOR Encryption in Java: losing data after decryption

I'm currently writing a very small Java program to implement a one-time-pad, where the pad (or key) itself is generated as a series of bytes using a SecureRandom object, which is seeded using a simple string with the SHA-512 algorithm.
Generating the one-time-pad hasn't caused any problems, and if I supply the same seed string each time, as expected I get the same sequence of psuedo-random numbers, making the decryption process possible as long as the person decrypting has the seed string used to encrypt.
When I try to encrypt a file, the program reads in the data 64 chars at a time (except for the end of file, which is generally an odd number), and generates 64 bytes (or matching amount) of psuedo random bytes. XOR is performed between the elements of both arrays, the resulting char array containing the cipher characters is written to file, and the process repeats until all text in the file has been read.
Now, because Java treats all primitives as signed numbers (the data type byte ranges from -128 to 127, not 0 to 255) this means that the XOR operation can (and does) result in some negative values (-128 to -1). It seems that Java does not recognise these values as valid ASCII, and simply writes a ? (question mark) to the file for any negative values. When it comes to reading from the file to decrypt the cipher text, the negative value that resulted in the ? to be written to file is lost, replaced with 63, the valid ASCII code for a question mark.
This means that XORing this value is useless, without the original value there is no way to produce the plaintext. Incidentally, if I reproduce the behaviour of encrypting some data and then decrypting the data immediately after, in the same program run, and printing status along the way, there are no problems. Only if the data is written to file is the information lost.
I should also mention that I did try adding 128 to each encryption XOR result, and then subtracting it before performing the decryption XOR (to put each value in a valid ASCII range), but the ? problem still showed up because there are 31 ASCII codes from 128 to 159 that I'm unable to read and appear as ?
I've been banging my head off the wall on this for a while now, any help is appreciated.
Cheers.

This is very confused. If you are processing a char array, the elements are 16 bits wide, they are unsigned, and not all values are valid. So (a) you cant possibly be having a problem with signs or bytes, and (b) you shouldn't be doing that at all. You should be reading the file into a byte array, XOR-ing, and writing out the byte array directly to the output file. No Readers or Writers, no chars, no Strings.

I guess the problem is in the way you write the file. Write directly the converted byte array to a FileOutputStream and do not try to convert it to string first. For reading, do the same thing, read it to a byte array.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.