I have a string that I have hashed using
Hashing.sha256().hashString("abc", Charsets.UTF_8).toString()
Now I want to decode the encrypted string. How should I do so?
The library I am using right now is
com.google.common.hash.Hashing
If you want to extract the original string, sha256 is specifically designed to make that almost impossible. If it was possible to do that efficiently then sha256 would be useless and everyone would be sad.
If you want it to be reversible, you should use an encryption algorithm, which is different from a hashing algorithm. Reversing a hash, especially a cryptographic hash function, is supposed to be hard.
(In addition, more than one string may have the exact same hash code, meaning that you can't be sure a string with the same hash was actually the same as the original string you used.)
Related
public static void main(String[] args) {
String s = "text";
hash=DatatypeConverter.printHexBinary(MessageDigest.getInstance("MD5").digest(s.getBytes("UTF8")))
System.err.println(hash);
}
You can't. That's what hashing is about. This is not a API or library limitation but a mathematical one, which is there by design.
You need to understand that hashing and encryption/ decryption are two completely different things (which are often used together).
Hashing
A hash, or to be precise, a cryptographic hash, is the result of a mathematical one-way function which means it is meant to be irreversible. Once something is hashed, it is not possible to get the text that was hashed back from it. This is at least true for good hash algorithms, which are not considered broken. MD5 as well as SHA1 are considered broken, so if you need this in a security context, use SHA-256 or even SHA-512 instead. If you just need this for a checksum, MD5 should be fine.
You can just use an MD5 hash value and type it into Google and get the result back. That is not what you want from a hash function. E.g. take your the hash 1cb251ec0d568de6a929b520c4aed8d1 which is the MD5 hash of your message text and you'll get a website which shows you the original text.
Another property of a hash it that for two different inputs it should ideally never have the same output. That is of course impossible as the set of possible messages to hash is much much bigger then the set of possible hash messages as hashes have a fixed length. But it should not be possible to generate such a hash collision artificially following some algorithm or clever input.
The only way to find out what was hashed is to hash many messages (brute-force) and see if the computed hash matches the hash to be cracked. See Rainbow tables for more information on this.
Hashes are mostly used to ensure integrity i.e. no modification was done to a message sent over the network. It is often used in conjunction with encryption algorithms (that's probably where your confusion originates from) as you need more than integrity to guarantee secure communication i.e. additionally authentication (e.g. by using certificates) and confidentiality (this is provided by encryption algorithms).
Encryption
An encryption function is a function which takes a message and a key and produces a result which is not readable unless you have the key. Side note: there may be one or two keys for encryption/ decryption depending on the algorithm you use (symmetric vs. asymmetric). If you have the key, it is reversible by applying the decryption function. If it was a one-way function like a hash, encryption would make no sense, as a recipient of a message would not be able to retrieve the message.
This question already has answers here:
How to reverse MD5 to get the original string? [duplicate]
(2 answers)
Closed 2 years ago.
I am looking into md5 hashing in java with the MessageDigest Class, lets say I do that
public void givenPassword_whenHashingUsingCommons_thenVerifying() {
String hash = "35454B055CC325EA1AF2126E27707052";
String password = "ILoveJava";
String md5Hex = DigestUtils
.md5Hex(password).toUpperCase();
assertThat(md5Hex.equals(hash)).isTrue();
}
So i did convert my password String into an md5 hash, lets say my recipient now wants the String not as hash, but as an normal String ( The Hash is just for transmission ), how can i convert the md5 hash String back to an "normal" Text String?
There is no way to convert a hash (MD5 or SHA1 or SHA2....) into the original string.
The purpose of a hashing function is a one-way translation. There is no coming back.
The MD5 hash function is an outdated cryptographic function that generates fixed length hashes of an input data. The how and whys is beyond the scope of this answer. I urge you to visit detailed texts on these online.
Please also remember that if you are using the MD5 for cryptographic and sensitive reasons, please learn about more secure recommended hashing as SHA256 etc..
For more details, please refer to this introductory text.
The reason that we hash passwords in particular is because it's one way and this is good for security reasons. Imagine a bad actor got access to your database, we wouldn't want them to be able to look up the password and log in as the user. By storing the hashed password, we can perform the same hashing algorithm on the password on login and compare it with the database value in the database for a successful login.
Even this has its issues. When bad actors gain access to a compromised database, they can generate "rainbow tables" to get from the hashed value to the password. Using common hashing algorithms and software such as hashcat they build up a database of common passwords, dictionary words etc. then match the hash value to the plain text string. This is why when storing passwords, we also use salting, which can be easily researched, Google "salting and hashing".
I see another answer which states that "The purpose of the hashing function is a one way translation". I think this is an over simplification. A hashing algorithm returns a message digest. It's a fixed length alphanumeric message which, as uniquely as possibly, represents the input. One the most important properties of a hashing algorithm is the uniqueness of the message and that it differs significantly from that of similar inputs. Here's an example using MD5:
String: aaaaaaaaaa
Hash: e09c80c42fda55f9d992e59ca6b3307d
String: aaaaaaaaab
Hash: ba05a43d3b98a72379fdc90a1e28ecaf
We encrypt with AES 256 CBC PKCS5PADDING in Java with the libraries one has to download from Oracle, with Base64 encoding of the resulting byte arrays. I have read that static common initialization vector drastically decreases the security as texts that starts with the chars will looks the same when encrypted. Is this still true for short strings (12 numeric chars)?
I have encrypted a large set and I cannot find any reoccurring substrings in the resulting encrypted strings, even when they start with the same sequence.
Example (plaintext on the left and resulting encrypted string on the right)
555555555501 -> U0Mkd0PPloB5iLBy5jM6nw==
555555555502 -> NUHWaFs62LMEeyoGA0mGoQ==
555555555503 -> X3/XJNd4TzEsMv7V0bXwqg==
Albeit separate from the question, but to preempt some suggestions: we need to be able to do look ups based on plaintext strings and to be able to decrypt. We could do both hashing and encryption, but prefer to avoid it if it does not improve security significantly as it adds complexity.
I have read that static common initialization vector are bad as one can derive the key from encrypted strings.
I'm curious: where have you read that?
With short (<=16 bytes) plaintext, a random IV effectifely works as a Salt, i.e. it causes the ciphertext to differ even if the plain text is the same. This is an important feature in a lot of applications. But you write:
We need to be able to do look ups based on plaintext strings.
So you want to build some sort of pseudonymization database? If that is a requirement for you, the feature that salt, and in your case random IV adds, is actually one that you specifically don't want. Depending on your other requirements you can probably get away with using a static IV here. But for pseudonymization in general, it is recommended to use a dedicated pseudonym. In your case the data seems to be atomic. But in the general case of, for example, address data, you want to hash the name, the zip code, the city and whatever else your pseudonym is, separately, both to allow more specific queries, and to keep access to and information flow from your data under strict control.
I tried to search this but surprisingly could not find any results of converting SHA-1 generated string back to normal string. I hash a string to SHA-1 and then send it to some other device where this SHA-1 generated string should be unhashed and used but I am unable to find any such method in Java.
The whole point of SHA-1 and other hashing algorithms is that there is no such thing as unhashing. There is no such method in Java or any other language.
What you are searching for is symmetric encryption.
You are confusing hashing with encryption (this is what you would probably want to use instead). Encryption is reversible given a key, while hashing is not.
Hashes are one way, you can go from text to hash but not the other way.
Have a look at this for a nice discussion:
https://security.stackexchange.com/questions/11717/why-are-hash-functions-one-way-if-i-know-the-algorithm-why-cant-i-calculate-t
String myText;
UUID.nameUUIDFromBytes((myText).getBytes()).toString();
I am using above code to generate a representative for specific texts.
For example 'Moien' should always be represeted with "e9cad067-56f3-3ea9-98d2-26e25778c48f", not any changes like project rebuild should be able to change that UUID.
The reason why I'm doing this is so that I don't want those specific texts to be readable(understandable) to human.
Note: I don't need the ability to regenerate the main text (e.g "Moien") after hashing .
I have an alternative way too :
MessageDigest digest = MessageDigest.getInstance("SHA-256");
byte[] hash = digest.digest((matcher.group(1)).getBytes("UTF-8"));
String a = Base64.encode(hash);
Which od you think is better for my problem?
UUID.nameUUIDFromBytes appears to basically just be MD5 hashing, with the result being represented as a UUID.
It feels clearer to me to use a base64-encoded hash explicitly, partly as you can then control which hash gets used - which could be relevant if collisions pose any sort of security risk. (SHA-256 is likely a better option than MD5 for exactly that reason.) The string will be longer from SHA-256 of course, but hopefully that's not a problem.
Note that in either case, I'd convert the string to text using a fixed encoding via StandardCharsets. Don't use the platform default (as per your first snippet) and prefer StandardCharsets over magic string values (as per your second snippet).