How to hash String using SHA-256 with only 32characters?

How to hash String using SHA-256 with only 32characters? - java

I am using org.apache.commons.codec.digest.DigestUtils for sha256 implementation like below.
DigestUtils.sha256Hex("myString")
this is returning 64characters hash String. But I need only 32characters hashed String.
How to hash the String value using SHA-256 with only 32 characters hashed value?

As the name SHA-256 implies, the hash consists of 256 bits, i.e. 64 characters (256 / 4 = 64) in the hex representation.
If you want 32 characters, you'd have to use the MD5 algorithm:
DigestUtils.md5Hex("myString")
But if you really need to use 32 characters from a 64 character string, you can use String#substring() - (which I would definitely not recommend, rather use MD5):
String hash64 = "9F86D081884C7D659A2FEAA0C55AD015A3BF4F1B2B0B822CD15D6C15B0F00A08";
String hash32 = hash64.substring(0, 32): // 9F86D081884C7D659A2FEAA0C55AD015

Related

How to encode and shorten hash for url safety?

I use SHA-512 to hash a token, which results in a 128 character string. How can I encode it so that it is shorter but still URL safe (no need to escape like '+' or '&')?
Hex and Base32 are both suboptimal, and Base64 is unsafe. I don't want to hack up a custom Base62 codec, as that will only become a maintenance burden.

Google has you covered:
com.google.common.io.BaseEncoding#base64Url
This is an efficient Base64 codec that only uses url-safe characters. It will produce a String with a length of 88 characters for a SHA512 hash.

Java 8 added Base64 encoders, with different flavors. See java.util.Base64
For example:
Base64.getUrlEncoder().encode(...);

Hashalgorithm and hashencoding

I am using build in Jboss login-module. It has to encode what user entered as a password and compare with encrypted password in db.
<module-option name="hashAlgorithm" value="MD5"/>
<module-option name="hashEncoding" value="base64"/>
For storing password in db I use following line
newUser.setPassword(DatatypeConverter.printBase64Binary(purePassword.getBytes("UTF-8")));
When I debug my application it appears:
encrypted password from DB = MTIzNDU2Nzg=
encrypted password from
user login = JdVa0oOqQAr0ZMdtcTwHrQ==
Questions:
What is happening? When do jboss use base64 algorithm and when md5
What is the difference between hashAlgorithm and hashEncoding?

What is happening? When do jboss use base64 algorithm and when md5
MD5 is the hash algorithm and Base64 is the output character encoding.
A character encoding is what defines which characters correspond to a byte or series of bytes.
MD5 is a cryptographic hash algorithm that produces a 16-byte output of 8-bit bytes, not characters. Not all 8-bit bytes are printable characters.
Base64 accepts an array of bytes and produces a printable character string. It is generally used the an array of bytes needs to be encoded to a printable character string.
What is the difference between hashAlgorithm and hashEncoding?
Some hash functions allow specification of the hash algorithm hashAlgorithm such as MD5, SHA1, SHA-256, etc that is used to hash the output encoding hashEncoding such as hexadecimal or Base64. This allows one function call to both hash the input with a selected hash algorithm and selected output encoding in one call.

Differences between Crypt.crypt() and DigestUtils.md5() in apache.commons.Codec

I am writing a basic password cracker for the MD5 hashing scheme against a Linux /etc/shadow file. When I use commons.codec's DigestUtils or Crypt libraries, the hash length for them are different (among other things).
When I use the Crypt.crypt(passwordToHash, "$1$Jhe937$") the output is a 22-character string. When I use the DigestUtils.md5[Hex](passwordToHash + "Jhe937")(or the Java MessageDigest class) the output is a 32-character string (after converted). This makes no sense to me.
aside: is there no easy way to convert the DigestUtils.md5(passwordToHash)'s byte[] to a String. I've tried all* the ways and I get all non-valid output: Nz_èJÓ_µù[î¬y
*all being: new String(byte[], "UTF-8") and convert to char then to String

The executive summary is that while they'll perform the same hashing, the output format is different between the two so the lengths will be different. Read on for details.
MD5 is a message digesting algorithm that produces a 16 byte hash value, always (assuming valid input, etc.) Those bytes aren't all printable characters, they can take any value from 0-255 for any of the bytes, while the printable characters in ASCII are in the range 32-126.
DigestUtils.md5(String) generates the MD5 of the string and returns a 16 element byte array. DigestUtils.md5Hex(String) is a convenience wrapper (I'm assuming, I haven't looked at the source, but that's how I'd write it :-) ) around DigestUtils.md5 that takes the 16 element byte array md5 produces and base16 encodes it (also known as hex encoding). That replaces each byte with the equivalent two hex characters, which is why you get a 32 character String out of it.
Crypt.crypt uses a special format that goes back to the original Unix method of storing passwords. It's been extended over the years to use different hash/encryption algorithms, longer salts, and additional features. It also encodes it's output to be printable text, which is where the length difference is coming from. By using a salt of "$1$...", you're saying to use MD5, so the password plus the salt will be hashed using MD5, resulting in 16 bytes as expected, but because those bytes aren't necessarily printable, the hash is base64 encoded (using a slightly different alphabet than the standard base64 encoding), which replaces 3 bytes with 4 printable characters. So 16 bytes becomes 16 / 3 * 4 = 21-1/3 characters, rounded up to 22.
On your aside, DigestUtils.md5 produces 16 bytes, but those bytes can have any value from 0 to 255 and are (effectively) random. new String(byte[], "UTF-8") says the bytes in the byte array are a UTF-8 encoding, which is a very specific format. new String does it's best to treat the bytes as a UTF-8 encoded string, but because they're really not, you generally get gibberish out. If you want something printable, you'll have to use something that takes random bytes, not bytes in a specific format (like UTF-8). Two popular options are base16/hex encoding, which you can get with DigestUtils.md5Hex, or base64, which you can get with Base64.encodeBase64String(DigestUtils.md5(pwd + salt)).

How do I check if a string is a valid md5 or sha1 checksum string

I don't want to calculate a file's checksum, just to know if a given string is a valid checksum

SHA1 verifier:
public boolean isValidSHA1(String s) {
return s.matches("^[a-fA-F0-9]{40}$");
}
MD5 verifier:
public boolean isValidMD5(String s) {
return s.matches("^[a-fA-F0-9]{32}$");
}

Any 160-bit sequence is a possible SHA1 hash. Any 128-bit sequence is a possible MD5 hash.
If you're looking at the hex string representations of them, then a sha1 will look like 40 hexadecimal digits, and an md5 will look like 32 hexadecimal digits.

There is no such thing as an MD5 or SHA-1 string, at least not one that is standardized. All you can test for is the size of the byte array: 16 for MD5 (a hash with an output size of 128 bits) or 20 bytes for SHA-1 (a hash with an output size of 160 bits) encoded using hexadecimal encoding or base 64 encoding.
If you use md5sum then generally the checksum is shown as hexadecimal encoding, using only lowercase characters (followed by the file name or - for standard input). Hexadecimals are generally preferred, but hashes may also contain separator character or use a different encoding such as base 64 or base 64 URL (etc. etc.).
The byte size you are testing for might however belong to an entirely different hash such as RIPEMD that may also have the same output size. Or it may be another value with the same amount of bytes.

MD5 verifier:
public boolean isValidMD5(String s) {
return s.matches("[a-fA-F0-9]{32}");}
And remove "-" of the string value.

RegExp SHA-1
public static final String SHA_1 = "^([0-9A-Fa-f]{2}[:]){19}([0-9A-Fa-f]{2})$";
public boolean isValidSHA1(String s) {
return s.matches(SHA_1);
}
boolean isValidSHA1 = isValidSHA1("12:45:54:3A:99:24:52:EA...");

SHA-1 Hashes Mixed with Strings

I have to parse something like the following "some text <40 byte hash>" can i read this whole thing in to a string without corrupting 40 byte hash part?
The thing is hash is not going to be there so i don't want to process it while reading.
EDIT: I forgot to mention that the 40 byte hash is 2x20 byte hashes no encoding raw bytes.

Read it from your input stream as a byte stream, and then strip the String out of the stream like this:
String s = new String(Arrays.copyOfRange(bytes, 0, bytes.length-40));
Then get your bytes as:
byte[] hash = Arrays.copyOfRange(bytes, s.length-1, bytes.length-1)

SHA-1 hashes are 20 bytes (160 bits) in length. If you are dealing with 40 character hashes, then they are probably an ASCII representation of the hash, and therefore only contain the characters 0-9 and a-f. If this is the case, then you should be able to read and manipulate the strings in Java without any trouble.

Some more details could be useful, but I think the answer is that you should be okay.
You didn't say how the SHA-1 hash was encoded (common possibilities include "none" (the raw bytes), Base64 and hex). Since SHA-1 produces a 20 byte (160 bit) hash, I am guessing that it will be encoded using hex, since that doubles the space needed to the 40 bytes you mentioned. With that encoding, 2 characters will be used to encode each byte from the hash, using the symbols 0 through 9 and A through F. Those are all ASCII characters so you are safe.
Base64 encoding would also work (though probably not what you asked about since it increases the size by about 1/3 leaving you at well less than 40 bytes) as each of the characters used in Base64 are also ASCII.
If the raw bytes were used directly, you would have a problem, as some of the values are not valid characters.

OK, now that you've clarified that these are raw bytes
No, you cannot read this into Java as a string, you will need to read it as raw bytes.

WORKING CODE:
Converts byte string inputs into hex characters which should be safe in almost all string encodings. Use the code I posted in your other question to decode the hex chars back to raw bytes.
/** Lookup table: character for a half-byte */
static final char[] CHAR_FOR_BYTE = {'0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F'};
/** Encode byte data as a hex string... hex chars are UPPERCASE */
public static String encode(byte[] data){
if(data == null || data.length==0){
return null;
}
char[] store = new char[data.length*2];
for(int i=0; i<data.length; i++){
final int val = (data[i]&0xFF);
final int charLoc=i<<1;
store[charLoc]=CHAR_FOR_BYTE[val>>>4];
store[charLoc+1]=CHAR_FOR_BYTE[val&0x0F];
}
return new String(store);
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.