Integer Hash to a String Value - java

I want to compute Hash of a String, but the Hash value should be a number (long or integer).
In other words I want to compute integer hash of a string.
Collusion resistance in not the concern.
Is there an way to convert MessageDigest of SHA-256 to a number.
I am using Java to accomplish this.

Try to call hashCode() method. It is already implemented and does exactly what you want.

Most obviously there is a hashCode() method on String
As for converting the MessageDigest to a number, you can either use hashCode again or take the byte array from the digest and compact this down to whatever size you want, integer, long or whatever with (say) xor.
public int compactDigest(MessageDigest digest) {
byte [] byteArr = digest.digest();
// +3 since conversion to int array with divide length by four.
// and we don't want to lose any bytes.
ByteBuffer bytes = ByteBuffer.allocate(byteArr.length + 3);
bytes.put(byteArr);
bytes.rewind();
IntBuffer ints = bytes.asIntBuffer();
int compactDigest = 0;
for (int i = 0; i < ints.limit(); ++i) {
compactDigest ^= ints.get(i);
}
return compactDigest;
}

A Sha Hash has 256 Bits e.g.
"364b7e70a9966ef7686ab814958cd0017b7f19147a257d40603d4a1307662b42"
this will exceed the range of long and integer.
You could use new BigInteger( hash, 16 ); for a decimal representation.
public static void main(String[] args) throws NoSuchAlgorithmException {
MessageDigest digest = MessageDigest.getInstance("SHA-256");
digest.update("string".getBytes() );
byte[] hash = digest.digest();
BigInteger bi = new BigInteger( hash );
System.out.println( "hex:" + bi.toString(16) + "\r\ndec:" + bi.toString() );
}

class String has a hashcode method, like any other Java class, which transforms the string into a number. See the documentation of this method for the exact algorithm it uses.

every object in java has hashCode() method. You can override it and specify your own logic. Look at the examples.

Please find it here: http://pastebin.com/j6Cffkcp;
I but it returns only string.

Cryptographic hashes created using the JCE classes (MessageDigest in your case) are essentially a sequence of bytes (256 bits for SHA-256). If you wish to store and manage these are numbers, you'll need to convert these into BigInteger or BigDecimal objects (given the length of the digest).
It is not always that a cryptographic hash of String objects are computed, and it is often done for the purpose of one-way encryption of secrets. If you are using the hash for other purposes, especially to ensure some sort of uniqueness among the Strings (that would be important when storing these objects in a hash map), you're better off using the hash value computed by the String.hashCode method.

Related

How to convert byte array to unsigned 128 bit integer in Java?

I am trying to hash strings using MD5. I need the hashed value as a 128 bit unsigned integer in Java.
MessageDigest md = MessageDigest.getInstance("MD5");
String toHash = "HashThis";
md.update(toHash.getBytes());
byte[] isHashed = md.digest();
How do I convert isHashed to an integer?
Java does not have a 128 bit int type. But it has a BigInteger class, which have a constructor that takes a signum and a magnitude expressed as a byte[], as you require.
BigInteger value=new BigInteger(1,isHashed);
Use BigInteger.
BigInteger value = new BigInteger(isHashed);
To ensure that the resulting value is positive (as the byte array is expected to be in 2's complement), make sure the most significant bit is zero. This can be easily achieved by making the array 1 element bigger, with the first byte being 0.

Long value of a String that does not contain numbers

I have a String variable that I want to convert to a long variable.
The problem is that the String variable will never contain any numbers, so simply calling Long.parseLong(myString); will throw a NumberFormatException.
To clarify my intentions:
I have a method that returns a long from a String in-parameter. I want the method to generate an ID based on the String variable, to later be able to group the long values.
I might solve this using a RegEx expression, but my question is if there's any straight forward way to get a long value of a String?
You say you want a long value. The built in hashCode() returns an int, not a long. If you really do need a long then you need to use a hashing method that returns a long. There are a number of possibilities, though I usually suggest the FNV hash for non-cryptographic purposes. It is very easy to code and comes in a wide range of sizes, 64-bit included.
ETA: Code for the FNV hash is on the FNV website that I linked to. Things to be careful of are 1) unsigned v. signed 64-bit numbers and 2) character encodings.
long FNV64Hash(String inString) throws UnsupportedEncodingException {
// FNV-64 constants.
long FNVprime = 1099511628211L;
// Needs workround for unsigned 64-bit: 14695981039346656037.
long FNVbasis = (146959810393466560L * 100L) + 37L;
// Alternative: long FNVbasis = -3750763034362895579L;
// Convert string to bytes.
byte[] bytes = inString.getBytes("UTF-8"); // Specify a character encoding.
long hash = FNVbasis;
for (byte aByte : bytes) {
hash ^= aByte;
hash *= FNVprime;
}
return hash;
} // end FNV64Hash()
if you want a simple and easy way , you can use hashCode() in java , and here is an example
import java.io.*;
public class StringHashing{
public static void main(String args[]){
String Str = new String("HELLO WORLD !!");
System.out.println("Hashcode for Str :" + Str.hashCode() );
}
}
or you can implement your own hash function

Standard way to create a hash in Java

The question is about the correct way of creating a hash in Java:
Lets assume I have a positive BigInteger value that I would like to create a hash from. Lets assume that below instance of the messageDigest is a valid instance of (SHA-256)
public static final BigInteger B = new BigInteger("BD0C61512C692C0CB6D041FA01BB152D4916A1E77AF46AE105393011BAF38964DC46A0670DD125B95A981652236F99D9B681CBF87837EC996C6DA04453728610D0C6DDB58B318885D7D82C7F8DEB75CE7BD4FBAA37089E6F9C6059F388838E7A00030B331EB76840910440B1B27AAEAEEB4012B7D7665238A8E3FB004B117B58", 16);
byte[] byteArrayBBigInt = B.toByteArray();
this.printArray(byteArrayBBigInt);
messageDigest.reset();
messageDigest.update(byteArrayBBigInt);
byte[] outputBBigInt = messageDigest.digest();
Now I only assume that the code below is correct, as according to the test the hashes I produce match with the one produced by:
http://www.fileformat.info/tool/hash.htm?hex=BD0C61512C692C0CB6D041FA01BB152D4916A1E77AF46AE105393011BAF38964DC46A0670DD125B95A981652236F99D9B681CBF87837EC996C6DA04453728610D0C6DDB58B318885D7D82C7F8DEB75CE7BD4FBAA37089E6F9C6059F388838E7A00030B331EB76840910440B1B27AAEAEEB4012B7D7665238A8E3FB004B117B58
However I am not sure why we are doing the step below i.e.
because the returned byte array after the digest() call is signed and in this case it is a negative, I suspect that we do need to convert it to a positive number i.e. we can use a function like that.
public static String byteArrayToHexString(byte[] b) {
String result = "";
for (int i=0; i < b.length; i++) {
result += Integer.toString((b[i] & 0xff) + 0x100, 16).substring(1);
}
return result;
}
thus:
String hex = byteArrayToHexString(outputBBigInt)
BigInteger unsignedBigInteger = new BigInteger(hex, 16);
When I construct a BigInteger from the new hex string and convert it back to byte array then I see that the sign bit, that is most significant bit i.e. the leftmost bit, is set to 0 which means that the number is positive, moreover the whole byte is constructed from zeros ( 00000000 ).
My question is: Is there any RFC that describes why do we need to convert the hash always to a "positive" unsigned byte array. I mean even if the number produced after the digest call is negative it is still a valid hash, right? thus why do we need that additional procedure. Basically, I am looking for a paper: standard or rfc describing that we need to do so.
A hash consists of an octet string (called a byte array in Java). How you convert it to or from a large number (a BigInteger in Java) is completely out of the scope for cryptographic hash algorithms. So no, there is no RFC to describe it as there is (usually) no reason to treat a hash as a number. In that sense a cryptographic hash is rather different from Object.hashCode().
That you can only treat hexadecimals as unsigned is a bit of an issue, but if you really want to then you can first convert it back to a byte array, and then perform new BigInteger(result). That constructor does threat the encoding within result as signed. Note that in protocols it is often not needed to convert back and forth to hexadecimals; hexadecimals are mainly for human consumption, a computer is fine with bytes.

Turn String to 128-bit key for AES

I want to enter my own String variable to then turn it into a key for encryption/decryption for AES algorithm. I have tried many known ways such as UTF-8, base64, some methods doing conversion byte-string and vice versa and some other. Although it's true that all of them work even with some of them not working accurately, all of them turn the string in bytes, but what i want is to enter something like "helloWorld" and get back a 128-bit key for AES. Anything i use it goes for "Invalid key length" since the bytes are not accurate.
What do i need to do to get the correct bytes? Also i want to clarify that i want String and not an array of char since i want to make it as a function in my programm later so that the user can change the key at will should it be compromised.
UPDATE: i edited the example and this i what i have so far, still throws exception about parameters and key length though
public class SHAHashingExample
{
private static byte[] keyValue;
public static void main(String[] args)throws Exception
{
String password = "123456";
MessageDigest md = MessageDigest.getInstance("SHA-256");
md.update(password.getBytes());
byte byteData[] = md.digest();
keyValue = md.digest();
//convert the byte to hex format method 1
StringBuffer sb = new StringBuffer();
for (int i = 0; i < byteData.length/2; i++) {
sb.append(Integer.toString((byteData[i] & 0xff) + 0x100, 16).substring(1));
}
System.out.println("Hex format : " + sb.toString());
//convert the byte to hex format method 2
StringBuffer hexString = new StringBuffer();
for (int i=0;i<byteData.length/2;i++) {
String hex=Integer.toHexString(0xff & byteData[i]);
if(hex.length()==1) hexString.append('0');
hexString.append(hex);
}
System.out.println("Hex format : " + hexString.toString());
String k = "hello world";
String f = encrypt(k);
System.out.println(f);
String j = decrypt(f);
System.out.println(j);
}
public static String encrypt(String Data) throws Exception {
Key key = generateKey();
Cipher c = Cipher.getInstance("AES");
c.init(Cipher.ENCRYPT_MODE, key);
byte[] encVal = c.doFinal(Data.getBytes());
String encryptedValue = new BASE64Encoder().encode(encVal);
return encryptedValue;
}
public static String decrypt(String encryptedData) throws Exception {
Key key = generateKey();
Cipher c = Cipher.getInstance("AES");
c.init(Cipher.DECRYPT_MODE, key);
byte[] decordedValue = new BASE64Decoder().decodeBuffer(encryptedData);
byte[] decValue = c.doFinal(decordedValue);
String decryptedValue = new String(decValue);
return decryptedValue;
}
private static Key generateKey() throws Exception {
Key key = new SecretKeySpec(keyValue, "AES");
return key;
}
}
UPDATE 2:
Turns out your usage of many components of the Java Cipher capabilities are not spot on. Look here at this other SO answer.
Java AES and using my own Key
UPDATE 1:
To get the 256 bit value down to 128 bits using the example below, here is what you may want to try:
// After you already have generated the digest
byte[] mdbytes = md.digest();
byte[] key = new byte[mdbytes.length / 2];
for(int I = 0; I < key.length; I++){
// Choice 1 for using only 128 bits of the 256 generated
key[I] = mdbytes[I];
// Choice 2 for using ALL of the 256 bits generated
key[I] = mdbytes[I] ^ mdbytes[I + key.length];
}
// Now use key as the input key for AES
ORIGINAL:
Here is a great example of using the built-in java APIs for performing a SHA hash on some data bytes.
http://www.mkyong.com/java/java-sha-hashing-example/
Java has built-in capability to perform multiple differing hash types, and you really should try to take advantage of one, instead of trying to write one yourself. Perhaps the most widely used hash functions are the SHA versions. There are versions that can output a 128, 256, and 512 bit hash output.
What you are asking for, is in all technicality exactly how logging into a system using your password generally works. the system never truly stores your actual textual password, but rather the HASH to it. When you, the user, enters your password, the system performs a live hash of what you entered and compares the live generated hash with the stored hash. This does not go the added step of lets say using that hash as an actual key component for a symmetric encryption. In general a GOOD hash can indeed generate DECENT key material for use in actual symmetric encryption / decryption.
What you are looking for is called a hash function. You will be able to enter an input of arbitrary length, and the hash function will always output a value of fixed bit size -- 128 bits in your case.
There are many approaches to a hash function. The most simple one would be doing the modulo operation between an input number (an integer representation of your string, for example) and the maximum number that can be represented in n bits (in your case, 128); the output result will be an n-bit number that you can convert to whatever form you want (probably hexadecimal) and use it as an AES key.
That isn't necessarily efficient (which is to say, the output 128-bit keys may not be very evenly distributed between 0 and 2^128 - 1), though -- more importantly, it would be quite slow for no good reason. Some efficient 128-bit hash functions are CityHash and MurmurHash -- you can look more up (as well as several Java implementations) on Google.

What is really happen (actual process) when we hash a particular string or word

Hi am trying to develop a counting bloom filter in java. i really searched most of the sources about the bloom filter.. The thing i understood is when we hash (do hashing) the particular string or word, the result of hashing will return one value so that we can store the content in that resultant value place.
But my big question is how to do the hashing (the algorithm). What really happens when we hash a particular string or word. Can u please explain me what really happens when we hash a particular string or word (Like how the particular final value arrives when we do hashing on particular string or word). I also read there is also chances for collision. Can you also address, Why the resultant hash value is not unique (Why its sometimes returns same hash value for different inputs). And do i really need to write the code to do hashing or is there any inbuilt functions in java to do hashing.
You can simply get a hash code by calling hashCode() on any object. In particular for class String from javadoc:
public int hashCode()
Returns a hash code for this string. The hash code for a String object
is computed as
s[0]*31^(n-1) + s[ 1]*31^(n-2) + ... + s[n-1]
using int arithmetic, where s[i] is the ith character of the string, n
is the length of the string, and ^ indicates exponentiation. (The hash
value of the empty string is zero.)
"Hashing" is a function
H: I -> O
Where usually the set I is much bigger or more complex than O. In hash table I is the class of your elements is, and O is the set of positive integers. Particularly, in a bloom filter you have n different functions. To develop a hash function you need to extract different characteristics of similar objects. For example, for character strings you can have :
the length
the first character
the number of occurrences of a specific character
the string evaluated as a polynomial h(S) = sum (s(i)*31^i) mod d
When using multiple hash collision of characteristics should be avoided, for example using number of voyels and number of non-voyels is not really helpful. There are some characteristics to a hash function must have, look at the wikipedia entry
The code executed for String is this one:
public int hashCode() {
int h = hash;
int len = count;
if (h == 0 && len > 0) {
int off = offset;
char val[] = value;
for (int i = 0; i < len; i++) {
h = 31*h + val[off++];
}
hash = h;
}
return h;
}
Hash is a function (not a bijection) and therefore, different inputs can produce the same result. This is the basics of hash functions
Java allows you to override the hashCode() method for your Classes to use a hashing algorithm
public class Employee {
// Default implementation might want to use "name" for as part of hashCode
private String name;
#Override
public int hashCode() {
// We know that ID is always unique, so don't use name in calculating
// the hash code. & hashCode() is an int
return id;
}
}
*(if you are going to override hashCode you should also override equals.)
The hashcode is computed per object stored in the collection.
It is computed using a standard algorithm.
You can indeed override the hashcode method on a per object basis.
one way to implement a hashcode method is using HashcodeBuilder.
Hope this helps. Search more in stack overflow related to this article ,you can get more descriptive answers.

Categories