Java Map<> with intrisic types possible?

Java Map<> with intrisic types possible? - java

How is this possible:
HashMap<byte[], byte[]> and what is hash() of byte[]?

Yes, it is possible (with a big caveat, see below), but byte[] is not an "intrinsic type". First, there's no such thing, you probably mean a "primitive type". Second: byte[] is not a primitive type, byte is. An array is always a reference type.
Arrays don't have specific hashCode implementations, so they'll just use the hashCode of Object, which means that the hashCode will be the indentity-hashCode, which is independent from the actual content.
In other words: a byte[] is a very bad Map key, because you can only retrieve the value with the exact same instance.
If you need a content-based hashCode() based on an array, you can use Arrays.hashCode(), but that won't help you (directly) with the Map. There's also Arrays.equals() to check for content equality.
You could wrap your byte[] in a thin wrapper object that implements hashCode() and equals() (using the methods mentioned above):
import java.util.Arrays;
public final class ArrayWrapper {
private final byte[] data;
private final int hash;
public ArrayWrapper(final byte[] data) {
// strictly speaking we should make a defensive copy here,
// but I *assume* (and should document) that the argument
// passed in here should not be changed
this.data = data;
this.hash = Arrays.hashCode(data);
}
#Override
public int hashCode() {
return hash
}
#Override
public boolean equals(Object o) {
if (!(o instanceof ArrayWrapper)) {
return false;
}
ArrayWrapper other = (ArrayWrapper) o;
return this.hash == other.hash && Arrays.equals(this.data, other.data);
}
// don't add getData to prevent having to do a defensive copy of data
}
Using this class you can then use a Map<ArrayWrapper,byte[]>.

For arrays hashCode() uses the default implementation from Object - typically some form of internal object address. As a result, key in this HashMap is considered unique if it is a different array, not if array contents are equal.
byte[] a = { 2, 3 };
byte[] b = { 2, 3 };
System.out.println(a.equals(b)); // false
Map<byte[], String> map = new HashMap<byte[], String>();
map.put(a, "A");
map.put(b, "B");
System.out.println(map); // {[B#37d2068d=B, [B#7ecec0c5=A}

Related

Custom key generation and collision in a hashMap

I have a method that is expected to save an object in a hashmap (used as a cache) that has as a key a String.
The objects that are passed in the method have either fields that are “volatile” i.e. they can change on a next refresh from the data store or are the same across all objects except for 3 fields.
Those fields are 2 of type double and 1 field of type String.
What I am doing now is I use:
Objects.hash(doubleField1, doubleField2, theString)
and convert the result to String.
Basically I am generating a key for that object based on the immutable state.
This seems to work but I have the following question:
If we exclude the case that we have 2 objects with the exact same fields (which is impossible) how likely is that I could end up with a collision that won’t be able to be verified properly?
I mean that if I have a hashmap with keys e.g. strings etc if there is a collision on the hashCode the actual value of the key is compared to verify if it is the same object and not a collision.
Would using keys the way I have described create problems in such verification?
Update:
If I have e.g. a hashmap with key a Person and the hashCode is generated using fullName and ssn or dateOfBirth if there is a collision then the hashmap implementation uses equals to verify if it is the actual object being searched for.
I was wondering if the approach I describe could have some issue in that part because I generate the actual key directly

Here is a simple demo for a hashMap key implementation. When retrieving the object I construct the fields piecemeal to avoid any possibility of using cached Strings or Integers. It makes a more convincing demo.
Map<MyKey, Long> map = new HashMap<>();
map.put(new MyKey(10,"abc"), 1234556L);
map.put(new MyKey(400,"aefbc"), 548282L);
int n = 380;
long v = map.get(new MyKey(n + 20, "ae" + "fbc")); // Should get 548282
System.out.println(v);
prints
548282
The key class
class MyKey {
privat eint v;
private String s;
private int hashcode;
public MyKey(int v, String s) {
Objects.requireNonNull(s, "String must be provided");
this.v = v;
this.s = s;
// this class is immutable so no need to keep
// computing hashCode
hashcode = Objects.hash(s,v);
}
#Override
public int hashCode() {
return hashcode;
}
#Override
public boolean equals(Object o) {
if (o == this) {
return true;
}
if (o == null) {
return false;
}
if (o instanceof MyKey) {
MyKey mk = (MyKey)o;
return v == mk.v && s.equals(mk.s);
}
return false;
}
}

Peculiar HashMap Behavior

I was reviewing one of Oracle’s Java Certification Practice Exams when I came across the follow question:
Given:
class MyKeys {
Integer key;
MyKeys(Integer k) {
key = k;
}
public boolean equals(Object o) {
return ((MyKeys) o).key == this.key;
}
}
And this code snippet:
Map m = new HashMap();
MyKeys m1 = new MyKeys(1);
MyKeys m2 = new MyKeys(2);
MyKeys m3 = new MyKeys(1);
MyKeys m4 = new MyKeys(new Integer(2));
m.put(m1, "car");
m.put(m2, "boat");
m.put(m3, "plane");
m.put(m4, "bus");
System.out.print(m.size());
What is the result?
A) 2
B) 3
C) 4
D) Compilation fails
My guess was B because m1 and m3 are equal due to their key references being the same. To my surprise, the answer is actually C. Does put() do something that I am missing? Why wouldn’t "plane" replace "car"? Thank you!

With given definition of class i.e
class MyKeys {
Integer key;
MyKeys(Integer k) {
key = k;
}
public boolean equals(Object o) {
return ((MyKeys) o).key == this.key;
}
}
It will result ans = 4, it has only equal method, if you add definition of hashcode then it will result ans=3
class MyKeys {
Integer key;
MyKeys(Integer k) {
this.key = k;
}
#Override
public boolean equals(Object o) {
return ((MyKeys) o).key == this.key;
}
#Override
public int hashCode(){
return key*key;
}
}
Contract of equal and hashcode:
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result. If you only override equals() and not hashCode() your class violates this contract.
The problem you will have is with collections where unicity of elements is calculated according to both .equals() and .hashCode(), for instance keys in a HashMap.
If you have two objects which are .equals(), but have different hash codes, you lose!

If we keep this simple, since this is for a Java Certification.
Notice that MyKeys doesn't override hashCode, you know there will be something about it. And I usually try to remember only one thing about Object.hashCode
As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)
Or in short, every instance will have a distinct hashcode. Meaning that with this code, every new MyKeys will add a new pair in the map.
In reality, this is a bit more complex since the method still return an integer, so the risk of collision is still present (integer doesn't provide infinite amount of values). You can see a bit more about this here.
This explain why the answer is that the map will have a size of 4. Each key inserted is a different instance.

As answered by others, the ans will be 4, reason being not overriding hashcode method.
For a more clear reason, whenever an object is added in hash map, the hashcode of the key is generated, which decides the location of the entry set. The 2 objects m1 and m3 will have different hash codes as the hashcode method is not overridden (usual hashcode behaviour). Different hash code will not create any collision and a new entry is made.
On the contrary, the equals methods is called only after the hashcode method produces the same result, i.e., same hash code.
In case of m2 and m4 also, the 2 objects have different hash codes, hence 2 different entries, with no calling done to the equals method.
Hence, in cases of hashing, it is necessary to overload hashcode method, along with equals.

It will be more clear when we see the implementation of put method of HashMap.
// here hash(key) method is call to calculate hash.
// and in putVal() method use int hash to find right bucket in map for the object.
public V put(K key, V value) {
return putVal(hash(key), key, value, false, true);
}
static final int hash(Object key) {
int h;
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}
In your code you #Override only equals method.
class MyKeys {
Integer key;
MyKeys(Integer k) {
key = k;
}
public boolean equals(Object o) {
return ((MyKeys) o).key == this.key;
}
}
To achieve output you need to override both hashCode() and equals() method.

Get value from a HashMap using a byte[] key [duplicate]

This question already has answers here:
Using a byte array as Map key
(13 answers)
Closed 4 years ago.
I'm trying to add some values to a map as
Map<byte[], byte[]> samplerMap = new HashMap<>();
samplerMap.put(Bytes.toBytes("key"), Bytes.toBytes("value"));
To fetch the value from map,
samplerMap.get(Bytes.toBytes("key"))
When I'm debugging this, I'm getting a null value. Are there any special case when using a byte[] as a key of a map. How may I fix this?

The problem is that two byte[] use the Object.hashCode, which tests for the instance of the object. Two array instances as created by new byte[...] will yield two different hash codes, keys, and hence almost always null is returned.
Furthermore equals does not work too, hence byte[] as key is no option.
You could use the string itself, as you are actual doing "key".getBytes(StandardCharsets.UTF_8).
Or create wrapper class:
public class ByteArray {
public final byte[] bytes;
public ByteArray(byte[] bytes) {
this.bytes = bytes;
}
#Override
public boolean equals(Object rhs) {
return rhs != null && rhs instanceof ByteArray
&& Arrays.equals(bytes, ((ByteArray)rhs).bytes);
}
#Override
public int hashCode() {
return Arrays.hashCode(bytes);
}
}

You can't use an array as a key of a HashMap since arrays don't override the default implementation of equals and hashCode. Therefore two different array instances that contain the exact same elements will be considered as different keys.
You can use a List<Byte> as key instead.

HashSet behavior when changing field value

I just did the following code:
import java.util.HashSet;
import java.util.Set;
public class MyClass {
private static class MyObject {
private int field;
public int getField() {
return field;
}
public void setField(int aField) {
field = aField;
}
#Override
public boolean equals(Object other) {
boolean result = false;
if (other != null && other instanceof MyObject) {
MyObject that = (MyObject) other;
result = (this.getField() == that.getField());
}
return result;
}
#Override
public int hashCode() {
return field;
}
}
public static void main(String[] args) {
Set<MyObject> mySet = new HashSet<MyObject>();
MyObject object = new MyObject();
object.setField(3);
mySet.add(object);
object.setField(5);
System.out.println(mySet.contains(object));
MyObject firstElement = mySet.iterator().next();
System.out.println("The set object: " + firstElement + " the object itself: " + object);
}
}
It prints:
false
The set object: MyClass$MyObject#5 the object itself: MyClass$MyObject#5
Basically meaning that the object is not considered to be in the set, whiile its instance itself apparantly is in the set. this means that if I insert a object in a set, then change the value of a field that participates in the calculation of the hashCode method, then the HashSet method will seize working as expected. Isn;t this too big source of possible errors? How can someone defend against such cases?

Below is the quote from Set API. It explains everything.
Note: Great care must be exercised if mutable objects are used as set elements. The behavior of a set is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is an element in the set. A special case of this prohibition is that it is not permissible for a set to contain itself as an element.
http://docs.oracle.com/javase/7/docs/api/java/util/Set.html

HashSet is implemented on HashMap.
HashMap caches the hashCode of the key, So if you change the hashCode than even though the hash function maps the hashCode to the same bucket as the original object present but it will not find because before even checking the object equality it will check the hashCode.
see the line:
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
And if the hashCode maps to different bucket by hash function than the original object is present than obviously it can't find.
So even though same object if you change the hashCode hashSet can't find. Hope it helps.
So key for the HashMap or the object that you are putting into HashSet should be immutable or effective immutable.
#fazomisiek
public HashSet() {
map = new HashMap<E,Object>();
}
Similarly if you check the source of HashSet you can find it.

This problem is just a limitation of the implementation of java.util.HashSet and the underlying java.util.HashMap. Fundamentally, you're trading off the ability to modify elements in the set for faster insert/lookup performance - it's just part of the contract of using a hash set / map data structure.
If you can't guarantee everybody will remember they can't modify the objects in the set, the only way to absolutely guard against this happening is to only insert immutable objects into the set in the first place.

How to make HashMap work with Arrays as key?

I am using boolean arrays as keys for a HashMap. But the problem is HashMap fails to get the keys when a different array is passed as key, although the elements are same. (As they are different objects).
How can I make it work with arrays as keys ?
Here is the code :
public class main {
public static HashMap<boolean[], Integer> h;
public static void main(String[] args){
boolean[] a = {false, false};
h = new HashMap<boolean[], Integer>();
h.put(a, 1);
if(h.containsKey(a)) System.out.println("Found a");
boolean[] t = {false, false};
if(h.containsKey(t)) System.out.println("Found t");
else System.out.println("Couldn't find t");
}
}
Both the arrays a and t contain the same elements, but HashMap doesn't return anything for t.
How do I make it work ?

You cannot do it this way. Both t and a will have different hashCode() values because the the java.lang.Array.hashCode() method is inherited from Object, which uses the reference to compute the hash-code (default implementation). Hence the hash code for arrays is reference-dependent, which means that you will get a different hash-code value for t and a. Furthermore, equals will not work for the two arrays because that is also based on the reference.
The only way you can do this is to create a custom class that keeps the boolean array as an internal member. Then you need to override equals and hashCode in such a way that ensures that instances that contain arrays with identical values are equal and also have the same hash-code.
An easier option might be to use List<Boolean> as the key. Per the documentation the hashCode() implementation for List is defined as:
int hashCode = 1;
Iterator<E> i = list.iterator();
while (i.hasNext()) {
E obj = i.next();
hashCode = 31*hashCode + (obj==null ? 0 : obj.hashCode());
}
As you can see, it depends on the values inside your list and not the reference, and so this should work for you.

It is not possible to do this with arrays, as any two different arrays don't compare equals, even if they have the same elements.
You need to map from container class, for example ArrayList<Boolean> (or simply List<Boolean>. Perhaps BitSet would be even more appropriate.

Map implementations relies on key's equals and hashCode methods. Arrays in java are directly extends from Object, they use default equals and hashCode of Object which only compares identity.
If I were you, I would create a class Key
class Key {
private final boolean flag1;
private final boolean flag2;
public Key(boolean flag1, boolean flag2) {
this.flag1 = flag1;
this.flag2 = flag2;
}
#Override
public boolean equals(Object object) {
if (!(object instanceof Key)) {
return false;
}
Key otherKey = (Key) object;
return this.flag1 == otherKey.flag1 && this.flag2 == otherKey.flag2;
}
#Override
public int hashCode() {
int result = 17; // any prime number
result = 31 * result + Boolean.valueOf(this.flag1).hashCode();
result = 31 * result + Boolean.valueOf(this.flag2).hashCode();
return result;
}
}
After that, you can use your key with Map:
Map<Key, Integer> map = new HashMap<>();
Key firstKey = new Key(false, false);
map.put(firstKey, 1);
Key secondKey = new Key(false, false) // same key, different instance
int result = map.get(secondKey); // --> result will be 1
Reference:
Java hash code from one field

Problems
As others have said, Java arrays inherit .hashcode() and .equals() from Object, which uses a hash of the address of the array or object, completely ignoring its contents. The only way to fix this is to wrap the array in an object that implements these methods based on the contents of the array. This is one reason why Joshua Bloch wrote Item 25: "Prefer lists to arrays." Java provides several classes that do this or you can write your own using Arrays.hashCode() and Arrays.equals() which contain correct and efficient implementations of those methods. Too bad they aren't the default implementations!
Whenever practical, use a deeply unmodifiable (or immutable) class for the keys to any hash-based collection. If you modify an array (or other mutable object) after storing it as a key in a hashtable, it will almost certainly fail future .get() or .contains() tests in that hashtable. See also Are mutable hashmap keys a dangerous practice?
Specific Solution
// Also works with primitive: (boolean... items)
public static List<Boolean> bList(Boolean... items) {
List<Boolean> mutableList = new ArrayList<>();
for (Boolean item : items) {
mutableList.add(item);
}
return Collections.unmodifiableList(mutableList);
}
ArrayList implements .equals() and .hashCode() (correctly and efficiently) based on its contents, so that every bList(false, false) has the same hashcode as, and will be equal to every other bList(false, false).
Wrapping it in Collections.unmodifiableList() prevents modification.
Modifying your example to use bList() requires changing just a few declarations and type signatures. It is as clear as, and almost as brief as your original:
public class main {
public static HashMap<List<Boolean>, Integer> h;
public static void main(String[] args){
List<Boolean> a = bList(false, false);
h = new HashMap<>();
h.put(a, 1);
if(h.containsKey(a)) System.out.println("Found a");
List<Boolean> t = bList(false, false);
if(h.containsKey(t)) System.out.println("Found t");
else System.out.println("Couldn't find t");
}
}
Generic Solution
public <T> List<T> bList(T... items) {
List<T> mutableList = new ArrayList<>();
for (T item : items) {
mutableList.add(item);
}
return Collections.unmodifiableList(mutableList);
}
The rest of the above solution is unchanged, but this will leverage Java's built-in type inference to work with any primitive or Object (though I recommend using only with immutable classes).
Library Solution
Instead of bList(), use Google Guava's ImmutableList.of(), or my own Paguro's vec(), or other libraries that provide pre-tested methods like these (plus immutable/unmodifiable collections and more).
Inferior Solution
This was my original answer in 2017. I'm leaving it here because someone found it interesting, but I think it's second-rate because Java already contains ArrayList and Collections.unmodifiableList() which work around the problem. Writing your own collection wrapper with .equals() and .hashCode() methods is more work, more error-prone, harder to verify, and therefore harder to read than using what's built-in.
This should work for arrays of any type:
class ArrayHolder<T> {
private final T[] array;
#SafeVarargs
ArrayHolder(T... ts) { array = ts; }
#Override public int hashCode() { return Arrays.hashCode(array); }
#Override public boolean equals(Object other) {
if (array == other) { return true; }
if (! (other instanceof ArrayHolder) ) {
return false;
}
//noinspection unchecked
return Arrays.equals(array, ((ArrayHolder) other).array);
}
}
Here is your specific example converted to use ArrayHolder:
// boolean[] a = {false, false};
ArrayHolder<Boolean> a = new ArrayHolder<>(false, false);
// h = new HashMap<boolean[], Integer>();
Map<ArrayHolder<Boolean>, Integer> h = new HashMap<>();
h.put(a, 1);
// if(h.containsKey(a)) System.out.println("Found a");
assertTrue(h.containsKey(a));
// boolean[] t = {false, false};
ArrayHolder<Boolean> t = new ArrayHolder<>(false, false);
// if(h.containsKey(t)) System.out.println("Found t");
assertTrue(h.containsKey(t));
assertFalse(h.containsKey(new ArrayHolder<>(true, false)));
I used Java 8, but I think Java 7 has everything you need for this. I tested hashCode and equals using TestUtils.

You could create a class that contains the array. Implements the hashCode() and equals() methods for that class, based on values:
public class boolarray {
boolean array[];
public boolarray( boolean b[] ) {
array = b;
}
public int hashCode() {
int hash = 0;
for (int i = 0; i < array.length; i++)
if (array[i])
hash += Math.pow(2, i);
return hash;
}
public boolean equals( Object b ) {
if (!(b instanceof boolarray))
return false;
if ( array.length != ((boolarray)b).array.length )
return false;
for (int i = 0; i < array.length; i++ )
if (array[i] != ((boolarray)b).array[i])
return false;
return true;
}
}
You can then use:
boolarray a = new boolarray( new boolean[]{ true, true } );
boolarray b = new boolarray( new boolean[]{ true, true } );
HashMap<boolarray, Integer> map = new HashMap<boolarray, Integer>();
map.put(a, 2);
int c = map.get(b);
System.out.println(c);

Probably it is because equals() method for Array returns acts different then you expect. You should think about implementing your own collecting and override equals() and hashCode().

boolean[] t;
t = a;
If you give this, instead of boolean[] t = {false, false};, then you'll get the desired output.
This is because the Map stores the reference as the key, and in your case, though t has the same values, it doesn't have the same reference as a.
Hence, when you give t=a, it'll work.
Its very similar to this:-
String a = "ab";
String b = new String("ab");
System.out.println(a==b); // This will give false.
Both a & b hold the same value, but have different references. Hence, when you try to compare the reference using ==, it gives false.
But if you give, a = b; and then try to compare the reference, you'll get true.

Map uses equals() to test if your keys are the same.
The default implementation of that method in Object tests ==, i.e. reference equality. So, as your two arrays are not the same array, equals always returns false.
You need to make the map call Arrays.equals on the two arrays to check for equality.
You can create an array wrapper class that uses Arrays.equals and then this will work as expected:
public static final class ArrayHolder<T> {
private final T[] t;
public ArrayHolder(T[] t) {
this.t = t;
}
#Override
public int hashCode() {
int hash = 7;
hash = 23 * hash + Arrays.hashCode(this.t);
return hash;
}
#Override
public boolean equals(Object obj) {
if (obj == null) {
return false;
}
if (getClass() != obj.getClass()) {
return false;
}
final ArrayHolder<T> other = (ArrayHolder<T>) obj;
if (!Arrays.equals(this.t, other.t)) {
return false;
}
return true;
}
}
public static void main(String[] args) {
final Map<ArrayHolder<Boolean>, Integer> myMap = new HashMap<>();
myMap.put(new ArrayHolder<>(new Boolean[]{true, true}), 7);
System.out.println(myMap.get(new ArrayHolder<>(new Boolean[]{true, true})));
}

You could use a library that accepts an external hashing and comparing strategy (trove).
class MyHashingStrategy implements HashingStrategy<boolean[]> {
#Override
public int computeHashCode(boolean[] pTableau) {
return Arrays.hashCode(pTableau);
}
#Override
public boolean equals(boolean[] o1, boolean[] o2) {
return Arrays.equals(o1, o2);
}
}
Map<boolean[], T> map = new TCustomHashMap<boolean[],T>(new MyHashingStrategy());

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java Map<> with intrisic types possible? - java

How is this possible: HashMap<byte[], byte[]> and what is hash() of byte[]?

Related

Custom key generation and collision in a hashMap

Peculiar HashMap Behavior

Get value from a HashMap using a byte[] key [duplicate]

HashSet behavior when changing field value

How to make HashMap work with Arrays as key?

Categories

Resources