How does Java HashMap store entries internally - java

Say you have a key class (KeyClass) with overridden equals, hashCode and clone methods. Assume that it has 2 primitive fields, a String (name) and an int (id).
Now you define
KeyClass keyOriginal, keyCopy, keyClone;
keyOriginal = new KeyClass("original", 1);
keyCopy = new KeyClass("original", 1);
keyClone = KeyClass.clone();
Now
keyOriginal.hashCode() == keyCopy.hashCode() == keyClone.hashCode()
keyOriginal.equals(keyCopy) == true
keyCopy.equals(keyClone) == true
So as far as a HashMap is concerned, keyOriginal, keyCopy and keyClone are indistinguishable.
Now if you put an entry into the HashMap using keyOriginal, you can retrieve it back using keyCopy or keyClone, ie
map.put(keyOriginal, valueOriginal);
map.get(keyCopy) will return valueOriginal
map.get(keyClone) will return valueOriginal
Additionally, if you mutate the key after you have put it into the map, you cannot retrieve the original value. So for eg
keyOriginal.name = "mutated";
keyOriginal.id = 1000;
Now map.get(keyOriginal) will return null
So my question is
when you say map.keySet(), it will return back all the keys in the map. How does the HashMap class know what are the complete list of keys, values and entries stored in the map?
EDIT
So as I understand it, I think it works by making the Entry key as a final variable.
static class Entry<K,V> implements Map.Entry<K,V> {
final K key;
(docjar.com/html/api/java/util/HashMap.java.html). So even if I mutate the key after putting it into the map, the original key is retained. Is my understanding correct? But even if the original key reference is retained, one can still mutate its contents. So if the contents are mutated, and the K,V is still stored in the original location, how does retrieval work?
EDIT
retrieval will fail if you mutate the key after putting into the hashmap. Hence it is not recommended that you have mutable hashmap keys.

HashMap maintains a table of entries, with references to the associated keys and values, organized according to their hash code. If you mutate a key, then the hash code will change, but the entry in HashMap is still placed in the hash table according to the original hash code. That's why map.get(keyOriginal) will return null.
map.keySet() just iterates over the hash table, returning the key of each entry it has.

If you change the entry but not the hashCode, you are safe. For this reason it is considered best practice to make all fields in the hashCode, equals and compareTo, both final and immutable.

Simply put, the HashMap is an object in your computer's memory that contains keys and values. Each key is unique (read about hashcode), and each key points to a single value.
In your code example, the value coming out of your map in each case is the same because the key is the same. When you changed your key, there is no way to get a value for it because you never added an item to your HashMap with the mutated key.
If you added the line:
map.put("mutated", 2);
Before mutating the key, then you will no longer get a null value.

Related

Java map key depending on value

Is it a good practice to have a map key depending on value?
e.g.:
class MyClass {
private String key;
private Object value;
}
And then:
Map<String, MyClass> map = new LinkedHashMap<String, MyClass>();
MyClass a = new MyClass("key1", valueObject);
map.put(a.getKey(), a);
Is it ok? I am forced to have such a class as value, I thought about using Set but I need to get item based on index (geting by keySet index) and key (where values key == maps key). I also need fixed size with possibility of removing oldest element from this collection.
I think i should ensure that my key and values key field will be always the same. How can i achieve it?
Having the keys in the value entities is totally fine, this is how every database creates indices. A good practice would be to have MyClass to be immutable and wrap the whole thing in a new collection class hiding details and preventing inserting values to the wrong keys. This is the way to ensure that key == value.key
Your map store redundant data(double key).
You can put MyClass.key as map's key and MyClass.value as map's value into the map.
As you can see here a Map maps a key to its value.
So your attempt is a bit redundant, as long as you don't need the key for further operations inside MyClass.

Infinite loop when using a key multiple times in HashMap

HashMap falls into an infinite loop.
I am not able to understand why HashMap throws stackoverflow error when the same key is used multiple times.
Code:
import java.util.HashMap;
public class Test {
public static void main(String[] args) {
HashMap hm = new HashMap();
hm.put(hm, "1");
hm.put(hm, "2");
}
}
Error:
Exception in thread "main" java.lang.StackOverflowError
It is not possible to add to a Map itself as a key. From javadoc:
A special case of this prohibition is that it is not permissible for a map to contain itself as a key.
The problem is that you are using as key not a standard object (any object with well defined equals and hashCode methods, that is not the map itself), but the map itself.
The problem is on how the hashCode of the HashMap is calculated:
public int hashCode() {
int h = 0;
Iterator<Entry<K,V>> i = entrySet().iterator();
while (i.hasNext())
h += i.next().hashCode();
return h;
}
As you can see, to calculate the hashCode of the map, it iterates over all the elements of the map. And for each element, it calculates the hashCode. Because the only element of the map has as key is the map itself, it becomes a source of recursion.
Replacing the map with another object that can be used as key (with well defined equals and hashCode) will work:
import java.util.HashMap;
public class Test {
public static void main(String[] args) {
HashMap hm = new HashMap();
String key = "k1";
hm.put(key, "1");
hm.put(key, "2");
}
}
Problem is not that hash map blows up the stack for "same key" entered twice, but because your particular choice of map key. You are adding hash map to itself.
To explain better - part of Map contract is that keys must not change in a way that affects their equals (or hashCode for that matter) methods.
When you added map to itself as a key, you changed key (map) in a way that is making it return different hashCode than when you first added map.
For more information this is from JDK dock for Map interface:
Note: great care must be exercised if mutable objects are used as map keys. The behavior of a map is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is a key in the map. A special case of this prohibition is that it is not permissible for a map to contain itself as a key. While it is permissible for a map to contain itself as a value, extreme caution is advised: the equals and hashCode methods are no longer well defined on such a map.
In order to locate a key in the HashMap (which is done whenever you call put or get or containsKey), hashCode method is called for the key.
For HashMap, hashCode() is a function of the hashCode() of all the entries of the Map, and each entry's hashCode is a function of the key's and value's hashCodes. Since your key is the same HashMap instance, computing the hashCode of that key causes an infinite recursion of hashCode method calls, leading to StackOverflowError.
Using a HashMap as a key to a HashMap is a bad idea.
You are using same object name as a key so it is infinite loop.
so it is stack overflow.
You use as key the hashmap itself -> that means recursion -> no exit condition -> StackOverflow.
Just use a key (Long, String, Object anything else you want).
And yeah, as Seek Addo suggest, add types with <> brackets.

How do HashSet and HashMap work in Java?

I'm a bit confused about the internal implementation of HashSet and HashMap in java.
This is my understanding, so please correct me if I'm wrong:
Neither HashSet or HashMap allow duplicate elements.
HashSet is backed by a HashMap, so in a HashSet when we call .add(element), we are calling the hashCode() method on the element and internally doing a put(k,v) to the internal HashMap, where the key is the hashCode and the value is the actual object. So if we try to add the same object to the Set, it will see that the hashCode is already there, and then replace the old value by the new one.
But then, this seems inconsistent to me when I read how a HashMap works when storing our own objects as keys in a HashMap.
In this case we must override the hashCode() and equals() methods and make them consistent between each other, because, if we find keys with the same hashCode, they will go to the same bucket, and then to distinguish between all the entries with the same hashCode we have to iterate over the list of entries to call the method equals() on each key and find a match.
So in this case, we allow to have the same hashCode and we create a bucket containing a list for all the objects with the same hashCode, however using a HashSet, if we find already a hashCode, we replace the old value by the new value.
I'm a bit confused, could someone clarify this to me please?
You are correct regarding the behavior of HashMap, but you are wrong about the implementation of HashSet.
HashSet is backed by a HashMap internally, but the element you are adding to the HashSet is used as the key in the backing HashMap. For the value, a dummy value is used. Therefore the HashSet's contains(element) simply calls the backing HashMap's containsKey(element).
The value we insert in HashMap acts as a Key to the map object and for its value, java uses a constant variable.So in the key-value pair, all the keys will have the same value.
you can refer to this link
https://www.geeksforgeeks.org/hashset-in-java/
Hash Map:-Basically Hash map working as key and value ,if we want to store data as key and value pair then we will go to the hash map, basically when we insert data by using hash map basically internally it will follow 3 think,
1.hashcode
2..equale
3.==
when we insert the data in hash map it will store the data in bucket(fast in) by using hash code , if there is 2 data store in the same bocket then key collision will happen to resolve this key collision we use (==) method, always == method check the reference of the object, if both object hashcode is same then first one replace to second one if the hashcode is not same then hashing Collision will happen to resolve this hashing collision we will use (.equal) method .equal method basically it will check the content , if both the content is same then it will return true other wise it will return false, so in the hash map it will check is the content is same ? if the content is same then first one replace to the second one if both content is different the it will create another one object in the bocket and store the data
Hash Set:- Basically Hash Set is use to store bunch of object at a time ,internally hash set also use hash map only , when we insert somethink by using add method internally it will call put method and it will store data in the hashmap key bcz hash map key always unique and duplicate are not allowed that's way hashset also unique and duplicate are not allowed and if we entered duplicate also in hashst it will not through any exception first one will replace to the second one and in the value it will store constant data "PRESENT".
You can observe that internal hashmap object contains the element of hashset as keys and constant “PRESENT” as their value.
Where present is constant which is defined as
private static final Object present = new Object()

Does a HashMap use a HashSet to store its keys?

I'm wondering if a HashMap uses a HashSet to store its keys. I would guess it does, because a HashMap would correspond with a HashSet, while a TreeMap would correspond with a TreeSet.
I looked at the source code for the HashMap class, and the method returns an AbstractSet that's implemented by some kind of Iterator.
Additionally...when I write
HashMap map = new HashMap();
if(map.keySet() instanceof HashSet){
System.out.println("true");
}
The above if statement never runs. Now I'm unsure
Could someone explain how the HashMap stores its keys?
You're actually asking two different questions:
Does a HashMap use a HashSet to store its keys?
Does HashMap.keySet() return a HashSet?
The answer to both questions is no, and for the same reason, but there's no technical reason preventing either 1. or 2. from being true.
A HashSet is actually a wrapper around a HashMap; HashSet has the following member variable:
private transient HashMap<E,Object> map;
It populates a PRESENT sentinel value as the value of the map when an object is added to the set.
Now a HashMap stores it's data in an array of Entry objects holding the Key, Value pairs:
transient Entry<K,V>[] table;
And it's keySet() method returns an instance of the inner class KeySet:
public Set<K> keySet() {
Set<K> ks = keySet;
return (ks != null ? ks : (keySet = new KeySet()));
}
private final class KeySet extends AbstractSet<K> {
// minimal Set implementation to access the keys of the map
}
Since KeySet is a private inner class, as far as you should be concerned it is simply an arbitrary Set implementation.
Like I said, there's no reason this has to be the case. You could absolutely implement a Map class that used a HashSet internally, and then have your Map return a HashSet from .keySet(). However this would be inefficient and difficult to code; the existing implementation is both more robust and more efficient than naive Map/Set implementations.
Code snippets taken from Oracle JDK 1.7.0_17. You can view the source of your version of Java inside the src.zip file in your Java install directory.
I'm wondering if a HashMap uses a HashSet to store its keys.
That would not work too well, because a Set only keeps track of the keys. It has no way to store the associated value mapping.
The opposite (using a Map to store Set elements) is possible, though, and this approach is being used:
HashSet is implemented by using a HashMap (with a dummy value for all keys).
The set of keys returned by HashMap#keySet is implemented by a private inner class (HashMap.KeySet extends AbstractSet).
You can study the source for both class, for example on GrepCode: HashMap and HashSet.
Could someone explain how the HashMap stores its keys?
It uses an array of buckets. Each bucket has a linked list of entries. See also
How does Java HashMap store entries internally
Hashmap and how this works behind the scene
The set that is returned by the keySet is backed by the underlying map only.
As per javadoc
Returns a Set view of the keys contained in this map. The set is backed by the map, so changes to the map are reflected in the set, and vice-versa. If the map is modified while an iteration over the set is in progress (except through the iterator's own remove operation), the results of the iteration are undefined. The set supports element removal, which removes the corresponding mapping from the map, via the Iterator.remove, Set.remove, removeAll, retainAll, and clear operations. It does not support the add or addAll operations.
Blockquote
HashMap stores keys into buckets. Keys that have same hash code goes into the same bucket. When retrieving value for an key if more than one key is found in the bucket than equals method is used to find the right key and hence the right value.
Answer is: NO.
HashMap.keySet() is a VIEW of the keys contained in this map.
The data of the map is stored in Entry[] table of HashMap.

Java Modifying key object inside map

I am having a problem with JAVA map. I enter an object as a key in the map. Then I modify the key and the map does not consider the object as a key of the map any more. Even though the key inside the object has been modified accordingly.
I am working with the object CoreLabel from StanfordNLP but it applies to a general case I guess.
Map <CoreLabel, String> myMap = new HashMap...
CoreLabel key = someCreatedCoreLabel
myMap.put(key, someString)
myMap.get(key) != null ----> TRUE
key.setValue("someValue");
myMap.get(key) != null ----> FALSE
I hope I was clear enough. The question is why is the last statement false? I am not a very experienced programmer but I would expect it to be true. Maybe has something to do with the CoreLabel object?
I check if .equals() still holds, and it actually does
for(CoreLabel token: myMap.keySet()) {
if(key.equals(token))
System.out.println("OK");
}
This is explicitly documented in the Map Javadoc as dangerous and unlikely to work:
Note: great care must be exercised if mutable objects are used as map keys. The behavior of a map is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is a key in the map. A special case of this prohibition is that it is not permissible for a map to contain itself as a key. While it is permissible for a map to contain itself as a value, extreme caution is advised: the equals and hashCode methods are no longer well defined on such a map.
The problem is that in modifying the value of the key, now the hash code of the key has changed as well. A HashMap will first use the hash code of the key to determine if it exists. The modified hash code didn't exist in the map, so it didn't even get to try the step of using the equals method. That's why it's a bad idea to change your key objects while they're in a HashMap.

Categories