Implementation of containsKey HashMap<> - Java

Implementation of containsKey HashMap<> - Java - java

The whole purpose of using containsKey() is to check whether any given key is already in HashMap or not? If it doesn't contain that key than just add key into that HasMap.
But seems like when we call this method it's parameters are Object
type that means, containsKey() checks whether given argument(key) has
similar memory address with any other already entered key.
Potential Solution:
One solution could be get a unique data from that object1(oldKey) and
check with object2(new key), If they are same than don't use it in
HashMap. However this means containsKey has no purpose at all. Am I
right?
Sorry I am not ranting, or probably I sound like one. But I would like to know the most efficient way to get over this problem.
will be thankful for any kind of help.

But seems like when we call this method it's parameters are Object type that means, containsKey() checks whether given argument(key) has similar memory address with any other already entered key.
Wrong. Their equality is checked by comparing their hashCode() values first. Only if the hash values are equal, the objects themselves may be compared (but always using equals(), not ==). So any class where these two methods are implemented properly will work correctly as a key in a HashMap.

HashMap.containsKey() methods finds if whether the key's hashCode() exists and not by equality comparison. If the hash code exists, it will pull the entry to see if the reference equality OR equals() of the key is equal.
This is implemented in HashMap.getEntry() method:
/**
* Returns the entry associated with the specified key in the
* HashMap. Returns null if the HashMap contains no mapping
* for the key.
*/
final Entry<K,V> getEntry(Object key) {
int hash = (key == null) ? 0 : hash(key.hashCode());
for (Entry<K,V> e = table[indexFor(hash, table.length)];
e != null;
e = e.next) {
Object k;
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
return e;
}
return null;
}

But seems like when we call this method it's parameters are Object
type
Yes, but the method will be called on the actual implementation type, not on Object.class. That's why it's so important to implement hashCode() properly.
Read: Effective java, Item 9 (in fact you should buy and read the whole book)

But seems like when we call this method it's parameters are Object type that means, containsKey() checks whether given argument(key) has similar memory address with any other already entered key
This conclusion is wrong. The containskey(Object key) calls the equals() method on the passed key , so if this has overriden the equals(Object key) method , then it will resolve correctly based on the key equivalence criteria. Ofcourse if the Key has not overridden the equals() method , then it is a bad design to start with.

Related

java.util.HashMap get: does key have to be exactly the same object as what is stored in the HashMap, or can the keys just be "equal"

Suppose I have a HashMap M. I want to call the "get" function on this HashMap, and find the value associated with a given object S. But I don't have an actual reference to the object S, so I create a new object S_new whose contents are identical to the contents of S. If I call M.get(S_new), will that give me the value associated with the key S?

From the documentation for Map#get:
public V get(Object key)
Returns the value to which the specified key is mapped, or null if
this map contains no mapping for the key. More formally, if this map
contains a mapping from a key k to a value v such that (key==null ? k==null : key.equals(k)), then this method returns v; otherwise it
returns null. (There can be at most one such mapping.)
So as long as the parameter you're passing overrides equals in such a way that the map key is seen as equivalent, you can use a different instance to retrieve a value from a map.
Also, as #Eugene and others mentioned, for HashMap you must also override the hashCode method, and ensure that your instance returns the same value as your key. In general, best practice is to ensure that your equality implementation is symmetric (i.e. A.equals(B) <==> B.equals(A)), and values that are equal should have the same hashCode.

Yes, it will as long as the hashcode and equals would produce the same exact values. Also notice that get does not even require the parameter to be T - it's Object, so any type that would fulfill the hashcode and equals would work

No to get value from map using key doesn't mean that key should be same object which were used while putting.
only thing is that hashcode & equals should be same and that is the reason its mandatory to override Hashcode & equals method if you wanna use your own class object as key .

Whose .equals() method is called to resolve hash collision in HashMaps?

On every single article about HashMaps hash collision one thing is in common and my question revolves around that.
Let me explain what i understand about hashmaps internal working.
Saving two entries(e1,e2) with same hashcode using map.put(k,v)
1) when the map.put(k,v) is called, hashmap finds the hashCode() of the key 'k'.
2) then it uses this hashcode it found as a seed for its internal static hashing method & gets another hash value.
3) then this new found hash value is mapped to the internal index of bucket.
4) then a Entry is added to the bucket.
In case of a hash collision.
1) same as normal, when the map.put(k,v) is called, hashmap finds the hashCode() of the key 'k'.
2) again same as usual, then it uses this hashcode it found as a seed for its internal static hashing method & gets another hash value.
3) the new found hash value is mapped to the internal index of the bucket, now there is a problem as it already has a entry at this bucket position.
Resolution : since the Entry is actually a simple linked list, the new item with the collided hash is stored at the next of the previous Entry.
Fetching the entry e2 with map.get(k)
1) hash generated from key & again static hash method called using the hash obtain from the key as seed.
2) finding the mapped bucket using the hash value obtained by the static hash method, now if there are more than one entries here equals() method comes to the rescue.
that is the linked list would traverse & keep on calling the "equals()" method until it finds the match.
Now my question is where is this so called equals() method defined ?
I opened the official documentation of HashMap & it doesn't override the .equals() method, so where is it overriden? Or is it the default .equals() from the Object class ?

Both hashCode() and equals() methods belong to the class of the key object, not to the hash map.
The methods are defined in the Object class, but it is expected that the objects used as keys in a hash map provide their own implementation for both these methods. Therefore, it's not the default .equals() from Object class, it is the specific .equals() from the actual key class that gets called for collision resolution.
For example, if you use String objects as keys, the overrides of hashCode() and equals() provided by String would be used.

HashMap in Java. hash.containsKey returns unexpected

I have a problem with hashMap. More specific with containsKey.
I want to check if a object exists in my hash. The problem is when I call this method with 2 different objects containing the same exact data, that should have same hashCode.
Person pers1,pers2;
pers1=new Person("EU",22);
pers2=new Person("EU",22);
public int hashCode(){ //From Person Class
return this.getName().hashCode()+age;
}
After inserting the pers1 key in my hash and calling " hash.containsKey(pers1);" returns true but"hash.containsKey(pers2)" returns false. Why and how could I fix this issue?
Thank you!

The cause of the issue seems to be that you did not override the equals method in the Person class. Hashmap needs that to locate the key while searching.
The steps performed while searching the key are as follows :
1) use hashCode() on the object (key) to locate the appropriate bucket where the key can be placed.
2) Once bucket is located, try to find the particular Key using equals() method.

containsKey() uses the .equals() method which you don't seem to override. .hashCode() provides a normalized (ideally) distribution across the hashtable, it does not do any equality comparisons (aside from requiring two equal objects require the same hashcode).
As you can see in the source code:
if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k))))

Why Hashtable does not allow null keys or values?

As specified in JDK documentation, Hashtable does not allow null keys or values. HashMap allows one null key and any number of null values. Why is this?

Hashtable is the older class, and its use is generally discouraged. Perhaps they saw the need for a null key, and more importantly - null values, and added it in the HashMap implementation.
HashMap is newer, and has more advanced capabilities, which are basically just an improvement on the Hashtable functionality. When HashMap was created, it was specifically designed to handle null values as keys and handles them as a special case.
Edit
From Hashtable JavaDoc:
To successfully store and retrieve objects from a Hashtable, the
objects used as keys must implement the hashCode method and the equals method.
Since null isn't an object, you can't call .equals() or .hashCode() on it, so the Hashtable can't compute a hash to use it as a key.

The main reason why Hashtable and ConcurrentHashMap do not allow null keys or values is because of the expectation that they are going to be used in a multi-threaded environment. For a minute, let's assume that null values are allowed. In this case the hashtable's "get" method has ambiguous behavior. It can return null if the key is not found in the map or it can return null if the key is found and its value is null. When a code expects null values, it usually checks if the key is present in the map so that it can know whether the key is not present or the key is present but value is null. Now this code breaks in a multi-threaded environment. Let's take a look at below code:
if (map.contains(key)) {
return map.get(key);
} else {
throw new KeyNotFoundException;
}
In the above code, let's say thread t1 calls the contains method and finds the key and it assumes that key is present and is ready for returning the value whether it is null or not. Now before it calls map.get, another thread t2 removes that key from the map. Now t1 resumes and returns null. However as per the code, the correct answer for t1 is KeyNotFoundException because the key has been removed. But still it returns the null and thus the expected behavior is broken.
Now, for a regular HashMap, it is assumed, that it is going to get called by a single thread, hence there is no possibility of key getting removed in the middle of "contains" check and "get". So HashMap can tolerate null values. However for Hashtable and ConcurrentHashMap, the expectations are clear that multiple threads are going to act on the data. Hence they cannot afford to allow null values and give out incorrect answer. Same logic goes for keys. Now the counter argument can be - the contains and get steps could fail for non null values for Hashtables and ConcurrentHashMaps, because another thread can modify the map/table before the second step gets executed. That is correct, it can happen. But since Hashtables and ConcurrentHashMaps do not allow null keys and values, it is not necessary for them to implement contains and get check in the first place. They can directly get the value because they know that if the get method returns null, the only reason for that is the key is not present and not because the value could be null. The contains and get check is necessary only for HashMaps because they allow the null values and thus need to resolve the ambiguity about whether the key is not found or the value is null.

The reason is the reason on the accepted answer: Hashtable is old.
However, the use of Hashtable IS NOT discouraged in favor of HashMap in every scenario.
Hashtable is synchronized, so it is THREAD-SAFE. HashMap is not.
Neither Hashtable nor ConcurrentHashMap support null keys or values. HashMap does.
If you want a drop-in replacement that doesn't require anything else than changing the class and works in every scenario, there is none. The most similar option would be ConcurrentHashMap (which is thread safe but doesn't support locking the whole table):
This class is fully interoperable with Hashtable in programs that rely
on its thread safety but not on its synchronization details.
HashMap is a better replacement for single threaded applications or any time synchronization is not a requirement, because of the performance impact synchronization introduces.
Sources:
Hashtable
HashMap
ConcurrentHashMap

so to conclude
Because in HashTable when you put an element it will take into account key and value hash. Basically you will have something like :
public Object put(Object key, Object value){
key.hashCode();
//some code
value.hashCode();
}
HashTable - Does not allow null keys
This is because, in put(K key, V value) method, we have key.hashcode() which throws null pointer exception.
HashTable - Does not allow null value
This is because, in put(K key, V value) method we have if(value==null){throw new NullPointerException
HashMap allows null values as it doesn't have any checks like HashTable, while it allows only one null key. This is done with the help of putForNullKey method, which add the value to the 0th index of the internal Array every time the key is provided as null

Default Hashtable implementation has null check which thorws null pointer exception.
Later on java developers might have realized the importance of null keys(for some default value etc) and values and that why HashMap got introduced.
For HashMap, null check is there for keys if the key is null then that element will be stored in a location where hashcode is not required.

Hashtable is a class which came with the first version of java. When it was released Java engineers tried to discourage the use of null keys or maybe did not realize its usefulness. So, they did not allow it in the Hashtable.
The put method to insert key value pair in Hashtable throws NullPointerException if value is null. Since Hashtable is based on hashing mechanism, hash is calculated for keys, which throws NullPointerException if key is null.
Later Java engineers must have realized that having a null key and values has its uses like using them for default cases. So, they provided HashMap class with collection framework in Java 5 with capability of storing null key and values.
The put method to insert key value pair in HashMap checks for null key and stores it at the first location of the internal table array. It isn’t afraid of the null values and does not throw NullPointerException like Hashtable does.
Now, there can be only one null key as keys have to be unique although we can have multiple null values associated with different keys.

Hash table is very old class , from JDK 1.0
To understand this, first of all we need to understand comments written on this class by author.
“This class implements a hashtable, which maps keys to values. Any non-null object can be used as a key or as a value.
To successfully store and retrieve objects from a hashtable, the objects used as keys must implement the
hashCode method and the equals method.”
HashTable class is implemented on hashing mechanism, that’s mean to store any key-value pair, its required hash code of key object. If key would be null, it will not able to given hash ,it will through null pointer exception and similar case for value
it is throwing null if the value is null.
But later on it was realized that null key and value has its own importance that is
why one null key and multiple null values are allowed in later implemented classes like HashMap class.
For hash map null keys will allow and there is a null check is there for keys
if the key is null then that element will be stored in a zero location in Entry array. null key we can use for some default value..
=> Hashtable methods are synchronised it neveruses object based locking.
HashMap implements by considering it special
static final int hash(Object key) {
int h;
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}
Java 8 you cannot infer types for hash table.
private Map<String,String> hashtable = new Hashtable<>(); // Not Allowed
private Map<String,String> hashtable = new HashMap<>(); // Allowed

Just as #Jainendra said, HashTable doesn't allow null key for call key.hashCode() in put().
But it seems that no one clearly answer why null value isn't allowed.
public synchronized V put(K key, V value) {
// Make sure the value is not null
if (value == null) {
throw new NullPointerException();
}
// Makes sure the key is not already in the hashtable.
Entry<?,?> tab[] = table;
int hash = key.hashCode();
int index = (hash & 0x7FFFFFFF) % tab.length;
#SuppressWarnings("unchecked")
Entry<K,V> entry = (Entry<K,V>)tab[index];
for(; entry != null ; entry = entry.next) {
if ((entry.hash == hash) && entry.key.equals(key)) {
V old = entry.value;
entry.value = value;
return old;
}
}
addEntry(hash, key, value, index);
return null;
}
The null check in put doesn't explain why null value is illegal, it just ensure non-null invariant.
The concrete answer for not allow null value is HashTable will call value.equals when call contains/remove.

Hashtable does not allow null keys but HashMap allows one null key and any number of null values. There is a simple explanation behind that.
put() method in hashmap does not call hashcode() when null is passed as key and null Key is handled as a special case. HashMap puts null key in bucket 0 and maps null as key to passed value.
public V put(K key, V value) {
if (key == null)
return putForNullKey(value);
int hash = hash(key);
int i = indexFor(hash, table.length);
for (Entry<K,V> e = table[i]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;
addEntry(hash, key, value, i);
return null;
}
As seen in the algorithm the put() method checks if the key is null and call putForNullKey(value) and return. This putForNullKey will create a entry in bucket at 0 index. Index zero is always reserved for null key in the bucket.
On the other hand, in case of hashtable objects used as keys must implement the hashCode method and the equals method. Since null is not an object, it can’t implement these methods.

HashTable - Does not allow null keys
This is because, in put(K key, V value) method, we have key.hashcode() which throws null pointer exception.
HashTable - Does not allow null value
This is because, in put(K key, V value) method we have if(value==null){throw new NullPointerException
HashMap allows null values as it doesn't have any checks like HashTable, while it allows only one null key. This is done with the help of putForNullKey method, which add the value to the 0th index of the internal Array every time the key is provided as null

How does Java implement hash tables?

Does anyone know how Java implements its hash tables (HashSet or HashMap)? Given the various types of objects that one may want to put in a hash table, it seems very difficult to come up with a hash function that would work well for all cases.

HashMap and HashSet are very similar. In fact, the second contains an instance of the first.
A HashMap contains an array of buckets in order to contain its entries. Array size is always powers of 2. If you don't specify another value, initially there are 16 buckets.
When you put an entry (key and value) in it, it decides the bucket where the entry will be inserted calculating it from its key's hashcode (hashcode is not its memory address, and the the hash is not a modulus). Different entries can collide in the same bucket, so they'll be put in a list.
Entries will be inserted until they reach the load factor. This factor is 0.75 by default, and is not recommended to change it if you are not very sure of what you're doing. 0.75 as load factor means that a HashMap of 16 buckets can only contain 12 entries (16*0.75). Then, an array of buckets will be created, doubling the size of the previous. All entries will be put again in the new array. This process is known as rehashing, and can be expensive.
Therefore, a best practice, if you know how many entries will be inserted, is to construct a HashMap specifying its final size:
new HashMap(finalSize);

You can check the source of HashMap, for example.

Java depends on each class' implementation of the hashCode() method to distribute the objects evenly. Obviously, a bad hashCode() method will result in performance problems for large hash tables. If a class does not provide a hashCode() method, the default in the current implementation is to return some function (i.e. a hash) of the the object's address in memory. Quoting from the API doc:
As much as is reasonably practical,
the hashCode method defined by class
Object does return distinct integers
for distinct objects. (This is
typically implemented by converting
the internal address of the object
into an integer, but this
implementation technique is not
required by the JavaTM programming
language.)

There are two general ways to implement a HashMap. The difference is how one deals with collisions.
The first method, which is the one Java users, makes every bucket in a the HashMap contain a singly linked list. To accomplish this, each bucket contains an Entry type, which caches the hashCode, has a pointer to the key, pointer to the value, and a pointer to the next entry. When a collision occurs in Java, another entry is added to the list.
The other method for handling collisions, is to simply put the item into the next empty bucket. The advantage of this method is it requires less space, however, it complicates removals, as if the bucket following the removed item is not empty, one has to check to see if that item is in the right or wrong bucket, and shift the item if it originally has collided with the item being removed.

In my own words:
An Entry object is created to hold the reference of the Key and Value.
The HashMap has an array of Entry's.
The index for the given entry is the hash returned by key.hashCode()
If there is a collision ( two keys gave the same index ) , the entry is stored in the .next attribute of the existing entry.
That's how two objects with the same hash could be stored into the collection.
From this answer we get:
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry<K,V> e = table[indexFor(hash, table.length)];
e != null;
e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
Let me know if I got something wrong.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Implementation of containsKey HashMap<> - Java - java

Related

java.util.HashMap get: does key have to be exactly the same object as what is stored in the HashMap, or can the keys just be "equal"

Whose .equals() method is called to resolve hash collision in HashMaps?

HashMap in Java. hash.containsKey returns unexpected

Why Hashtable does not allow null keys or values?

How does Java implement hash tables?

Categories

Resources