I want to know how set store unique data not duplicate value in java?Is there any mechanism the set folow when it store data
java.util.Set#add(E) internally calls jav.util.Map#put(K,V) passing the element you try to add as key, where It checks if there is an element in the current set which has the same hashcode as the key's hashcode(where key is the element you try to insert) and it also checks for equality using equals() method.It only add's to the Set if the hashcode or the equal test fail.If both hashcode and equal test pass then it simply replaces the old value with the current value. Below is the source of both Set#add() and Map#put().
Set#add(E)
public boolean add(E e) {
217 return map.put(e, PRESENT)==null;
218 }
Map#put(K,V)
public V put(K key, V value) {
387 if (key == null)
388 return putForNullKey(value);
389 int hash = hash(key.hashCode());
390 int i = indexFor(hash, table.length);
391 for (Entry<K,V> e = table[i]; e != null; e = e.next) {
392 Object k;
393 if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
394 V oldValue = e.value;
395 e.value = value;
396 e.recordAccess(this);
397 return oldValue;
398 }
399 }
400
401 modCount++;
402 addEntry(hash, key, value, i);
403 return null;
404 }
Related
As we know, null is not allowed in Hashtable.
But when I checked the source code of Hashtable (jdk 1.8).
I only saw the check of value and couldn't find the key check.
Here is the source code below of the put method:
public synchronized V put(K key, V value) {
// Make sure the value is not null
if (value == null) {
throw new NullPointerException();
}
// Makes sure the key is not already in the hashtable.
Entry<?,?> tab[] = table;
int hash = key.hashCode();
int index = (hash & 0x7FFFFFFF) % tab.length;
#SuppressWarnings("unchecked")
Entry<K,V> entry = (Entry<K,V>)tab[index];
for(; entry != null ; entry = entry.next) {
if ((entry.hash == hash) && entry.key.equals(key)) {
V old = entry.value;
entry.value = value;
return old;
}
}
addEntry(hash, key, value, index);
return null;
}
The key check is here:
int hash = key.hashCode();
This will throw a NullPointerException if the key is null.
public V put(K key, V value) {
if (table == EMPTY_TABLE) {
inflateTable(threshold);
}
if (key == null)
return putForNullKey(value);
int hash = hash(key);
int i = indexFor(hash, table.length);
for (Entry<K,V> e = table[i]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;
addEntry(hash, key, value, i);
return null;
}
I am trying to understand the HashMap implemintation. I understood everything except this line - Object k;
Please, explain how this Object k appears??
In that implementation of a HashMap, the data structure was backed by an array of linked lists of entries. These entries have a key and a value.
That variable k is used to store the key for each entry while iterating over the linked list bucket. If it's found to be equal (reference and then value equality) to the key with which you're trying to insert a value, then that value replaces the old.
When I run this code why only hashCode() is called not equals method while my hashCode() implementation generate same hashCode for both entries to HashSet?
import java.util.HashSet;
public class Test1 {
public static void main(String[] args) {
Student st=new Student(89);
HashSet st1=new HashSet();
st1.add(st);
st1.add(st);
System.out.println("Ho size="+st1.size());
}
}
class Student{
private int name;
private int ID;
public Student(int iD) {
super();
this.ID = iD;
}
#Override
public int hashCode() {
System.out.println("Hello-hashcode");
return ID;
}
#Override
public boolean equals(Object obj) {
System.out.println("Hello-equals");
if(obj instanceof Student){
if(this.ID==((Student)obj).ID){
return true;
}
else{
return false;
}
}
return false;
}
}
The output for this is:
Hello-hashcode
Hello-hashcode
Ho size=1
The hash set checks reference equality first, and if that passes, it skips the .equals call. This is an optimization and works because the contract of equals specifies that if a == b then a.equals(b).
I attached the source code below, with this check highlighted.
If you instead add two equal elements that are not the same reference, you get the effect you were expecting:
HashSet st1=new HashSet();
st1.add(new Student(89));
st1.add(new Student(89));
System.out.println("Ho size="+st1.size());
results in
$ java Test1
Hello-hashcode
Hello-hashcode
Hello-equals
Ho size=1
Here's the source code from OpenJDK 7, with equality optimization indicated (from HashMap, the underlying implementation of HashSet):
public V put(K key, V value) {
if (key == null)
return putForNullKey(value);
int hash = hash(key.hashCode());
int i = indexFor(hash, table.length);
for (Entry<K,V> e = table[i]; e != null; e = e.next) {
Object k;
// v-- HERE
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;
addEntry(hash, key, value, i);
return null;
}
A HashSet uses a HashMap as its backing mechanism for the set. Normally, we would expect that hashCode and equals would be called to ensure that there are no duplicates. However, the put method (which calls a private putVal method to do the actual operation) makes an optimization in the source code:
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
If the hash codes match, it first checks to see if the keys are the same before calling equals. You are passing the same Student object, so they are already ==, so the || operator short-circuits, and equals is never called.
If you passed in a different Student object but with the same ID, then == would return false and equals would be called.
Looking through the source code of HashSet it is using HashMap for all of its operations and the add method performs put(element, SOME_CONSTANT_OBJECT). Here is the source code for the put method for JDK 1.6.0_17 :
public V put(K key, V value) {
if (key == null)
return putForNullKey(value);
int hash = hash(key.hashCode());
int i = indexFor(hash, table.length);
for (Entry<K,V> e = table[i]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;
addEntry(hash, key, value, i);
return null;
}
as you can see it performs a == comparison before using the equals method. Since you are adding the same instance of an object twice the == returns true and the equals method is never called.
Equals is always called after the hashCode method in a java hashed
collection while adding and removing elements. The reason being, if
there is an element already at the specified bucket, then JVM checks
whether it is the same element which it is trying to put.
hashcode() and equals() method
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
I found that some code is changed for null keys in class HashMap in JDK 1.6 or above version compared to the previous JDK version, like 1.5.
In JDK1.5, a static final Object named NULL_KEY is defined: static final Object NULL_KEY = new Object();
Methods, including maskNull, unmaskNull, get and put etc, will use this object.
See
static final Object NULL_KEY = new Object();
static <T> T maskNull(T key) {
return key == null ? (T)NULL_KEY : key;
}
static <T> T unmaskNull(T key) {
return (key == NULL_KEY ? null : key);
}
public V get(Object key) {
Object k = maskNull(key);
int hash = hash(k);
int i = indexFor(hash, table.length);
Entry<K,V> e = table[i];
while (true) {
if (e == null)
return null;
if (e.hash == hash && eq(k, e.key))
return e.value;
e = e.next;
}
}
public V put(K key, V value) {
K k = maskNull(key);
int hash = hash(k);
int i = indexFor(hash, table.length);
for (Entry<K,V> e = table[i]; e != null; e = e.next) {
if (e.hash == hash && eq(k, e.key)) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;
addEntry(hash, k, value, i);
return null;
}
However, such Object (NULL_KEY) is not used in JDK 1.6 or above version.
Instead, two new methods named getForNullKey() and putForNullKey(value) is added, which are applied in get and put method as well.
See the source code as follows:
public V get(Object key) {
if (key == null)
return getForNullKey();
Entry<K,V> entry = getEntry(key);
return null == entry ? null : entry.getValue();
}
private V getForNullKey() {
for (Entry<K,V> e = table[0]; e != null; e = e.next) {
if (e.key == null)
return e.value;
}
return null;
}
public V put(K key, V value) {
if (key == null)
return putForNullKey(value);
int hash = hash(key);
int i = indexFor(hash, table.length);
for (Entry<K,V> e = table[i]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;
addEntry(hash, key, value, i);
return null;
}
/**
* Offloaded version of put for null keys
*/
private V putForNullKey(V value) {
for (Entry<K,V> e = table[0]; e != null; e = e.next) {
if (e.key == null) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;
addEntry(0, null, value, 0);
return null;
}
Change always has its reason for changing, such as improving the performance etc. Please help me out with the following 2 question
Q#1 ==> Why this change is made, is there some scenario that the null keys of HashMap implemented in JDK 1.5 encouneters issue ?
Q#2 ==> What is the advantage of null keys mechanism change of HashMap in JDK 1.6 or above version?
Documentation for private V getForNullKey() says
Offloaded version of get() to look up null keys. Null keys map to
index 0. This null case is split out into separate methods for the
sake of performance in the two most commonly used operations (get and
put), but incorporated with conditionals in others.
Since HashMap and HashSet allows null objects, what would be the hash value of these 'null' objects? How it is being calculated/handled in java?
The HashMap built into openJDK just puts the null reference into the first bucket in the array that it uses to hold entries
409 private V putForNullKey(V value) {
410 for (Entry<K,V> e = table[0]; e != null; e = e.next) {
411 if (e.key == null) {
412 V oldValue = e.value;
413 e.value = value;
414 e.recordAccess(this);
415 return oldValue;
416 }
417 }
418 modCount++;
419 addEntry(0, null, value, 0);
420 return null;
421 }
For HashMap for example, it is just a special case. See the source code below that I have taken from JDK 1.6:
public V put(K key, V value) {
if (key == null)
return putForNullKey(value);
int hash = hash(key.hashCode());
...
}
/**
* Offloaded version of put for null keys
*/
private V putForNullKey(V value) {
for (Entry<K,V> e = table[0]; e != null; e = e.next) {
if (e.key == null) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;
addEntry(0, null, value, 0);
return null;
}