Equal Objects must have equal hashcodes? - java

Equal Objects must have equal hashcodes. As per my understanding this statement is valid when we have intention of using object in hashbased datastuctures. This is one of contract for hashcode and equals method in java docs. I explored the reason why this is said and looked in the implementation of hashtable and found out below code in put method
if ((e.hash == hash) && e.key.equals(key))
So I got it, contract came from condition e.hash == hash above. I further tried to explore why java is checking hashcode when comparing two objects for equality. So here is my understaing
If two equal object have equal hascodes then they can be stored in the same bucket and this will be good in terms of look up in single bucket only
Its better to check hashcode then actually calling equals method because hascode method is less costly than equals method, because here we just have to compare int value where in equals method may be invloving object field comparison. So hashcode method providing one extra filter.
Please correct me if both above reasons are valid?

Correct, just a small correction - if two unequal objects have the same hashcode.
Not exactly, It's better to check it first, as a filter for the non-equal, but if you want to make sure the objects are equal, you should call equals()

You got it wrong. equals just returns a boolean value (two possible values), and needs another object to compare against. hashCode returns an int (2^32 possible values), and only needs the object to be called.
The HashMap tries to distribute all the objects it holds among buckets. When put is called on the map, it has to decide which bucket it will use for the given object. It thus uses hashCode (modulo the number of buckets) to decide which bucket to use. Then, once the bucket is found, it has to check whether the key is already in the map or not. To do this, it compares every object in the bucket with the object to put in the map. And to do this, it uses equals. If the object isn't found, it adds it in the bucket.
hashCode isn't used because it's faster than equals. It's used because it allows distributing keys among a set of buckets. And it's much faster to compute the hashCode once and compare the object with (hopefully) 0, one or two objects in the same bucket that to compare the object with the thousands of objects already stored in the map.

" I further tried to Exlpore why java is checking Hashcode when comparing two objects for equality". Put method is not just checking for equality, it is trying to first narrow down the bucket and then use the equals. That is why we need to combine HashCode with Equals in case of bucketed collections.
But if your sole intention is to just check equality between two objects, you will never need a hashcode method.
Obj1.equals(Obj2) will never use the hashcode method by default.

Its a general type of contract so that when we store the objects inside a hashing based data structure, then we should always consistently put or get the same object to and from the hashtable.
Its a contract which we have created to be followed such that the entry/put processes occur smoothly.

Related

Consequences of different hashcodes but same equals for two java objects

I understand that we should have same hashcodes incase equals are same for two java objects, but just wanted to understand if hashcodes are not same but equals returns true, what would be the consequences with respect to collections like HashMap, HashSet etc.
Would it only impact the performance or will it impact the behavior/functionality of those collection classes.
Let's call the objects o1 and o2 where o1.equals(o2) but o1.hashCode() != o2.hashCode()
Consider the following:
Map map = new HashMap();
Set set = new HashSet();
map.put(o1, "foo");
set.add(o1);
The following assertions would fail
Assert.assertTrue(map.containsKey(o2));
Assert.assertTrue(set.contains(o2));
The consequences will be unexpected behavior.
If a.equals(b) == true but a.hashCode()!=b.hashCode(), set.add(a) followed by set.contains(b) will most likely return false (assuming set is a HashSet), even though according to equals it should return true. (the reason it's most likely and not a certainty is that two different hash codes still have a chance of being mapped to the same bucket of the HashSet/HashMap, in which case you can still get true).
It would break the functionality. If you are looking for an object in a hashmap or hashset, it is using the hash code in order to find it. If the hash code is not consistent, it probably will not be able to find it.
The most basic requirement of a hash code is that two equal objects must have the same hash code. Everything else is secondary.
If two objects are equal, their hashcode will always return same value.
Note that it is generally necessary to override the hashCode method whenever this method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.
Please read this
It will miss the bucket during fetch. HashMap is designed to store huge data with fetch time as O(1) in best possible scenario. To do this it stores key/values marked against hashcode(which we generally refer as bucket). This is called hashing technology.
So you stored it in hashmap with hashcode number, say 100 and now you are trying to fetch the object with different hashcode (say 200)(ie looking inside different bucket). So even though your object is inside hashmap, it will not be able to retrieve it because it will try to find in different bucket (i.e 200).
That is the reason we should have same hashcodes incase equals are same for two java objects
HashMap/HashSet is meant to help you find what the target object in the collections. If two equal objects have different hashcodes, you are unlikely to find the object in the hash bucket.

Why can't I just compare the hashCode of two objects in order to find out if they are equal or not?

Why do the equals methods implemented by Eclipse compare each value, wouldn't it be simpler to just compare the hashCodes of both objects?
From what I know:
hashCode always generates the same hash for the same input
So if two objects are equal, they should have the same hash
If objects that are equal have the same hash, I can just check the hash in order to determine of objects are equal or not
edit: Related question, why does one always implement the hashCode when equals is implemented, if the hashCode isn't actually needed for equals?
hashCode always generates the same hash for the same input
Correct.
So if two objects are equal, they should have the same hash
Correct.
If objects that are equal have the same hash, I can just check the hash in order to determine of objects are equal or not
Non sequitur. Objects that are unequal can also have the same hashcode. That is the purpose of a hashcode.
Related question, why does one always implement the hashCode when equals is implemented, if the hashCode isn't actually needed for equals?
Because it is needed for hashing, in HashMap, HashSet, and friends. If you think your object will never be so used, don't override it, and good luck with that.
To complement #EJP's answer, here is a perfectly valid, although useless, implementation of .hashCode():
#Override
public int hashCode()
{
return 42; // The Answer
}
Putting this in very simple terms: while every squirrel is an animal, not every animal is a squirrel. The hashCode is usually used for quick lookup - it should be efficient and it should distribute data uniformly across a lookup table - see here. But a hash function can generate collisions, which is why it shouldn't be used as a means of verifying object equality.
It's all very much dependent on the implementation of hashCode - as you can also see in fge's answer.
As to why it usually needs to be reimplemented when you override equals: they are both used when storing and retrieving objects from collections (for example a HashMap). The hashCode determines the place in the map where the object will be inserted, while equals is used to identify the object inside a collision bucket.

Hashcode and equals methods contract [duplicate]

This question already has answers here:
Why do I need to override the equals and hashCode methods in Java?
(31 answers)
Closed 7 years ago.
I know that when we override equals() method then we need to override hashcode() as well and other way around.
But i don't understand why we MUST do that?
In Joshua Bloch Book it is clearly written that we must do that, because when we deal with hash based collections, it is crucial to satisfy the Hashcode contract and I admit that, but what if I am not dealing with hash-based collections?
Why is it still required ?
Why to Override Equals ?
A programmer who compares references to value objects using the equals
method expects to find out whether they are logically equivalent, not
whether they refer to the same object .
Now coming to HashCode
Hash function which is called to produce the hashCode should return the same hash code each and every time,
when function is applied on same or equal objects. In other words, two
equal objects must produce same hash code consistently.
Implementation of HashCode provided by Object Class is not based upon logical equivalency ,
So Now if you will not override hashCode but override equals, then according to you 2 Objects are equals as they will pass the equals() test but according to Java they are not .
Consequences :
Set start allowing duplicates !!!
Map#get(key) will not return the correct value !!
and so on many other consquences..................
Data structures, such as HashMap, depend on the contract.
A HashMap achieves magical performance properties by using the hashcode to bucketize entries. Every item that is put in the map that has the same hashcode() value gets placed in the same bucket. These "collisions" are resolved by comparing within the same bucket using equals(). In other words, the hashcode is used to determine the subset of the items in the map that might be equal and in this way quickly eliminate the vast majority of the items from further consideration.
This only works if objects that are equal are placed in the same bucket, which can only be ensured if they have the same hashcode.
NOTE: In practice, the number of collisions is much higher than may be implied above, because the number of buckets used is necessarily much smaller than the number of possible hashcode values.
As per Joshua Bloch book;
A common source of bugs is the failure to override the hashCode
method. You must override hashCode in every class that overrides
equals. Failure to do so will result in a violation of the general
contract for Object.hashCode, which will prevent your class from
functioning properly in conjunction with all hash-based collections,
including HashMap, HashSet, and Hashtable.
Failing to override hashcode while overriding equals is violation the contract of Object.hashCode. But this won't have impact if you are using your objects only on non hash based collection.
However, how do you prevent; the other developers doing so. Also if an object is eligible for element of collection, better provide support for all the collections, don't have half baked objects in your project. This will fail anytime in the future, and you will be caught for not following the contacts while implementing :)
Because that is the way it is meant to be:
Whenever a.equals(b), then a.hashCode() must be same as b.hashCode().
What issues should be considered when overriding equals and hashCode in Java?
There are use-cases where you don't need hashcode(), mostly self-written scenarious, but you can never be sure, because implementations can and might be also relying on hashcode() if they are using equals()
This question is answered many times in SO, but still I will attempt to answer this .
In order to understand this concept completely, we need to understand the purpose of hashcode and equals, how they are implemented, and what exactly is this contract(that hashcode also should be overridden when equals is overridden)
equals method is used to determine the equality of the object. For primitive types, its very easy to determine the equality. We can very easily say that int 1 is always equal to 1. But this equal method talks about the equality of objects. The object equality depends on the instance variables or any other parameter (depend purely on the implementation - how you want to compare).
This equal method needs to be overridden if we want some customized comparison, lets say we want to say that two books are same if they have same title and same author, or I can say two books are equal if they have same ISBN.
hashcode method returns a hash code value of an object. The default implementation of the Object hashcode returns a distinct integers for distinct objects. This integer is calculated based on the memory address of the object.
So we can say that the default implementation of the equals method just comapres the hashcodes to check the equality of the object. But for the book example - we need it differently.
Also Equal objects must produce the same hash code as long as they are equal, however unequal objects need not produce distinct hash codes.
In case of not using a hash based collection, you can break the contract and need not to override the hashcode method - because you ll not be using the default implementations anywhere but still I would not suggest that and would say to have it as you may need it in future when you put those things in collection

Why did java designers impose a mandate that if obj1.equals(obj2) then obj1.hashCode() MUST Be == obj2.hashCode()

Why did java designers impose a mandate that
if obj1.equals(obj2) then
obj1.hashCode() MUST Be == obj2.hashCode()
Because a HashMap uses the following algorithm to find keys quickly:
get the hashCode() of the key in argument
deduce the bucket from this hash code
compare every key in the bucket with the key in argument (using equals()) to find the right one
If two equal objects didn't have the same hash code, the first two steps of the algorithm wouldn't work. And it's those two first steps that make a HashMap very fast (O(1)).
There is no mandate. It is a good practice since this is a required condition if your objects are meant to be used in hash based data structures like HashMap/HashSet etc.
As far as I know that's not baked into the language - you could technically have objects whose equals() method does not check the hashcode but you'll get pretty peculiar results.
In particular if you put a bunch of these objects into a HashMap or HashSet the map/set will use the hashCode() method to determine whether the objects may be duplicates - so you can have a situation where a collection will store 2 objects you've defined as equals (which should never happen) because they're each returning different hashCodes.
Because hashcodes are used to quickly determine if two objects are not equal.
Its sort of two steps matching to improve performance.
First Step: calculate hashcode()
Second Step: calculate equals()
Its because if you put your objects as keys in collections like hashmap, your keys will be compared first on hashcode() method if it finds matching hashcode it then goes on further to calculate equals().
Its like indexing for better search performance

Why both hashCode() and equals() exist

why java Object class has two methods hashcode() and equals()? One of them looks redundant and its percolated to the bottom most derived class?
Why do you think one is redundant? They say different things:
hashCode is "give me some way of efficiently seeing whether two objects are likely to be equal"
equals is "check whether this object is genuinely equal to another"
You definitely need both - although I don't believe they should really be in Object in the first place.
You absolutely need hash codes in order to perform efficient lookups with hash tables - and you absolutely need further equality checks because hashes will collide (there are far more possible strings than hash codes, for example).
First of all, when you override equals() you MUST override hashcode() as well.
Failure to do so
will result in a violation of the general contract for Object.hashCode, which will
prevent your class from functioning properly in conjunction with all hash-based
collections, including HashMap, HashSet, and Hashtable.
Here is the contract, copied from the Object specification [JavaSE6]:
Whenever it is invoked on the same object more than once during an execu-
tion of an application, the hashCode method must consistently return the
same integer, provided no information used in equals comparisons on the
object is modified. This integer need not remain consistent from one execu-
tion of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then call-
ing the hashCode method on each of the two objects must produce the same
integer result.
It is not required that if two objects are unequal according to the equals(Object) method, then calling the hashCode method on each of the two objects
must produce distinct integer results. However, the programmer should be
aware that producing distinct integer results for unequal objects may improve
the performance of hash tables.
The fundamental idea is that by comparing hashcode()s it's quick to check whether two objects are probably equal. If their hashcodes are equal, then the objects probably are equal (not necessarily, but it's a good guess). Then a more profound (and more expensive) check with equals() is performed. This is important to speed up all kind of look-ups (from maps etc).
equals is to compare objects, hashcode is used to generate a hash value from an object, which will then be used by the java map containers (Hashtable, Map etc).
it's common practice to override them together (if you override hashcode, you need to override equals and vice versa).

Categories