what is the purpose of overriding hashcode in java object? [duplicate]

what is the purpose of overriding hashcode in java object? [duplicate] - java

This question already has answers here:
What issues should be considered when overriding equals and hashCode in Java?
(11 answers)
Closed 9 years ago.
I know multiple object having same hashcode in java objects.It doesn't make any problem at all.So, What is the purpose of overriding hashcode in java...
In which situation it is advisable to do override the hashcode in java?

In which situation it is advisable to do override the hashcode in java?
When you override equals, basically. It means that collections which are hash-based (e.g. HashMap, HashSet) can very quickly find a set of candidate objects which will be equal to the one you're looking for (or trying to add, or whatever). If you have a large collection, you split it into buckets by hash code. When you're trying to look something up, you find the hash code of the object you've been passed, and look for objects with the same hash code within the relevant bucket. Then for each object with the exact same hash code, you call equals to see whether the two objects are really the same.
See the Wikipedia article on hash tables for more information.
EDIT: A quick note on the choice of hash codes...
It's always valid to just override hashCode and return some constant (the same for every call) regardless of the contents of the object. However, at that point you lose all benefit of the hash code - basically a hash-based container will think that any instance of your type might be equal to any other one, so searching for one will consist of O(N) calls to equals.
On the other end of the scale, if you don't implement hashCode correctly and return a different value for calls to equal objects (or two calls on the same object twice) you won't be able to find the object when you search for it - the different hash codes will rule out equality so equals will never even be called.
Then there's the aspect of mutability - it's generally a bad idea for equals and hashCode to use mutable aspects of an object: if you mutate an object in a hash-changing way after you've inserted it into a hash-based collection, you won't be able to find it again because the hash at the time of insertion will no longer be correct.

Related

Hashcode and equals methods contract [duplicate]

This question already has answers here:
Why do I need to override the equals and hashCode methods in Java?
(31 answers)
Closed 7 years ago.
I know that when we override equals() method then we need to override hashcode() as well and other way around.
But i don't understand why we MUST do that?
In Joshua Bloch Book it is clearly written that we must do that, because when we deal with hash based collections, it is crucial to satisfy the Hashcode contract and I admit that, but what if I am not dealing with hash-based collections?
Why is it still required ?

Why to Override Equals ?
A programmer who compares references to value objects using the equals
method expects to find out whether they are logically equivalent, not
whether they refer to the same object .
Now coming to HashCode
Hash function which is called to produce the hashCode should return the same hash code each and every time,
when function is applied on same or equal objects. In other words, two
equal objects must produce same hash code consistently.
Implementation of HashCode provided by Object Class is not based upon logical equivalency ,
So Now if you will not override hashCode but override equals, then according to you 2 Objects are equals as they will pass the equals() test but according to Java they are not .
Consequences :
Set start allowing duplicates !!!
Map#get(key) will not return the correct value !!
and so on many other consquences..................

Data structures, such as HashMap, depend on the contract.
A HashMap achieves magical performance properties by using the hashcode to bucketize entries. Every item that is put in the map that has the same hashcode() value gets placed in the same bucket. These "collisions" are resolved by comparing within the same bucket using equals(). In other words, the hashcode is used to determine the subset of the items in the map that might be equal and in this way quickly eliminate the vast majority of the items from further consideration.
This only works if objects that are equal are placed in the same bucket, which can only be ensured if they have the same hashcode.
NOTE: In practice, the number of collisions is much higher than may be implied above, because the number of buckets used is necessarily much smaller than the number of possible hashcode values.

As per Joshua Bloch book;
A common source of bugs is the failure to override the hashCode
method. You must override hashCode in every class that overrides
equals. Failure to do so will result in a violation of the general
contract for Object.hashCode, which will prevent your class from
functioning properly in conjunction with all hash-based collections,
including HashMap, HashSet, and Hashtable.
Failing to override hashcode while overriding equals is violation the contract of Object.hashCode. But this won't have impact if you are using your objects only on non hash based collection.
However, how do you prevent; the other developers doing so. Also if an object is eligible for element of collection, better provide support for all the collections, don't have half baked objects in your project. This will fail anytime in the future, and you will be caught for not following the contacts while implementing :)

Because that is the way it is meant to be:
Whenever a.equals(b), then a.hashCode() must be same as b.hashCode().
What issues should be considered when overriding equals and hashCode in Java?
There are use-cases where you don't need hashcode(), mostly self-written scenarious, but you can never be sure, because implementations can and might be also relying on hashcode() if they are using equals()

This question is answered many times in SO, but still I will attempt to answer this .
In order to understand this concept completely, we need to understand the purpose of hashcode and equals, how they are implemented, and what exactly is this contract(that hashcode also should be overridden when equals is overridden)
equals method is used to determine the equality of the object. For primitive types, its very easy to determine the equality. We can very easily say that int 1 is always equal to 1. But this equal method talks about the equality of objects. The object equality depends on the instance variables or any other parameter (depend purely on the implementation - how you want to compare).
This equal method needs to be overridden if we want some customized comparison, lets say we want to say that two books are same if they have same title and same author, or I can say two books are equal if they have same ISBN.
hashcode method returns a hash code value of an object. The default implementation of the Object hashcode returns a distinct integers for distinct objects. This integer is calculated based on the memory address of the object.
So we can say that the default implementation of the equals method just comapres the hashcodes to check the equality of the object. But for the book example - we need it differently.
Also Equal objects must produce the same hash code as long as they are equal, however unequal objects need not produce distinct hash codes.
In case of not using a hash based collection, you can break the contract and need not to override the hashcode method - because you ll not be using the default implementations anywhere but still I would not suggest that and would say to have it as you may need it in future when you put those things in collection

Java Overriding hashCode() method has any Performance issue?

If i will override hashCode() method will it degrade the performance of application. I am overriding this method in many places in my application.

Yes you can degrade the performance of a hashed collection if the hashCode method is implemented in a bad way. The best implementation of a hashCode method should generate the unique hashCode for unique objects. Unique hashCode will avoid collisions and an element can be stored and retrieved with O(1) complexity. But only hashCode method will not be able to do it, you need to override the equals method also to help the JVM.
If the hashCode method is not able to generate unique hash for unique objects then there is a chance that you will be holding more than one objects at a bucket. This will occur when you have two elements with same hash but equals method returns false for them. So each time this happens the element will be added to the list at hash bucket. This will slow down both the insertion and retreival of elements. It will lead to O(n) complexity for the get method, where n is the size of the list at a bucket.
Note: While you try to generate unique hash for unique objects in your hashCode implementation, be sure that you write simple algorithm for doing so. If your algorithm for generating the hash is too heavy then you will surely see a poor performance for operations on your hashed collection. As hashCode method is called for most of the operations on the hashed collection.

It would improve performance if the right data structure used at right place,
For example: a proper hashcode implementation in Object can nearly convert O(N) to O(1) for HashMap lookup
unless you are doing too much complicated operation in hashCode() method
It would invoke hashCode() method every time it has to deal with Hash data structure with your Object and if you have heavy hashCode() method (which shouldn't be)

It depends entirely on how you're implementing hashCode. If you're doing lots of expensive deep operations, then perhaps it might, and in that case, you should consider caching a copy of the hashCode (like String does). But a decent implementation, such as with HashCodeBuilder, won't be a big deal. Having a good hashCode value can make lookups in data structures like HashMaps and HashSets much, much faster, and if you override equals, you need to override hashCode.

Java's hashCode() is a virtual function anyway, so there is no performance loss by the sheer fact that it is overridden and the overridden method is used.
The real difference may be the implementation of the method. By default, hashCode() works like this (source):
As much as is reasonably practical, the hashCode method defined by
class Object does return distinct integers for distinct objects. (This
is typically implemented by converting the internal address of the
object into an integer, but this implementation technique is not
required by the JavaTM programming language.)
So, whenever your implementation is as simple as this, there will be no performance loss. However, if you perform complex computing operations based on many fields, calling many other functions - you will notice a performance loss but only because your hashCode() does more things.
There is also the issue of inefficient hashCode() implementations. For example, if your hashCode() simply returns value 1 then the use of HashMap or HashSet will be significantly slower than with proper implementation. There is a great question which covers the topic of implementing hashCode() and equals() on SO: What issues should be considered when overriding equals and hashCode in Java?
One more note: remember, that whenever you implement hashCode() you should also implement equals(). Moreover, you should do it with care, because if you write an invalid hashCode() you may break equality checks for various collections.

Overriding hashCode() in a class in itself does not cause any performance issues. However when an instance of such class is inserted either into a HashMap HashSet or equivalent data structure hashCode() & optionally equals() method is invoked to identify right bucket to put the element in. same applicable to Retrival Search & Deletion.
As posted by others performance totally depends on how hashCode() is implemented.
However If a particular class's equals method is not used at all then it is not mandatory to override equals() and hashCode() , but if equals() is overridden , hashcode() must be overridden as well

As all previous comments mentioned, hash-code is used for hashing in collections or it could be used as negative condition in equals. So, yes you can slow you app a lot. Obviously there is more use-cases.
First of all I would say that the approach (whether to rewrite it at all) depends on the type of objects you are talking about.
Default implementation of hash-code is fast as possible because it's unique for every object. It's possible to be enough for many cases.
This is not good when you want to use hashset and let say want to do not store two same objects in a collection. Now, the point is in "same" word.
"Same" can mean "same instance". "Same" can mean object with same (database) identifier when your object is entity or "same" can mean the object with all equal properties. It seems that it can affect performance so far.
But one of properties can be a object which could evaluate hashCode() on demand too and right now you can get evaluation of object tree's hash-code always when you call hash-code method on the root object.
So, what I would recommend? You need to define and clarify what you want to do. Do you really need to distinguish different object instances, or identifier is crucial, or is it value object?
It also depends on immutability. It's possible to calculate hashcode value once when object is constructed using all constructor properties (which has only get) and use it always when hashcode() is call. Or another option is to calculate hashcode always when any property gets change. You need to decide whether most cases read the value or write it.
The last thing I would say is to override hashCode() method only when you know that you need it and when you know what are you doing.

If you will override hashCode() method will it degrade the performance of application.It would improve performance if the right data structure used at right place,
For example: a proper hashcode() implementation in Object can nearly convert O(N) to O(1) for HashMap lookup.unless you are doing too much complicated operation in hashCode() method

The main purpose of hashCode method is to allow an object to be a key in the hash map or a member of a hash set. In this case an object should also implement equals(Object) method, which is consistent with hashCode implementation:
If a.equals(b) then a.hashCode() == b.hashCode()
If hashCode() was called twice on the same object, it should return the same result provided that the object was not changed
hashCode from the performance point of view
From the performance point of view, the main objective for your hashCode method implementation is to minimize the number of objects sharing the same hash code.
All JDK hash based collections store their values in an array.
Hash code is used to calculate an initial lookup position in this array. After that equals is used to compare given value with values stored in the internal array. So, if all values have distinct hash codes, this will minimize the possibility of hash collisions.
On the other hand, if all values will have the same hash code, hash map (or set) will degrade into a list with operations on it having O(n2) complexity.
From Java 8 onwards though collision will not impact performance as much as it does in earlier versions because after a threshold the linked list will be replaced by the binary tree, which will give you O(logN) performance in the worst case as compared to O(n) of linked list.
Never write a hashCode method which returns a constant.
String.hashCode results distribution is nearly perfect, so you can sometimes substitute Strings with their hash codes.
The next objective is to check how many identifiers with non-unique has codes you still have. Improve your hashCode method or increase a range of allowed hash code values if you have too many of non-unique hash codes. In the perfect case all your identifiers will have unique hash codes.

Relationship between hashCode and equals method in Java [duplicate]

This question already has answers here:
What issues should be considered when overriding equals and hashCode in Java?
(11 answers)
Why do I need to override the equals and hashCode methods in Java?
(31 answers)
Closed 9 years ago.
I read in many places saying while override equals method in Java, should override hashCode method too, otherwise it is "violating the contract".
But so far I haven't faced any problem if I override only equals method, but not hashCode method.
What is the contract? And why am I not facing any problem when I am violating the contract? In which case will I face a problem if I haven't overridden the hashCode method?

The problem you will have is with collections where unicity of elements is calculated according to both .equals() and .hashCode(), for instance keys in a HashMap.
As its name implies, it relies on hash tables, and hash buckets are a function of the object's .hashCode().
If you have two objects which are .equals(), but have different hash codes, you lose!
The part of the contract here which is important is: objects which are .equals() MUST have the same .hashCode().
This is all documented in the javadoc for Object. And Joshua Bloch says you must do it in Effective Java. Enough said.

According to the doc, the default implementation of hashCode will return some integer that differ for every object
As much as is reasonably practical, the hashCode method defined by class Object does
return distinct integers for distinct objects. (This is typically implemented by
converting the internal address of the object into an integer, but this implementation
technique is not required by the JavaTM programming language.)
However some time you want the hash code to be the same for different object that have the same meaning. For example
Student s1 = new Student("John", 18);
Student s2 = new Student("John", 18);
s1.hashCode() != s2.hashCode(); // With the default implementation of hashCode
This kind of problem will be occur if you use a hash data structure in the collection framework such as HashTable, HashSet. Especially with collection such as HashSet you will end up having duplicate element and violate the Set contract.

Yes, it should be overridden. If you think you need to override equals(), then you need to override hashCode() and vice versa. The general contract of hashCode() is:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.

The contract is that if obj1.equals(obj2) then obj1.hashCode() == obj2.hashCode() , it is mainly for performance reasons, as maps are mainly using hashCode method to compare entries keys.

Have a look at Hashtables, Hashmaps, HashSets and so forth. They all store the hashed key as their keys. When invoking get(Object key) the hash of the parameter is generated and lookup in the given hashes.
When not overwriting hashCode() and the instance of the key has been changed (for example a simple string that doesn't matter at all), the hashCode() could result in 2 different hashcodes for the same object, resulting in not finding your given key in map.get().

See JavaDoc of java.lang.Object
In hashCode() it says:
If two objects are equal according to the equals(Object) method,
then calling the hashCode method on each of the two objects must
produce the same integer result.
(Emphasis by me).
If you only override equals() and not hashCode() your class violates this contract.
This is also said in the JavaDoc of the equals() method:
Note that it is generally necessary to override the hashCode method
whenever this method is overridden, so as to maintain the general
contract for the hashCode method, which states that equal objects must
have equal hash codes.

A contract is: If two objects are equal then they should have the same hashcode and if two objects are not equal then they may or may not have same hash code.
Try using your object as key in HashMap (edited after comment from joachim-sauer), and you will start facing trouble. A contract is a guideline, not something forced upon you.

java hashing objects

I'd like to be able to determine whether I've encountered an object before - I have a graph implementation and I want to see if I've created a cycle, probably by iterating through the Node objects with a tortoise/hare floyd algorithm.
But I want to avoid a linear search through my list of "seen" nodes each time. This would be great if I had a hash table for just keys. Can I somehow hash an object? Aren't java objects just references to places in memory anyway? I wonder how much of a problem collisions would be if so..

The simple answer is to create a HashSet and add each node to the set the first time you encounter it.
The only case that this won't work is if you've overloaded hashCode() and equals(Object) for the node class to implement equality based on node contents (or whatever). Then you'll need to:
use the IdentityHashMap class which uses == and System.identityHashcode rather than equals(Object) and hashCode(), or
build a hashtable yourself using your own flavour of object identity.
Aren't java objects just references to places in memory anyway?
Yes and no. Yes, the reference is represented by a memory address (on most JVMs). The problem is that 1) you can't get hold of the address, and 2) it can change when the GC relocates the object. This means that you can't use the object address as a hashcode.
The identityHashCode method deals this by returning a value that is initially based on the memory address. If you then call identityHashCode again for the same object, you are guaranteed to get the same value as before ... even if the object has been relocated.
I wonder how much of a problem collisions would be if so..
The hash values produced by the identityHashCode method can collide. (That is, two distinct objects can have the same identity hashcode value.) Anything that uses these values has to deal with this. (The standard HashSet and IdentityHashMap classes take care of these collisions ... if you chose to use them.)

I'd like to be able to determine whether I've encountered an object
before
Use an IdentityHashMap. It is the ideal for your job since it is not an equals but a == implementation.

Take a look at HashSet. Note that in order for objects to work with HashSet, they need to provide correct implementations of hashCode and equals methods of the java.lang.Object class.

You'll need to implement a hash function for your objects. This is done by overriding hashCode() defined in java.lang.Object. This method is used by HashMap, HashSet etc to store objects. In hashCode() it's up to you to calculate a hash for the object. Don't forget to also implement the equals()-method!
Take a look at Java collection framework (http://docs.oracle.com/javase/tutorial/collections/)

Why both hashCode() and equals() exist

why java Object class has two methods hashcode() and equals()? One of them looks redundant and its percolated to the bottom most derived class?

Why do you think one is redundant? They say different things:
hashCode is "give me some way of efficiently seeing whether two objects are likely to be equal"
equals is "check whether this object is genuinely equal to another"
You definitely need both - although I don't believe they should really be in Object in the first place.
You absolutely need hash codes in order to perform efficient lookups with hash tables - and you absolutely need further equality checks because hashes will collide (there are far more possible strings than hash codes, for example).

First of all, when you override equals() you MUST override hashcode() as well.
Failure to do so
will result in a violation of the general contract for Object.hashCode, which will
prevent your class from functioning properly in conjunction with all hash-based
collections, including HashMap, HashSet, and Hashtable.
Here is the contract, copied from the Object specification [JavaSE6]:
Whenever it is invoked on the same object more than once during an execu-
tion of an application, the hashCode method must consistently return the
same integer, provided no information used in equals comparisons on the
object is modified. This integer need not remain consistent from one execu-
tion of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then call-
ing the hashCode method on each of the two objects must produce the same
integer result.
It is not required that if two objects are unequal according to the equals(Object) method, then calling the hashCode method on each of the two objects
must produce distinct integer results. However, the programmer should be
aware that producing distinct integer results for unequal objects may improve
the performance of hash tables.

The fundamental idea is that by comparing hashcode()s it's quick to check whether two objects are probably equal. If their hashcodes are equal, then the objects probably are equal (not necessarily, but it's a good guess). Then a more profound (and more expensive) check with equals() is performed. This is important to speed up all kind of look-ups (from maps etc).

equals is to compare objects, hashcode is used to generate a hash value from an object, which will then be used by the java map containers (Hashtable, Map etc).
it's common practice to override them together (if you override hashcode, you need to override equals and vice versa).

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.