why should I override equals and hashcode method for following scnerio - java

why I need to override for direct access of value in Hash map.That is if insert data into hashmap as follow HashMap,I could get value by giving the Key as Integer ,would get Object as Value.In this case is it necessary to Override equals() and hashCode() method?Please give suggestion.

No, you don't need to override anything to use an object as a value in a HashMap.
Only keys need to have a working hashCode().
However, you need to implement these two methods (technically only equals, but these two are a set, really) if you want to use things like Map#containsValue, List#indexOf or Collection#contains (and these should not just be using reference identity).

hashCode() is used to search for a specific elem when you want to retrieve it from a hashTable. hashCode() doesn't have to be distinct. in fact, you could just return the same integer for all your instance, but then, elems are stored in a list instead of a hashTable, and will cause a performance problem.
By default implementation of hashCode() (which is the implementation of Object for subClass to extents from )of JVM returns a integer according to the memory address of the object, so this should be enough, but this implement was not required by the JVM standard.
By default(Class object), implementation of equals() will return true and only return true when they have same reference , ie obj1 == obj2. read this
keep in mind that:
equal objects must have same hashCode()
those have same hashCode() are not required to be equal to each other.
I think override of hashCode() is not needed in most situations(not extends from other Class), cause modern JVMs has done
pretty good job for you.
So conclusion is:
if your super class have overwrite the hashCode() and equals() method, then you should override them, or at least take a look at the implementation, and decide whether you should override them.

Related

If the hashcode() creates hashcode based on the address of the object, how can two different objects with same contents create the same hashcode?

When you read the description of hashCode() in the Object class, it says that
If two objects are equal according to the equals(Object) method, then
calling the hashCode method on each of the two objects must produce
the same integer result.
I've read an article and it says that an object's hashcode() can provide different integer values if it runs in different environments even though their content is the same.
It does not happen with String class's hashcode() because the String class's hashcode() creates integer value based on the content of the object. The same content always creates the same hash value.
However, it happens if the hash value of the object is calculated based on its address in the memory. And the author of the article says that the Object class calculates the hash value of the object based on its address in the memory.
In the example below, the objects, p1 and p2 have the same content so equals() on them returns true.
However, how come these two objects return the same hash value when their memory addresses are different?
Here is the example code: main()
Person p1 = new Person2("David", 10);
Person p2 = new Person2("David", 10);
boolean b = p1.equals(p2);
int hashCode1 = p1.hashCode();
int hashCode2 = p2.hashCode();
Here is the overriden hashcode()
public int hashCode(){
return Objects.hash(name, age);
}
Is the article's content wrong?
If there is a hashCode() that calculates a hash value based on the instance's address what is the purpose of them?
Also, if it really exists, it violates the condition that Object class's hashCode() specifies. How should we use the hashCode() then?
I think you have misunderstood this statement:
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
The word “must” means that you, the programmer, are required to write a hashCode() method which always produces the same hashCode value for two objects which are equal to each other according to their equals(Object) method.
If you don’t, you have written a broken object that will not work with any class that uses hash codes, particularly unsorted Sets and Maps.
I've read an article and it says that an object's hashcode() can provide different integer values if it runs in different environments even though their content is the same.
Yes, hashCode() can provide different values in different runtimes, for a class which does not override the hashCode() method.
… how come these two objects return the same hash value when their memory addresses are different?
Because you told them to. You overrode the hashCode() method.
If there is a hashCode() that calculates a hash value based on the instance's address what is the purpose of them?
It doesn’t have a lot of use. That’s why programmers are strongly recommended to override the method, unless they don’t care about object identity.
Also, if it really exists, it violates the condition that Object class's hashCode() specifies. How should we use the hashCode() then?
No it does not. The contract states:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
As long as the hashCode method defined in the Object class returns the same value for the duration of the Java runtime, it is compliant.
The contract also says:
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
Two objects whose equals method reports that they are effectively equal must return the same hashCode. Any instance of the Object class whose type is not a subclass of Object is only equal to itself and cannot be equal to any other object, so any value it returns is valid, as long as it is consistent throughout the life of the Java runtime.
The article is wrong, memory address is not involved (btw. the address can change during the lifecycle of objects, so it's abstracted away as a reference). You can look at the default hashCode as a method that returns a random integer, which is always the same for the same object.
Default equals (inherited from Object) works exactly like ==. Since there is a condition (required for classes like HashSet etc. to work) that states when a.equals(b) then a.hashCode == b.hashCode, then we need a default hashCode which only has to have one property: when we call it twice, it must return the same number.
The default hashCode exists exactly so that the condition you mention is upheld. The value returned is not important, it's only important that it never changes.
No p1 and p2 does not have the same content
If you do p1.equals(p2) that will be false, so not the same content.
If you want p1 and p2 to equal, you need to implement the equals methods from object, in a way that compare their content. And IF you implement the equals method, then you also MUST implement the hashCode method, so that if equals return true, then the objects have the same hashCode().
Here's the design decision every programmer needs to make for objects they define of (say) MyClass:
Do you want it to be possible for two different objects to be "equal"?
If you do, then firstly you have to implement MyClass.equals() so that it gives the correct notion of equality for your purposes. That's entirely in your hands.
Then you're supposed to implement hashCode such that, if A.equals(B), then A.hashCode() == B.hashCode(). You explicitly do not want to use Object.hashCode().
If you don't want different objects to ever be equal, then don't implement equals() or hashCode(), use the implementations that Object gives you. For Object A and Object B (different Objects, and not subclasses of Object), then it is never the case that A.equals(B), and so it's perfectly ok that A.hashCode() is never the same as B.hashCode().

Should I override hashCode() of Collections?

Given that I some class with various fields in it:
class MyClass {
private String s;
private MySecondClass c;
private Collection<someInterface> coll;
// ...
#Override public int hashCode() {
// ????
}
}
and of that, I do have various objects which I'd like to store in a HashMap. For that, I need to have the hashCode() of MyClass.
I'll have to go into all fields and respective parent classes recursively to make sure they all implement hashCode() properly, because otherwise hashCode() of MyClass might not take into consideration some values. Is this right?
What do I do with that Collection? Can I always rely on its hashCode() method? Will it take into consideration all child values that might exist in my someInterface object?
I OPENED A SECOND QUESTION regarding the actual problem of uniquely IDing an object here: How do I generate an (almost) unique hash ID for objects?
Clarification:
is there anything more or less unqiue in your class? The String s? Then only use that as hashcode.
MyClass hashCode() of two objects should definitely differ, if any of the values in the coll of one of the objects is changed. HashCode should only return the same value if all fields of two objects store the same values, resursively. Basically, there is some time-consuming calculation going on on a MyClass object. I want to spare this time, if the calculation had already been done with the exact same values some time ago. For this purpose, I'd like to look up in a HashMap, if the result is available already.
Would you be using MyClass in a HashMap as the key or as the value? If the key, you have to override both equals() and hashCode()
Thus, I'm using the hashCode OF MyClass as the key in a HashMap. The value (calculation result) will be something different, like an Integer (simplified).
What do you think equality should mean for multiple collections? Should it depend on element ordering? Should it only depend on the absolute elements that are present?
Wouldn't that depend on the kind of Collection that is stored in coll? Though I guess ordering not really important, no
The response you get from this site is gorgeous. Thank you all
#AlexWien that depends on whether that collection's items are part of the class's definition of equivalence or not.
Yes, yes they are.
I'll have to go into all fields and respective parent classes recursively to make sure they all implement hashCode() properly, because otherwise hashCode() of MyClass might not take into consideration some values. Is this right?
That's correct. It's not as onerous as it sounds because the rule of thumb is that you only need to override hashCode() if you override equals(). You don't have to worry about classes that use the default equals(); the default hashCode() will suffice for them.
Also, for your class, you only need to hash the fields that you compare in your equals() method. If one of those fields is a unique identifier, for instance, you could get away with just checking that field in equals() and hashing it in hashCode().
All of this is predicated upon you also overriding equals(). If you haven't overridden that, don't bother with hashCode() either.
What do I do with that Collection? Can I always rely on its hashCode() method? Will it take into consideration all child values that might exist in my someInterface object?
Yes, you can rely on any collection type in the Java standard library to implement hashCode() correctly. And yes, any List or Set will take into account its contents (it will mix together the items' hash codes).
So you want to do a calculation on the contents of your object that will give you a unique key you'll be able to check in a HashMap whether the "heavy" calculation that you don't want to do twice has already been done for a given deep combination of fields.
Using hashCode alone:
I believe hashCode is not the appropriate thing to use in the scenario you are describing.
hashCode should always be used in association with equals(). It's part of its contract, and it's an important part, because hashCode() returns an integer, and although one may try to make hashCode() as well-distributed as possible, it is not going to be unique for every possible object of the same class, except for very specific cases (It's easy for Integer, Byte and Character, for example...).
If you want to see for yourself, try generating strings of up to 4 letters (lower and upper case), and see how many of them have identical hash codes.
HashMap therefore uses both the hashCode() and equals() method when it looks for things in the hash table. There will be elements that have the same hashCode() and you can only tell if it's the same element or not by testing all of them using equals() against your class.
Using hashCode and equals together
In this approach, you use the object itself as the key in the hash map, and give it an appropriate equals method.
To implement the equals method you need to go deeply into all your fields. All of their classes must have equals() that matches what you think of as equal for the sake of your big calculation. Special care needs to be be taken when your objects implement an interface. If the calculation is based on calls to that interface, and different objects that implement the interface return the same value in those calls, then they should implement equals in a way that reflects that.
And their hashCode is supposed to match the equals - when the values are equal, the hashCode must be equal.
You then build your equals and hashCode based on all those items. You may use Objects.equals(Object, Object) and Objects.hashCode( Object...) to save yourself a lot of boilerplate code.
But is this a good approach?
While you can cache the result of hashCode() in the object and re-use it without calculation as long as you don't mutate it, you can't do that for equals. This means that calculation of equals is going to be lengthy.
So depending on how many times the equals() method is going to be called for each object, this is going to be exacerbated.
If, for example, you are going to have 30 objects in the hashMap, but 300,000 objects are going to come along and be compared to them only to realize that they are equal to them, you'll be making 300,000 heavy comparisons.
If you're only going to have very few instances in which an object is going to have the same hashCode or fall in the same bucket in the HashMap, requiring comparison, then going the equals() way may work well.
If you decide to go this way, you'll need to remember:
If the object is a key in a HashMap, it should not be mutated as long as it's there. If you need to mutate it, you may need to make a deep copy of it and keep the copy in the hash map. Deep copying again requires consideration of all the objects and interfaces inside to see if they are copyable at all.
Creating a unique key for each object
Back to your original idea, we have established that hashCode is not a good candidate for a key in a hash map. A better candidate for that would be a hash function such as md5 or sha1 (or more advanced hashes, like sha256, but you don't need cryptographic strength in your case), where collisions are a lot rarer than a mere int. You could take all the values in your class, transform them into a byte array, hash it with such a hash function, and take its hexadecimal string value as your map key.
Naturally, this is not a trivial calculation. So you need to think if it's really saving you much time over the calculation you are trying to avoid. It is probably going to be faster than repeatedly calling equals() to compare objects, as you do it only once per instance, with the values it had at the time of the "big calculation".
For a given instance, you could cache the result and not calculate it again unless you mutate the object. Or you could just calculate it again only just before doing the "big calculation".
However, you'll need the "cooperation" of all the objects you have inside your class. That is, they will all need to be reasonably convertible into a byte array in such a way that two equivalent objects produce the same bytes (including the same issue with the interface objects that I mentioned above).
You should also beware of situations in which you have, for example, two strings "AB" and "CD" which will give you the same result as "A" and "BCD", and then you'll end up with the same hash for two different objects.
For future readers.
Yes, equals and hashCode go hand in hand.
Below shows a typical implementation using a helper library, but it really shows the "hand in hand" nature. And the helper library from apache keeps things simpler IMHO:
#Override
public boolean equals(Object o) {
if (this == o) {
return true;
}
if (o == null || getClass() != o.getClass()) {
return false;
}
MyCustomObject castInput = (MyCustomObject) o;
boolean returnValue = new org.apache.commons.lang3.builder.EqualsBuilder()
.append(this.getPropertyOne(), castInput.getPropertyOne())
.append(this.getPropertyTwo(), castInput.getPropertyTwo())
.append(this.getPropertyThree(), castInput.getPropertyThree())
.append(this.getPropertyN(), castInput.getPropertyN())
.isEquals();
return returnValue;
}
#Override
public int hashCode() {
return new org.apache.commons.lang3.builder.HashCodeBuilder(17, 37)
.append(this.getPropertyOne())
.append(this.getPropertyTwo())
.append(this.getPropertyThree())
.append(this.getPropertyN())
.toHashCode();
}
17, 37 .. those you can pick your own values.
From your clarifications:
You want to store MyClass in an HashMap as key.
This means the hashCode() is not allowed to change after adding the object.
So if your collections may change after object instantiation, they should not be part of the hashcode().
From http://docs.oracle.com/javase/8/docs/api/java/util/Map.html
Note: great care must be exercised if mutable objects are used as map
keys. The behavior of a map is not specified if the value of an object
is changed in a manner that affects equals comparisons while the
object is a key in the map.
For 20-100 objects it is not worth that you enter the risk of an inconsistent hash() or equals() implementation.
There is no need to override hahsCode() and equals() in your case.
If you don't overide it, java takes the unique object identity for equals and hashcode() (and that works, epsecially because you stated that you don't need an equals() considering the values of the object fields).
When using the default implementation, you are on the safe side.
Making an error like using a custom hashcode() as key in the HashMap when the hashcode changes after insertion, because you used the hashcode() of the collections as part of your object hashcode may result in an extremly hard to find bug.
If you need to find out whether the heavy calculation is finished, I would not absue equals(). Just write an own method objectStateValue() and call hashcode() on the collection, too. This then does not interfere with the objects hashcode and equals().
public int objectStateValue() {
// TODO make sure the fields are not null;
return 31 * s.hashCode() + coll.hashCode();
}
Another simpler possibility: The code that does the time consuming calculation can raise an calculationCounter by one as soon as the calculation is ready. You then just check whether or not the counter has changed. this is much cheaper and simpler.

Hashcode and equals methods contract [duplicate]

This question already has answers here:
Why do I need to override the equals and hashCode methods in Java?
(31 answers)
Closed 7 years ago.
I know that when we override equals() method then we need to override hashcode() as well and other way around.
But i don't understand why we MUST do that?
In Joshua Bloch Book it is clearly written that we must do that, because when we deal with hash based collections, it is crucial to satisfy the Hashcode contract and I admit that, but what if I am not dealing with hash-based collections?
Why is it still required ?
Why to Override Equals ?
A programmer who compares references to value objects using the equals
method expects to find out whether they are logically equivalent, not
whether they refer to the same object .
Now coming to HashCode
Hash function which is called to produce the hashCode should return the same hash code each and every time,
when function is applied on same or equal objects. In other words, two
equal objects must produce same hash code consistently.
Implementation of HashCode provided by Object Class is not based upon logical equivalency ,
So Now if you will not override hashCode but override equals, then according to you 2 Objects are equals as they will pass the equals() test but according to Java they are not .
Consequences :
Set start allowing duplicates !!!
Map#get(key) will not return the correct value !!
and so on many other consquences..................
Data structures, such as HashMap, depend on the contract.
A HashMap achieves magical performance properties by using the hashcode to bucketize entries. Every item that is put in the map that has the same hashcode() value gets placed in the same bucket. These "collisions" are resolved by comparing within the same bucket using equals(). In other words, the hashcode is used to determine the subset of the items in the map that might be equal and in this way quickly eliminate the vast majority of the items from further consideration.
This only works if objects that are equal are placed in the same bucket, which can only be ensured if they have the same hashcode.
NOTE: In practice, the number of collisions is much higher than may be implied above, because the number of buckets used is necessarily much smaller than the number of possible hashcode values.
As per Joshua Bloch book;
A common source of bugs is the failure to override the hashCode
method. You must override hashCode in every class that overrides
equals. Failure to do so will result in a violation of the general
contract for Object.hashCode, which will prevent your class from
functioning properly in conjunction with all hash-based collections,
including HashMap, HashSet, and Hashtable.
Failing to override hashcode while overriding equals is violation the contract of Object.hashCode. But this won't have impact if you are using your objects only on non hash based collection.
However, how do you prevent; the other developers doing so. Also if an object is eligible for element of collection, better provide support for all the collections, don't have half baked objects in your project. This will fail anytime in the future, and you will be caught for not following the contacts while implementing :)
Because that is the way it is meant to be:
Whenever a.equals(b), then a.hashCode() must be same as b.hashCode().
What issues should be considered when overriding equals and hashCode in Java?
There are use-cases where you don't need hashcode(), mostly self-written scenarious, but you can never be sure, because implementations can and might be also relying on hashcode() if they are using equals()
This question is answered many times in SO, but still I will attempt to answer this .
In order to understand this concept completely, we need to understand the purpose of hashcode and equals, how they are implemented, and what exactly is this contract(that hashcode also should be overridden when equals is overridden)
equals method is used to determine the equality of the object. For primitive types, its very easy to determine the equality. We can very easily say that int 1 is always equal to 1. But this equal method talks about the equality of objects. The object equality depends on the instance variables or any other parameter (depend purely on the implementation - how you want to compare).
This equal method needs to be overridden if we want some customized comparison, lets say we want to say that two books are same if they have same title and same author, or I can say two books are equal if they have same ISBN.
hashcode method returns a hash code value of an object. The default implementation of the Object hashcode returns a distinct integers for distinct objects. This integer is calculated based on the memory address of the object.
So we can say that the default implementation of the equals method just comapres the hashcodes to check the equality of the object. But for the book example - we need it differently.
Also Equal objects must produce the same hash code as long as they are equal, however unequal objects need not produce distinct hash codes.
In case of not using a hash based collection, you can break the contract and need not to override the hashcode method - because you ll not be using the default implementations anywhere but still I would not suggest that and would say to have it as you may need it in future when you put those things in collection

Using HashMap with custom key

Quick Question: If I want to use HashMap with a custom class as the key, must I override the hashCode function? How will it work if I do not override that function?
If you don't override hashCode AND equals you will get the default behaviour which is that each object is different, regardless of its contents.
Technically, you don't have to override the hashCode method as long as equal objects have the same hashCode.
So, if you use the default behaviour defined by Object, where equals only returns true only for the same instance, then you don't have to override the hashCode method.
But if you don't override the equals and the hashCode methods, it means you have to make sure you're always using the same key instance.
E.g.:
MyKey key1_1 = new MyKey("key1");
myMap.put(key1_1,someValue); // OK
someValue = myMap.get(key1_1); // returns the correct value, since the same key instance has been used;
MyKey key1_2 = new MaKey("key1"); // different key instance
someValue = myMap.get(key1_2); // returns null, because key1_2 has a different hashCode than key1_1 and key1_1.equals(key1_2) == false
In practice you often have only one instance of the key, so technically you don't have to override the equals and hashCode methods.
But it's best practice to override the equals and hashCode methods for classes used as keys anyway, because sometime later you or another developer might forget that the same instance has to be used, which can lead to hard to track issues.
And note: even if you override the equals and hashCode methods, you must make sure you don't change the key object in a way that would change the result of the equals or the hashCode methods, otherwise the map won't find your value anymore. That's why it's recommended to use immutable objects as keys if possible.
The only time you don't have to override the hashCode() function is when you also don't override equals, so you use the default Object.equals definition of reference equality. This may or may not be what you want -- in particular, different objects will not be considered equal even if they have the same field values.
If you override equals but not hashCode, HashMap behavior will be undefined (read: it won't make any sense at all, and will be totally corrupted).
It depends on the object class you are using as a key. If it's a custom class like you propose, and it doesn't extend anything (i.e. it extends Object) then the hashCode function will be that of Object, and that will consider memory references, making two objects that look the same to you hash to different codes.
So yes, unless you are extending a class with a hashCode() function you know works for you, you need to implement your own. Also make sure to implement equals(): some classes like ArrayList will only use equals while others like HashMap will check on both hashCode() and equals().
Consider also that if your key is not immutable you may have problems. If you put an entry with a mutable key in the map an you change later the key in a way that it affects hashcode and equals you may lose your entry in the map,as you won't be able to retrieve it anymore.
You should override the equals() and hashCode() methods from the Object class. The default implementation of the equals() and hashcode(), which are inherited from the java.lang.Object uses an object instance’s memory location (e.g. MyObject#6c60f2ea). This can cause problems when two instances of the an objects have the same properties but the inherited equals() will return false because it uses the memory location, which is different for the two instances.
Also the toString() method can be overridden to provide a proper string representation of your object.
primary considerations when implementing a user defined key
If a class overrides equals(), it must override hashCode().
If 2 objects are equal, then their hashCode values must be equal as well.
If a field is not used in equals(), then it must not be used in hashCode().
If it is accessed often, hashCode() is a candidate for caching to enhance performance.

Does Hashcode equality imply refer reference based equality?

I read that to use equals() method in java we also have to override the hashcode() method and that the equal (logically) objects should have eual hashcodes, but doesn't that imply reference based equality! Here is my code for overridden equals() method, how should I override hashcode method for this:
#Override
public boolean equals(Object o)
{
if (!(o instanceof dummy))
return false;
dummy p = (dummy) o;
return (p.getName() == this.getName() && p.getId() == this.getId() && p.getPassword() == this.getPassword());
}
I just trying to learn how it works, so there are only three fields, namely name , id and password , and just trying to compare two objects that I define in the main() thats all! I also need to know if it is always necessary to override hashcode() method along with equals() method?
Hashcode equality does not imply anything. However, hashcode inequality should imply that equals will yield false, and any two items that are equal should always have the same hashcode.
For this reason, it is always wise to override hashcode with equals, because a number of data structures rely on it.
Even though failure to override hashCode() will only break usage of your class in HashSet, HashMap, and other hashCode dependent structures, you should still override hashCode() to maintain the contract described by Object.
The general strategy of most hashCode() implementations is to combine the hash codes of the fields used to determine equality. In your case, a reasonable hashCode() may look something like this:
public int hashCode(){
return this.getName().hashCode() ^ this.getId() ^ this.getPassword().hashCode();
}
You need to override hashCode() when you override equals(). Merely using equals() is not enough to require you to override hashCode().
In your code, you aren't actually comparing your fields' values. Use equals() instead of == to make your implementation of equal correct.
return (p.getName().equals(this.getName()) && ...
(Note that the above code can cause null reference exceptions if getName() returns null: you may want to use a utility class as described here)
And yes hashCode() would be called when you use some hashing data structure like HashMap,HashSet
You must override hashCode() in every
class that overrides equals(). Failure
to do so will result in a violation of
the general contract for
Object.hashCode(), which will prevent
your class from functioning properly
in conjunction with all hash-based
collections, including HashMap,
HashSet, and Hashtable.
from Effective Java, by Joshua Bloch
Also See
overriding-equals-and-hashcode-in-java
hashcode-and-equals
Nice article on equals() & hashCode()
The idea with hashCode() is that it is a unique representation of your object in a given space. Data structures that hold objects use hash codes to determine where to place objects. In Java, a HashSet for example uses the hash code of an object to determine which bucket that objects lies in, and then for all objects in that bucket, it uses equals() to determine whether it is a match.
If you don't override hashCode(), but do override equals(), then you will get to a point where you consider 2 objects to be equal, but Java collections don't see it the same way. This will lead to a lot of strange behaviour.

Categories