equals and hashCode - java

I am running into a question about equals and hashCode contracts:
here it is
Given:
class SortOf {
String name;
int bal;
String code;
short rate;
public int hashCode() {
return (code.length() * bal);
}
public boolean equals(Object o) {
// insert code here
}
}
Which of the following will fulfill the equals() and hashCode() contracts for this
class? (Choose all that apply.)
Correct Answer
C:
return ((SortOf)o).code.length() * ((SortOf)o).bal == this.code.length() *
this.bal;
D:
return ((SortOf)o).code.length() * ((SortOf)o).bal * ((SortOf)o).rate ==
this.code.length() * this.bal * this.rate;
I have a question about the last choice D, say if the two objects
A: code.length=10, bal=10, rate = 100
B: code.length=10, bal=100, rate = 10
Then using the equals() method in D, we get A.equals(B) evaluating to true right? But then they get a different hashCode because they have different balances? Is it that I misunderstood the concept somewhere? Can someone clarify this for me?

You're right - D would be inappropriate because of this.
More generally, hashCode and equals should basically take the same fields into account, in the same way. This is a very strange equals implementation to start with, of course - you should normally be checking for equality between each of the fields involved. In a few cases fields may be inter-related in a way which would allow for multiplication etc, but I wouldn't expect that to involve a string length...
One important point which often confuses people is that it is valid for unequal objects to have the same hash code; it's the case you highlighted (equal objects having different hash codes) which is unacceptable.

You have to check at least all the fields used by .hashCode() so objects which are equal do have the same hash. But you can check more fields in equals, its totally fine to have different objects with the same hash. It seems your doing SCJP 1.6? This topic is well covered in the SCJP 1.6 book from Katherine Sierra and Bert Bates.
Note: thats why its legit to implement a useful .equals() while returning a constant value from .hashCode()

It's all about fulfilling the contract (as far as this question is concerned). Different implementation (of hasCode and equal) has different limitations and its own advantages - so its for developer to check that.
but then they get different hashCode because they have a different balance?
Exactly! But that's why you should choose option C. The question wants to test your grasp on fulfilling the contract concept and not which hascode will be better for the scenario.
More clarification:
The thing you need to check always is :
Your hashCode() implementation should use the same instance variables as used in equals() method.
Here these instance variables are : code.length() and bal used in hashCode() and hence you are limited to use these same variables in equals() as well. (Unless you can edit the hashCode() implementation and add rate to it)

hashCode() method is used to get a unique integer for given object. This integer is used to determined the bucket location, when this object need to be stored in some HashTable like HashMap data structure. But default, Object's hashCode() method returns an integer to represent memory address where object is stored.
equals() method, as name suggest, is used to simply verify the equality of two objects. Default implementation just simply check the object references of two object to verify their equality.
equal objects must have equal hash codes.
equals() must define an equality relation. if the objects are not modified, then it must keep returning the same value. o.equals(null) must always return false.
hashCode() must also be consistent, if the object is not modified in terms of equals(), it must keep returning the same value.
The relation between the two method is:
whenever a.equals(b) then a.hashCode() must be same as b.hashCode().
refer to: https://howtodoinjava.com/interview-questions/core-java-interview-questions-series-part-1/

In general, you should always override one if you override the other in a class. If you don't, you might find yourself getting into trouble when that class is used in hashmaps/hashtables, etc.

Related

Is == guaranteed to return correct result? [duplicate]

This question already has answers here:
Compare two objects with .equals() and == operator
(16 answers)
Closed 1 year ago.
From what I understand, the == operator in Java compares references (an int) of objects.
This value is what the default implementation of hashCode method in Object returns.
The hashCode method has an implementation note:
As far as is reasonably practical, the hashCode method defined
by class Object returns distinct integers for distinct objects.
reasonably practical: This means that, no matter how small, there is a real possibility that two distinct objects can have equal hashCode or reference value.
So, if I compare two different objects (that don't override hashCode and equals) using ==, it's a real possibility that the result can be true (?). The default implementation of equals does a == check:
public class Test {
public static void main(String[] args) {
var t1 = new Test();
var t2 = new Test();
System.out.println(t1.hashCode() + ":" + t2.hashCode()); // 2055281021:1554547125 (Could've been 1554547125:1554547125 ?)
System.out.println(t1 == t2); // false (Could've been true ?)
System.out.println(t1.equals(t2)); // false (Could've been true ?)
}
}
Why is that equals and hashCode are overridden in certain situations only and rest of the time (many library classes such as Thread) depend on default implementation for equality check when it's not guaranteed to return correct result?
And, how someone extremely risk-averse make sure the above false-positive would never occur? If the class has at least one non-static field, one can override hashCode and equals. But, what if this is not the case (like the Test class above)?
Can you please explain what am I missing here?
Edit 1:
Adding an API note for hashCode (taken form Silvio's answer):
This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language
Okay, there's a lot of questions here, so let's try to break it down.
From what I understand, the == operator in Java compares references (an int) of objects.
This value is what the default implementation of hashCode method in Object returns.
== in Java compares references, yes. Those references are not necessarily compatible with int. On many common architectures, int will probably coincide with most of the observable space that a reference can occupy, but that's not true in general.
In particular.
int is a signed type. That means half of its values are negative. Pointers are generally unsigned.
Even if we ignore the sign problems, int is a 32-bit type. Most modern computers are 64-bit, which means the address space would fit better in a 64-bit integer (i.e. a Java long). So only a small fraction of addresses can even be stored in int.
Second, hashCode is not required to have anything to do with the pointer itself. From the hashCode docs you referenced already
(This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)
A Java implementation is free to choose whatever hashCode it wants. Maybe you're running on some bizarre embedded hardware and it makes sense to use some additional flag variable in the computation. hashCode should not be assumed to be the pointer.
Why is that equals and hashCode are overridden in certain situations only and rest of the time (many library classes such as Thread) depend on default implementation for equality check when it's not guaranteed to return correct result?
What is your definition of "correct" here? The guarantees demanded by the Java specification can be summarized from the docs
The equals method implements an equivalence relation on non-null object references:
It is reflexive: for any non-null reference value x, x.equals(x) should return true.
It is symmetric: for any non-null reference values x and y, x.equals(y) should return true if and only if y.equals(x) returns true.
It is transitive: for any non-null reference values x, y, and z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) should return true.
It is consistent: for any non-null reference values x and y, multiple invocations of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the objects is modified.
For any non-null reference value x, x.equals(null) should return false.
...
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
The default equals implementation clearly satisfies the basic requirements above, and the default hashCode is guaranteed by the standard to be the same for two equal objects.
We override equals when we have a better notion of equality. For instance, two strings should be considered equal if they have the same characters, even if they are distinct objects in memory, and two array lists should be equal if their elements are equal pointwise. But for something like Thread, what would that even mean? When should two arbitrary threads be equal? The default suffices well enough, because we'd gain nothing from overriding it anyway.
If the class has at least one non-static field, one can override hashCode and equals
What does equality have to do with the number of non-static fields? I can override the two just fine. Watch.
public final class MySimpleClass {
public boolean equals(Object other) {
return (other != null) && (other instanceof MySimpleClass);
}
public int hashCode() {
return 42;
}
}
That's a perfectly valid, conformant implementation of equality and hashing for MySimpleClass. In particular, since there's only one meaningfully distinct value of this class, I'd argue that's a good implementation of the two methods. No non-static fields required.
== always returns false if you compare 2 different objects and always returns true if you compare an object to itself.
But it is not guaranteed, that 2 different objects return different hash codes. That's because hashCode() returns int and there's only about 4 billion distinct ints. The number of objects in your code is constrained by the size of heap only.
So, because there can be more than 4 billion distinct objects, their hash codes can sometimes be the same
As for equals, it works as == by default, but can be overridden, so == can return false, when equals returns true and vice versa
equals and hashCode have an unenforceable-at-compile-time contract between them (which itself is different than the == operator).
Fundamentally speaking, an object should override hashCode such that a.equals(b) (and its inverse) is congruent to a.hashCode() == b.hashCode() (and its inverse).
The == operator is only looking to compare numeric equality, which is why the same instance of an object compared against itself (or a == a) will return true, with some caveats given to Strings and string interning.
Because the contract between equals and hashCode is unenforceable, suggesting that == will always return a "correct" result depends on your definition of "correct".
For instance:
It's correct that a square is a parallelogram; it's not correct that any given square is the same as any given parallelogram.
It's correct that a book is a dictionary; it's not correct that any given book is a dictionary.
It's correct that a car has wheels; it's not correct that any given car has any given number of wheels.
Also too - just remember that hashCode is only 32 bits, so there's always going to be the chance of a collision between two unrelated objects (which is where having equals pick up the slack is beneficial here).
In this context, you can only trust == based on the constraints and conditions the individual object has, and what business rules make sense for equality comparisons given a hash code, and nothing further. If your business rules require a deviation between how equals and hashCode behave, then you have to keep that context in mind when comparing through those methods.
Firstly, == checks the memory reference. JVM does this using pointers internally. So each object is literally different as they are stored in different memory address. As compared using memory address location 32 or 64 bit int/number.
For your second question, if you need to have a hash code implementation, but the class has no fields. Then use System.identityHashCode() to do it. It will provide zero for null object and a unique/smal hashcode for same object.
From what I understand, the == operator in Java compares references
(an int) of objects.
On a 64-bit architecture, a reference will need 8 bytes. This is a long, not an int.
This value is what the default implementation of hashCode method in Object returns.
When you cast a long to an int, you will lose information. This is why the default implementation of hashCode() can return equal hashes for different objects.

If the hashcode() creates hashcode based on the address of the object, how can two different objects with same contents create the same hashcode?

When you read the description of hashCode() in the Object class, it says that
If two objects are equal according to the equals(Object) method, then
calling the hashCode method on each of the two objects must produce
the same integer result.
I've read an article and it says that an object's hashcode() can provide different integer values if it runs in different environments even though their content is the same.
It does not happen with String class's hashcode() because the String class's hashcode() creates integer value based on the content of the object. The same content always creates the same hash value.
However, it happens if the hash value of the object is calculated based on its address in the memory. And the author of the article says that the Object class calculates the hash value of the object based on its address in the memory.
In the example below, the objects, p1 and p2 have the same content so equals() on them returns true.
However, how come these two objects return the same hash value when their memory addresses are different?
Here is the example code: main()
Person p1 = new Person2("David", 10);
Person p2 = new Person2("David", 10);
boolean b = p1.equals(p2);
int hashCode1 = p1.hashCode();
int hashCode2 = p2.hashCode();
Here is the overriden hashcode()
public int hashCode(){
return Objects.hash(name, age);
}
Is the article's content wrong?
If there is a hashCode() that calculates a hash value based on the instance's address what is the purpose of them?
Also, if it really exists, it violates the condition that Object class's hashCode() specifies. How should we use the hashCode() then?
I think you have misunderstood this statement:
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
The word “must” means that you, the programmer, are required to write a hashCode() method which always produces the same hashCode value for two objects which are equal to each other according to their equals(Object) method.
If you don’t, you have written a broken object that will not work with any class that uses hash codes, particularly unsorted Sets and Maps.
I've read an article and it says that an object's hashcode() can provide different integer values if it runs in different environments even though their content is the same.
Yes, hashCode() can provide different values in different runtimes, for a class which does not override the hashCode() method.
… how come these two objects return the same hash value when their memory addresses are different?
Because you told them to. You overrode the hashCode() method.
If there is a hashCode() that calculates a hash value based on the instance's address what is the purpose of them?
It doesn’t have a lot of use. That’s why programmers are strongly recommended to override the method, unless they don’t care about object identity.
Also, if it really exists, it violates the condition that Object class's hashCode() specifies. How should we use the hashCode() then?
No it does not. The contract states:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
As long as the hashCode method defined in the Object class returns the same value for the duration of the Java runtime, it is compliant.
The contract also says:
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
Two objects whose equals method reports that they are effectively equal must return the same hashCode. Any instance of the Object class whose type is not a subclass of Object is only equal to itself and cannot be equal to any other object, so any value it returns is valid, as long as it is consistent throughout the life of the Java runtime.
The article is wrong, memory address is not involved (btw. the address can change during the lifecycle of objects, so it's abstracted away as a reference). You can look at the default hashCode as a method that returns a random integer, which is always the same for the same object.
Default equals (inherited from Object) works exactly like ==. Since there is a condition (required for classes like HashSet etc. to work) that states when a.equals(b) then a.hashCode == b.hashCode, then we need a default hashCode which only has to have one property: when we call it twice, it must return the same number.
The default hashCode exists exactly so that the condition you mention is upheld. The value returned is not important, it's only important that it never changes.
No p1 and p2 does not have the same content
If you do p1.equals(p2) that will be false, so not the same content.
If you want p1 and p2 to equal, you need to implement the equals methods from object, in a way that compare their content. And IF you implement the equals method, then you also MUST implement the hashCode method, so that if equals return true, then the objects have the same hashCode().
Here's the design decision every programmer needs to make for objects they define of (say) MyClass:
Do you want it to be possible for two different objects to be "equal"?
If you do, then firstly you have to implement MyClass.equals() so that it gives the correct notion of equality for your purposes. That's entirely in your hands.
Then you're supposed to implement hashCode such that, if A.equals(B), then A.hashCode() == B.hashCode(). You explicitly do not want to use Object.hashCode().
If you don't want different objects to ever be equal, then don't implement equals() or hashCode(), use the implementations that Object gives you. For Object A and Object B (different Objects, and not subclasses of Object), then it is never the case that A.equals(B), and so it's perfectly ok that A.hashCode() is never the same as B.hashCode().

What is the use of overriding hashCode in Java other than Collections API?

This question is asked by interviewer that most of answers related to hash code is used for bucketing where it checks equals to search objects.
Is there any other general use case or scenario, where hash code is beneficial and can be used in a routine program?
As recently I have used JPA where it throws exception "Composite-id class does not override hashCode()" but again it is used by implementation class of hibernate. Sly, what other places or scenario we can use hashcode other then collections especially scenario where you have used it yourself.
class a {
public int hashCode() {
}
}
class b {
public static void main(String[] str) {
//In what ways can i use hashcode here?
}
}
Let's say your class will not be used in any collection ever( which is very unlikely though), it will be used in more than one place and by other developers. Any developer using your class will expect that if two instances of that class are equal based on equals method, they should produce same hashCode value. This fundamental assumption will be broken if hashCode is not overridden to be consistent with equals and that will prevent their code to function properly.
From Effective Java , 3rd Edition :
ITEM 11: ALWAYS OVERRIDE HASHCODE WHEN YOU OVERRIDE EQUALS
You must override hashCode in every class that overrides equals. If
you fail to do so, your class will violate the general contract for
hashCode, which will prevent it from functioning properly in
collections such as HashMap and HashSet. Here is the contract, adapted
from the Object specification :
• When the hashCode method is invoked on an object repeatedly during
an execution of an application, it must consistently return the same
value, provided no information used in equals comparisons is modified.
This value need not remain consistent from one execution of an
application to another.
• If two objects are equal according to the equals(Object) method,
then calling hashCode on the two objects must produce the same integer
result.
• If two objects are unequal according to the equals(Object) method,
it is not required that calling hashCode on each of the objects must
produce distinct results. However, the programmer should be aware that
producing distinct results for unequal objects may improve the
performance of hash tables.
The key provision that is violated when you fail to override hashCode
is the second one: equal objects must have equal hash codes. Two
distinct instances may be logically equal according to a class’s
equals method, but to Object’s hashCode method, they’re just two
objects with nothing much in common. Therefore, Object’s hashCode
method returns two seemingly random numbers instead of two equal
numbers as required by the contract.
A small semantic mistake in the interview question. Hash code is not used to check equality, it's used to detect inequality. If hash codes are different the objects are guaranteed to be inequal. If the codes are equal the objects may be equal and need to be checked with the equals-method.
That said, if the hash code is cached, it could be used to speed up the equals method.
Suppose you only override equals but not hashCode
This means that hashCode is inherited from Object
Object.hashCode always tries to return different hash codes for different objects (regardless if they are equal or not)
This means that you may end up with different hash codes for two objects that you consider to be equal.
This in turn causes these two equal objects to end up in different buckets in hash based collections such as HashSet.
This causes such collections to break.
More reference: https://programming.guide/java/overriding-hashcode-and-equals.html

Hashcode and equals methods contract [duplicate]

This question already has answers here:
Why do I need to override the equals and hashCode methods in Java?
(31 answers)
Closed 7 years ago.
I know that when we override equals() method then we need to override hashcode() as well and other way around.
But i don't understand why we MUST do that?
In Joshua Bloch Book it is clearly written that we must do that, because when we deal with hash based collections, it is crucial to satisfy the Hashcode contract and I admit that, but what if I am not dealing with hash-based collections?
Why is it still required ?
Why to Override Equals ?
A programmer who compares references to value objects using the equals
method expects to find out whether they are logically equivalent, not
whether they refer to the same object .
Now coming to HashCode
Hash function which is called to produce the hashCode should return the same hash code each and every time,
when function is applied on same or equal objects. In other words, two
equal objects must produce same hash code consistently.
Implementation of HashCode provided by Object Class is not based upon logical equivalency ,
So Now if you will not override hashCode but override equals, then according to you 2 Objects are equals as they will pass the equals() test but according to Java they are not .
Consequences :
Set start allowing duplicates !!!
Map#get(key) will not return the correct value !!
and so on many other consquences..................
Data structures, such as HashMap, depend on the contract.
A HashMap achieves magical performance properties by using the hashcode to bucketize entries. Every item that is put in the map that has the same hashcode() value gets placed in the same bucket. These "collisions" are resolved by comparing within the same bucket using equals(). In other words, the hashcode is used to determine the subset of the items in the map that might be equal and in this way quickly eliminate the vast majority of the items from further consideration.
This only works if objects that are equal are placed in the same bucket, which can only be ensured if they have the same hashcode.
NOTE: In practice, the number of collisions is much higher than may be implied above, because the number of buckets used is necessarily much smaller than the number of possible hashcode values.
As per Joshua Bloch book;
A common source of bugs is the failure to override the hashCode
method. You must override hashCode in every class that overrides
equals. Failure to do so will result in a violation of the general
contract for Object.hashCode, which will prevent your class from
functioning properly in conjunction with all hash-based collections,
including HashMap, HashSet, and Hashtable.
Failing to override hashcode while overriding equals is violation the contract of Object.hashCode. But this won't have impact if you are using your objects only on non hash based collection.
However, how do you prevent; the other developers doing so. Also if an object is eligible for element of collection, better provide support for all the collections, don't have half baked objects in your project. This will fail anytime in the future, and you will be caught for not following the contacts while implementing :)
Because that is the way it is meant to be:
Whenever a.equals(b), then a.hashCode() must be same as b.hashCode().
What issues should be considered when overriding equals and hashCode in Java?
There are use-cases where you don't need hashcode(), mostly self-written scenarious, but you can never be sure, because implementations can and might be also relying on hashcode() if they are using equals()
This question is answered many times in SO, but still I will attempt to answer this .
In order to understand this concept completely, we need to understand the purpose of hashcode and equals, how they are implemented, and what exactly is this contract(that hashcode also should be overridden when equals is overridden)
equals method is used to determine the equality of the object. For primitive types, its very easy to determine the equality. We can very easily say that int 1 is always equal to 1. But this equal method talks about the equality of objects. The object equality depends on the instance variables or any other parameter (depend purely on the implementation - how you want to compare).
This equal method needs to be overridden if we want some customized comparison, lets say we want to say that two books are same if they have same title and same author, or I can say two books are equal if they have same ISBN.
hashcode method returns a hash code value of an object. The default implementation of the Object hashcode returns a distinct integers for distinct objects. This integer is calculated based on the memory address of the object.
So we can say that the default implementation of the equals method just comapres the hashcodes to check the equality of the object. But for the book example - we need it differently.
Also Equal objects must produce the same hash code as long as they are equal, however unequal objects need not produce distinct hash codes.
In case of not using a hash based collection, you can break the contract and need not to override the hashcode method - because you ll not be using the default implementations anywhere but still I would not suggest that and would say to have it as you may need it in future when you put those things in collection

java - what happens if hashCode is not overriden? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
what is an objects hashcode
Let's say that I create an object, called Employee which has id, firstName, lastName and email for instance variables and corresponding setter/getter methods. How is hashCode() calculated if I don't override hashCode() in Employee object when it is stored in collection objects?
If you don't override hashcode() then the default implementation in Object class will be used by collections. This implementation gives different values for different objects, even if they are equal according to the equals() method.
Some collections, like HashSet, HashMap or HashTable use the hash code to store its data and to retrieve it. If you don't implement hashcode() and equals() in a consistent manner, then they will not function properly.
Edit:
As per Javadoc: Object.hashcode() is ''typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the Java(TM) programming language''. Therefore I would advise not to rely on a specific implementation. For what the implementations really do, see this answer to a similar question.
From the documentation:
As much as is reasonably practical, the hashCode method defined by
class Object does return distinct integers for distinct objects. (This
is typically implemented by converting the internal address of the
object into an integer, but this implementation technique is not
required by the JavaTM programming language.)
So basically when you store in a Map/Set/somethingThatRequiresHashCode, the JVM will use the internal memory address of that instance to calculate the hashCode, guaranteeing (as much as hash functions guarantee anything - they don't) that each distinct instance will have a unique hashCode.
This is particularly important because of the Object contract regarding equals and hashCode, since:
The equals method for class Object implements the most discriminating
possible equivalence relation on objects; that is, for any non-null
reference values x and y, this method returns true if and only if x
and y refer to the same object (x == y has the value true).
If you don't override equals, it will compare the internal address of the two references, which matches the logic behind hashCode.
If your question is more related to: Will the JVM look at the values inside an instance to determine equality/calculate hashcode, the answer is simply no, if you do:
MyObject a = new MyObject("a", 123,"something");
MyObject b = new MyObject("a", 123,"something");
a and b will have different hashcodes.
From effective Java 2nd Edition
Item 9: Always override hashCode when you override equals
A common
source of bugs is the failure to override the hashCode method. You
must override hashCode in every class that overrides equals. Failure
to do so will result in a violation of the general contract for
Object.hashCode, which will prevent your class from functioning
properly in conjunction with all hash-based collections, including
HashMap, HashSet, and Hashtable.
I suggest you read that chapter. There are a lot of examples, that you can learn, what will happen if you don't do so.
It depends on the collection, most collections should work even if the hashCode of the elements has not been overridden, except collections like HashSet which rely on the element hashCode in order to work properly.
Beware that the hashCode of a collection usually relies on the hashCode of the elements:
Returns the hash code value for this collection. While the Collection
interface adds no stipulations to the general contract for the
Object.hashCode method, programmers should take note that any class
that overrides the Object.equals method must also override the
Object.hashCode method in order to satisfy the general contract for
the Object.hashCode method. In particular, c1.equals(c2) implies that
c1.hashCode()==c2.hashCode()
See: http://docs.oracle.com/javase/6/docs/api/java/util/Collection.html#hashCode%28%29
**type of field** **hash code formula**
boolean fieldHash = f ? 0 : 1
any integer type except long fieldHash = (int) f
long fieldHash = (int) (f ^ (f >>> 32))
float fieldHash = Float.floatToIntBits( f )
double int v=Double.doubleToLongBits(f)
fieldHash (int) (v ^ (v >>> 32))
any Object reference fieldHash = f.hashCode()
If you write a custom type you are responsible for creating a good hashCode implementation that will best represent the state of the current instance.
http://content.hccfl.edu/pollock/AJava/HashCode.htm

Categories