Java hashCode doubt - java

I have this program:
import java.util.*;
public class test {
private String s;
public test(String s) { this.s = s; }
public static void main(String[] args) {
HashSet<Object> hs = new HashSet<Object>();
test ws1 = new test("foo");
test ws2 = new test("foo");
String s1 = new String("foo");
String s2 = new String("foo");
hs.add(ws1);
hs.add(ws2);
hs.add(s1);
hs.add(s2); // removing this line also gives same output.
System.out.println(hs.size());
}
}
Note that this is not a homework. We were asked this question on our quiz earlier today. I know the answers but trying to understand why it is so.
The above program gives 3 as output.
Can anyone please explain why that is?
I think (not sure):
The java.lang.String class overrides the hashCode method from java.lang.Object. So the String objects with value "foo" will be treated as duplicates. The test class does not override the hashCode method and ends up using the java.lang.Object version and this version always returns a different hashcode for every object, so the two test objects being added are treated as different.

In this case it's not about hashCode() but is about equals() method. HashSet is still Set, which has semantic of not allowing duplicates. Duplicates are checked for using equals() method which in case of String will return true
However for your test class equals() method is not defined and it will use the default implementation from Object which will return true only when both references are to the same instance.
Method hashCode() is used not to check if objects should be treated as same but as a way to distribute them in collections based on hash functions. It's absolutely possible that for two objects this method will return same value while equals() will return false.
P.S. hashCode implementation of Object doesn't guarantee uniqueness of values. It's easy to check using simple loop.

Hashcode is used to narrow down the search result. When we try to insert any key in HashMap first it checks whether any other object present with same hashcode and if yes then it checks for the equals() method. If two objects are same then HashMap will not add that key instead it will replace the old value by new one.

In fact, it is not about overriding the hashcode(), it is about equals method. Set does not allow duplicates. A duplicate is the one where the objects are logically equal.
For verifying you can try with
System.out.println(ws1.equals(ws2));
System.out.println(s1.equals(s2));
If the objects are equal, only one will be accepted by a set.

Below are few (well quite many) bullets refarding the equals and hashcode from my preparations to SCJP.
Hope it helps:
equals(), hashCode(), and toString() are public.
Override toString() so that System.out.println() or other methods can see something useful, like your object's state.
Use == to determine if two reference variables refer to the same object.
Use equals() to determine if two objects are meaningfully equivalent.
If you don't override equals(), your objects won't be useful hashing keys.
If you don't override equals(), different objects can't be considered equal.
Strings and wrappers override equals() and make good hashing keys.
When overriding equals(), use the instanceof operator to be sure you're evaluating an appropriate class.
When overriding equals(), compare the objects' significant attributes.
Highlights of the equals() contract:
a. Reflexive: x.equals(x) is true.
b. Symmetric: If x.equals(y) is true, then y.equals(x) must be true.
c. Transitive: If x.equals(y) is true, and y.equals(z) is true, then z.equals(x) is true.
d. Consistent: Multiple calls to x.equals(y) will return the same result.
e. Null: If x is not null, then x.equals(null) is false.
f. If x.equals(y) is true, then x.hashCode() == y.hashCode() is true.
If you override equals(), override hashCode().
HashMap, HashSet, Hashtable, LinkedHashMap, & LinkedHashSet use hashing.
An appropriate hashCode() override sticks to the hashCode() contract.
An efficient hashCode() override distributes keys evenly across its buckets.
An overridden equals() must be at least as precise as its hashCode() mate.
To reiterate: if two objects are equal, their hashcodes must be equal.
It's legal for a hashCode() method to return the same value for all instances (although in practice it's very inefficient).
In addition if you implement equals and hashcode the transient fields (if any) must be treated properly.
The Commons have nice implementation for EqualsBuilder and HashcodeBuilder. They are available in Coomons Lang
http://commons.apache.org/lang/
I use them whenevr I need to implement the equals and the hashcode.

Related

Is == guaranteed to return correct result? [duplicate]

This question already has answers here:
Compare two objects with .equals() and == operator
(16 answers)
Closed 1 year ago.
From what I understand, the == operator in Java compares references (an int) of objects.
This value is what the default implementation of hashCode method in Object returns.
The hashCode method has an implementation note:
As far as is reasonably practical, the hashCode method defined
by class Object returns distinct integers for distinct objects.
reasonably practical: This means that, no matter how small, there is a real possibility that two distinct objects can have equal hashCode or reference value.
So, if I compare two different objects (that don't override hashCode and equals) using ==, it's a real possibility that the result can be true (?). The default implementation of equals does a == check:
public class Test {
public static void main(String[] args) {
var t1 = new Test();
var t2 = new Test();
System.out.println(t1.hashCode() + ":" + t2.hashCode()); // 2055281021:1554547125 (Could've been 1554547125:1554547125 ?)
System.out.println(t1 == t2); // false (Could've been true ?)
System.out.println(t1.equals(t2)); // false (Could've been true ?)
}
}
Why is that equals and hashCode are overridden in certain situations only and rest of the time (many library classes such as Thread) depend on default implementation for equality check when it's not guaranteed to return correct result?
And, how someone extremely risk-averse make sure the above false-positive would never occur? If the class has at least one non-static field, one can override hashCode and equals. But, what if this is not the case (like the Test class above)?
Can you please explain what am I missing here?
Edit 1:
Adding an API note for hashCode (taken form Silvio's answer):
This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language
Okay, there's a lot of questions here, so let's try to break it down.
From what I understand, the == operator in Java compares references (an int) of objects.
This value is what the default implementation of hashCode method in Object returns.
== in Java compares references, yes. Those references are not necessarily compatible with int. On many common architectures, int will probably coincide with most of the observable space that a reference can occupy, but that's not true in general.
In particular.
int is a signed type. That means half of its values are negative. Pointers are generally unsigned.
Even if we ignore the sign problems, int is a 32-bit type. Most modern computers are 64-bit, which means the address space would fit better in a 64-bit integer (i.e. a Java long). So only a small fraction of addresses can even be stored in int.
Second, hashCode is not required to have anything to do with the pointer itself. From the hashCode docs you referenced already
(This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)
A Java implementation is free to choose whatever hashCode it wants. Maybe you're running on some bizarre embedded hardware and it makes sense to use some additional flag variable in the computation. hashCode should not be assumed to be the pointer.
Why is that equals and hashCode are overridden in certain situations only and rest of the time (many library classes such as Thread) depend on default implementation for equality check when it's not guaranteed to return correct result?
What is your definition of "correct" here? The guarantees demanded by the Java specification can be summarized from the docs
The equals method implements an equivalence relation on non-null object references:
It is reflexive: for any non-null reference value x, x.equals(x) should return true.
It is symmetric: for any non-null reference values x and y, x.equals(y) should return true if and only if y.equals(x) returns true.
It is transitive: for any non-null reference values x, y, and z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) should return true.
It is consistent: for any non-null reference values x and y, multiple invocations of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the objects is modified.
For any non-null reference value x, x.equals(null) should return false.
...
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
The default equals implementation clearly satisfies the basic requirements above, and the default hashCode is guaranteed by the standard to be the same for two equal objects.
We override equals when we have a better notion of equality. For instance, two strings should be considered equal if they have the same characters, even if they are distinct objects in memory, and two array lists should be equal if their elements are equal pointwise. But for something like Thread, what would that even mean? When should two arbitrary threads be equal? The default suffices well enough, because we'd gain nothing from overriding it anyway.
If the class has at least one non-static field, one can override hashCode and equals
What does equality have to do with the number of non-static fields? I can override the two just fine. Watch.
public final class MySimpleClass {
public boolean equals(Object other) {
return (other != null) && (other instanceof MySimpleClass);
}
public int hashCode() {
return 42;
}
}
That's a perfectly valid, conformant implementation of equality and hashing for MySimpleClass. In particular, since there's only one meaningfully distinct value of this class, I'd argue that's a good implementation of the two methods. No non-static fields required.
== always returns false if you compare 2 different objects and always returns true if you compare an object to itself.
But it is not guaranteed, that 2 different objects return different hash codes. That's because hashCode() returns int and there's only about 4 billion distinct ints. The number of objects in your code is constrained by the size of heap only.
So, because there can be more than 4 billion distinct objects, their hash codes can sometimes be the same
As for equals, it works as == by default, but can be overridden, so == can return false, when equals returns true and vice versa
equals and hashCode have an unenforceable-at-compile-time contract between them (which itself is different than the == operator).
Fundamentally speaking, an object should override hashCode such that a.equals(b) (and its inverse) is congruent to a.hashCode() == b.hashCode() (and its inverse).
The == operator is only looking to compare numeric equality, which is why the same instance of an object compared against itself (or a == a) will return true, with some caveats given to Strings and string interning.
Because the contract between equals and hashCode is unenforceable, suggesting that == will always return a "correct" result depends on your definition of "correct".
For instance:
It's correct that a square is a parallelogram; it's not correct that any given square is the same as any given parallelogram.
It's correct that a book is a dictionary; it's not correct that any given book is a dictionary.
It's correct that a car has wheels; it's not correct that any given car has any given number of wheels.
Also too - just remember that hashCode is only 32 bits, so there's always going to be the chance of a collision between two unrelated objects (which is where having equals pick up the slack is beneficial here).
In this context, you can only trust == based on the constraints and conditions the individual object has, and what business rules make sense for equality comparisons given a hash code, and nothing further. If your business rules require a deviation between how equals and hashCode behave, then you have to keep that context in mind when comparing through those methods.
Firstly, == checks the memory reference. JVM does this using pointers internally. So each object is literally different as they are stored in different memory address. As compared using memory address location 32 or 64 bit int/number.
For your second question, if you need to have a hash code implementation, but the class has no fields. Then use System.identityHashCode() to do it. It will provide zero for null object and a unique/smal hashcode for same object.
From what I understand, the == operator in Java compares references
(an int) of objects.
On a 64-bit architecture, a reference will need 8 bytes. This is a long, not an int.
This value is what the default implementation of hashCode method in Object returns.
When you cast a long to an int, you will lose information. This is why the default implementation of hashCode() can return equal hashes for different objects.

If the hashcode() creates hashcode based on the address of the object, how can two different objects with same contents create the same hashcode?

When you read the description of hashCode() in the Object class, it says that
If two objects are equal according to the equals(Object) method, then
calling the hashCode method on each of the two objects must produce
the same integer result.
I've read an article and it says that an object's hashcode() can provide different integer values if it runs in different environments even though their content is the same.
It does not happen with String class's hashcode() because the String class's hashcode() creates integer value based on the content of the object. The same content always creates the same hash value.
However, it happens if the hash value of the object is calculated based on its address in the memory. And the author of the article says that the Object class calculates the hash value of the object based on its address in the memory.
In the example below, the objects, p1 and p2 have the same content so equals() on them returns true.
However, how come these two objects return the same hash value when their memory addresses are different?
Here is the example code: main()
Person p1 = new Person2("David", 10);
Person p2 = new Person2("David", 10);
boolean b = p1.equals(p2);
int hashCode1 = p1.hashCode();
int hashCode2 = p2.hashCode();
Here is the overriden hashcode()
public int hashCode(){
return Objects.hash(name, age);
}
Is the article's content wrong?
If there is a hashCode() that calculates a hash value based on the instance's address what is the purpose of them?
Also, if it really exists, it violates the condition that Object class's hashCode() specifies. How should we use the hashCode() then?
I think you have misunderstood this statement:
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
The word “must” means that you, the programmer, are required to write a hashCode() method which always produces the same hashCode value for two objects which are equal to each other according to their equals(Object) method.
If you don’t, you have written a broken object that will not work with any class that uses hash codes, particularly unsorted Sets and Maps.
I've read an article and it says that an object's hashcode() can provide different integer values if it runs in different environments even though their content is the same.
Yes, hashCode() can provide different values in different runtimes, for a class which does not override the hashCode() method.
… how come these two objects return the same hash value when their memory addresses are different?
Because you told them to. You overrode the hashCode() method.
If there is a hashCode() that calculates a hash value based on the instance's address what is the purpose of them?
It doesn’t have a lot of use. That’s why programmers are strongly recommended to override the method, unless they don’t care about object identity.
Also, if it really exists, it violates the condition that Object class's hashCode() specifies. How should we use the hashCode() then?
No it does not. The contract states:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
As long as the hashCode method defined in the Object class returns the same value for the duration of the Java runtime, it is compliant.
The contract also says:
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
Two objects whose equals method reports that they are effectively equal must return the same hashCode. Any instance of the Object class whose type is not a subclass of Object is only equal to itself and cannot be equal to any other object, so any value it returns is valid, as long as it is consistent throughout the life of the Java runtime.
The article is wrong, memory address is not involved (btw. the address can change during the lifecycle of objects, so it's abstracted away as a reference). You can look at the default hashCode as a method that returns a random integer, which is always the same for the same object.
Default equals (inherited from Object) works exactly like ==. Since there is a condition (required for classes like HashSet etc. to work) that states when a.equals(b) then a.hashCode == b.hashCode, then we need a default hashCode which only has to have one property: when we call it twice, it must return the same number.
The default hashCode exists exactly so that the condition you mention is upheld. The value returned is not important, it's only important that it never changes.
No p1 and p2 does not have the same content
If you do p1.equals(p2) that will be false, so not the same content.
If you want p1 and p2 to equal, you need to implement the equals methods from object, in a way that compare their content. And IF you implement the equals method, then you also MUST implement the hashCode method, so that if equals return true, then the objects have the same hashCode().
Here's the design decision every programmer needs to make for objects they define of (say) MyClass:
Do you want it to be possible for two different objects to be "equal"?
If you do, then firstly you have to implement MyClass.equals() so that it gives the correct notion of equality for your purposes. That's entirely in your hands.
Then you're supposed to implement hashCode such that, if A.equals(B), then A.hashCode() == B.hashCode(). You explicitly do not want to use Object.hashCode().
If you don't want different objects to ever be equal, then don't implement equals() or hashCode(), use the implementations that Object gives you. For Object A and Object B (different Objects, and not subclasses of Object), then it is never the case that A.equals(B), and so it's perfectly ok that A.hashCode() is never the same as B.hashCode().

Object equals method and hashcode Method

I'm revising for my upcoming exam and I've came across a question from a past paper that I'm unsure on.
The question is the following : Describe the equivalence relation that the Object equals method implements. What relationship must hold between the Object equals method and hasCode methods?
If this came up in the exam then I wouldn't be too sure how about answering this. I'll try and give it my best shot. Since the Object equals method is checking whether two objects are equal, the hashcode gives objects ascii values. If you have two objects that are the same it is possible that they can be given the same hascode since they're equal. The object equals method you are told whether they're equal whereas the hashcode method gives you a value which you can then use to compare with other hashcodes and find out whether they're the same or not. Also could have I of mentioned something about overriding methods ? I'm not if that could come into play in the answer.
My answer is probably completely wrong but the best I could think off :P
To the Java API Documentation! especially Object.hashCode
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
This means that if the equal method returns true, hashCode must return the same value for both objects.
x.equals(y) == true => x.hashCode() == y.hashCode()
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.
But! If the objects are not the same they may have the same hashCode()! (But it is not advised)
The requirements for the equals() method are well described, so i won't repeat them here.
The question is clearly answered in the descriptions of both the equals() method and the hashCode() method. Have a look at equals method description and hashCode method description.
Here is the relevant answer you might be looking for:
The equals method for class Object implements the most discriminating
possible equivalence relation on objects; that is, for any non-null
reference values x and y, this method returns true if and only if x
and y refer to the same object (x == y has the value true).
Note that it is generally necessary to override the hashCode method
whenever this method is overridden, so as to maintain the general
contract for the hashCode method, which states that equal objects must
have equal hash codes.
This the hashCode method description:
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
Also, the hashCode method does not give the ASCII value of an object, merely an integer value that uniquely defines the object.

Does singleton means hashcode always return the same?

I have two objects, o1 and o2 from the same class.
If o1.hashcode() == o2.hashcode(), can I tell they are the same object?
Beside o1==o2, is there any other way to tell the singleton.
If you have a single instance of the class, the == and the equals comparison will always return true.
However, the hashcode can be equal for different objects, so an equality is not guaranteed just by having equal hashcodes.
Here is a nice explanation of the hashcode and equals contracts.
Checking the equality is not sufficient to be sure that you have a singleton, only that the instances are considered equal.
If you want to have a single instance of a java class, it may be better to make use of static members and methods.
Here, several approaches to singletons are demonstrated.
EDIT: as emory pointed out - you could in fact override equals to return something random and thus violate the required reflexivity (x.equals(x) == true). As you cannot override operators in java, == is the only reliable way to determine identical objects.
No, different objects can have the same hashCode():
"hypoplankton".hashCode()
"unheavenly" .hashCode()
both return the same 427589249 hash value, while they are clearly not equal.
Your question (from the title) seems to be "will hashCode() always return the same value for the same object"... the answer is no.
The implementation is free to return anything, although to be well behaved it should return the same for the same object. For example, this is a valid, albeit poor, implementation:
#Override
public int hashCode() {
return (int) (Math.random() * Integer.MAX_VALUE);
}
The general contract for the hashCode is described below (Effective Java, page. 92). The 3rd item means that the hashCode() results do not need to be unique when called on unequal objects.
Within the same program, the result of hashCode() must not change
If equals() returns true when called with two objects, calling hashCode() on each of those objects must return the same result
If equals() returns false when called with two objects, calling hashCode() on each of those objects does not have to return a different result

Does Hashcode equality imply refer reference based equality?

I read that to use equals() method in java we also have to override the hashcode() method and that the equal (logically) objects should have eual hashcodes, but doesn't that imply reference based equality! Here is my code for overridden equals() method, how should I override hashcode method for this:
#Override
public boolean equals(Object o)
{
if (!(o instanceof dummy))
return false;
dummy p = (dummy) o;
return (p.getName() == this.getName() && p.getId() == this.getId() && p.getPassword() == this.getPassword());
}
I just trying to learn how it works, so there are only three fields, namely name , id and password , and just trying to compare two objects that I define in the main() thats all! I also need to know if it is always necessary to override hashcode() method along with equals() method?
Hashcode equality does not imply anything. However, hashcode inequality should imply that equals will yield false, and any two items that are equal should always have the same hashcode.
For this reason, it is always wise to override hashcode with equals, because a number of data structures rely on it.
Even though failure to override hashCode() will only break usage of your class in HashSet, HashMap, and other hashCode dependent structures, you should still override hashCode() to maintain the contract described by Object.
The general strategy of most hashCode() implementations is to combine the hash codes of the fields used to determine equality. In your case, a reasonable hashCode() may look something like this:
public int hashCode(){
return this.getName().hashCode() ^ this.getId() ^ this.getPassword().hashCode();
}
You need to override hashCode() when you override equals(). Merely using equals() is not enough to require you to override hashCode().
In your code, you aren't actually comparing your fields' values. Use equals() instead of == to make your implementation of equal correct.
return (p.getName().equals(this.getName()) && ...
(Note that the above code can cause null reference exceptions if getName() returns null: you may want to use a utility class as described here)
And yes hashCode() would be called when you use some hashing data structure like HashMap,HashSet
You must override hashCode() in every
class that overrides equals(). Failure
to do so will result in a violation of
the general contract for
Object.hashCode(), which will prevent
your class from functioning properly
in conjunction with all hash-based
collections, including HashMap,
HashSet, and Hashtable.
from Effective Java, by Joshua Bloch
Also See
overriding-equals-and-hashcode-in-java
hashcode-and-equals
Nice article on equals() & hashCode()
The idea with hashCode() is that it is a unique representation of your object in a given space. Data structures that hold objects use hash codes to determine where to place objects. In Java, a HashSet for example uses the hash code of an object to determine which bucket that objects lies in, and then for all objects in that bucket, it uses equals() to determine whether it is a match.
If you don't override hashCode(), but do override equals(), then you will get to a point where you consider 2 objects to be equal, but Java collections don't see it the same way. This will lead to a lot of strange behaviour.

Categories