As my knowledge set basically have two methods equals() and hashcode(),based on which its determining the values contains are equal and avoiding duplicate entry,but below program where i get confused,even wrapper classes also have both methods overridden then why its accepting duplicate values?
Code:
Collection col=new LinkedHashSet();
col.add(new Long(65));
col.add(new Byte((byte) 65));
col.add(new Integer(65));
col.add("A");
System.out.println(col);
Answer: [65,65,65,A]
But I expected [65,A]
A Long instance can never be equal to an Integer instance which can never be equal to a Byte instance, even if both have the same numeric value. The 3 instances you put in your Set are not equal to each other.
See, for example, Integer's equals :
public boolean equals(Object obj) {
if (obj instanceof Integer) {
return value == ((Integer)obj).intValue();
}
return false;
}
Both instances must be of the same type in order to be equal to each other (a necessary condition).
Here is what the javadoc for Integer.equals(Object) says:
Compares this object to the specified object. The result is true if and only if the argument is not null and is an Integer object that contains the same int value as this object.
In other words, an Integer object cannot be equal to an object that is not also an Integer object. The same applies to all of the primitive wrapper classes.
Thus, those 4 objects in your example are not equal, and hence not duplicates, according to the semantics of HashSet.
Unfortunately, if you are using HashSet or any other of the standard Java hashtable-based classes, together with the standard wrapper classes, there is no workaround for this.
However:
If you were to use TreeSet or similar, you could work around this issue using a custom Comparator object.
There is an alternative hash table implementation in Guava that allows you to supply external equals and hashCode implementations.
You could create your own wrapper classes with different semantics for equals and hashcode to the standard classes. Unfortunately, these would not interoperate with other things; e.g. Java auto-boxing / auto-unboxing.
Set does not allow duplicate values. The question here is what is duplicate and what is not.
Duplicate values are those thatare of the same type, have the same hashCode() and also return true when being compared. Objects of different types (in your case Byte and Long) are not equal in these terms.
If you want however to put values of different numeric types into Set and enjoy cross-type behavior, you can use TreeSet with your custom Comparator that compares values only without taking the type into consideration.
Related
I recently ran across a problem on leetcode which I solved with a nested hashset. This is the problem, if you're interested: https://leetcode.com/problems/group-anagrams/.
My intuition was to add all of the letters of each word into a hashset, then put that hashset into another hashset. At each iteration, I would check if the hashset already existed, and if it did, add to the existing hashset.
Oddly enough, that seems to work. Why do 2 hashsets share the same hashcode if they are different objects? Would something like if(set1.hashCode() == set2.hashCode()) doStuff() be valid code?
This is expected. HashSet extends AbstractSet. The hashCode() method in AbstractSet says:
Returns the hash code value for this set. The hash code of a set is defined to be the sum of the hash codes of the elements in the set, where the hash code of a null element is defined to be zero. This ensures that s1.equals(s2) implies that s1.hashCode()==s2.hashCode() for any two sets s1 and s2, as required by the general contract of Object.hashCode.
This implementation iterates over the set, calling the hashCode method on each element in the set, and adding up the results.
Here's the code from AbstractSet:
public int hashCode() {
int h = 0;
Iterator<E> i = iterator();
while (i.hasNext()) {
E obj = i.next();
if (obj != null)
h += obj.hashCode();
}
return h;
}
Why do 2 hashsets share the same hashcode if they are different objects?
With HashSet, the hashCode is calculated using the contents of the set. Since it's just numeric addition, the order of addition doesn't matter – just add them all up. So it makes sense that you have two sets, each containing objects which are equivalent (and thus should have matching hashCode() values), and then the sum of hashCodes within each set is the same.
Would something like if(set1.hashCode() == set2.hashCode()) doStuff() be valid code?
Sure.
EDIT: The best way of comparing two sets for equality is to use equals(). In the case of AbstractSet, calling set1.equals(set2) would result in individual calls to equals() at the level of the objects within the set (as well as some other checks).
Why do two different HashSets with the same data have the same
HashCode?
Actually this is needed to fulfill another need that is specified in Java.
The equals method of Set is overridden to take in consideration that equals returns true (example a.equals(b)) if:
a is of type Set and b is of type Set.
both a and b have exactly the same size.
a contains all elements of b.
b contains all elements of a.
Since the default equals (which compares only the memory reference to be the same) is overridden for Set, according to java guidelines the hashCode method has to be overridden as well. So, this custom implementation of hashCode is provided in order to match with the custom implementation of equals.
In order to see why it is necessary to override hashCode method when the equals method is overridden, you can take a look at this previous answer of mine.
Why do 2 hashsets share the same hashcode if they are different
objects
Because as explained above this is needed so that Set can have the custom functionality for equals that it currently has.
If you want to just check if a and b are different instances of set you can still check this with operators == and !=.
a == b -> true means a and b point to the same instance of Set in memory
a != b -> true means a and b point to different instances of Set in memory
This question already has answers here:
Compare two objects with .equals() and == operator
(16 answers)
Closed 1 year ago.
From what I understand, the == operator in Java compares references (an int) of objects.
This value is what the default implementation of hashCode method in Object returns.
The hashCode method has an implementation note:
As far as is reasonably practical, the hashCode method defined
by class Object returns distinct integers for distinct objects.
reasonably practical: This means that, no matter how small, there is a real possibility that two distinct objects can have equal hashCode or reference value.
So, if I compare two different objects (that don't override hashCode and equals) using ==, it's a real possibility that the result can be true (?). The default implementation of equals does a == check:
public class Test {
public static void main(String[] args) {
var t1 = new Test();
var t2 = new Test();
System.out.println(t1.hashCode() + ":" + t2.hashCode()); // 2055281021:1554547125 (Could've been 1554547125:1554547125 ?)
System.out.println(t1 == t2); // false (Could've been true ?)
System.out.println(t1.equals(t2)); // false (Could've been true ?)
}
}
Why is that equals and hashCode are overridden in certain situations only and rest of the time (many library classes such as Thread) depend on default implementation for equality check when it's not guaranteed to return correct result?
And, how someone extremely risk-averse make sure the above false-positive would never occur? If the class has at least one non-static field, one can override hashCode and equals. But, what if this is not the case (like the Test class above)?
Can you please explain what am I missing here?
Edit 1:
Adding an API note for hashCode (taken form Silvio's answer):
This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language
Okay, there's a lot of questions here, so let's try to break it down.
From what I understand, the == operator in Java compares references (an int) of objects.
This value is what the default implementation of hashCode method in Object returns.
== in Java compares references, yes. Those references are not necessarily compatible with int. On many common architectures, int will probably coincide with most of the observable space that a reference can occupy, but that's not true in general.
In particular.
int is a signed type. That means half of its values are negative. Pointers are generally unsigned.
Even if we ignore the sign problems, int is a 32-bit type. Most modern computers are 64-bit, which means the address space would fit better in a 64-bit integer (i.e. a Java long). So only a small fraction of addresses can even be stored in int.
Second, hashCode is not required to have anything to do with the pointer itself. From the hashCode docs you referenced already
(This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)
A Java implementation is free to choose whatever hashCode it wants. Maybe you're running on some bizarre embedded hardware and it makes sense to use some additional flag variable in the computation. hashCode should not be assumed to be the pointer.
Why is that equals and hashCode are overridden in certain situations only and rest of the time (many library classes such as Thread) depend on default implementation for equality check when it's not guaranteed to return correct result?
What is your definition of "correct" here? The guarantees demanded by the Java specification can be summarized from the docs
The equals method implements an equivalence relation on non-null object references:
It is reflexive: for any non-null reference value x, x.equals(x) should return true.
It is symmetric: for any non-null reference values x and y, x.equals(y) should return true if and only if y.equals(x) returns true.
It is transitive: for any non-null reference values x, y, and z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) should return true.
It is consistent: for any non-null reference values x and y, multiple invocations of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the objects is modified.
For any non-null reference value x, x.equals(null) should return false.
...
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
The default equals implementation clearly satisfies the basic requirements above, and the default hashCode is guaranteed by the standard to be the same for two equal objects.
We override equals when we have a better notion of equality. For instance, two strings should be considered equal if they have the same characters, even if they are distinct objects in memory, and two array lists should be equal if their elements are equal pointwise. But for something like Thread, what would that even mean? When should two arbitrary threads be equal? The default suffices well enough, because we'd gain nothing from overriding it anyway.
If the class has at least one non-static field, one can override hashCode and equals
What does equality have to do with the number of non-static fields? I can override the two just fine. Watch.
public final class MySimpleClass {
public boolean equals(Object other) {
return (other != null) && (other instanceof MySimpleClass);
}
public int hashCode() {
return 42;
}
}
That's a perfectly valid, conformant implementation of equality and hashing for MySimpleClass. In particular, since there's only one meaningfully distinct value of this class, I'd argue that's a good implementation of the two methods. No non-static fields required.
== always returns false if you compare 2 different objects and always returns true if you compare an object to itself.
But it is not guaranteed, that 2 different objects return different hash codes. That's because hashCode() returns int and there's only about 4 billion distinct ints. The number of objects in your code is constrained by the size of heap only.
So, because there can be more than 4 billion distinct objects, their hash codes can sometimes be the same
As for equals, it works as == by default, but can be overridden, so == can return false, when equals returns true and vice versa
equals and hashCode have an unenforceable-at-compile-time contract between them (which itself is different than the == operator).
Fundamentally speaking, an object should override hashCode such that a.equals(b) (and its inverse) is congruent to a.hashCode() == b.hashCode() (and its inverse).
The == operator is only looking to compare numeric equality, which is why the same instance of an object compared against itself (or a == a) will return true, with some caveats given to Strings and string interning.
Because the contract between equals and hashCode is unenforceable, suggesting that == will always return a "correct" result depends on your definition of "correct".
For instance:
It's correct that a square is a parallelogram; it's not correct that any given square is the same as any given parallelogram.
It's correct that a book is a dictionary; it's not correct that any given book is a dictionary.
It's correct that a car has wheels; it's not correct that any given car has any given number of wheels.
Also too - just remember that hashCode is only 32 bits, so there's always going to be the chance of a collision between two unrelated objects (which is where having equals pick up the slack is beneficial here).
In this context, you can only trust == based on the constraints and conditions the individual object has, and what business rules make sense for equality comparisons given a hash code, and nothing further. If your business rules require a deviation between how equals and hashCode behave, then you have to keep that context in mind when comparing through those methods.
Firstly, == checks the memory reference. JVM does this using pointers internally. So each object is literally different as they are stored in different memory address. As compared using memory address location 32 or 64 bit int/number.
For your second question, if you need to have a hash code implementation, but the class has no fields. Then use System.identityHashCode() to do it. It will provide zero for null object and a unique/smal hashcode for same object.
From what I understand, the == operator in Java compares references
(an int) of objects.
On a 64-bit architecture, a reference will need 8 bytes. This is a long, not an int.
This value is what the default implementation of hashCode method in Object returns.
When you cast a long to an int, you will lose information. This is why the default implementation of hashCode() can return equal hashes for different objects.
When you read the description of hashCode() in the Object class, it says that
If two objects are equal according to the equals(Object) method, then
calling the hashCode method on each of the two objects must produce
the same integer result.
I've read an article and it says that an object's hashcode() can provide different integer values if it runs in different environments even though their content is the same.
It does not happen with String class's hashcode() because the String class's hashcode() creates integer value based on the content of the object. The same content always creates the same hash value.
However, it happens if the hash value of the object is calculated based on its address in the memory. And the author of the article says that the Object class calculates the hash value of the object based on its address in the memory.
In the example below, the objects, p1 and p2 have the same content so equals() on them returns true.
However, how come these two objects return the same hash value when their memory addresses are different?
Here is the example code: main()
Person p1 = new Person2("David", 10);
Person p2 = new Person2("David", 10);
boolean b = p1.equals(p2);
int hashCode1 = p1.hashCode();
int hashCode2 = p2.hashCode();
Here is the overriden hashcode()
public int hashCode(){
return Objects.hash(name, age);
}
Is the article's content wrong?
If there is a hashCode() that calculates a hash value based on the instance's address what is the purpose of them?
Also, if it really exists, it violates the condition that Object class's hashCode() specifies. How should we use the hashCode() then?
I think you have misunderstood this statement:
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
The word “must” means that you, the programmer, are required to write a hashCode() method which always produces the same hashCode value for two objects which are equal to each other according to their equals(Object) method.
If you don’t, you have written a broken object that will not work with any class that uses hash codes, particularly unsorted Sets and Maps.
I've read an article and it says that an object's hashcode() can provide different integer values if it runs in different environments even though their content is the same.
Yes, hashCode() can provide different values in different runtimes, for a class which does not override the hashCode() method.
… how come these two objects return the same hash value when their memory addresses are different?
Because you told them to. You overrode the hashCode() method.
If there is a hashCode() that calculates a hash value based on the instance's address what is the purpose of them?
It doesn’t have a lot of use. That’s why programmers are strongly recommended to override the method, unless they don’t care about object identity.
Also, if it really exists, it violates the condition that Object class's hashCode() specifies. How should we use the hashCode() then?
No it does not. The contract states:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
As long as the hashCode method defined in the Object class returns the same value for the duration of the Java runtime, it is compliant.
The contract also says:
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
Two objects whose equals method reports that they are effectively equal must return the same hashCode. Any instance of the Object class whose type is not a subclass of Object is only equal to itself and cannot be equal to any other object, so any value it returns is valid, as long as it is consistent throughout the life of the Java runtime.
The article is wrong, memory address is not involved (btw. the address can change during the lifecycle of objects, so it's abstracted away as a reference). You can look at the default hashCode as a method that returns a random integer, which is always the same for the same object.
Default equals (inherited from Object) works exactly like ==. Since there is a condition (required for classes like HashSet etc. to work) that states when a.equals(b) then a.hashCode == b.hashCode, then we need a default hashCode which only has to have one property: when we call it twice, it must return the same number.
The default hashCode exists exactly so that the condition you mention is upheld. The value returned is not important, it's only important that it never changes.
No p1 and p2 does not have the same content
If you do p1.equals(p2) that will be false, so not the same content.
If you want p1 and p2 to equal, you need to implement the equals methods from object, in a way that compare their content. And IF you implement the equals method, then you also MUST implement the hashCode method, so that if equals return true, then the objects have the same hashCode().
Here's the design decision every programmer needs to make for objects they define of (say) MyClass:
Do you want it to be possible for two different objects to be "equal"?
If you do, then firstly you have to implement MyClass.equals() so that it gives the correct notion of equality for your purposes. That's entirely in your hands.
Then you're supposed to implement hashCode such that, if A.equals(B), then A.hashCode() == B.hashCode(). You explicitly do not want to use Object.hashCode().
If you don't want different objects to ever be equal, then don't implement equals() or hashCode(), use the implementations that Object gives you. For Object A and Object B (different Objects, and not subclasses of Object), then it is never the case that A.equals(B), and so it's perfectly ok that A.hashCode() is never the same as B.hashCode().
Given that I some class with various fields in it:
class MyClass {
private String s;
private MySecondClass c;
private Collection<someInterface> coll;
// ...
#Override public int hashCode() {
// ????
}
}
and of that, I do have various objects which I'd like to store in a HashMap. For that, I need to have the hashCode() of MyClass.
I'll have to go into all fields and respective parent classes recursively to make sure they all implement hashCode() properly, because otherwise hashCode() of MyClass might not take into consideration some values. Is this right?
What do I do with that Collection? Can I always rely on its hashCode() method? Will it take into consideration all child values that might exist in my someInterface object?
I OPENED A SECOND QUESTION regarding the actual problem of uniquely IDing an object here: How do I generate an (almost) unique hash ID for objects?
Clarification:
is there anything more or less unqiue in your class? The String s? Then only use that as hashcode.
MyClass hashCode() of two objects should definitely differ, if any of the values in the coll of one of the objects is changed. HashCode should only return the same value if all fields of two objects store the same values, resursively. Basically, there is some time-consuming calculation going on on a MyClass object. I want to spare this time, if the calculation had already been done with the exact same values some time ago. For this purpose, I'd like to look up in a HashMap, if the result is available already.
Would you be using MyClass in a HashMap as the key or as the value? If the key, you have to override both equals() and hashCode()
Thus, I'm using the hashCode OF MyClass as the key in a HashMap. The value (calculation result) will be something different, like an Integer (simplified).
What do you think equality should mean for multiple collections? Should it depend on element ordering? Should it only depend on the absolute elements that are present?
Wouldn't that depend on the kind of Collection that is stored in coll? Though I guess ordering not really important, no
The response you get from this site is gorgeous. Thank you all
#AlexWien that depends on whether that collection's items are part of the class's definition of equivalence or not.
Yes, yes they are.
I'll have to go into all fields and respective parent classes recursively to make sure they all implement hashCode() properly, because otherwise hashCode() of MyClass might not take into consideration some values. Is this right?
That's correct. It's not as onerous as it sounds because the rule of thumb is that you only need to override hashCode() if you override equals(). You don't have to worry about classes that use the default equals(); the default hashCode() will suffice for them.
Also, for your class, you only need to hash the fields that you compare in your equals() method. If one of those fields is a unique identifier, for instance, you could get away with just checking that field in equals() and hashing it in hashCode().
All of this is predicated upon you also overriding equals(). If you haven't overridden that, don't bother with hashCode() either.
What do I do with that Collection? Can I always rely on its hashCode() method? Will it take into consideration all child values that might exist in my someInterface object?
Yes, you can rely on any collection type in the Java standard library to implement hashCode() correctly. And yes, any List or Set will take into account its contents (it will mix together the items' hash codes).
So you want to do a calculation on the contents of your object that will give you a unique key you'll be able to check in a HashMap whether the "heavy" calculation that you don't want to do twice has already been done for a given deep combination of fields.
Using hashCode alone:
I believe hashCode is not the appropriate thing to use in the scenario you are describing.
hashCode should always be used in association with equals(). It's part of its contract, and it's an important part, because hashCode() returns an integer, and although one may try to make hashCode() as well-distributed as possible, it is not going to be unique for every possible object of the same class, except for very specific cases (It's easy for Integer, Byte and Character, for example...).
If you want to see for yourself, try generating strings of up to 4 letters (lower and upper case), and see how many of them have identical hash codes.
HashMap therefore uses both the hashCode() and equals() method when it looks for things in the hash table. There will be elements that have the same hashCode() and you can only tell if it's the same element or not by testing all of them using equals() against your class.
Using hashCode and equals together
In this approach, you use the object itself as the key in the hash map, and give it an appropriate equals method.
To implement the equals method you need to go deeply into all your fields. All of their classes must have equals() that matches what you think of as equal for the sake of your big calculation. Special care needs to be be taken when your objects implement an interface. If the calculation is based on calls to that interface, and different objects that implement the interface return the same value in those calls, then they should implement equals in a way that reflects that.
And their hashCode is supposed to match the equals - when the values are equal, the hashCode must be equal.
You then build your equals and hashCode based on all those items. You may use Objects.equals(Object, Object) and Objects.hashCode( Object...) to save yourself a lot of boilerplate code.
But is this a good approach?
While you can cache the result of hashCode() in the object and re-use it without calculation as long as you don't mutate it, you can't do that for equals. This means that calculation of equals is going to be lengthy.
So depending on how many times the equals() method is going to be called for each object, this is going to be exacerbated.
If, for example, you are going to have 30 objects in the hashMap, but 300,000 objects are going to come along and be compared to them only to realize that they are equal to them, you'll be making 300,000 heavy comparisons.
If you're only going to have very few instances in which an object is going to have the same hashCode or fall in the same bucket in the HashMap, requiring comparison, then going the equals() way may work well.
If you decide to go this way, you'll need to remember:
If the object is a key in a HashMap, it should not be mutated as long as it's there. If you need to mutate it, you may need to make a deep copy of it and keep the copy in the hash map. Deep copying again requires consideration of all the objects and interfaces inside to see if they are copyable at all.
Creating a unique key for each object
Back to your original idea, we have established that hashCode is not a good candidate for a key in a hash map. A better candidate for that would be a hash function such as md5 or sha1 (or more advanced hashes, like sha256, but you don't need cryptographic strength in your case), where collisions are a lot rarer than a mere int. You could take all the values in your class, transform them into a byte array, hash it with such a hash function, and take its hexadecimal string value as your map key.
Naturally, this is not a trivial calculation. So you need to think if it's really saving you much time over the calculation you are trying to avoid. It is probably going to be faster than repeatedly calling equals() to compare objects, as you do it only once per instance, with the values it had at the time of the "big calculation".
For a given instance, you could cache the result and not calculate it again unless you mutate the object. Or you could just calculate it again only just before doing the "big calculation".
However, you'll need the "cooperation" of all the objects you have inside your class. That is, they will all need to be reasonably convertible into a byte array in such a way that two equivalent objects produce the same bytes (including the same issue with the interface objects that I mentioned above).
You should also beware of situations in which you have, for example, two strings "AB" and "CD" which will give you the same result as "A" and "BCD", and then you'll end up with the same hash for two different objects.
For future readers.
Yes, equals and hashCode go hand in hand.
Below shows a typical implementation using a helper library, but it really shows the "hand in hand" nature. And the helper library from apache keeps things simpler IMHO:
#Override
public boolean equals(Object o) {
if (this == o) {
return true;
}
if (o == null || getClass() != o.getClass()) {
return false;
}
MyCustomObject castInput = (MyCustomObject) o;
boolean returnValue = new org.apache.commons.lang3.builder.EqualsBuilder()
.append(this.getPropertyOne(), castInput.getPropertyOne())
.append(this.getPropertyTwo(), castInput.getPropertyTwo())
.append(this.getPropertyThree(), castInput.getPropertyThree())
.append(this.getPropertyN(), castInput.getPropertyN())
.isEquals();
return returnValue;
}
#Override
public int hashCode() {
return new org.apache.commons.lang3.builder.HashCodeBuilder(17, 37)
.append(this.getPropertyOne())
.append(this.getPropertyTwo())
.append(this.getPropertyThree())
.append(this.getPropertyN())
.toHashCode();
}
17, 37 .. those you can pick your own values.
From your clarifications:
You want to store MyClass in an HashMap as key.
This means the hashCode() is not allowed to change after adding the object.
So if your collections may change after object instantiation, they should not be part of the hashcode().
From http://docs.oracle.com/javase/8/docs/api/java/util/Map.html
Note: great care must be exercised if mutable objects are used as map
keys. The behavior of a map is not specified if the value of an object
is changed in a manner that affects equals comparisons while the
object is a key in the map.
For 20-100 objects it is not worth that you enter the risk of an inconsistent hash() or equals() implementation.
There is no need to override hahsCode() and equals() in your case.
If you don't overide it, java takes the unique object identity for equals and hashcode() (and that works, epsecially because you stated that you don't need an equals() considering the values of the object fields).
When using the default implementation, you are on the safe side.
Making an error like using a custom hashcode() as key in the HashMap when the hashcode changes after insertion, because you used the hashcode() of the collections as part of your object hashcode may result in an extremly hard to find bug.
If you need to find out whether the heavy calculation is finished, I would not absue equals(). Just write an own method objectStateValue() and call hashcode() on the collection, too. This then does not interfere with the objects hashcode and equals().
public int objectStateValue() {
// TODO make sure the fields are not null;
return 31 * s.hashCode() + coll.hashCode();
}
Another simpler possibility: The code that does the time consuming calculation can raise an calculationCounter by one as soon as the calculation is ready. You then just check whether or not the counter has changed. this is much cheaper and simpler.
I found this comment on can StringBuffer objects be keys in TreeSet in Java?
"There are 2 identifying strategies used with Maps in Java (more-or-less).
Hashing: An input "Foo" is converted into a best-as-possible attempt to generate a number that uniquely accesses an index into an array. (Purists, please don't abuse me, I am intentionally simplifying). This index is where your value is stored. There is the likely possibility that "Foo" and "Bar" actually generate the same index value meaning they would both be mapped to the same array position. Obviously this can't work and so that's where the "equals()" method comes in; it is used to disambiguate
Comparison: By using a comparative method you don't need this extra disambiguation step because comparison NEVER produces this collision in the first place. The only key that "Foo" is equal to is "Foo". A really good idea though is if you can is to define "equals()" as compareTo() == 0; for consistency sake. Not a requirement."
my question is as follows:
if my class implements comparable, then does it mean I dont have to override equals and hashcode method for using my objects as keys in Hash collections. eg
class Person implements Comparable<Person> {
int id;
String name;
public Person(int id, String name) {
this.id=id;
this.name=name;
}
public int compareTo(Person other) {
return this.id-other.id;
}
}
Now, can I use my Person objects in Hashable collections?
The article you brough is talking on TreeSet. a tree set is a tree with each node has a place defined by it's value in compare to the other values already in the tree.
a hashTable stores key/value pairs in a hash table. When using a Hashtable, you specify an object that is used as a key, and the value that you want linked to that key. The key is then hashed, and the resulting hash code is used as the index at which the value is stored within the table.
the difference between Hashable and TreeSet is that treeset don't need hashCode, it just need to know if you need the take the item left or right in the tree. for that you can use Compare and nothing more.
in hashTable a compare will suffice, because it's build differently, each object get to his cell by hashing it, not by comparing it to the items already in the collection.
so the answer is no, you can' use Person in hashtable just with compareTo. u must override hashCode() and equals() for that
i also suggest you read this article on hashtables
HashTable does use equals and hashCode. Every class has those methods. If you don't implement them, you inherit them.
Whether you need to implement them depends on whether the inherited version is suitable for your purposes. In particular, since Person has no specified superclass, it inherits the Object methods. That means a Person object is equal only to itself.
Do you need two distinct Person objects to be treated as being equal as HashTable keys?
if my class implements comparable, then does it mean I dont have to override equals and hashcode method for using my objects as keys in Hash collections. eg
No, you still need to implement equals() and hashCode(). The methods perform very different functions and cannot be replaced by compareTo().
equals() returns a boolean based on equality of the object. This is usually identity equality and not field equality. This can be very different from the fields used to compare an object in compareTo(...) although if it makes sense for the entity, the equals() method can be:
#Overrides
public boolean equals(Object obj) {
if (obj == null || obj.getClass() != getClass()) {
return false;
} else {
return compareTo((Person)obj) == 0;
}
}
hashCode() returns an integer value for the instance which is used in hash tables to calculate the bucket it should be placed in. There is no equivalent way to get this value out of compareTo(...).
TreeSet needs Comparable, to add values to right or left of tree. HashMap needs equals() and Hashcode() methods that are available from Object Class but you have to override them for your purpose.
If a class implements Comparable, that would suggest that instances of the class represent values of some sort; generally, when classes encapsulate values it will be possible for there to exist two distinct instances which hold the same value and should consequently be considered equivalent. Since the only way for distinct object instances to be considered equivalent is for them to override equals and hashCode, that would imply that things which implement Comparable should override equals and hashCode unless the encapsulated values upon which compare operates will be globally unique (implying that distinct instances should never be considered equivalent).
As a simple example, suppose a class includes a CreationRank field of type long; every time an instances is created, that member is set to a value fetched from a singleton AtomicLong, and Comparable uses that field to rank objects in the order of creation. No two distinct instances of the class will ever report the same CreationRank; consequently, the only way x.equals(y) should ever be true is if x and y refer to the same object instance--exactly the way the default equals and hashCode work.
BTW, having x.compare(y) return zero should generally imply that x.equals(y) will return true, and vice versa, but there are some cases where x.equals(y) may be false but x.compare(y) should nonetheless return zero. This may be the case when an object encapsulates some properties that can be ranked and others that cannot. Consider, for example, a hypohetical FutureAction type which encapsulates a DateTime and an implementation of a DoSomething interface. Such things could be ranked based upon the encapsulated date and time, but there may be no sensible way to rank two items which have the same date and time but different actions. Having equals report false while compare reports zero would make more sense than pretending that the clearly-non-equivalent items should be called "equal".