Does it make sense for equals and compareTo to be inconsistent? - java

I want to make a class usable in SortedSet | SortedMap.
class MyClass implements Comparable<MyClass>{
// the only thing relevant to comparisons:
private final String name;
//...
}
The class' instances must be sorted by their name property.
However, I don't want equally named instances to be considered as equal.
So a SortedSet content would look like a, a, a, b, c.
(Normally, SortedSet would only allow a, b, c)
First of all: is this (philosophically) consistent?
If so, do I have to expect unpredictable behavior, when I don't
override equals(...) and hashCode()?
Edit:
I am sorry, my question seems inconsistent:
I want to put multiple "equal" values inside a set, which doesn't allow this
by concept.
So, please don't reply to my question anymore.
Thanks to all who already replied.

Let me ask you a question: does it make sense to have a.compareTo(b) return 0 and a.equals(b) return false?
I would use a Comparator<MyClass> instead. This is why all SortedMap/SortedSet implementations that I know of allow you to pass in a Comparator at creation.

From the Javadoc for Comparable
It is strongly recommended (though not
required) that natural orderings be
consistent with equals. This is so
because sorted sets (and sorted maps)
without explicit comparators behave
"strangely" when they are used with
elements (or keys) whose natural
ordering is inconsistent with equals
If you want to have compareTo inconsistent with equals(), it is recommended that you instead use an explicit comparator by providing a class that implements Comparator.
If so, do I have to expect unpredictable behavior, when I don't override equals(...) and hashcode()?
You should still override equals() and hashcode(). Whether or not equals() and hashcode() are consistent with compareTo is a different matter.

Effective Java recommends that if you don't implement compareTo consistent with equals you should clearly indicate so:
The recommended language is "Note:
This class has a natural ordering that
is inconsistent with equals."

Just put this code in the equals method and dont ever think about it again:
public boolean equals(Object obj) {
if (this == obj) return true;
if (!(obj instanceof MyClass)) return false;
return 0 == this.compareTo((MyClass) obj);
}

Related

Overriding hashCode() when overriding equals() [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
In Java, why must equals() and hashCode() be consistent?
I read that one should always hascode() when overriding equals().
Can anyone give a practical example of why it might be wrong otherwise?
i.e. Problems that might arise when overriding equals() but not hashCode().
Is it necessary to write a robust hasCode() function whenever we override equals()? Or a trivial implementation is enough?
For example,
A poor implementation such as the below is good enough to satisfy the contract between equals() & hashCode()?
public int hashCode() {
return 91;
}
Both equals and hashcode are based on the principle of object's unicity. If equals returns true the hashcode of both objects must be the same, otherwise hash based structures and algorithms could have undefined results.
Think of a hash based structure such as a HashMap. hashcode will be invoked as a base to get the key's reference, not equals, making it impossible in most cases to find the key. Also, a poor implementation of hashcode will create collisions (multiple objects with the same hashcode, which one is the "correct" one?) that affect performance.
IMHO, overriding equals OR hashcode (instead of overriding both) should be considered a code smell or, at least, a potential bugs source. That is, unless you're 100% sure it won't affect your code sooner or later (when are we so sure anyway?).
Note: There are various libraries that provide support for this by having equals and hashcode builders, like Apache Commons with HashcodeBuilder and EqualsBuilder.
equals() and hashCode() are used conjunctively in certain collections, such as HashSet and HashMap, so you have to make sure that if you use these collections, you override hashCode according to the contract.
If you don't override hashCode at all, then you'll have problems with HashSet and HashMap. In particular, two objects that are "equal" may be put in different hash buckets even though they should be equal.
If you do override hashCode, but do so poorly, then you'll have performance issues. All your entries for HashSet and HashMap will be put into the same bucket, and you'll lose the O(1) performance and have O(n) instead. This is because the data structure essentially becomes a linearly-checked linked list.
As for breaking programs outside of these conditions, it's not likely, but you never know when an API (especially in 3rd-party libraries) is going to depend on this contract. The contract is upheld for objects that don't implement either of them, so it's conceivable that a library may depend on this somewhere without using hash buckets.
In any case, implementing a good hashCode is easy, especially if you're using an IDE. Eclipse and Netbeans both have the ability to generate equals and hashCode for you in a way that all contracts are followed, including the inverse rules of equals (the assertion that a.equals(b) == b.equals(a)). All you need to do is select the fields you want to be included and go.
Here's some code that illustrates a bug you can introduce by not implementing hashCode(): Set.contains() will first check the hashCode() of an object, and then check .equals(). So, if you don't implement both, .contains() will not behave in an intuitive way:
public class ContainsProblem {
// define a class that implements equals, without implementing hashcode
class Car {
private String name;
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (!(o instanceof Car)) return false;
Car car = (Car) o;
if (name != null ? !name.equals(car.name) : car.name != null) return false;
return true;
}
public String getName() {return name;}
public Car(String name) { this.name = name;}
}
public static void main(String[] args) {
ContainsProblem oc = new ContainsProblem();
ContainsProblem.Car ford = oc.new Car("ford");
ContainsProblem.Car chevy = oc.new Car("chevy");
ContainsProblem.Car anotherFord = oc.new Car("ford");
Set cars = Sets.newHashSet(ford,chevy);
// if the set of cars contains a ford, a ford is equal to another ford, shouldn't
// the set return the same thing for both fords? without hashCode(), it won't:
if (cars.contains(ford) && ford.equals(anotherFord) && !cars.contains(anotherFord)) {
System.out.println("oh noes, why don't we have a ford? isn't this a bug?");
}
}
}
Your trivial implementation is correct, but would kill the performance of hash-based collections.
The default implementation (provided by Object) would break the contract if two different instances of your class compared equal.
I suggest reading Joshua Bloch's "Effective Java" Chapter 3 "Methods Common to All Objects". Nobody can explaing better than him. He He led the design and implementation of numerous Java platform features.

When to include what?

I created a class Person (as the book says) to hold the name and last name of a person entered from the keyboard and then there is another class PhoneNumber which encapsulates the country code, area code and the number of a person as a String.
Person is intended to be used as the key in a Hashmap.
Class BookEntry encapsulates both Person and PhoneNumber. A lot of BookEntry objects make up a HashMap that represents a phonebook.
Person implements Comparable<Person> so it contains CompareTo(Person) method. Later the book adds equals(Object anotherPerson)method.
My question is, isn't the CompareTo method enough for comparing two keys? or is it that the internal mechanics of the HashMap<> requires me to include equals() method to compare two keys?
compareTo()
public int compareTo(Person person) {
int result = lastName.compareTo(person.lastName);
return result==0? firstName.compareTo(person.firstName):result;
}
equals()
public boolean equals(Object anotherPerson){
return compareTo((Person)person)==0;
}
Some data structures will use compareTo (for example a TreeMap) and some will use equals (for example a HashMap).
More importantly, it is strongly recommended that compareTo and equals be consistent, as explained in the Comparator javadoc:
It is strongly recommended, but not strictly required that (x.compareTo(y)==0) == (x.equals(y)). Generally speaking, any class that implements the Comparable interface and violates this condition should clearly indicate this fact. The recommended language is "Note: this class has a natural ordering that is inconsistent with equals."
Another hint, found in TreeMap javadoc (emphasis mine):
Note that the ordering maintained by a tree map, like any sorted map, and whether or not an explicit comparator is provided, must be consistent with equals if this sorted map is to correctly implement the Map interface.
Finally, if you override equals you should also override hashcode to prevent unexpected behaviours when using hash-based structures.
compareTo() method is used in sorting,
This method's implementation will determine who is greater(lesser, same) between two person, also at what degree
while equals() & hashcode() will be used in Hash based data structure (HashMap) in your case
user-defined class as a key of HashMap
yes you need to implement hashcode() and equals() properly
Also See
overriding-equals-and-hashcode-in-java
HashMap uses equals() and not compareTo(), so you have to implement it.
TreeMap uses compareTo().

Does List.retainAll() use HashMap internally?

I am purposefully violating the hashCode contract that says that if we override equals() in our class, we must override hashCode() as well, and I am making sure that no Hash related data structures (like HashMap, HashSet, etc) are using it. The problem is that I fear methods like removeAll() and containsAll() of Lists might use HashMaps internally, and in that case, since I am not overriding hashCode() in my classes, their functionality might break.
Can anyone please conform whether my doubt is valid ? The classes contain a lot of fields that are being used for equality comparison, and I will have to come up with an efficient technique to get a hashCode using all of them. I really don't require them in any hash-related operations, and as such, I am trying to avoid implementing hashCode()
From AbstractCollection.retainAll()
* <p>This implementation iterates over this collection, checking each
* element returned by the iterator in turn to see if it's contained
* in the specified collection. If it's not so contained, it's removed
* from this collection with the iterator's <tt>remove</tt> method.
public boolean retainAll(Collection<?> c) {
boolean modified = false;
Iterator<E> e = iterator();
while (e.hasNext()) {
if (!c.contains(e.next())) {
e.remove();
modified = true;
}
}
return modified;
}
As for
I will have to come up with an efficient technique to get a hashCode using all of them
You don't need to use all of the fields used by equals in your hashCode implementation:
It is not required that if two objects are unequal according to the equals method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.
Therefore, your hashCode implementation could be very simple and still obey the contract:
public int hashCode() {
return 1;
}
This will ensure that hash-based data structures still work (alebit at degraded performance). If you add logging to your hashCode implementation, then you could even check if it is ever called.
I think a simple way to test if hashCode() is being used anywhere is to override hashCode() for your class, make it print a statement to the console (or a file if you prefer) and then return some random value (won't matter since you said you don't want to use any hash-based classes anyway).
However, i think the best would be to just override it, i'm sure some IDE's even can do it for you (Eclipse can, for example). If you never expect it to get called, it can't hurt.

Does Hashcode equality imply refer reference based equality?

I read that to use equals() method in java we also have to override the hashcode() method and that the equal (logically) objects should have eual hashcodes, but doesn't that imply reference based equality! Here is my code for overridden equals() method, how should I override hashcode method for this:
#Override
public boolean equals(Object o)
{
if (!(o instanceof dummy))
return false;
dummy p = (dummy) o;
return (p.getName() == this.getName() && p.getId() == this.getId() && p.getPassword() == this.getPassword());
}
I just trying to learn how it works, so there are only three fields, namely name , id and password , and just trying to compare two objects that I define in the main() thats all! I also need to know if it is always necessary to override hashcode() method along with equals() method?
Hashcode equality does not imply anything. However, hashcode inequality should imply that equals will yield false, and any two items that are equal should always have the same hashcode.
For this reason, it is always wise to override hashcode with equals, because a number of data structures rely on it.
Even though failure to override hashCode() will only break usage of your class in HashSet, HashMap, and other hashCode dependent structures, you should still override hashCode() to maintain the contract described by Object.
The general strategy of most hashCode() implementations is to combine the hash codes of the fields used to determine equality. In your case, a reasonable hashCode() may look something like this:
public int hashCode(){
return this.getName().hashCode() ^ this.getId() ^ this.getPassword().hashCode();
}
You need to override hashCode() when you override equals(). Merely using equals() is not enough to require you to override hashCode().
In your code, you aren't actually comparing your fields' values. Use equals() instead of == to make your implementation of equal correct.
return (p.getName().equals(this.getName()) && ...
(Note that the above code can cause null reference exceptions if getName() returns null: you may want to use a utility class as described here)
And yes hashCode() would be called when you use some hashing data structure like HashMap,HashSet
You must override hashCode() in every
class that overrides equals(). Failure
to do so will result in a violation of
the general contract for
Object.hashCode(), which will prevent
your class from functioning properly
in conjunction with all hash-based
collections, including HashMap,
HashSet, and Hashtable.
from Effective Java, by Joshua Bloch
Also See
overriding-equals-and-hashcode-in-java
hashcode-and-equals
Nice article on equals() & hashCode()
The idea with hashCode() is that it is a unique representation of your object in a given space. Data structures that hold objects use hash codes to determine where to place objects. In Java, a HashSet for example uses the hash code of an object to determine which bucket that objects lies in, and then for all objects in that bucket, it uses equals() to determine whether it is a match.
If you don't override hashCode(), but do override equals(), then you will get to a point where you consider 2 objects to be equal, but Java collections don't see it the same way. This will lead to a lot of strange behaviour.

Can you explain this Java hash map key collision?

I have a HashMap and is used in the following way:
HashMap<SomeInterface, UniqueObject> m_map;
UniqueObject getUniqueObject(SomeInterface keyObject)
{
if (m_map.containsKey(keyObject))
{
return m_map.get(keyObject);
}
else
{
return makeUniqueObjectFor(keyObject);
}
}
My issue is that I'm seeing multiple objects of different classes matching the same key on m_map.containsKey(keyObject).
So here are my questions:
Is this possible? The Map interface says it uses equals() to compare if the key is not null. I haven't overridden equals() in any of my SomeInterface classes. Does this mean the equals method can be wrong?
If the above is true, how do I get HashMap to only return true on equals() if they are in fact the same object and not a copy? Is this possible by saying if (object1 == object2)? I was told early on in Java development that I should avoid doing that, but I never found out when it should be used.
Thanks in advance. :)
I strongly suspect you've misdiagnosed the issue. If you aren't overriding equals anywhere (and you're not subclassing anything else that overrides equals) then you should indeed have "identity" behaviour.
I would be shocked to hear that this was not the case, to be honest.
If you can product a short but complete program which demonstrates the problem, that would make it easier to look into - but for the moment, I'd definitely double-check your suspicions about seeing different objects being treated as equal keys.
The default implementation of equals() is done in java.lang.Object:
public boolean equals(Object obj) {
return (this == obj);
}
Other method hashCode(); by default returns some kind of reference to the object. I.e. both are unique by default. Equals returns true only for the same object, hashCode() is different for every object.
This is exactly what can create some kind of multiple entries. You can create 2 instances of your class. From your point of view they are equal because they contain identical data. But they are different. So, if you are using these objects as keys of map you are producing 2 entries. If you want to avoid this implement equals and hashCode for your class.
This implementation sometimes is very verbose. HashCodeBuilder and EqualsBuilder from Jakarta project may help you. Here is an example:
#Override
public int hashCode() {
return HashCodeBuilder.reflectionHashCode(this);
}
#Override
public boolean equals(Object other) {
return EqualsBuilder.reflectionEquals(this, other);
}
#Override
public String toString() {
return ToStringBuilder.reflectionToString(this);
}
You need to ensure that your .equals() and your .hashCode() methods are implemented for all objects that you want to store in the HashMap. To not have that invites all sorts of problems.
You must implement the equals() and hashCode() methods of the objects that you use as the keys in the HashMap.
Note that HashMap not only uses equals(), it also uses hashCode(). Your hashCode() method must be implemented correctly to match the implementation of the equals() method. If the implementation of these methods don't match, you can get unpredictable problems.
See the description of equals() and hashCode() in the API documentation of class Object for the detailed requirements.
FYI you can have IDE's such as Eclipse generate the hashCode & equals methods for you. They'll probably do a better job than if you try to hand-code them yourself.

Categories