It's written in all decent java courses, that if you implement the Comparable interface, you should (in most cases) also override the equals method to match its behavior.
Unfortunately, in my current organization people try to convince me to do exactly the opposite. I am looking for the most convincing code example to show them all the evil that will happen.
I think you can beat them by showing the Comparable javadoc that says:
It is strongly recommended (though not required) that natural
orderings be consistent with equals. This is so because sorted sets
(and sorted maps) without explicit comparators behave "strangely" when
they are used with elements (or keys) whose natural ordering is
inconsistent with equals. In particular, such a sorted set (or sorted
map) violates the general contract for set (or map), which is defined
in terms of the equals method.
For example, if one adds two keys a and b such that (!a.equals(b) &&
a.compareTo(b) == 0) to a sorted set that does not use an explicit
comparator, the second add operation returns false (and the size of
the sorted set does not increase) because a and b are equivalent from
the sorted set's perspective.
So especially with SortedSet (and SortedMap) if the compareTo method returns 0, it assumes it as equal and doesn't add that element second time even the the equals method returns false, and causes confusion as specified in the SortedSet javadoc
Note that the ordering maintained by a sorted set (whether or not an
explicit comparator is provided) must be consistent with equals if the
sorted set is to correctly implement the Set interface. (See the
Comparable interface or Comparator interface for a precise definition
of consistent with equals.) This is so because the Set interface is
defined in terms of the equals operation, but a sorted set performs
all element comparisons using its compareTo (or compare) method, so
two elements that are deemed equal by this method are, from the
standpoint of the sorted set, equal. The behavior of a sorted set is
well-defined even if its ordering is inconsistent with equals; it just
fails to obey the general contract of the Set interface.
If you don't override the equals method, it inherits its behaviour from the Object class.
This method returns true if and only if the specified object is not null and refers to the same instance.
Suppose the following class:
class VeryStupid implements Comparable
{
public int x;
#Override
public int compareTo(VeryStupid o)
{
if (o != null)
return (x - o.x);
else
return (1);
}
}
We create 2 instances:
VeryStupid one = new VeryStupid();
VeryStupid two = new VeryStupid();
one.x = 3;
two.x = 3;
The call to one.compareTo(two) returns 0 indicating the instances are equal but the call to one.equals(two) returns false indicating they're not equal.
This is inconsistent.
Consistency of compareTo and equals is not required but strongly recommended.
I'll give it a shot with this example:
private static class Foo implements Comparable<Foo> {
#Override
public boolean equals(Object _other) {
System.out.println("equals");
return super.equals(_other);
}
#Override
public int compareTo(Foo _other) {
System.out.println("compareTo");
return 0;
}
}
public static void main (String[] args) {
Foo a, b;
a = new Foo();
b = new Foo();
a.compareTo(b); // prints 'compareTo', returns 0 => equal
a.equals(b); // just prints 'equals', returns false => not equal
}
You can see that your (maybe very important and complicated) comparission code is ignored when you use the default equals-method.
the method int compareTo(T o) allow you know if the T o is (in some way) superior or inferior of this, so it allow you to order a list of T o.
In the scenario of int compareTo(T o) you have to do :
is o InstanceOfThis ? => true/false ;
is o EqualOfThis ? => true/false ;
is o SuperiorOfThis ? => true/false ;
is o InferiorOfThis ? true/false ;
So you see you have the equality test, and the best way to not implement the equality two times is to put it in the boolean equals(Object obj) method.
Related
I have a class DebugTo where if I have two equal instances el1, el2 a HashSet of el1 will not regard el2 as contained.
import java.util.Objects;
public class DebugTo {
public String foo;
public DebugTo(String foo) {
this.foo = foo;
}
#Override
public int hashCode() {
System.out.println(super.hashCode());
return Objects.hash(super.hashCode(), foo);
}
#Override
public boolean equals(Object o) {
if (this == o) {
return true;
}
if (o == null || getClass() != o.getClass()) {
return false;
}
DebugTo that = (DebugTo) o;
return Objects.equals(foo, that.foo);
}
}
var el1 = new DebugTo("a");
var el2 = new DebugTo("a");
System.out.println("Objects.equals(el1, el2): " + Objects.equals(el1, el2));
System.out.println("Objects.equals(el2, el1): " + Objects.equals(el2, el1));
System.out.println("el1.hashCode(): " + el1.hashCode());
System.out.println("el2.hashCode(): " + el2.hashCode());
Objects.equals(el1, el2): true
Objects.equals(el2, el1): true
1205483858
el1.hashCode(): -1284705008
1373949107
el2.hashCode(): -357249585
From my analysis I have gathered that:
HashSet::contains calls hashCode not equals (relying on the Objects.equals(a, b) => a.hashSet() == b.hashSet())
super.hashCode() gives a different value both times.
Why does super.hashCode() give different results for el1 and el2? since they are of the same class, they have the same super class and so I expect super.hashCode() to give the same result for both.
The hashCode method was probably autogenerated by eclipse. If not answered above, why is super.hashCode used wrong here?
Because the default implementations of the equals and hashCode methods (which go hand in hand - you always override both or neither) treat any 2 different instances as not equal to each other. If you want different behaviour, you override equals and hashCode, and do not invoke super.equals / super.hashCode, or there'd be no point.
HashSets work as follows: They use .hashCode() to know which 'bucket' to put the object into, and if 2 objects end up in the same bucket, equals is used only on those very few objects to double check.
In other words, these are the rules:
If a.equals(b), then b.equals(a) must be true.
a.equals(a) must always be true.
If a.equals(b) and b.equals(c), a.equals(c) must be true.
If a.equals(b), a.hashCode() == b.hashCode() must be true.
The reverse of 4 does not hold: If a.hashCode() == b.hashCode(), that doesn't mean a.equals(b), and hashset does not require it.
Therefore, return 1; is a legal implementation of hashCode.
If a class has really bad hashcode spread (such as the idiotic but legal option listed in bullet 6), then the performance of hashset will be very bad. e.g. set.containsKey(k) which ordinarily takes constant time, will take linear time instead if your objects are all not-equal but have the same hashCode. Hence, do try to ensure hashcodes are as different as they can be.
HashSet and HashMap require stable objects, meaning, their behaviour when calling hashCode and equals cannot change over time.
From the above it naturally follows that overriding equals and not hashCode or vice versa is necessarily broken.
Breaking any of the above rules does not, generally, result in a compiler error. It often doesn't even result in an exception. But instead it results in bizarre behaviour with hashsets and hashmaps: You put an k/v pair in the map, and then immediately ask for the value back and you get null back instead of what you put in, or something completely different. Just an example.
NB: One weird effect of all this is that you cannot add equality-affecting state to subclasses, unless you apply a caveat that most classes including all classes in the core libraries don't apply.
Imagine as an example that we invent the notion of a 'coloured' arraylist. You could have a red '["Hello", "World"]' list, and a blue one:
class ColoredArrayList extends ArrayList {
Color color;
public ColoredArrayList(Color c) {
this.color = color;
}
}
You'd probably want an empty red list to not equal an empty blue one. However, that is impossible if you intend to follow the rules. That's because the equals/hashCode impl of ArrayList itself considers any other list equal to itself if it has the same items in the same order. Therefore:
List<String> a = new ArrayList<String>();
ColoredList<String> b = new ColoredList<String>(Color.RED);
a.equals(b); // this is true, and you can't change that!
Therefore, b.equals(a) must also be true (your impl of equals has to say that an empty red list is equal to an empty plain arraylist), and given that an empty arraylist is also equal to an empty blue one, given that a.equals(b) and b.equals(c) implies that a.equals(c), a red empty list has to be equal to a blue empty list.
There is an easy solution for this that brings in new problems, and a hard solution that is objectively better.
The easy solution is to define that you can't be equal to anything except exact instances of yourself, as in, any subclass is insta-disqualified. Imagine ArrayList's equals method returns false if you call it with an instance of a subclass of ArrayList. Then you could make your colored list just fine. But, this isn't necessarily great, for example, you probably want an empty LinkedList and an empty ArrayList to be equal.
The harder solution is to introduce a second method, canEqual, and call it. You override canEqual to return 'if other is instanceof the nearest class in my hierarchy that introduces equality-relevant state'. Thus, your ColoredList should have #Override public boolean canEqual(Object other) { return other instanceof ColoredList; }.
The problem is, all classes need to have that and use it, or it's not going to work, and ArrayList does not have it. And you can't change that.
Project Lombok can generate this for you if you prefer. It's not particularly common; I'd only use it if you really know you need it.
I already looked into different questions but these usually ask about consistency or ordering, while I am interested into ordering of two HashSets containing the same elements at the same time.
I want to create a HashSet of HashSets containing integers. Over time I will put HashSets of size 3 in this bigger HashSet and I will want to see if a newly created HashSet is already contained within the bigger HashSet.
Now my question is will it always find duplicates or can the ordering of two HashSets with the same elements be different?
I am conflicted as they use the same hashcode() function but does that mean they will always be the same?
HashSet<HashSet<Integer>> test = new HashSet<>();
HashSet<Integer> one = new HashSet<>();
one.add(1);
one.add(2);
one.add(5);
test.add(one);
HashSet<Integer> two = new HashSet<>();
two.add(5);
two.add(1);
two.add(2);
//Some other stuff that runs over time
System.out.println(test.contains(two));
Above code tries to illustrate what I mean, does this always return true? (Keep in mind I might initialise another HashSet with the same elements and try the contains again)
Yes, the above always returns true. Sets have no order, and when you test whether two Sets are equal to each other, you are checking that they have the same elements. Order has no meaning.
To elaborate, test.contains(two) will return true, if an only if test contains an element having the same hashCode() as two which is equal to two (according to the equals method).
Two sets s1 and s2 that have the same elements have the same hashCode() and s1.equals(s2) returns true.
This is required by the contract of equals and hashCode of the Set interface:
equals
Compares the specified object with this set for equality. Returns true if the specified object is also a set, the two sets have the same size, and every member of the specified set is contained in this set (or equivalently, every member of this set is contained in the specified set). This definition ensures that the equals method works properly across different implementations of the set interface.
hashCode
Returns the hash code value for this set. The hash code of a set is defined to be the sum of the hash codes of the elements in the set, where the hash code of a null element is defined to be zero. This ensures that s1.equals(s2) implies that s1.hashCode()==s2.hashCode() for any two sets s1 and s2, as required by the general contract of Object.hashCode.
As you can see, one and two don't even have to use the same implementation of the Set interface in order for test.contains(two) to return true. They just have to contain the same elements.
The key property of sets is about uniqueness of keys.
By "default", insertion order doesn't matter at all.
A linked LinkedHashSet guarantees to you that when iterating, you get the elements always in the same order (the one used for inserting them). But even then, when comparing such sets, it is still only about their content, not that insertion order part.
In other words: no matter what (default) implementation of the Set interface you are using, you should always see consistent behavior. Of course you free to implement your own Set and to violate that contract, but well, violating contracts leads to violated contracts, aka bugs.
You can look for yourself, this is open source code:
public boolean equals(Object o) {
if (o == this)
return true;
if (!(o instanceof Set))
return false;
Collection<?> c = (Collection<?>) o;
if (c.size() != size())
return false;
try {
return containsAll(c);
} catch (ClassCastException unused) {
return false;
} catch (NullPointerException unused) {
return false;
}
}
public int hashCode() {
int h = 0;
Iterator<E> i = iterator();
while (i.hasNext()) {
E obj = i.next();
if (obj != null)
h += obj.hashCode();
}
return h;
}
You can easily see that the hashcode will be the sum of the hashcode of the elements, so it is not affected by any order and that equals use containsAll(...) so also here the order doesn't matter.
The code shown below does output:
[b]
[a, b]
However I would expect it to print two identical lines in the output.
import java.util.*;
public class Test{
static void test(String... abc) {
Set<String> s = new TreeSet<String>(String.CASE_INSENSITIVE_ORDER);
s.addAll(Arrays.asList("a", "b"));
s.removeAll(Arrays.asList(abc));
System.out.println(s);
}
public static void main(String[] args) {
test("A");
test("A", "C");
}
}
The spec clearly states that removeAll
"Removes all this collection's elements that are also contained in the
specified collection."
So from my understanding current behavior is unpredictable . Please help me understand this
You only read documentation partly. You forgot one important paragraph from TreeSet:
Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the Set interface. (See Comparable or Comparator for a precise definition of consistent with equals.) This is so because the Set interface is defined in terms of the equals operation, but a TreeSet instance performs all element comparisons using its compareTo (or compare) method, so two elements that are deemed equal by this method are, from the standpoint of the set, equal. The behavior of a set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Set interface.
Now removeAll implementation comes from AbstractSet and utilizes equals method. According to your code you will have that "a".equals("A") is not true so that elements are not considered equal even if you provided a comparator which manages them when used in the TreeSet itself. If you try with a wrapper then the problem goes away:
import java.util.*;
import java.lang.*;
class Test
{
static class StringWrapper implements Comparable<StringWrapper>
{
public final String string;
public StringWrapper(String string)
{
this.string = string;
}
#Override public boolean equals(Object o)
{
return o instanceof StringWrapper &&
((StringWrapper)o).string.compareToIgnoreCase(string) == 0;
}
#Override public int compareTo(StringWrapper other) {
return string.compareToIgnoreCase(other.string);
}
#Override public String toString() { return string; }
}
static void test(StringWrapper... abc)
{
Set<StringWrapper> s = new TreeSet<>();
s.addAll(Arrays.asList(new StringWrapper("a"), new StringWrapper("b")));
s.removeAll(Arrays.asList(abc));
System.out.println(s);
}
public static void main(String[] args)
{
test(new StringWrapper("A"));
test(new StringWrapper("A"), new StringWrapper("C"));
}
}
This because you are now providing a consistent implementation between equals and compareTo of your object so you never have incoherent behavior between how the objects are added inside the sorted set and how all the abstract behavior of the set uses them.
This is true in general, a sort of rule of three for Java code: if you implement compareTo or equals or hashCode you should always implement all of them to avoid problems with standard collections (even if hashCode is less crucial unless you are using these objects in any hashed collection). This is specified many times around java documentation.
This is an inconsistency in the implementation of TreeSet<E>, bordering on the bug. The code will ignore custom comparator when the number of items in the collection that you pass to removeAll is greater than or equal to the number of items in the set.
The inconsistency is caused by a small optimization: if you look at the implementation of removeAll, which is inherited from AbstractSet, the optimization goes as follows:
public boolean removeAll(Collection<?> c) {
boolean modified = false;
if (size() > c.size()) {
for (Iterator<?> i = c.iterator(); i.hasNext(); )
modified |= remove(i.next());
} else {
for (Iterator<?> i = iterator(); i.hasNext(); ) {
if (c.contains(i.next())) {
i.remove();
modified = true;
}
}
}
return modified;
}
you can see that the behavior is different when c has fewer items than this set (top branch) vs. when it has as many or more items (bottom branch).
Top branch uses the comparator associated with this set, while the bottom branch uses equals for comparison c.contains(i.next()) - all in the same method!
You can demonstrate this behavior by adding a few extra elements to the original tree set:
s.addAll(Arrays.asList("x", "z", "a", "b"));
Now the output for both test cases becomes identical, because remove(i.next()) utilizes the comparator of the set.
The reason is because the comparator String.CASE_INSENSITIVE_ORDER you use is not consistent with equals.
As stated by TreeSet:
Note that the ordering maintained by a set (whether or not an explicit comparator is provided)
must be consistent with equals if it is to correctly implement the Set interface.
Consistency with equals as stated by Comparable:
The natural ordering for a class C is said to be consistent with equals if and only if
e1.compareTo(e2) == 0 has the same boolean value as e1.equals(e2)
for every e1 and e2 of class C.
And as an example for the case insensitive comparator you use:
"a".compareTo("A") == 0 => true
while
"a".equals("A") => false
Can somebody put some light on what are the Consequences when compareTo() is inconsistent with equals() of a class. I have read that if Obj1.compareTo(Obj2) = 0 then it's not mandatory to be Obj1.equals(Obj2) = true. But what is the consequence if this happens. Thanks.
The documentation for Comparable explains this in some detail:
The natural ordering for a class C is said to be consistent with equals if and only if e1.compareTo(e2) == 0 has the same boolean value as e1.equals(e2) for every e1 and e2 of class C. Note that null is not an instance of any class, and e.compareTo(null) should throw a NullPointerException even though e.equals(null) returns false.
It is strongly recommended (though not required) that natural orderings be consistent with equals. This is so because sorted sets (and sorted maps) without explicit comparators behave "strangely" when they are used with elements (or keys) whose natural ordering is inconsistent with equals. In particular, such a sorted set (or sorted map) violates the general contract for set (or map), which is defined in terms of the equals method.
For example, if one adds two keys a and b such that (!a.equals(b) && a.compareTo(b) == 0) to a sorted set that does not use an explicit comparator, the second add operation returns false (and the size of the sorted set does not increase) because a and b are equivalent from the sorted set's perspective.
Virtually all Java core classes that implement Comparable have natural orderings that are consistent with equals. One exception is java.math.BigDecimal, whose natural ordering equates BigDecimal objects with equal values and different precisions (such as 4.0 and 4.00).
Although the documentation says that consistency is not mandatory, it is better to always ensure this consistency, as you never know whether your object may be one day present in a TreeMap / TreeSet or the like. If compareTo() returns 0 for 2 objects that are not equal, then all Tree based collections are broken.
For example, imagine a class Query, implementing an SQL query, with 2 fields:
tableList: list of tables
references: list of programs using such a query
Let's say that 2 objects are equal if their tableList is equal, i.e. the tableList is the natural key of this object. hashCode() and equals() only consider the field tableList:
public class Query implements Comparable {
List<String> tableList;
List<String> references;
Query(List<String> tableList, List<String> references) {
this.tableList = tableList;
this.references = references;
Collections.sort(tableList); // normalize
}
#Override
public int hashCode() {
int hash = 5;
hash = 53 * hash + Objects.hashCode(this.tableList);
return hash;
}
#Override
public boolean equals(Object obj) {
if (obj == null) {
return false;
}
if (getClass() != obj.getClass()) {
return false;
}
final Query other = (Query) obj;
return Objects.equals(this.tableList, other.tableList);
}
}
Let's say that we would like the sorting to be along the number of references.
Writing the code naively yields a compareTo() methods which could look like this:
public int compareTo(Object o) {
Query other = (Query) o;
int s1 = references.size();
int s2 = other.references.size();
if (s1 == s2) {
return 0;
}
return s1 - s2;
}
Doing so seems OK as equality and sorting are done on two separate fields, so far so good.
However, whenever is put in a TreeSet or a TreeMap, it is catastrophic: the implementation of these classes consider that if compareTo returns 0, then the elements are equal. In this case, it would mean that each object with identical number of references are indeed 'equal' objects, which is obviously not the case.
A better compareTo() method could be:
public int compareTo(Object o) {
Query other = (Query) o;
// important to match equals!!!
if (this.equals(other)) {
return 0;
}
int s1 = references.size();
int s2 = other.references.size();
if (s1 == s2) {
return -1; // not 0, they are NOT equal!
}
return s1 - s2;
}
Some collections will assume that if two objects follow obj1.compareTo(obj2) = 0 then obj1.equals(obj2) is also true. For example: sorted TreeSet.
Failing to meet this logic will result in an iconsistent collection. See:
Comparator and equals().
I have a TreeSet containing wrappers which store a Foo object at a certain position, defined like so:
class Wrapper implements Comparable<Wrapper> {
private final Foo foo;
private final Double position;
...
#Override boolean equals(Object o) {
...
if(o instanceof Wrapper)
return o.getFoo().equals(this.foo);
if(o instanceof Foo)
return o.equals(this.foo);
}
#Override public int compareTo(MarkerWithPosition o) {
return position.compareTo(o.getPosition());
}
}
NavigableSet<Wrapper> fooWrappers = new TreeSet<Wrapper>();
because I want my TreeSet to be ordered by position but searchable by foo. But when I perform these operations:
Foo foo = new Foo(bar);
Wrapper fooWrapper = new Wrapper(foo, 1.0);
fooWrappers.add(fooWrapper);
fooWrapper.equals(new Wrapper(new Foo(bar), 1.0));
fooWrapper.equals(new Foo(bar));
fooWrappers.contains(fooWrapper);
fooWrappers.contains(new Wrapper(foo, 1.0));
fooWrappers.contains(new Wrapper(new Foo(bar), 1.0));
fooWrappers.contains(new Wrapper(foo, 2.0));
fooWrappers.contains(foo);
I get:
true
true
true
true
true
false
Exception in thread "main" java.lang.ClassCastException: org.gridqtl.Marker cannot be cast to java.lang.Comparable
at java.util.TreeMap.getEntry(TreeMap.java:325)
at java.util.TreeMap.containsKey(TreeMap.java:209)
at java.util.TreeSet.contains(TreeSet.java:217)
when I expecting them all to return true, so it seems like TreeSet.contains is not using my equals method as the API suggests. Is there another method I need to overwrite?
TreeSet is a Set implementation that does indeed use compareTo, as explained in the javadoc - emphasis mine:
Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the Set interface. (See Comparable or Comparator for a precise definition of consistent with equals.) This is so because the Set interface is defined in terms of the equals operation, but a TreeSet instance performs all element comparisons using its compareTo (or compare) method, so two elements that are deemed equal by this method are, from the standpoint of the set, equal. The behavior of a set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Set interface.
TreeSet is an ordered set.
equals cannot give you ordering information, hence TreeSet has to use something else.
This 'something else' is Comparable interface, or its cousin Comparator interface.
Both interfaces provide an information about how to order 2 objects of a class.