I was just wondering if there is any consideration to have in account when saving our own objects in a TreeMap. Something similar when we save our own objects as keys in a hashmap that we need to override equals and hashcode method to be able to retrieve them later. In a treemap there is no hash, a black red algorith is used, but I don't know if there is something special to do.
If so, could you tell me if there is something to have in account?
Thanks
The javadoc says:
The map is sorted according to the natural ordering of its keys, or by a Comparator provided at map creation time
So you need to implement a natural ordering correctly, or implement a Comparator correctly.
It also says:
Note that the ordering maintained by a tree map, like any sorted map, and whether or not an explicit comparator is provided, must be consistent with equals if this sorted map is to correctly implement the Map interface. (See Comparable or Comparator for a precise definition of consistent with equals.) This is so because the Map interface is defined in terms of the equals operation, but a sorted map performs all key comparisons using its compareTo (or compare) method, so two keys that are deemed equal by this method are, from the standpoint of the sorted map, equal. The behavior of a sorted map is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Map interface.
So, if you want to obey the general contract of Map (and you should, generally), the compareTo() method must be consistent with equals(), which means that you need to correctly implement an equals() method, and transitively, a hashCode() method, and that you must make sure that a.equals(b) iff e.compareTo(b) == 0.
Most of the times, people screw up because they implement a compareTo/compare method that returns 0 for two objects, and still expect these two objects to be considered different by the map.
Apart from equals and compareTo transitivity there's one more thing that's terribly important.
Your keys, or at least the fields you're using for comparison, should be immutable.
And you can actually use anything as a key for a TreeMap as long as you provide custom Comparator in its constructor.
Related
I had a interview today and the person taking my interview puzzled me with his statement asking if it possible that TreeSet equals HashSet but not HashSet equals TreeSet. I said "no" but according to him the answer is "yes".
How is it even possible?
Your interviewer is right, they do not hold equivalence relation for some specific cases. It is possible that TreeSet can be equal to HashSet and not vice-versa. Here is an example:
TreeSet<String> treeSet = new TreeSet<>(String.CASE_INSENSITIVE_ORDER);
HashSet<String> hashSet = new HashSet<>();
treeSet.addAll(List.of("A", "b"));
hashSet.addAll(List.of("A", "B"));
System.out.println(hashSet.equals(treeSet)); // false
System.out.println(treeSet.equals(hashSet)); // true
The reason for this is that a TreeSet uses comparator to determine if an element is duplicate while HashSet uses equals.
Quoting TreeSet:
Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the Set interface.
It’s not possible without violating the contract of either equals or Set. The definition of equals in Java requires symmetry, I.e. a.equals(b) must be the same as b.equals(a).
In fact, the very documentation of Set says
Returns true if the specified object is also a set, the two sets have the same size, and every member of the specified set is contained in this set (or equivalently, every member of this set is contained in the specified set). This definition ensures that the equals method works properly across different implementations of the set interface.
NO, this is impossible without violating general contract of the equals method of the Object class, which requires symmetry, i. e. x.equals(y) if and only if y.equals(x).
BUT, classes TreeSet and HashSet implement the equals contract of the Set interface differently. This contract requires, among other things, that every member of the specified set is contained in this set. To determine whether an element is in the set the contains method is called, which for TreeSet uses Comparator and for HashSet uses hashCode.
And finally:
YES, this is possible in some cases.
This is a quote from the book Java Generics and Collections:
In principle, all that a client should need to know is how to keep to
its side of the contract; if it fails to do that, all bets are off and
there should be no need to say exactly what the supplier will do.
So the answer is : Yes it can happen but only when you don't keep to your side of the contract with Java. Here you can say Java has violated the symmetric property of equality but if that happen be sure that you are the one who has broken the contract of some other interfaces first. Java has already documented this behaviour.
Generally you should read documentation of Comparator and Comparable interfaces to use them correctly in sorted collections.
This question is somehow answered in Effective Java Third Edition Item 14 on pages 66-68.
This is a quote from the book when defining contract for implementing Comparable interface(note that this is only part of the whole contract):
• It is strongly recommended, but not required, that (x.compareTo(y)
== 0)
== (x.equals(y)). Generally speaking, any class that implements the Comparable interface and violates this condition should clearly
indicate this fact. The recommended language is “Note: This class has
a natural ordering that is inconsistent with equals.”
It says It is strongly recommended, but not required, it means you are allowed to have classes for which
x.compareTo(y)==0 does not mean x.equal(y)==true.(But if it is implemented that way you can't use them as an element in sorted collections, this is exactly the case with BigDecimal)
The paragraph of the book describing this part of the contract of Comparable interface is worth mentioning:
It is a strong suggestion rather than a true requirement, simply
states that the equality test imposed by the compareTo method should
generally return the same results as the equals method. If this
provision is obeyed, the ordering imposed by the compareTo method is
said to be consistent with equals. If it’s violated, the ordering is
said to be inconsistent with equals. A class whose compareTo method
imposes an order that is inconsistent with equals will still work, but
sorted collections containing elements of the class may not obey the
general contract of the appropriate collec- tion interfaces
(Collection, Set, or Map). This is because the general contracts for
these interfaces are defined in terms of the equals method, but sorted
collec- tions use the equality test imposed by compareTo in place of
equals. It is not a catastrophe if this happens, but it’s something to
be aware of.
Actually we have some classes in Java itself that did not follow this recommendation. BigDecimal is one of them and this is mentioned in the book.
For example, consider the BigDecimal class, whose compareTo method is
inconsistent with equals. If you create an empty HashSet instance and
then add new BigDecimal("1.0") and new BigDecimal("1.00"), the set
will contain two elements because the two BigDecimal instances added
to the set are unequal when compared using the equals method. If,
however, you perform the same procedure using a TreeSet instead of a
HashSet, the set will contain only one element because the two
BigDecimal instances are equal when compared using the compareTo
method. (See the BigDecimal documentation for details.)
However this behaviour is documented in BigDecimal Documentation. Let's have a look at that part of the documentation:
Note: care should be exercised if BigDecimal objects are used as keys
in a SortedMap or elements in a SortedSet since BigDecimal's natural
ordering is inconsistent with equals. See Comparable, SortedMap or
SortedSet for more information.
So although you can write code like below you should not do it because the BigDecimal class has prohibited this usage:
Set<BigDecimal> treeSet = new TreeSet<>();
Set<BigDecimal> hashSet = new HashSet<>();
treeSet.add(new BigDecimal("1.00"));
treeSet.add(new BigDecimal("2.0"));
hashSet.add(new BigDecimal("1.00"));
hashSet.add(new BigDecimal("2.00"));
System.out.println(hashSet.equals(treeSet)); // false
System.out.println(treeSet.equals(hashSet)); // true
Note that Comparable will be used as natural ordering of the elements when you don't pass any comparator to TreeSet or TreeMap, the same thing can happen when you pass Comparator to those class constructor. This is mentioned in the Comparator documentation:
The ordering imposed by a comparator c on a set of elements S is said
to be consistent with equals if and only if c.compare(e1, e2)==0 has
the same boolean value as e1.equals(e2) for every e1 and e2 in S.
Caution should be exercised when using a comparator capable of
imposing an ordering inconsistent with equals to order a sorted set
(or sorted map). Suppose a sorted set (or sorted map) with an explicit
comparator c is used with elements (or keys) drawn from a set S. If
the ordering imposed by c on S is inconsistent with equals, the sorted
set (or sorted map) will behave "strangely." In particular the sorted
set (or sorted map) will violate the general contract for set (or
map), which is defined in terms of equals.
So considering this documention of Comparator, following example given by #Aniket Sahrawat is not supported to work:
TreeSet<String> treeSet = new TreeSet<>(String.CASE_INSENSITIVE_ORDER);
HashSet<String> hashSet = new HashSet<>();
treeSet.addAll(List.of("A", "b"));
hashSet.addAll(List.of("A", "B"));
System.out.println(hashSet.equals(treeSet)); // false
System.out.println(treeSet.equals(hashSet)); // true
In a nutshell the answer is: Yes it can happen but only when you break the documented contract of one of the aforementioned interfaces(SortedSet, Comparable, Comparator).
There already are good answers, but I would like to approach this from a bit more general perspective.
In the Mathematics, Logic, and correspondingly, in the Computer Science, "is equal to" is a Symmetric Binary Relation, which means, that if A is equal to B then B is equal to A.
So, if TreeSet X equals HashSet Y, then HashSet Y must equal to TreeSet X, and that must be true always.
If, however, symmetric property of the Equality is violated (i.e. Equality is not implemented correctly), then x.equals(y) might not mean y.equals(x).
The documentation of Object#equals method in Java, explicitly states, that:
The equals method implements an equivalence relation on non-null object references.
hence, it implements the symmetric property, and if it does not, then it violates the Equality, in general, and violates the Object#equals method, specifically in Java.
When I'm looking at the Java Object Ordering tutorial, the last section 'Comparators' of the article confused me a little bit.
By defining a class Employee which itself is comparable by employee's name, the tutorial doesn't show if this class has overridden the equals method. Then it uses a customized Comparator in which the employees are sorted by the seniority to sort a list of employees and which I could understand.
Then the tutorial explains why this won't work for a sorted collection such as TreeSet (a SortedSet), and the reason is:
it generates an ordering that is not compatible with equals. This means that this Comparator equates objects that the equals method does not. In particular, any two employees who were hired on the same date will compare as equal. When you're sorting a List, this doesn't matter; but when you're using the Comparator to order a sorted collection, it's fatal. If you use this Comparator to insert multiple employees hired on the same date into a TreeSet, only the first one will be added to the set; the second will be seen as a duplicate element and will be ignored.
Now I'm confused, since I know List allows duplicate elements while Set doesn't based on equals method. So I wonder when the tutorial says the ordering generated by the Comparator is not compatible with equals, what does it mean? And it also says 'If you use this Comparator to insert multiple employees hired on the same date into a TreeSet, only the first one will be added to the set; the second will be seen as a duplicate element and will be ignored.' I don't understand how using a Comparator will affect the use of original equals method. I think my question is how the TreeSet will be produced and sorted in this case and when the compare and equals methods are used.
So I wonder when the tutorial says the ordering generated by the Comparator is not compatible with equals, what does it mean?
In this example, the Comparator compares two Employee objects based on their seniority alone. This comparison does not in any way use equals or hashCode. Keeping that in mind, when we pass this Comparator to a TreeSet, the set will consider any result of 0 from the Comparator as equality. Therefore, if any Employees share starting dates, only one will be added because the set thinks they are equal.
Finally:
I think my question is how the TreeSet will be produced and sorted in this case and when the compare and equals methods are used.
For the TreeSet, if a Comparator is given, it uses the compare method to determine equality and ordering of objects. If no Comparator is given, then the set uses the compareTo method of the objects being sorted (they must implement Comparable).
The reason why the Java specification claims that the compare/compareTo method being used must be in line with equals is because the Set specification makes use of equals, even though this specific type of Set, the TreeSet, uses comparisons instead.
If you ever receive a Set from some method implementation, you can expect that there are no duplicates of the objects in that Set as defined by the equals method. Because TreeSet doesn't use this method, however, developers must be careful to ensure that the comparison method results in the same equality as equals does.
A TreeSet uses only the Comparator to determine if two elements are "equal":
https://docs.oracle.com/javase/7/docs/api/java/util/TreeSet.html
Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the Set interface. (See Comparable or Comparator for a precise definition of consistent with equals.) This is so because the Set interface is defined in terms of the equals operation, but a TreeSet instance performs all element comparisons using its compareTo (or compare) method, so two elements that are deemed equal by this method are, from the standpoint of the set, equal. The behavior of a set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Set interface.
This means that the Comparator should return 0 if and only if the equals returns true, to get a consistent behaviour between TreeSet and other sets, like a HashSet. The HashSet indeed uses equals and the hash code to determine if two elements are "equal".
I am trying to create a method that sorts a List by making:
private List<Processor> getByPriority(){
return processors.stream().sorted( new ProcessorComparator() ).collect( Collectors.toList() );
}
But I read in the Comprator javadoc that compare to needs to be a total ordering relation. That is, no two comparator may have the same priority unless they are equal. This might not be the case.
I was trying this simple comparator:
public class ProcessorComparator implements Comparator<TTYMessageProcessor<?>>{
#Override
public int compare( Processor processor1 , Processor processor2 ) {
return processor1.getPriority() - processor2.getPriority();
}
}
Of course I could make the Processor Comparable but I would like to avoid modifications to all the Proccessors. Isn't there any way to sort them with streams? As alternatives I could write my own method or create a more complex comparator but I am surprised of the lack of a more elegant solution.
Reading the references the elements of the original stream are preserved:
Returns a stream consisting of the elements of this stream, sorted according to the provided Comparator.
No elements will be evicted, deleted or duplicated. The same elements come out of the sort as go in, just re-ordered.
Edit: the docs also state for Comparator.compare
It is generally the case, but not strictly required that (compare(x,
y)==0) == (x.equals(y)). Generally speaking, any comparator that
violates this condition should clearly indicate this fact. The
recommended language is "Note: this comparator imposes orderings that
are inconsistent with equals."
This may introduce confusion about equals when used in maps or sets:
Caution should be exercised when using a comparator capable of
imposing an ordering inconsistent with equals to order a sorted set
(or sorted map). Suppose a sorted set (or sorted map) with an explicit
comparator c is used with elements (or keys) drawn from a set S. If
the ordering imposed by c on S is inconsistent with equals, the sorted
set (or sorted map) will behave "strangely." In particular the sorted
set (or sorted map) will violate the general contract for set (or
map), which is defined in terms of equals.
The confusion is lifted if you think about Comparator as an abstraction of the key-value pair: you wouldn't expect two pairs to be equal in case their keys were equal. It just means that some property of those values (i.e. their keys) is considered alike. If you wanted an object to be Comparable in a manner consistent with equals best implement the equally named interface Comparable.
I have an object, Foo which inherits the default equals method from Object, and I don't want to override this because reference equality is the identity relation that I would like to use.
I now have a specific situation in which I would now like to compare these objects according to a specific field. I'd like to write a comparator, FooValueComparator, to perform this comparison. However, if my FooValueComparator returns 0 whenever two objects have the same value for this particular field, then it is incompatible with the equals method inherited from Object, along with all the problems that entails.
What I would like to do would be to have FooValueComparator compare the two objects first on their field value, and then on their references. Is this possible? What pitfalls might that entail (eg. memory locations being changed causing the relative order of Foo objects to change)?
The reason I would like my comparator to be compatible with equals is because I would like to have the option of applying it to SortedSet collections of Foo objects. I don't want a SortedSet to reject a Foo that I try to add just because it already contains a different object having the same value.
This is described in the documentation of Comparator:
The ordering imposed by a comparator c on a set of elements S is said to be consistent with equals if and only if c.compare(e1, e2)==0 has the same boolean value as e1.equals(e2) for every e1 and e2 in S.
Caution should be exercised when using a comparator capable of imposing an ordering inconsistent with equals to order a sorted set (or sorted map). Suppose a sorted set (or sorted map) with an explicit comparator c is used with elements (or keys) drawn from a set S. If the ordering imposed by c on S is inconsistent with equals, the sorted set (or sorted map) will behave "strangely." In particular the sorted set (or sorted map) will violate the general contract for set (or map), which is defined in terms of equals.
It short, if the implementation of Comparator is not consistent with equals method, then you should know what you're doing and you're responsible of the side effects of this design, but it's not an imposition to make the implementation consistent to Object#equals. Still, take into account that it is preferable to do it in order to not cause confusion for future coders that will maintain the system. Similar concept applies when implementing Comparable.
An example of this in the JDK may be found in BigDecimal#compareTo, which explicitly states in javadoc that this method is not consistent with BigDecimal#equals.
If your intention is to use a SortedSet<YourClass> then probably you're using the wrong approach. I would recommend using a SortedMap<TypeOfYourField, Collection<YourClass>> (or SortedMap<TypeOfYourField, YourClass>, in case there are no equals elements for the same key) instead. It may be more work to do, but it provides you more control of the data stored/retrieved in/from the structure.
You may have several comparators for a given class, i.e each per different field. In that case equals can not be reused. Therefore the answer is not necessarily. You should make them consistence however if your collection is stored in a sorted (map or tree) and the comperator is used to determined element position in that collection.
See documentation for details.
It says in the contract for the Comparator interface, that it must be consistent with equals.
Does this mean that Comparator = zero if equalsTo = true , or does it mean that Comparator = zero if and only if equalsTo = true?
I seem to remember that it is the second one, but I have come across lots of comparators which sort by non-unique sub properties.
For example, I might have objects which have a sub-property date, and I want to sort my list of objects by the date of submission. However, you can have several objects with the same date? What are the consequences of this? Surely there is a best practice solution to this problem already? How can I sort a collection by a property which is not guaranteed to be unique without violating the comparator contract? What are the consequences for this type of violation? Are they manageable?
It's not at all true that Comparator must be consistent with equals.
The docs merely warn for this situation:
Caution should be exercised when using a comparator capable of imposing an ordering inconsistent with equals to order a sorted set (or sorted map) (http://docs.oracle.com/javase/7/docs/api/java/util/Comparator.html)
If you have one ordering based on date, and another ordering based on date+time, you should simply implement multiple comparators.
Perhaps you are confusing Comparator with Comparable? For Comparable the docs strongly advice against this situation:
It is strongly recommended (though not required) that natural orderings be consistent with equals. (http://docs.oracle.com/javase/7/docs/api/java/lang/Comparable.html)
This difference makes sense if you realize that an object can only have 1 implementation of Comparable, but multiple of Comparator. The whole idea of Comparator is to have multiple ways of comparing the same class.
edit you could have mulitple Comparators and as popovitsj stated they don't necessarily have to be consistent with equals
(although I presume most of the time you have Comparator.compare(obj1, obj2) == 0 <=> obj1.equals(obj2) == true)
If you want to have specific sort results when sorting by non-unique field, you need to customize your Comparator to account for these,
for example, while implementing compare() you encounter that obj1.date == obj2.date, then you should compare other important fields (name, age, etc) to rank obj1 vs obj2 accordingly and return corresponding value.
Hope that helps.
As you suspect, in order for compareTo() to be consistent with equals(), compareTo() must always return 0 when equals() returns true. Similarly, if equals returns false, then compareTo must not return 0.
However, as popovitsj has pointed out in his answer, consistency with equals() is not a requirement. As such, the above only applies when you are attempting to make the two methods consistent.
It says in the contract for the Comparator interface, that it must be consistent with equals.
That's not entirely correct; see #popovitjs' answer.
Does this mean that Comparator = zero if equalsTo = true , or does it mean that Comparator = zero if and only if equalsTo = true?
It means the latter. However, it is not actually a hard requirement for Comparator objects.
I seem to remember that it is the second one, but I have come across lots of comparators which sort by non-unique sub properties.
Well that's reasonable, given that it is not actually a hard requirement. In fact, a Comparator that is inconsistent with equals(Object) is just fine if you are going to use it with Arrays.sort(...). The problems only arise with TreeSet and TreeMap.
For example, suppose that you have a Comparator<E> C that says e1 and e2 are not equal, but e1.equals(e2) returns true. Now suppose that you create a TreeSet<E> instance using the comparator, and then add e1 and e2 to that set. The set's tree is organized based on the comparator, and therefore e1 and e2 will slot into different places in the search tree, and will both be elements of the set. But that violates the primary invariant of a Set ... which is based on the equals method.
As the javadoc for TreeSet says:
"Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the Set interface. (See Comparable or Comparator for a precise definition of consistent with equals.) This is so because the Set interface is defined in terms of the equals operation, but a TreeSet instance performs all element comparisons using its compareTo (or compare) method, so two elements that are deemed equal by this method are, from the standpoint of the set, equal. The behavior of a set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Set interface."
And this answers the last part of your question.
What are the consequences for this type of violation?
If you use an inconsistent Comparator in a TreeSet or TreeMap, the collection will not obey the Set or Map contract.
a and b may not be equal. But if comparator is zero when comparing a and b it should be zero when comparing b and a.
In this answer say:
Typically, if 2 objects are equal from an equals perspective but not from a compareTo perspective, you can store both objects as keys in a TreeMap. This can lead to un-intuitive behaviour. It can also be done on purpose in specific situations.
But this is for specific situations in general nothing stops you from having an inconsistant behaviour where equals and compareTo dont behave consistently.
One example, this morning someone asked: Move specific items to the end of a list
Most of the answers have comparators that returns 0 for elements that are not equal.