What is the importance of "same ordering" objects being equal? - java

I'm sorting an array of objects. The objects have lots of fields but I only care about one of them. So, I wrote a comparator:
Collections.sort(details, new Comparator<MyObj>() {
#Override
public int compare(MyObj d1, MyObj d2) {
if (d1.getDate() == null && d2.getDate() == null) {
return 0;
} else if (d1.getDate() == null) {
return -1;
} else if (d2.getDate() == null) {
return 1;
}
if (d1.getDate().before(d2.getDate())) return 1;
else if (d1.getDate().after(d2.getDate())) return -1;
else return 0;
}
});
From the perspective of my use case, this Comparator does all it needs to, even if I might consider this sorting non-deterministic. However, I wonder if this is bad code. Through this Comparator, two very distinct objects could be considered "the same" ordering even if they are unequal objects. I decided to use hashCode as a tiebreaker, and it came out something like this:
Collections.sort(details, new Comparator<MyObj>() {
#Override
public int compare(MyObj d1, MyObj d2) {
if (d1.getDate() == null && d2.getDate() == null) {
return d1.hashCode();
} else if (d1.getDate() == null) {
return -1;
} else if (d2.getDate() == null) {
return 1;
}
if (d1.getDate().before(d2.getDate())) return 1;
else if (d1.getDate().after(d2.getDate())) return -1;
else return d1.hashCode() - d2.hashCode();
}
});
(what I return might be backwards, but that's is not important to this question)
Is this necessary?
EDIT:
To anyone else looking at this question, consider using Google's ordering API. The logic above was replaced by:
return Ordering.<Date> natural().reverse().nullsLast().compare(d1.getDate(), d2.getDate());

Through this comparator, two very distinct objects could be considered "the same" ordering even if they are unequal objects.
That really doesn't matter; it's perfectly fine for two objects to compare as equal even if they are not "equal" in any other sense. Collections.sort is a stable sort, meaning objects that compare as equal come out in the same order they came in; that's equivalent to just using "the index in the input" as a tiebreaker.
(Also, your new Comparator is actually significantly more broken than the original. return d1.hashCode() is particularly nonsensical, and return d1.hashCode() - d2.hashCode() can lead to nontransitive orderings that will break Collections.sort, because of overflow issues. Unless both integers are definitely nonnegative, which hashCodes aren't, always use Integer.compare to compare integers.)

This is only mostly important if the objects implement Comparable.
It is strongly recommended (though not required) that natural orderings be consistent with equals. This is so because sorted sets (and sorted maps) without explicit comparators behave "strangely" when they are used with elements (or keys) whose natural ordering is inconsistent with equals. In particular, such a sorted set (or sorted map) violates the general contract for set (or map), which is defined in terms of the equals method.
For example, if one adds two keys a and b such that (!a.equals(b) && a.compareTo(b) == 0) to a sorted set that does not use an explicit comparator, the second add operation returns false (and the size of the sorted set does not increase) because a and b are equivalent from the sorted set's perspective.
However, you're not doing that, you're using a custom Comparator, probably for presentation reasons. Since this sorting metric isn't inherently attached to the object, it doesn't matter that much.
As an aside, why not just return 0 instead of messing with the hashCodes? Then they will preserve the original order if the dates match, because Collections.sort is a stable sort. I agree with #LouisWasserman that using hashCode in this way can have potentially very bizarre consequences, mostly relating to integer overflow. Consider the case where d1.hashCode() is positive and d2.hashCode() is negative, and vice versa.

Related

Java - Do two HashSets change similarly overtime?

I already looked into different questions but these usually ask about consistency or ordering, while I am interested into ordering of two HashSets containing the same elements at the same time.
I want to create a HashSet of HashSets containing integers. Over time I will put HashSets of size 3 in this bigger HashSet and I will want to see if a newly created HashSet is already contained within the bigger HashSet.
Now my question is will it always find duplicates or can the ordering of two HashSets with the same elements be different?
I am conflicted as they use the same hashcode() function but does that mean they will always be the same?
HashSet<HashSet<Integer>> test = new HashSet<>();
HashSet<Integer> one = new HashSet<>();
one.add(1);
one.add(2);
one.add(5);
test.add(one);
HashSet<Integer> two = new HashSet<>();
two.add(5);
two.add(1);
two.add(2);
//Some other stuff that runs over time
System.out.println(test.contains(two));
Above code tries to illustrate what I mean, does this always return true? (Keep in mind I might initialise another HashSet with the same elements and try the contains again)
Yes, the above always returns true. Sets have no order, and when you test whether two Sets are equal to each other, you are checking that they have the same elements. Order has no meaning.
To elaborate, test.contains(two) will return true, if an only if test contains an element having the same hashCode() as two which is equal to two (according to the equals method).
Two sets s1 and s2 that have the same elements have the same hashCode() and s1.equals(s2) returns true.
This is required by the contract of equals and hashCode of the Set interface:
equals
Compares the specified object with this set for equality. Returns true if the specified object is also a set, the two sets have the same size, and every member of the specified set is contained in this set (or equivalently, every member of this set is contained in the specified set). This definition ensures that the equals method works properly across different implementations of the set interface.
hashCode
Returns the hash code value for this set. The hash code of a set is defined to be the sum of the hash codes of the elements in the set, where the hash code of a null element is defined to be zero. This ensures that s1.equals(s2) implies that s1.hashCode()==s2.hashCode() for any two sets s1 and s2, as required by the general contract of Object.hashCode.
As you can see, one and two don't even have to use the same implementation of the Set interface in order for test.contains(two) to return true. They just have to contain the same elements.
The key property of sets is about uniqueness of keys.
By "default", insertion order doesn't matter at all.
A linked LinkedHashSet guarantees to you that when iterating, you get the elements always in the same order (the one used for inserting them). But even then, when comparing such sets, it is still only about their content, not that insertion order part.
In other words: no matter what (default) implementation of the Set interface you are using, you should always see consistent behavior. Of course you free to implement your own Set and to violate that contract, but well, violating contracts leads to violated contracts, aka bugs.
You can look for yourself, this is open source code:
public boolean equals(Object o) {
if (o == this)
return true;
if (!(o instanceof Set))
return false;
Collection<?> c = (Collection<?>) o;
if (c.size() != size())
return false;
try {
return containsAll(c);
} catch (ClassCastException unused) {
return false;
} catch (NullPointerException unused) {
return false;
}
}
public int hashCode() {
int h = 0;
Iterator<E> i = iterator();
while (i.hasNext()) {
E obj = i.next();
if (obj != null)
h += obj.hashCode();
}
return h;
}
You can easily see that the hashcode will be the sum of the hashcode of the elements, so it is not affected by any order and that equals use containsAll(...) so also here the order doesn't matter.

My treemap breaks after sorting it because "comparator used for the treemap is inconsistent with equals"

I needed to sort my treemap based on it's value. The requirements of what I'm doing are such that I have to use a sorted map. I tried the solution here: Sort a Map<Key, Value> by values (Java) however as the comments say, this will make getting values from my map not work. So, instead I did the following:
class sorter implements Comparator<String> {
Map<String, Integer> _referenceMap;
public boolean sortDone = false;
public sorter(Map<String, Integer> referenceMap) {
_referenceMap = referenceMap;
}
public int compare(String a, String b) {
return sortDone ? a.compareTo(b) : _referenceMap.get(a) >= _referenceMap.get(b) ? -1 : 1;
}
}
So I leave sortDone to false until I'm finished sorting my map, and then I switch sortDone to true so that it compares things as normal. Problem is, I still cannot get items from my map. When I do myMap.get(/anything/) it is always null still.
I also do not understand what the comparator inconsistent with equals even means.
I also do not understand what the comparator inconsistent with equals even means.
As per the contract of the Comparable interface.
The natural ordering for a class C is said to be consistent with equals if and only if e1.compareTo(e2) == 0 has the same boolean value as e1.equals(e2) for every e1 and e2 of class C. Note that null is not an instance of any class, and e.compareTo(null) should throw a NullPointerException even though e.equals(null) returns false.
It is strongly recommended (though not required) that natural orderings be consistent with equals.
I believe you need to change the line :
_referenceMap.get(a) >= _referenceMap.get(b) ? -1 : 1;
to
_referenceMap.get(a).compareTo(_referenceMap.get(b));
Since if the Integer returned by _referenceMap.get(a) is actually == in value to the Integer returned by _referenceMap.get(b) then you should ideally return 0, not -1.
It means you must implement, ie override, the equals() method to compare the same field(s) you are comparing for the compareTo() method.
It is good practice to override the hashCode() method to return a hash based on the same fields too.

Partial ordered Comparator

How to implement java.util.Comparator that orders its elements according to a partial order relation?
For example given a partial order relation a ≺ c, b ≺ c; the order of a and b is undefined.
Since Comparator requires a total ordering, the implementation orders elements for which the partial ordering is undefined arbitrarily but consistent.
Would the following work?
interface Item {
boolean before(Item other);
}
class ItemPartialOrderComperator implements Comparator<Item> {
#Override
public int compare(Item o1, Item o2) {
if(o1.equals(o2)) { // Comparator returns 0 if and only if o1 and o2 are equal;
return 0;
}
if(o1.before(o2)) {
return -1;
}
if(o2.before(o1)) {
return +1;
}
return o1.hashCode() - o2.hashCode(); // Arbitrary order on hashcode
}
}
Is this comparator's ordering transitive?
(I fear that it is not)
Are Comparators required to be transitive?
(when used in a TreeMap)
How to implement it correctly?
(if the implementation above doesn't work)
(Hashcodes can collide, for simplicity collisions the example ignores collisions; see Damien B's answer to Impose a total ordering on all instances of *any* class in Java for a fail-safe ordering on hashcodes.)
The problem is that, when you have incomparable elements, you need to fall back to something cleverer than comparing hash codes. For example, given a partial order {a < b, c < d}, the hash codes could satisfy h(d) < h(b) < h(c) < h(a), which means that a < b < c < d < a (bold denotes tie broken by hash code), which will cause problems with a TreeMap.
In general, there's probably nothing for you to do except topologically sort the keys beforehand, so some details about the partial orders of interest to you would be welcome.
It seems to be more of an answer than a comment so I'll post it
The documentation says:
It follows immediately from the contract for compare that the quotient is an equivalence relation on S, and that the imposed ordering is a total order on S."
So no, a Comparator requires a total ordering. If you implement this with a partial ordering you're breaching the interface contract.
Even if it might work in some scenario, you should not attempt to solve your problem in a way that breaches the contract of the interface.
See this question about data structures that do fit a partial ordering.
Any time I've tried using hash codes for this sort of thing I've come to regret it. You will be much happier if your ordering is deterministic - for debuggability if nothing else. The following will achieve that, by creating a fresh index for any not previously encountered Item and using those indices for the comparison if all else fails.
Note that the ordering still is not guaranteed to be transitive.
class ItemPartialOrderComperator implements Comparator<Item> {
#Override
public int compare(Item o1, Item o2) {
if(o1.equals(o2)) {
return 0;
}
if(o1.before(o2)) {
return -1;
}
if(o2.before(o1)) {
return +1;
}
return getIndex(o1) - getIndex(o2);
}
private int getIndex(Item i) {
Integer result = indexMap.get(i);
if (result == null) {
indexMap.put(i, result = indexMap.size());
}
return result;
}
private Map<Item,Integer> indexMap = new HashMap<Item, Integer>();
}
In jdk7, your object will throw runtime exception :
Area: API: Utilities
Synopsis: Updated sort behavior for Arrays and Collections may throw an IllegalArgumentException
Description: The sorting algorithm used by java.util.Arrays.sort and (indirectly) by java.util.Collections.sort has been replaced. The
new sort implementation may throw an IllegalArgumentException if it
detects a Comparable that violates the Comparable contract. The
previous implementation silently ignored such a situation.
If the previous behavior is desired, you can use the new system property, java.util.Arrays.useLegacyMergeSort, to restore previous
mergesort behavior.
Nature of Incompatibility: behavioral
RFE: 6804124
If a < b and b < c implies a < c, then you have made a total ordering by using the hashCodes. Take a < d, d < c. The partial order says that b and d not necessarily are ordered. By introducing hashCodes you provide an ordering.
Example: is-a-descendant-of(human, human).
Adam (hash 42) < Moses (hash 17), Adam < Joe (hash 9)
Implies
Adam < Joe < Moses
A negative example would be the same relation, but when time travel allows being your own descendant.
When one item is neither "before" nor "after" another, instead of returning a comparison of the hashcode, just return 0. The result will be "total ordering" and "arbitrary" ordering of coincident items.

Consequence when compareTo() is inconsistent with equals()

Can somebody put some light on what are the Consequences when compareTo() is inconsistent with equals() of a class. I have read that if Obj1.compareTo(Obj2) = 0 then it's not mandatory to be Obj1.equals(Obj2) = true. But what is the consequence if this happens. Thanks.
The documentation for Comparable explains this in some detail:
The natural ordering for a class C is said to be consistent with equals if and only if e1.compareTo(e2) == 0 has the same boolean value as e1.equals(e2) for every e1 and e2 of class C. Note that null is not an instance of any class, and e.compareTo(null) should throw a NullPointerException even though e.equals(null) returns false.
It is strongly recommended (though not required) that natural orderings be consistent with equals. This is so because sorted sets (and sorted maps) without explicit comparators behave "strangely" when they are used with elements (or keys) whose natural ordering is inconsistent with equals. In particular, such a sorted set (or sorted map) violates the general contract for set (or map), which is defined in terms of the equals method.
For example, if one adds two keys a and b such that (!a.equals(b) && a.compareTo(b) == 0) to a sorted set that does not use an explicit comparator, the second add operation returns false (and the size of the sorted set does not increase) because a and b are equivalent from the sorted set's perspective.
Virtually all Java core classes that implement Comparable have natural orderings that are consistent with equals. One exception is java.math.BigDecimal, whose natural ordering equates BigDecimal objects with equal values and different precisions (such as 4.0 and 4.00).
Although the documentation says that consistency is not mandatory, it is better to always ensure this consistency, as you never know whether your object may be one day present in a TreeMap / TreeSet or the like. If compareTo() returns 0 for 2 objects that are not equal, then all Tree based collections are broken.
For example, imagine a class Query, implementing an SQL query, with 2 fields:
tableList: list of tables
references: list of programs using such a query
Let's say that 2 objects are equal if their tableList is equal, i.e. the tableList is the natural key of this object. hashCode() and equals() only consider the field tableList:
public class Query implements Comparable {
List<String> tableList;
List<String> references;
Query(List<String> tableList, List<String> references) {
this.tableList = tableList;
this.references = references;
Collections.sort(tableList); // normalize
}
#Override
public int hashCode() {
int hash = 5;
hash = 53 * hash + Objects.hashCode(this.tableList);
return hash;
}
#Override
public boolean equals(Object obj) {
if (obj == null) {
return false;
}
if (getClass() != obj.getClass()) {
return false;
}
final Query other = (Query) obj;
return Objects.equals(this.tableList, other.tableList);
}
}
Let's say that we would like the sorting to be along the number of references.
Writing the code naively yields a compareTo() methods which could look like this:
public int compareTo(Object o) {
Query other = (Query) o;
int s1 = references.size();
int s2 = other.references.size();
if (s1 == s2) {
return 0;
}
return s1 - s2;
}
Doing so seems OK as equality and sorting are done on two separate fields, so far so good.
However, whenever is put in a TreeSet or a TreeMap, it is catastrophic: the implementation of these classes consider that if compareTo returns 0, then the elements are equal. In this case, it would mean that each object with identical number of references are indeed 'equal' objects, which is obviously not the case.
A better compareTo() method could be:
public int compareTo(Object o) {
Query other = (Query) o;
// important to match equals!!!
if (this.equals(other)) {
return 0;
}
int s1 = references.size();
int s2 = other.references.size();
if (s1 == s2) {
return -1; // not 0, they are NOT equal!
}
return s1 - s2;
}
Some collections will assume that if two objects follow obj1.compareTo(obj2) = 0 then obj1.equals(obj2) is also true. For example: sorted TreeSet.
Failing to meet this logic will result in an iconsistent collection. See:
Comparator and equals().

Java: Implement Compararable but too many conditional ifs. How can I avoid them?

I have a list of objects which implement Comparable.
I want to sort this list and that is why I used the Comparable.
Each object has a field, weight that is composed of 3 other member int variables.
The compareTo returns 1 for the object with the most weight.
The most weight is not only if the
weightObj1.member1 > weightObj2.member1
weightObj1.member2 > weightObj2.member2
weightObj1.member3 > weightObj2.member3
but actually is a little more complicated and I end up with code with too many conditional ifs.
If the weightObj1.member1 > weightObj2.member1 holds then I care if weightObj1.member2 > weightObj2.member2.
and vice versa.
else if weightObj1.member2 > weightObj2.member2 holds then I care if weightObj1.member3 > weightObj2.member3 and vice versa.
Finally if weightObj1.member3 > weightObj2.member3 holds AND if a specific condition is met then this weightObj1 wins and vice versa
I was wondering is there a design approach for something like this?
You can try with CompareToBuilder from Apache commons-lang:
public int compareTo(Object o) {
MyClass myClass = (MyClass) o;
return new CompareToBuilder()
.appendSuper(super.compareTo(o)
.append(this.field1, myClass.field1)
.append(this.field2, myClass.field2)
.append(this.field3, myClass.field3)
.toComparison();
}
See also
How write universal comparator which can make sorting through all necessary fields?
Group Comparator, Bean Comparator and Column Comparator
Similar to the above-mentioned Apache CompareToBuilder, but including generics support, Guava provides ComparisonChain:
public int compareTo(Foo that) {
return ComparisonChain.start()
.compare(this.aString, that.aString)
.compare(this.anInt, that.anInt)
.compare(this.anEnum, that.anEnum, Ordering.natural().nullsLast())
// you can specify comparators
.result();
}
The API for Comparable states:
It is strongly recommended (though not required) that natural
orderings be consistent with equals.
Since the values of interest are int values you should be able to come up with a single value that captures all comparisons and other transformations you need to compare two of your objects. Just update the single value when any of the member values change.
You can try using reflection, iterate over properties and compare them.
You can try something like this:
int c1 = o1.m1 - o2.m1;
if (c1 != 0) {
return c1;
}
int c2 = o1.m2 - o2.m2;
if (c2 != 0) {
return c2;
}
return o1.m3 - o2.m3;
because comparable shall not just return -1, 0 or 1. It can return any integer value and only the sign is considered.

Categories