Add additional rules to the compare method of a Comparator

Add additional rules to the compare method of a Comparator - java

I currently have a code snippet which returns strings of a list in ascending order:
Collections.sort(myList, new Comparator<MyClass>() {
#Override
public int compare(MyClass o1, MyClass o2) {
return o1.aString.compareTo(o2.aString);
}
});
While it works, I would like to add some custom "rules" to the order to put certain strings to the front. For instance:
if(aString.equals("Hi")){
// put string first in the order
}
if(aString begins with a null character, e.g. " ") {
// put string after Hi, but before the other strings
}
// So the order could be: Hi, _string, a_string, b_string, c_string
Is it possible to customize the sorting of a list with a Comparator like this?

The answer from MC Emperor is quite nice (+1) in that it fulfills the OP's requirement of not using Java 8 APIs. It also uses a neat internal function technique (the getOrder method) of mapping conditions to small integer values in order to effect a first-level comparison.
Here's an alternative that uses Java 8 constructs. It assumes that MyClass has a getString method that does the obvious thing.
Collections.sort(myList,
Comparator.comparing((MyClass mc) -> ! mc.getString().equals("Hi"))
.thenComparing(mc -> ! mc.getString().startsWith(" "))
.thenComparing(MyClass::getString));
This is pretty opaque until you get used to this style. The key insight is that the "extractor" function that's supplied to Comparator.comparing and Comparator.thenComparing often simply extracts a field, but it can be a general mapping to any other value. If that value is Comparable then an additional Comparator for it needn't be provided. In this case the extractor function is a boolean expression. This gets boxed to a Boolean which as it turns out is Comparable. Since false orders before true we need to negate the boolean expression.
Also note that I had to provide an explicit type declaration for the lambda parameter, as type inference often doesn't work for chained comparator cases such as this one.

That's possible.
Using Java 8 features
You could pass a function to the Comparator.comparing method to define your rules. Note that we simply return integers, the lowest integer for the elements which should come first.
Comparator<MyClass> myRules = Comparator.comparing(t -> {
if (t.aString.equals("Hi")) {
return 0;
}
else if (t.aString.startsWith(" ")) {
return 1;
}
else {
return 2;
}
});
If you want the remaining elements to be sorted alphabetically, you could use thenComparing(Comparator.naturalOrder()), if your class implements Comparable. Otherwise, you should extract the sort key first:
Collections.sort(myList, myRules.thenComparing(Comparator.comparing(t -> t.aString)));
Note that the actual specific numbers returned don't matter, what matters is that lower numbers come before higher numbers when sorting, so if one would always put the string "Hi" first, then the corresponding number should be the lowest returned (in my case 0).
Using Java <= 7 features (Android API level 21 compatible)
If Java 8 features are not available to you, then you could implement it like this:
Comparator<MyClass> myRules = new Comparator<MyClass>() {
#Override
public int compare(MyClass o1, MyClass o2) {
int order = Integer.compare(getOrder(o1), getOrder(o2));
return (order != 0 ? order : o1.aString.compareTo(o2.aString));
}
private int getOrder(MyClass m) {
if (m.aString.equals("Hi")) {
return 0;
}
else if (m.aString.startsWith(" ")) {
return 1;
}
else {
return 2;
}
}
};
And call it like this:
Collections.sort(list, myRules);
This works as follows: first, both received strings are mapped to your custom ruleset and subtracted from eachother. If the two differ, then the operation Integer.compare(getOrder(o1), getOrder(o2))1 determines the comparison. Otherwise, if both are the same, then the lexiographic order is used for comparison.
Here is some code in action.
1 Always use Integer::compare rather than subtracting one from the other, because of the risk of erroneous results due to integer overflow. See here.

Yes, that is possible, you have complete control over the compareTo() method. Two things:
Use String#equals instead of == to compare strings
Make sure you check both arguments to compareTo for your exceptional cases.
A concrete way of implementing something where some words are always first and some words are always last, with ordering defined among the exceptions:
Map<String, Integer> exceptionMap = new HashMap<>();
exceptionMap.put("lowest", -2);
exceptionMap.put("second_lowest", -1);
exceptionMap.put("second_highest", 1);
exceptionMap.put("highest", 2);
public int compareToWithExceptionMap(String s1, String s2) {
int firstExceptional = exceptionMap.getOrDefault(s1, 0);
int secondExceptional = exceptionMap.getOrDefault(s2, 0);
if (firstExceptional == 0 && secondExceptional == 0) {
return s1.compareTo(s2);
}
return firstExceptional - secondExceptional;
}

Related

Equivalent to stream distinct using a custom comparator

If I have the following List:
List<String> list = Arrays.asList("hello", "world", "hello");
And I apply the following (Java8):
list.stream().distinct().collect(Collectors.toString());
Then I would get a list with "hello" and "world".
However, in my case, I have a list of a type (from an external api) where I want to "bypass" the equals Method, ideally with a comparator, as it doesn't cover what I need.
Assume this class looks like this:
public class Point {
float x;
float y;
//getters and setters omitted
}
In this case, I would like two points that cover a certain criteria to be defined as equal, for instance (30, 20) and (30.0001, 19.999).
A custom comparator could do the trick, but I have found no API that does what the distinct() in Java8 Stream does, but with a comparator (or similar pattern).
Any thoughts? I know I could write such a function, but I would rather like the elegant way of using existing apis... I have no restriction with external libraries (guava, apache-commons, etc. are welcome if they have a comfortable way of doing what I need).

HashingStrategy is the concept you're looking for. It's a strategy interface that allows you to define custom implementations of equals and hashcode.
public interface HashingStrategy<E>
{
int computeHashCode(E object);
boolean equals(E object1, E object2);
}
Streams don't support hashing strategies but Eclipse Collections does. It has sets and maps that support hashing strategies as well as overloads of methods like distinct() that take hashing strategies.
This would work well for Strings. For example, here's how we could get all distinct Strings ignoring case.
MutableList<String> strings = Lists.mutable.with("Hello", "world", "HELLO", "World");
assertThat(
strings.distinct(HashingStrategies.fromFunction(String::toLowerCase)),
is(equalTo(Lists.immutable.with("Hello", "world"))));
Or you can write the hashing strategy by hand to avoid garbage creation.
HashingStrategy<String> caseInsensitive = new HashingStrategy<String>()
{
#Override
public int computeHashCode(String string)
{
int hashCode = 0;
for (int i = 0; i < string.length(); i++)
{
hashCode = 31 * hashCode + Character.toLowerCase(string.charAt(i));
}
return hashCode;
}
#Override
public boolean equals(String string1, String string2)
{
return string1.equalsIgnoreCase(string2);
}
};
assertThat(
strings.distinct(caseInsensitive),
is(equalTo(Lists.immutable.with("Hello", "world"))));
This could work for Points too, but only if you can group all points within non-overlapping regions to have the same hashcode. If you're using a Comparator defined to return 0 when two Points are close enough, then you can run into transitivity problems. For example, Points A, B, and C can fall along a line with A and C both close to B but far from each other. Still, if this is a useful concept to you, we'd welcome a pull request adding ListIterable.distinct(Comparator) to the API.
Note: I am a committer for Eclipse Collections.

What is the importance of "same ordering" objects being equal?

I'm sorting an array of objects. The objects have lots of fields but I only care about one of them. So, I wrote a comparator:
Collections.sort(details, new Comparator<MyObj>() {
#Override
public int compare(MyObj d1, MyObj d2) {
if (d1.getDate() == null && d2.getDate() == null) {
return 0;
} else if (d1.getDate() == null) {
return -1;
} else if (d2.getDate() == null) {
return 1;
}
if (d1.getDate().before(d2.getDate())) return 1;
else if (d1.getDate().after(d2.getDate())) return -1;
else return 0;
}
});
From the perspective of my use case, this Comparator does all it needs to, even if I might consider this sorting non-deterministic. However, I wonder if this is bad code. Through this Comparator, two very distinct objects could be considered "the same" ordering even if they are unequal objects. I decided to use hashCode as a tiebreaker, and it came out something like this:
Collections.sort(details, new Comparator<MyObj>() {
#Override
public int compare(MyObj d1, MyObj d2) {
if (d1.getDate() == null && d2.getDate() == null) {
return d1.hashCode();
} else if (d1.getDate() == null) {
return -1;
} else if (d2.getDate() == null) {
return 1;
}
if (d1.getDate().before(d2.getDate())) return 1;
else if (d1.getDate().after(d2.getDate())) return -1;
else return d1.hashCode() - d2.hashCode();
}
});
(what I return might be backwards, but that's is not important to this question)
Is this necessary?
EDIT:
To anyone else looking at this question, consider using Google's ordering API. The logic above was replaced by:
return Ordering.<Date> natural().reverse().nullsLast().compare(d1.getDate(), d2.getDate());

Through this comparator, two very distinct objects could be considered "the same" ordering even if they are unequal objects.
That really doesn't matter; it's perfectly fine for two objects to compare as equal even if they are not "equal" in any other sense. Collections.sort is a stable sort, meaning objects that compare as equal come out in the same order they came in; that's equivalent to just using "the index in the input" as a tiebreaker.
(Also, your new Comparator is actually significantly more broken than the original. return d1.hashCode() is particularly nonsensical, and return d1.hashCode() - d2.hashCode() can lead to nontransitive orderings that will break Collections.sort, because of overflow issues. Unless both integers are definitely nonnegative, which hashCodes aren't, always use Integer.compare to compare integers.)

This is only mostly important if the objects implement Comparable.
It is strongly recommended (though not required) that natural orderings be consistent with equals. This is so because sorted sets (and sorted maps) without explicit comparators behave "strangely" when they are used with elements (or keys) whose natural ordering is inconsistent with equals. In particular, such a sorted set (or sorted map) violates the general contract for set (or map), which is defined in terms of the equals method.
For example, if one adds two keys a and b such that (!a.equals(b) && a.compareTo(b) == 0) to a sorted set that does not use an explicit comparator, the second add operation returns false (and the size of the sorted set does not increase) because a and b are equivalent from the sorted set's perspective.
However, you're not doing that, you're using a custom Comparator, probably for presentation reasons. Since this sorting metric isn't inherently attached to the object, it doesn't matter that much.
As an aside, why not just return 0 instead of messing with the hashCodes? Then they will preserve the original order if the dates match, because Collections.sort is a stable sort. I agree with #LouisWasserman that using hashCode in this way can have potentially very bizarre consequences, mostly relating to integer overflow. Consider the case where d1.hashCode() is positive and d2.hashCode() is negative, and vice versa.

Java: Implement Compararable but too many conditional ifs. How can I avoid them?

I have a list of objects which implement Comparable.
I want to sort this list and that is why I used the Comparable.
Each object has a field, weight that is composed of 3 other member int variables.
The compareTo returns 1 for the object with the most weight.
The most weight is not only if the
weightObj1.member1 > weightObj2.member1
weightObj1.member2 > weightObj2.member2
weightObj1.member3 > weightObj2.member3
but actually is a little more complicated and I end up with code with too many conditional ifs.
If the weightObj1.member1 > weightObj2.member1 holds then I care if weightObj1.member2 > weightObj2.member2.
and vice versa.
else if weightObj1.member2 > weightObj2.member2 holds then I care if weightObj1.member3 > weightObj2.member3 and vice versa.
Finally if weightObj1.member3 > weightObj2.member3 holds AND if a specific condition is met then this weightObj1 wins and vice versa
I was wondering is there a design approach for something like this?

You can try with CompareToBuilder from Apache commons-lang:
public int compareTo(Object o) {
MyClass myClass = (MyClass) o;
return new CompareToBuilder()
.appendSuper(super.compareTo(o)
.append(this.field1, myClass.field1)
.append(this.field2, myClass.field2)
.append(this.field3, myClass.field3)
.toComparison();
}
See also
How write universal comparator which can make sorting through all necessary fields?
Group Comparator, Bean Comparator and Column Comparator

Similar to the above-mentioned Apache CompareToBuilder, but including generics support, Guava provides ComparisonChain:
public int compareTo(Foo that) {
return ComparisonChain.start()
.compare(this.aString, that.aString)
.compare(this.anInt, that.anInt)
.compare(this.anEnum, that.anEnum, Ordering.natural().nullsLast())
// you can specify comparators
.result();
}

The API for Comparable states:
It is strongly recommended (though not required) that natural
orderings be consistent with equals.
Since the values of interest are int values you should be able to come up with a single value that captures all comparisons and other transformations you need to compare two of your objects. Just update the single value when any of the member values change.

You can try using reflection, iterate over properties and compare them.

You can try something like this:
int c1 = o1.m1 - o2.m1;
if (c1 != 0) {
return c1;
}
int c2 = o1.m2 - o2.m2;
if (c2 != 0) {
return c2;
}
return o1.m3 - o2.m3;
because comparable shall not just return -1, 0 or 1. It can return any integer value and only the sign is considered.

using multiple comparators in a java binarySearch

How do I use multiple comparators in a binarySearch in java...
I'm trying to sort a list of contestants which are sorted by name and their starting number.
The problem is if two contestants have the same name I get an IndexOutOfBoundsException so I want to do a secondary binarySearch using the starting number (which is unique) but still keeping them in the right order with names.
This is what I've got right now:
static void add(Contestant c){
int pos = Collections.binarySearch(byName, c, new ConNameCmp());
if (pos >= 0){
pos = Collections.binarySearch(byName, c, new ConStartCmp());
}
byName.add(-pos-1, c);

One Comparator only
Don't use two Comparators, use a single Comparator that compares both values:
public int compare(Foo a, Foo b){
// compare bar() values first
int result = a.bar().compareTo(b.bar());
// compare baz() values only if bar() values are different
if(result==0){
result = a.baz().compareTo(b.baz());
}
return result;
}
(In your case bar() is the name and baz() is the number).
Use Libraries
Creating Comparators this way is a lot easier if you use either Guava or Commons / Lang
Guava Versions:
#Override
public int compare(final Foo a, final Foo b){
return ComparisonChain
.start()
.compare(a.bar(), b.bar())
.compare(a.baz(), b.baz())
.result();
}
Commons / Lang Version:
#Override
public int compare(final Foo a, final Foo b){
return new CompareToBuilder()
.append(a.bar(), b.bar())
.append(a.baz(), b.baz())
.toComparison();
}
(Both of these versions won't fail if any of the values are null, my quick and dirty code above will)
Solve the Problem
I don't think you should do a Binary search in the first place, this seems very complicated.
Why don't you use a TreeSet with a custom comparator? Or Collections.sort(list, comparator)? (For both of these options you can use the comparators I showed earlier).
Also, you should think about letting your Contestant implement Comparable<Contestant>. That way you won't need to use an external Comparator. You can use the same logic as above in the compareTo() method, just replace one of the objects with this.

You might have already tried this, and this solution might not be available to you, but if you can change your "Contestant" class, you can make it extend the "java.lang.Comparable" interface and override Comparable#compareTo(Contestant) method so that it takes both the name and starting number into account. Afterwards, you'll be able to use the Collections.binarySearch(Collection<Contestant>, Contestant) method for your need.

Java: Equalator? (removing duplicates from a collection of objects)

I have a bunch of objects of a class Puzzle. I have overridden equals() and hashCode(). When it comes time to present the solutions to the user, I'd like to filter out all the Puzzles that are "similar" (by the standard I have defined), so the user only sees one of each.
Similarity is transitive.
Example:
Result of computations:
A (similar to A)
B (similar to C)
C
D
In this case, only A or D and B or C would be presented to the user - but not two similar Puzzles. Two similar puzzles are equally valid. It is only important that they are not both shown to the user.
To accomplish this, I wanted to use an ADT that prohibits duplicates. However, I don't want to change the equals() and hashCode() methods to return a value about similarity instead. Is there some Equalator, like Comparator, that I can use in this case? Or is there another way I should be doing this?
The class I'm working on is a Puzzle that maintains a grid of letters. (Like Scrabble.) If a Puzzle contains the same words, but is in a different orientation, it is considered to be similar. So the following to puzzle:
(2, 2): A
(2, 1): C
(2, 0): T
Would be similar to:
(1, 2): A
(1, 1): C
(1, 0): T

Okay you have a way of measuring similarity between objects. That means they form a Metric Space.
The question is, is your space also a Euclidean space like normal three dimensional space, or integers or something like that? If it is, then you could use a binary space partition in however many dimensions you've got.
(The question is, basically: is there a homomorphism between your objects and an n-dimensional real number vector? If so, then you can use techniques for measuring closeness of points in n-dimensional space.)
Now, if it's not a euclidean space then you've got a bigger problem. An example of a non-euclidean space that programers might be most familiar with would be the Levenshtein Distance between to strings.
If your problem is similar to seeing how similar a string is to a list of already existing strings then I don't know of any algorithms that would do that without O(n2) time. Maybe there are some out there.
But another important question is: how much time do you have? How many objects? If you have time or if your data set is small enough that an O(n2) algorithm is practical, then you just have to iterate through your list of objects to see if it's below a certain threshold. If so, reject it.
Just overload AbstractCollection and replace the Add function. Use an ArrayList or whatever. Your code would look kind of like this
class SimilarityRejector<T> extends AbstractCollection<T>{
ArrayList<T> base;
double threshold;
public SimilarityRejector(double threshold){
base = new ArrayList<T>();
this.threshold = threshold;
}
public void add(T t){
boolean failed = false;
for(T compare : base){
if(similarityComparison(t,compare) < threshold) faled = true;
}
if(!failed) base.add(t);
}
public Iterator<T> iterator() {
return base.iterator();
}
public int size() {
return base.size();
}
}
etc. Obviously T would need to be a subclass of some class that you can perform a comparison on. If you have a euclidean metric, then you can use a space partition, rather then going through every other item.

I'd use a wrapper class that overrides equals and hashCode accordingly.
private static class Wrapper {
public static final Puzzle puzzle;
public Wrapper(Puzzle puzzle) {
this.puzzle = puzzle;
}
#Override
public boolean equals(Object object) {
// ...
}
#Override
public int hashCode() {
// ...
}
}
and then you wrap all your puzzles, put them in a map, and get them out again…
public Collection<Collection<Puzzle>> method(Collection<Puzzles> puzzles) {
Map<Wrapper,<Collection<Puzzle>> map = new HashMap<Wrapper,<Collection<Puzzle>>();
for (Puzzle each: puzzles) {
Wrapper wrapper = new Wrapper(each);
Collection<Puzzle> coll = map.get(wrapper);
if (coll == null) map.put(wrapper, coll = new ArrayList<Puzzle>());
coll.add(puzzle);
}
return map.values();
}

Create a TreeSet using your Comparator
Adds all elements into the set
All duplicates are stripped out

Normally "similarity" is not a transitive relationship. So the first step would be to think of this in terms of equivalence rather than similarity. Equivalence is reflexive, symmetric and transitive.
Easy approach here is to define a puzzle wrapper whose equals() and hashCode() methods are implemented according to the equivalence relation in question.
Once you have that, drop the wrapped objects into a java.util.Set and that filters out duplicates.

IMHO, most elegant way was described by Gili (TreeSet with custom Comparator).
But if you like to make it by yourself, seems this easiest and clearest solution:
/**
* Distinct input list values (cuts duplications)
* #param items items to process
* #param comparator comparator to recognize equal items
* #return new collection with unique values
*/
public static <T> Collection<T> distinctItems(List<T> items, Comparator<T> comparator) {
List<T> result = new ArrayList<>();
for (int i = 0; i < items.size(); i++) {
T item = items.get(i);
boolean exists = false;
for (int j = 0; j < result.size(); j++) {
if (comparator.compare(result.get(j), item) == 0) {
exists = true;
break;
}
}
if (!exists) {
result.add(item);
}
}
return result;
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Add additional rules to the compare method of a Comparator - java

Related

Equivalent to stream distinct using a custom comparator

What is the importance of "same ordering" objects being equal?

Java: Implement Compararable but too many conditional ifs. How can I avoid them?

using multiple comparators in a java binarySearch

Java: Equalator? (removing duplicates from a collection of objects)

Categories

Resources