Custom Java Comparator With Pre Defined Top Result

Custom Java Comparator With Pre Defined Top Result - java

I want to sort a list of field names alphabetically however I need to include a condition in the doCompare method of the comparator so that if the field name is "pk" that will always be sorted to the top of the list. What I have is below but I'm not sure if I'm taking the right approach, particualrly with the reurn value of -1000. Any advice on this would be much appreciated.
#Override
public int doCompare(Object firstRec, Object secondRec)
{
MyField firstField = (MyField) firstRec;
MyField secondField = (MyField ) secondRec;
if(firstField.name() == "pk")
{
return -1000;
}
return StringUtils.compareStrings(firstField.name().toLowerCase(), secondField.name().toLowerCase());
}

The requirements of a Comparator (and, by extension, methods which are supposed to act like Comparator.compare) are described in the Javadoc:
The implementor must ensure that sgn(compare(x, y)) == -sgn(compare(y, x)) for all x and y. (This implies that compare(x, y) must throw an exception if and only if compare(y, x) throws an exception.)
The implementor must also ensure that the relation is transitive: ((compare(x, y)>0) && (compare(y, z)>0)) implies compare(x, z)>0.
Finally, the implementor must ensure that compare(x, y)==0 implies that sgn(compare(x, z))==sgn(compare(y, z)) for all z.
Assuming StringUtils.compareStrings correctly implements these requirements, the thing you've got wrong is the first requirement: you also need to consider the cases when secondField is also pk:
The general pattern for writing correct Comparators is:
int firstComparison = /* compare something about firstField and secondField */;
if (firstComparison != 0) {
return firstComparison;
}
int secondComparison = /* compare something else about firstField and secondField */;
if (secondComparison != 0) {
return secondComparison;
}
// ...
return 0;
Applying that here:
int pkComparison = Boolean.compare(secondField.name().equals("pk"), firstField.name().equals("pk"));
if (pkComparison != 0) {
return pkComparison;
}
int compareStringsComparison = StringUtils.compareStrings(firstField.name().toLowerCase(), secondField.name().toLowerCase());
if (compareStringsComparison != 0) {
return compareStringsComparison;
}
return 0;
Obviously, the last if statement is redundant, because you always return compareStringsComparison whether or not it is zero; so you could write simply:
return StringUtils.compareStrings(firstField.name().toLowerCase(), secondField.name().toLowerCase());
I would recommend sticking to the compare/check and return/finally return 0 pattern, because it's easier to slot in additional conditions later. But it's not terrible either way.

The new static methods of class Comparator available since Java 8 are very handy to create a multi-criteria Comparator like in your case.
You could try something like this:
List<String> list = ... ;
list.sort(
Comparator.comparingBoolean("PK"::equals)
.thenComparing(StringUtils::compare)
);
You may need to use .reversed() in case the order is the opposite of what you want.
The great advantage of Comparator.comparing / Comparator.comparingXXX is that you don't need to twist your mind to get the correct behavior when to return a positive, negative or 0 value.
The Comparator.thenComparing dos proper chaining, i.e. it checks further criterias only when needed, only when previous comparisons returned 0.
If your list may contain null values, there are also methods to handle them properly. This isn't the case in this short example.

Related

Arrays.equals without length of arrays

I have two arrays with different length, but same elements. For example
A1 = {1,2,3,null,null}
A2 = {1,2,3}
Arrays.equals gives me false, because arrays have different length. Are there any method in java that will compare only elements in method?
I don't want to use .toString
I'm trying to make compare method in my own generic stack realization.

No, because its a weird request. null does not mean 'not here', null means 'unknown / unset', that's why it throws exceptions when you interact with it: You're asking "hey, thing that has not been set yet, are you X", and there is no way to answer such a question.
That doesn't mean your code is wrong, just, you can stop looking for existing implementations. Weird requests generally aren't catered to by the core libraries (or any other). You also may want to change your mindset on null. Programming in java is a lot less aggravating if at all times a NullPointerException is a good thing. In other words, avoid using null as having any semantic meaning. If you ever write if (x == null || x.isEmpty()) you are doing it wrong. Instead, where-ever 'x' is coming from, it should hold, or be updated to ASAP, the empty string instead. So, if reading in external data (e.g. you marshalled some JSON into an object), do a 'clean' step that replaces all null values that have semantic meaning with an object that actually represents it, and for methods that return stuff, always return an object that represents what you are returning - only return null if you WANT to convey the notion that there is no result (i.e. that's not the same as 'an empty result', i.e. if any code acts like there was a result, you want it to crash).
In other words, I doubt you are asking the right question. But in case you are, you have two broad options.
First make null-less arrays then compare those as normal
One option is to make new arrays that have nulls stripped. Something like:
#SuppressWarnings("unchecked")
<T> T[] stripNulls(T[] in) {
Class<?> componentType = in.getClass().getComponentType();
return (T[]) Arrays.stream(in)
.filter(x -> x != null)
.toArray(len -> java.lang.reflect.Array.newInstance(componentType, len));
}
// which you can then use; you don't need generics for a compare,
// it wouldn't add anything at all.
boolean compare(Object[] a, Object[] b) {
return Arrays.equals(stripNulls(a), stripNulls(b));
}
Just compare in place
If it's performance sensitive that's suboptimal. A better approach would involve a little more coding:
boolean compare(Object[] a, Object[] b) {
Object ae = null, be = null;
int ai = 0, bi = 0, al = a.length, bl = b.length;
while (true) {
/* set `ae` and `be` to the next non-null element */
while (ae == null && ai < al) ae = a[ai++];
while (be == null && bi < bl) be = b[bi++];
/* Have we hit the end? */
if (ai == al && bi == bl) return true;
/* If one is at the end, but the other isn't... */
if (ai == al || bi == bl) return false;
/* check if the 2 current elements are equal */
if (!ae.equals(be)) return false;
}
}

Not a native Java Developer, but maybe this helps you?
boolean arraysEqual = Arrays.equals(Arrays.stream(a1).filter(n => n != null).toArray(), Arrays.stream(a2).filter(n => n != null).toArray())

How to implement a key-value pair with variability in the key

I'm writing some code to de-duplicate data based on 2 fields:
A string of characters, we'll call this the UMI
An array of integers
I've created a POJO to hold this data and work as key for a TreeMap. The full set of data is held in the value - this way I only keep relevant data in memory.
However, the next requirement is to have variability in the UMI AND the integers. For example, the following two pieces of data would be considered duplicates based on the UMI having a variability(mismatch) of 1.
a. "AAA", [200,300]
b. "ABA", [200,300]
Similarly, the following would be considered duplicates based on the integer array, given a mismatch allowance of 2.
a. "AAA", [201,300]
b. "AAA", [203,300]
My current attempt has been to make this POJO implement the Comparable interface, and attempt to work the compareTo method to take into account the variability:
public class UMIPrimoKey implements Comparable<UMIPrimoKey> {
private final String UMI;
private final int[] ints;
private final int umiMisMatch;
private final int posMisMatch;
public UMIPrimoKey(String UMI, int[] ints, int umiMisMatch, int posMisMatch) {
this.UMI = UMI;
this.ints = ints;
this.umiMisMatch = umiMisMatch;
this.posMisMatch = posMisMatch;
}
#Override
public int compareTo(UMIPrimoKey o) {
if (!Arrays.equals(ints, o.ints)) {
if (ints.length == o.ints.length) {
for (int i = 0; i < ints.length; i++) {
if (Math.abs(ints[i] - o.ints[i]) > posMisMatch) {
return -1;
}
}
} else {
return -1;
}
}
if (XsamStringUtils.numberOfDifferences(UMI, o.UMI) <= umiMisMatch) {
return 0;
}
return 1;
}
}
XsamStringUtils.numberOfDifferences is just a simple static method to count the number of differences between the two UMIs.
I return -1 if any two integers from the array have a difference greater than the allowed mismatches (posMisMatch). 0 is returned if the integers are allowed, and the number of mismatches in the UMI is less than the allowed amount, specified by umiMisMatch.
Otherwise, 1 is returned as the UMIs don't match.
I've then used this in a TreeMap which takes into account the compareTo method.
This works in my unit tests, with small numbers of UMIPrimoKeys added to it, but I'm getting some strange results when running the completed program. It's probably due to the rules for the method outlined here: https://docs.oracle.com/javase/8/docs/api/java/lang/Comparable.html but i'm finding it hard to adapt the code to take the rules into account.
Any direction is appreciated, thanks for reading!

According to the docs of compareTo:
The implementor must ensure sgn(x.compareTo(y)) == -sgn(y.compareTo(x)) for all x and y. (This implies that x.compareTo(y) must throw an exception iff y.compareTo(x) throws an exception.)
The implementor must also ensure that the relation is transitive: (x.compareTo(y)>0 && y.compareTo(z)>0) implies x.compareTo(z)>0.
Finally, the implementor must ensure that x.compareTo(y)==0 implies that sgn(x.compareTo(z)) == sgn(y.compareTo(z)), for all z.
I think that's not true to your code, and that could cause problems with the get function not finding your entry

Design pattern for multiple combinations

If I have to make a different database query depending on the presence or not of different parameters, which would be the correct design pattern to avoid too many if-else with the different combinations ?
Let's say I have parameters a, b, c (the amount can grow in the future), I'm using repositories so I would have to make a call something like this
public Foo getFoo(String a, String b, String c){
Foo foo;
if(a!=null && !a.isEmpty() && b!=null && !b.isEmpty() && c!=null && !c.isEmpty())
foo = repository.findByAAndBAndC(a,b,c);
if((a==null || a.isEmpty()) && b!=null && !b.isEmpty() && c!=null && !c.isEmpty())
foo = repository.findByBAndC(b,c);
if((a!=null && !a.isEmpty()) && (b==null || b.isEmpty()) && c!=null && !c.isEmpty())
foo = repository.findByAAndC(a,c);
if((a==null || a.isEmpty()) && (b==null || b.isEmpty()) && !b.isEmpty() && c!=null && !c.isEmpty())
foo = repository.findByC(c);
if((a==null || a.isEmpty()) && (b==null || b.isEmpty()) && !b.isEmpty() && (b==null || b.isEmpty()))
foo = repository.findOne();
etc.
.
.
.
return foo;
}
How can that be better structured ?

At the beginning, I would propose you the Specification design pattern that :
is a particular software design pattern, whereby business rules can be
recombined by chaining the business rules together using boolean
logic. The pattern is frequently used in the context of domain-driven
design.
but your actual code doesn't suit completely to that as you don't invoke the same method of the repository according to the case.
So I think that you have two ways :
1) Refactoring your repository to provide a single common method accepting a specification parameter and able to handle the different cases.
If you use Spring, you could look at the JpaSpecificationExecutor interface that provides methods such as :
List<T> findAll(Specification<T> spec)
Even if you don't use Spring, I think that these examples could help you .
2) If you cannot refactor the repository, you should look for another way and provide a abstraction level about which repository methods/parameters may be passed to.
Actually, you invoke a different method with different parameters according to the input parameters but in any case you return the same type of object to the client of the method : Foo. So to avoid conditional statements, polymorphism is the way to follow.
Each case to handle is finally a different strategy. So you could have a strategy interface and you could determine the strategy to use to return the Foo to the client.
Besides, as suggested in a comment : a!=null && !a.isEmpty() repeated multiple times is not a good smell. It makes much duplication and also makes the code less readable. It would better to apply this processing by using a library such as Apache common or even a custom method.
public class FooService {
private List<FindFooStrategy> strategies = new ArrayList<>();
public FooService(){
strategies.add(new FindFooByAAndBAndCStrategy());
strategies.add(new FindFooByBAndCStrategy());
strategies.add(new FindFooByAAndCStrategy());
strategies.add(new FindFooByCStrategy());
}
public Foo getFoo(String a, String b, String c){
for (FindFooStrategy strategy : strategies){
if (strategy.isApplicable(a, b, c)) {
return strategy.getFoo(a, b, c);
}
}
}
}
Where FindFooStrategy is defined as :
public interface FindFooStrategy{
boolean isApplicable(String a, String b, String c);
Foo getFoo(String a, String b, String c);
}
And where each subclass defines its rules. For example :
public class FindFooByAAndBAndCStrategy implements FindFooStrategy{
public boolean isApplicable(String a, String b, String c){
return StringUtils.isNotEmpty(a) && StringUtils.isNotEmpty(b) &&
StringUtils.isNotEmpty(c);
}
public Foo getFoo(String a, String b, String c){
return repository.findByAAndBAndC(a,b,c);
}
}

This is not a complete answer. I will offer several suggestions to address the problem at hand.
Dealing with Null Values
To avoid checking whether a value is null, I suggest that you use a container class for your String query parameters with some method, say getValue() that returns parameter's value e.g., parameter='value' if the value is present or some default string value e.g., parameter like '%' if it's null. This approach follows the so-called, Null Design Pattern.
Dynamic Construction of Query
After doing this, it will no longer matter what values the parameters you passed have and you can just construct your condition iteratively such as:
for parameter in parameters:
condition = "AND" + parameter.getValue()
Perhaps you can combine this with a generic method for querying that accepts arbitrary length condition such as:
repository.findBy(condition)
I am not 100% sure since I am typing this answer from the top of my mind but I think this approach works and should be able to address the problem mentioned in your post. Let me know what you think.

You can make use of a enum defining bitmap-constants with a valueOf method:
public enum Combinations{
A_AND_B_AND_C (0b111),
B_AND_C (0b110),
A_AND_C (0b101),
C (0b100),
A_AND_B (0b011),
B (0b010),
A (0b001),
NONE (0b000),
;
private final int bitmap;
Combinations(int bitmap){
this.bitmap = bitmap;
}
public static Combinations valueOf(String... args){
final StringBuilder builder = new StringBuilder();
for(int i = args.length - 1; i >= 0; i--){
final String arg = args[i];
builder.append(arg != null && !arg.isEmpty() ? '1' : '0');
}
final int bitmap = Integer.parseInt(builder.toString(), 2);
final Combinations[] values = values();
for(int i = values.length -1; i >= 0; i--){
if(values[i].bitmap == bitmap){
return values[i];
}
}
throw new NoSuchElementException();
}
}
And another class which has a switch case statement:
public class SomeClass {
public Foo getFoo(String a, String b, String c){
switch(Combinations.valueOf(a, b, c)){
case A_AND_B_AND_C:
return repository.findByAAndBAndC(a, b, c);
case B_AND_C:
return repository.findByBAndC(b, c);
/* all other cases */
case NONE:
return repository.findOne();
default:
// type unknown
throw new UnsupportedOperationException();
}
}
}
This may be a lot of work in the first place. But you'll be glad when you've done it. By using Bitmaps you can have a lot of combinations. The valueOf method takes care of finding out which combination actually should be taken. But what should happen after can't be done generically. So when adding another parameter d you'll get a lot more combinations which must be added to the enum.
All in all this solution is overkill for small amounts of parameters. Is still quite easy to understand, because the logic is split up into many small parts. You just still don't get around the big switch statement at the end though.

Best way to traverse and find an object field from a list

I have a list of Custom object and i want to find an object by given an Id(a field in custom object). i was coding for this so i found two solutions when comparing fields.
1
private Product getProduct(String productId,List<Product> productList){
for (int i = 0; i < productList.size(); i++) {
if (productId.equals(productList.get(i).getId())) {
return productList.get(i);
}
}
return null;
}
2.
private Product getProduct(String productId,List<Product> productList){
for (int i = 0; i < productList.size(); i++) {
if (productList.get(i).getId().equals(productId)) {
return productList.get(i);
}
}
return null;
}
The difference is in if condition , i want to know which one is better than the other and why, when to use 1st method and when to use second ?

Since equals() is required by Java to be symmetric, there is no difference between the two snippets.
Both snippets are sub-optimal, in that they iterate by numeric index, and retrieve productList.get(i) twice before returning it. Iterating by index is especially dangerous, because passing a LinkedList<Product> will slow down your search considerably.
A better approach is to use a for-each form of the loop:
for (Product p : productList) {
if (p.getId().equals(productId)) {
return p;
}
}
return null;

The concern in both of your implementations is the possibility of calling .equals on a null value.
If you can guarantee neither of them are null then they are equivalent.
If you are using Java 8, stream may be a better choice.
private Product getProduct(String productId,List<Product> productList){
return products.stream()
.filter(p-> productId.equals(p.getId())
.findFirst()
.orElse(null);

When you are sure the product id's are never null it doesn't really matter.
But in general it's always good to program in a defensive way, so for example prefer using
"SomeString".equals(aString)
instead of
aString.equals("SomeString")
since you know "SomeString" is never null.
Or use
Objects.equals(object1, object2)
when both objects might be null.

The first one invokes equals on the parameter productId, while the second one invokes equals on the current list element from productList. The result is the same because equals is symmetric:
for any non-null reference values x and y, x.equals(y) should return true if and only if y.equals(x) returns true.
You can also use a stream for this, so you don't have to care about implementation details (furthermore, Objects#equals(Object, Object) is null-safe):
String p = productList.stream().filter(e -> Objects.equals(e, productId))
.findFirst()
.orElse(null);
Have a look a this question for further information.

How to implement efficient hash cons with java HashSet

I am trying to implement a hash cons in java, comparable to what String.intern does for strings. I.e., I want a class to store all distinct values of a data type T in a set and provide an T intern(T t) method that checks whether t is already in the set. If so, the instance in the set is returned, otherwise t is added to the set and returned. The reason is that the resulting values can be compared using reference equality since two equal values returned from intern will for sure also be the same instance.
Of course, the most obvious candidate data structure for a hash cons is java.util.HashSet<T>. However, it seems that its interface is flawed and does not allow efficient insertion, because there is no method to retrieve an element that is already in the set or insert one if it is not in there.
An algorithm using HashSet would look like this:
class HashCons<T>{
HashSet<T> set = new HashSet<>();
public T intern(T t){
if(set.contains(t)) {
return ???; // <----- PROBLEM
} else {
set.add(t); // <--- Inefficient, second hash lookup
return t;
}
}
As you see, the problem is twofold:
This solution would be inefficient since I would access the hash table twice, once for contains and once for add. But okay, this may not be a too big performance hit since the correct bucket will be in the cache after the contains, so add will not trigger a cache miss and thus be quite fast.
I cannot retrieve an element already in the set (see line flagged PROBLEM). There is just no method to retrieve the element in the set. So it is just not possible to implement this.
Am I missing something here? Or is it really impossible to build a usual hash cons with java.util.HashSet?

I don't think it's possible using HashSet. You could use some kind of Map instead and use your value as key and as value. The java.util.concurrent.ConcurrentMap also happens to posess the quite convenient method
putIfAbsent(K key, V value)
that returns the value if it is already existent. However, I don't know about the performance of this method (compared to checking "manually" on non-concurrent implementations of Map).
Here is how you would do it using a HashMap:
class HashCons<T>{
Map<T,T> map = new HashMap<T,T>();
public T intern(T t){
if (!map.containsKey(t))
map.put(t,t);
return map.get(t);
}
}
I think the reason why it is not possible with HashSet is quite simple: To the set, if contains(t) is fulfilled, it means that the given t also equals one of the t' in the set. There is no reason for being able return it (as you already have it).

Well HashSet is implemented as HashMap wrapper in OpenJDK, so you won't win in memory usage comparing to solution suggested by aRestless.
10-min sketch
class HashCons<T> {
T[] table;
int size;
int sizeLimit;
HashCons(int expectedSize) {
init(Math.max(Integer.highestOneBit(expectedSize * 2) * 2, 16));
}
private void init(int capacity) {
table = (T[]) new Object[capacity];
size = 0;
sizeLimit = (int) (capacity * 2L / 3);
}
T cons(#Nonnull T key) {
int mask = table.length - 1;
int i = key.hashCode() & mask;
do {
if (table[i] == null) break;
if (key.equals(table[i])) return table[i];
i = (i + 1) & mask;
} while (true);
table[i] = key;
if (++size > sizeLimit) rehash();
return key;
}
private void rehash() {
T[] table = this.table;
if (table.length == (1 << 30))
throw new IllegalStateException("HashCons is full");
init(table.length << 1);
for (T key : table) {
if (key != null) cons(key);
}
}
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Custom Java Comparator With Pre Defined Top Result - java

Related

Arrays.equals without length of arrays

How to implement a key-value pair with variability in the key

Design pattern for multiple combinations

Best way to traverse and find an object field from a list

How to implement efficient hash cons with java HashSet

Categories

Resources