Java: how to partition a collection into equivalence classes? - java

I have a list(!) of items:
A
B
C
D
E
...
and I want to group them:
[A, C, D]
[B, E]
...
Groups are defined by:
all items in the group are equal according to a custom function f(a, b) -> boolean
f(a, b) = f(b, a)
Question: is there ready API to do so?
<T> List<List<T>> group(Collection<T> collection, BiFunction<T, T, Boolean> eqF);
UPDATE. This question is totally not for a scenario when you can define some quality to group by! In this case Java 8 Collectors.groupingBy is the simplest answer.
I am working with multidimensional vectors and equality function looks like:
metrics(a, b) < threshold
For this case defining a hash is equal to solving the initial task :)

Your scenario sounds like a good use case for the groupingBy collector. Normally, instead of supplying an equality function, you supply a function that extracts a qualifier. The elements are then mapped to these qualifiers in lists.
i.e.
Map<Qualifier, List<T>> map = list.stream()
.collect(Collectors.groupingBy(T::getQualifier));
Collection<List<T>> result = map.values();
In the case the identity of T is your qualifier, you could use Function.identity() as an argument.
But this becomes a problem when your qualifier is more than 1 field of T. You could use a tuple type, to create an alternate identity for T but this only goes so far, as there needs to be a separate tuple class for each number of fields.
If you want to use groupingBy you really need to create a temperate alternate identity for T, so you don't have to change T's equals and hashCode methods.
To create a proper identity, you need to implement equals and hashCode (or always return 0 for a hash code, with performance downsides). There is no API class for this, that I know of, but I have made a simple implementation:
interface AlternateIdentity<T> {
public static <T> Function<T, AlternateIdentity<T>> mapper(
BiPredicate<? super T, Object> equality, ToIntFunction<? super T> hasher) {
return t -> new AlternateIdentity<T>() {
#Override
public boolean equals(Object other) {
return equality.test(t, other);
}
#Override
public int hashCode() {
return hasher.applyAsInt(t);
}
};
}
}
Which you could use like:
Collection<List<T>> result
= list.stream()
.collect(Collectors.groupingBy(
AlternateIdentity.mapper(eqF, hashF)
))
.values();
Where eqF is your function, and hashF is a hash code function that hashes the same fields as eqF tests. (Again, you could also just return 0 in hashF, but having a proper implementation would speed things up.)

You can use hashing to do this in linear time.
To do this, you need to first implement the hashCode() function in your object, so it returns an equal hash value for equal elements (for example by XOR-ing the hash codes of its instance properties). Then you can use a hash table of sets to group your elements.
Map<Integer, Set<T>> hashMap = new HashMap<>();
for (T element : collection) {
if (!hashMap.containsKey(element.hashCode())
hashMap.put(element.hashCode(), new HashSet<T>());
hashMap.get(element.hashCode()).add(element);
}
As equal elements produce the same hash, they will be inserted into the same equivalence class.
Now, you can obtain a collection of all equivalence classes (as sets) by using hashMap.values();

I'm pretty sure there's nothing in the standard API for this. You might try a third-party collection class, like Trove's TCustomHashSet. (It's interesting that, according to a comment in this related thread, the Guava group has (for now) rejected a similar class. See the discussion here.)
The alternative is to roll your own solution. If you don't have too many items, I'd suggest a brute-force approach: keep a list of item lists and, for each new item, go through the list of lists and see if it is equal to the first element of the list. If so, add the new item to the matching list and, if not, add a new list to the list of lists with that item as the only member. The computation complexity is not very good, which is why I would only recommend this where the number of items is small or execution time performance is not an issue at all.
A second approach is to modify your item class to implement the custom equality function. But to use that with the hash-based collection classes, you'll need to override hashcode() as well. (If you don't use a hash-based collection, you might as well go with the brute force approach.) If you don't want to (or can't) modify the item class (e.g., you want to use various equality tests), I'd suggest creating a wrapper class that can be parameterized with the equality (and hash code) strategy to use. (This is kind of half way between modifying your item class and using the Trove class.)

Here's a simple example grouping strings. You'll need to supply a different function other than identity() if your objects you want to group are more complex.
public class StreamGroupingBy
{
public static void main( String[] args )
{
List<String> items = Arrays.asList(
"a", "b", "c", "d",
"a", "b", "c",
"a", "b",
"a", "x" );
Map<String,List<String>> result = items.stream().collect(
Collectors.groupingBy( Function.identity() ) );
System.out.println( result );
}
}
Output:
{a=[a, a, a, a], b=[b, b, b], c=[c, c], d=[d], x=[x]}

I would also recommend to implement a hashing mechanism. You can do something similar with Guava FluentIterable:
FluentIterable.from(collection)
.index(new Function<T, K>() {
K apply(T input) {
//transform T to K hash
}
})//that would return ImmutableListMultimap<K, T>
.asMap()//that would return Map<K, Collection<T>>
.values();//Collection<Collection<T>>

Related

Sort objects of only one class in a mixed list

Consider the following list of objects which are from two classes (A and C),
C3, A2, C1, A1, A3
I would like to sort only the objects of type A (with the objects of other types remain in their places) so the output would look
like this:
C3, A1, C1, A2, A3
Is there any straightforward way to do so in Java?
P.S. By straightforward, I mean, without implementation of custom algorithms and by using Java classes.
I doubt that you can come up with a Comparator that incorporates the logic to do this, and certainly not a straightforward comparator. So just sorting the original array without writing your own sorting algorithm seems to be out.
My approach would be to generate a second array by stripping out the C elements (so it contains only the A elements), sort it using the standard Java classes, and then re-insert the C elements in their correct positions. You will have to be a little careful about indexing, as you are changing the sorted array length as you insert elements, but it should be fairly straightforward to work out.
Probably not the simplest solution, but one (hacked together*) approach:
Add all A objects to a new List and sort them:
List<A> list2 = new ArrayList<>();
list2.add(a2);
list2.add(a1);
list2.add(a3);
Collections.sort(list2, new Comparator<A>() {
#Override
public int compare(A a1, A a2) {
return a1.getName().compareTo(a2.getName());
}
});
And then have a final list, adding from the original list if the type is C and otherwise add from the sorted A list:
List<Object> resultSet = new ArrayList<>();
int it = 0;
for (Object obj : list) {
if (obj instanceof C) {
resultSet.add(obj);
} else {
resultSet.add(list2.get(it));
it++;
}
}
Example
*Note that this is just to get an idea of how to do it, not a great solution to copy.
You can sort it as you would normally sort a list, but you would use .getClass() and .getClass().getSimpleName() to tell if they're in class A or C, and what name the instance has

Merge two ArrayLists with no duplicate subclasses

Given:
public abstract class Cars {}
...
public class Ford extends Cars {}
...
public class Dodge extends Cars {}
...
public class Volkswagen extends Cars {}
...
If I have two ArrayList objects:
List<Cars> dealer1 = new ArrayList<>;
List<Cars> dealer2 = new ArrayList<>;
dealer1.addAll(asList(new Ford("ford1"), new Dodge("dodge1")));
dealer2.addAll(asList(new Dodge("dodge2"), new Volkswagen("vw1")));
I then want to create a merged list from the two with only one instance of each subclass, such that:
dealerMerged = ["ford1", "dodge1", "vw1"]
OR
dealerMerged = ["ford1", "dodge2", "vw1"]
It doesn't matter which instance makes it into the merged list.
Is this possible? I had a search through and saw something about using Set but that seems to only ensure unique references, unless I've badly misunderstood something.
Overriding equals() will work but DON'T
You can always make your collection distinctful converting it to a Set (as #Arun states in comment) or using distinct operation over the Stream of your collections. But remember those approaches use the equal() methods for that. So a quick thinking would be overriding equals() and return its Class type. But wait ! If you do so you will end up having all Dodge objects equals to each other despite they have different properties like name dodge1, dodge2. You may not only handle a single business in read world. equal() method has lots of other significances. So stay away of doing so.
If you are thinking a Java 8 way, Stateful Filter is perfect
We have a choice to use the filter operation for our concatenated stream. filter operation works pretty straight forward. It takes a predicate and decide which element to take or ignore. This a commonly used function that you will find all over the blogs that solves this problem.
public static <T> Predicate<T> distinctBy(Function<? super T, ?> keyExtractor) {
Map<Object, Boolean> seen = new ConcurrentHashMap<>();
return t -> seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null;
}
Here the distinctBy function returns a predicate (that will be used in filter operation). It maintains state about what it's seen previously and returns whether the given element was seen for the first time. (You can read further explanation about this here)
You can use this Stateful Filter like
Stream.of(dealer1, dealer2)
.flatMap(Collection::stream)
.filter(distinctBy(Cars::getClass))
.collect(Collectors.toList())
.forEach(cars -> System.out.println(cars));
So What we actually did here ?
We concatenated the 2 ArrayList with flatmap that will give us a single stream of the merged elements (If you are new to Stream API. See this Stream Concatenation article
We then exploits the filter() operation that is feed with the distinctBy method which return a predicate.
And you see a ConcurrentHashMap is maintained to track which element satisfies the predicate or not by a boolean flag.
And the predicate uses the getClass() method which returns the full class name, that distinguise the elements as subclasses
We then can collect or iterate over the filtered list.
Try using Map instead of List. You may please try following solution. This will let you put Car instances by their types. Thereby you will always have only one entry per class (this will be the latest entry in your map by the way).
public class CarsCollection {
Map<Class<? extends Cars>, ? super Cars> coll = new HashMap<>();
public <T extends Cars> void add(Class<T> cls, T obj) {
coll.put(cls, obj);
}
}
public class Temp {
public static void main(String[] args) {
CarsCollection nos1 = new CarsCollection();
cars.add(Ford.class, new Ford("ford1"));
cars.add(Dodge.class, new Dodge("dodge1"));
cars.add(Dodge.class, new Dodge("dodge2"));
cars.add(Volkswagen.class, new Volkswagen("vw1"));
System.out.println(cars);
}
}
You could add all the element of the first list into the result list (assuming there is no duplicate in the first list) and then loop through the second list and add the elements to the resulting list only if there is no instance of the same class in the first list.
That could look something like this :
dealerMerged = dealer1;
boolean isAlreadyRepresented;
for (car2 : dealer2) {
isAlreadyRepresented = false;
for (car1 : dealer1) {
if (car1.getClass().equals(car2.getClass())) {
isAlreadyRepresented = true;
}
}
if (!isAlreadyRepresented) {
dealerMerged.add(car2);
}
}
Just use class of the object as key in the map. This example with Java stream does exactly that:
List<Cars> merged = Stream.of(dealer1, dealer2)
.flatMap(Collection::stream)
.collect( Collectors.toMap( Object::getClass, Function.identity(), (c1, c2) -> c1 ) )
.values()
.stream().collect( Collectors.toList() );

Java: Creating an Array from the Properties of Another Array

Is there a simple way in Java (that doesn't involve writing a for-loop) to create an array of objects from a property of another array of different objects?
For example, if I have an array of objects of type A, defined as:
public class A {
private String p;
public getP() {
return p;
}
}
I want to create an array of Strings that contains the value of A[i].p for each i.
Essentially, I'm I want to do this: Creating an array from properties of objects in another array, but in Java.
I attempted to use Arrays.copyOf(U[] original, int newLength, Class<? extends T[]> newType) along with a lambda expression, but that didn't seem to work. What I tried:
Arrays.copyOf(arrayA, arrayA.length, (A a) -> a.getP());
With Java 8, you can use the Stream API and particularly the map function:
A[] as = { new A("foo"), new A("bar"), new A("blub") };
String[] ps = Stream.of(as).map(A::getP).toArray(String[]::new);
Here, A::getP and String[]::new are method/constructor references. If you do not have a suitable method for the property you want to have, you could also use a lambda function:
String[] ps = Stream.of(as).map(a -> a.getP()).toArray(String[]::new);
This is where a powerful concept in functional programming called map is useful. Here's how map is defined:
map :: (a -> b) -> [a] -> [b]
Thus, map is a function that takes a function (that takes a and returns b) and a list and returns a list. It applies the given function to each element of the given list. Thus map is a higher order function.
In Java 8, you can use this idiom if you can convert the array into a stream. This can be done simply:
Arrays.stream(array).map(mappingFunction);
where the mappingFunction takes an element from stream (say of type A) and converts it to another (say of type B). What you now have is a stream of B's, which you can easily collect in a collector (e.g. in a list, or an array) for further processing.

Arrays.asList(T[] array)?

So there's Arrays.asList(T... a) but this works on varargs.
What if I already have the array in a T[] a? Is there a convenience method to create a List<T> out of this, or do I have to do it manually as:
static public <T> List<T> arrayAsList(T[] a)
{
List<T> result = new ArrayList<T>(a.length);
for (T t : a)
result.add(t);
return result;
}
Just because it works with varargs doesn't mean you can't call it normally:
String[] x = { "a", "b", "c" };
List<String> list = Arrays.asList(x);
The only tricky bit is if T is Object, where you should use a cast to tell the compiler whether it should wrap the argument in an array or not:
Object[] x = ...;
List<Object> list = Arrays.asList((Object[]) x);
or
Object[] x = ...;
List<Object[]> list = Arrays.asList((Object) x);
As you probably already know, there is a Static class called java.util.Collections which has a number of useful methods for dealing wit arrays such as searching and sorting.
As for your question, the Collection interface specifies methods to add, remove and toArray, amongst others. For one reason or another, the API's authors decided that the add and addAll method will be the only input functions provided to the user.
One explanation for why Java Lists cannot add arrays of objects is that Lists use an iterator and iterators are more strict in their scrolling (i.e. going to the next value) than Arrays which do not have to have all their index values i=(1, 2, 5, 9, 22, ...).
Also, Arrays are not type safe; that is, they cannot guarantee that all their elements conform to a specific super-class or interface, whereas generics (of which List is a member) can guarantee type safety. Hence, the list has the chance to validate each item using the add method.
I think that you can rest assure that your method of adding an array to a list is one of the most (if not most) efficient way of achieving this effect in Java.

Java TreeMap (comparator) and get method ignoring the comparator

public final Comparator<String> ID_IGN_CASE_COMP = new Comparator<String>() {
public int compare(String s1, String s2) {
return s1.compareToIgnoreCase(s2);
}
};
private Map< String, Animal > _animals = new TreeMap< String, Animal >(ID_IGN_CASE_COMP);
My problem is, how to use method get(id) ignoring the given comparator. I want the map to be order by Case Insensitive but, I want it to be case sensitive when I fetch the values by a given key.
I think the answer is easy. Implement your own comparator that does a case insensitive sort but does NOT return 0 for "A" and "a"... sort them too.
The issue is that your comparator returns 0 for the compare( "A", "a" ) case which means it is the same key as far as the map is concerned.
Use a comparator like:
public final Comparator<String> ID_IGN_CASE_COMP = new Comparator<String>() {
public int compare(String s1, String s2) {
int result = s1.compareToIgnoreCase(s2);
if( result == 0 )
result = s1.compareTo(s2);
return result;
}
};
Then all keys will go in regardless of case and "a" and "A" will still be sorted together.
In other words, get("a") will give you a different value from get("A")... and they will both show up in keySet() iterators. They will just be sorted together.
In a TreeMap, adding two keys a and b (in that order) so that compare(a, b) returns 0 will result in that the latest added entry (b) will overwrite the first one (a).
In your case, this means that there will never be any use for case insensitive get(id).
quoting http://java.sun.com/javase/6/docs/api/java/util/TreeMap.html
Note that the ordering maintained by a sorted map (whether or not an explicit comparator is provided) must be consistent with equals if this sorted map is to correctly implement the Map interface. (See Comparable or Comparator for a precise definition of consistent with equals.) This is so because the Map interface is defined in terms of the equals operation, but a map performs all key comparisons using its compareTo (or compare) method, so two keys that are deemed equal by this method are, from the standpoint of the sorted map, equal. The behavior of a sorted map is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Map interface.
This is probably not what you want.
If the map is comparably small and you don't need to fetch the sorted entries very many times, a solution is to use a HashMap (or a TreeMap without explicitly setting the comparator), and sort the entries case-insensitively when you need them ordered.
You'll have to use two separate TreeMaps for that, with the same contents but different comparators.
maybe it'll do the job:
new Comparator<String>(){
public int compare(String s1, String s2)
{
String s1n = s1.toLowerCase();
String s2n = s2.toLowerCase();
if(s1n.equals(s2n))
{
return s1.compareTo(s2);
}
return s1n.compareTo(s2n);
}
};
}
you need a multimap: each entry of this multimap keeps the case insensitive keys and aanother map with the original keys as value.
There are many freely usable implementations of multimaps such as Common Collections, Google Collections, etc
In addition to all the other answers and agreeing, that it is impossible to have a single TreeMap structure with different comparators:
From your question I understand that you have two requirements: the data model shall be case sensitive (you want the case sensitive values when you use get()), the presenter shall be case insensitive (you want an case sensitive ordering, presentation is just an assumption).
Let's assume, we populate the Map with the mappings (aa,obj1), (aA,obj2), (Aa,obj3), (AA,obj4). The iterator will provides the values in the order: (obj4, obj3, obj2, obj1)(*). Now which order do you expect if the map was ordered case-insensitive? All four keys would be equal and the order undefined. Or are you looking for a solution that would resolve the collection {obj1, obj2, obj3, obj4} for the key 'AA'? But that's a different approach.
SO encourages the community to be honest: therefore my advice at this point is to look at your requirement again :)
(*) not tested, assumed that 'A' < 'a' = true.
Use floorEntry and then higherEntry in a loop to find the entries case-insensitively; stop when you find the exact key match.

Categories