Combining / merging object members in a list

Combining / merging object members in a list - java

I got a list of objects from class A in a list.
Some of these objects are equal in id and name but not in list <B> , and list b is ALWAYS different.
I need to merge these so that my list is only made out of object a's with same name and id exists and all the b from same group are collected
I can make use of jdk 8 plus utils so streams are ok to use here.. Although I think reflection here is more usable?
PS: I can not change content of a of b class as they are generated classes and no access / expansion possibility
#Test
public void test() {
List.of(new A(1, "a1", List.of(new B(1, "1b"))),
new A(1, "a1", List.of(new B(2, "2b"))),
new A(2, "a2", List.of(new B(3, "3b"))));
//expected
List.of(new A(1, "a1", List.of(new B(1, "1b"), new B(2, "2b"))),
new A(2, "a2", List.of(new B(3, "3b"))));
}
class A {
public A(int id, String name, List<B> listB) {
this.id = id;
this.name = name;
this.listB = listB;
}
int id;
String name;
List<B> listB;
}
class B {
public B(int id, String name) {
this.id = id;
this.name = name;
}
int id;
String name;
}

You could use
record Key(int id, String name) {};
List<A> result = input.stream().collect(
Collectors.groupingBy(a -> new Key(a.getId(), a.getName()),
LinkedHashMap::new,
Collectors.flatMapping(a -> a.getListB().stream(), Collectors.toList())))
.entrySet().stream()
.map(e -> new A(e.getKey().id(), e.getKey().name(), e.getValue()))
.collect(Collectors.toList());
if(!result.equals(expected)) {
throw new AssertionError("expected " + expected + " but got " + result);
}
This constructs new lists with new A objects, which is suitable for immutable objects. Your use of List.of(…) suggests a preference towards immutable objects. If you have mutable objects and want to perform the operation in-place, you could do
List<A> result = new ArrayList<>(input); // only needed if input is an immutable list
record Key(int id, String name) {};
HashMap<Key,A> previous = new HashMap<>();
result.removeIf(a -> previous.merge(new Key(a.getId(), a.getName()), a, (old, newA) -> {
var l = old.getListB();
if(l.getClass() != ArrayList.class) old.setListB(l = new ArrayList<>(l));
l.addAll(newA.getListB());
return old;
}) != a);
if(!result.equals(expected)) {
throw new AssertionError("expected " + expected + " but got " + result);
}
This removes the duplicates from the list and adds their Bs to the previously encountered original. It does the minimum of changes required to get the intended list, e.g. if there are no duplicates, it does nothing.
If A objects with the same id always have the same name, in other words, there is no need for a key object checking both, you could simplify this approach to
List<A> result = new ArrayList<>(input); // only needed if input is an immutable list
HashMap<Integer,A> previous = new HashMap<>();
result.removeIf(a -> previous.merge(a.getId(), a, (old, newA) -> {
var l = old.getListB();
if(l.getClass() != ArrayList.class) old.setListB(l = new ArrayList<>(l));
l.addAll(newA.getListB());
return old;
}) != a);
if(!result.equals(expected)) {
throw new AssertionError("expected " + expected + " but got " + result);
}

If you need preserve an instance for each id you can write (I assume objects have getters and setters)
System.out.println(xs.stream()
.collect(groupingBy(A::getId, toList()))
.values().stream()
.peek(g -> g.get(0).setListB(
g.stream()
.flatMap(h -> h.getListB().stream())
.collect(groupingBy(B::getId, toList()))
.values().stream()
.map(i -> i.get(0))
.collect(toList())))
.map(g -> g.get(0))
.collect(toList()));
your input case with output
[A(id=1, name=a1, listB=[B(id=1, name=b1), B(id=2, name=b2)]), A(id=2, name=a2, listB=[B(id=3, name=b3)])]
if you can create new instances then you can renormalize the lists
System.out.println(xs.stream()
.flatMap(a -> a.getListB().stream().map(b -> List.<Object>of(a.id, a.name, b.id, b.name)))
.distinct()
.collect(groupingBy(o -> o.get(0), toList()))
.values()
.stream()
.map(zs -> new A((int) zs.get(0).get(0), (String) zs.get(0).get(1),
zs.stream().map(z -> new B((int) z.get(2), (String) z.get(3))).collect(toList())))
.collect(toList()));
(you can change the ugly .get(0).get(0) using some intermediate class called DenormalizedRow or so)

Related

Java 8 compare objects from the same list(array index out of bound)

I have a hashmap with key,value as Map<String,List<Trade>>. In the value object i.e. List<Trade> I have to compare each object. If any two of the object's property name "TradeType" is same then I have to remove those two from the list. I am trying to achieve as below. But I am getting "Array index out of bound exception" Also is there any better way to compare the same object inside list using streams??
Map<String,List<Trade>> tradeMap = new HashMap<>();
// Insert some data here
tradeMap.entrySet()
.stream()
.forEach(tradeList -> {
List<Trade> tl = tradeList.getValue();
String et = "";
// Is there a better way to refactor this?
for(int i = 0;i <= tl.size();i++){
if(i == 0) {
et = tl.get(0).getTradeType();
}
else {
if(et.equals(tl.get(i).getTradeType())){
tl.remove(i);
}
}
}
});

Your description is not completely in sync with what your code does so I will provide a couple of solutions in which you can choose the one you're after.
First and foremost as for the IndexOutOfBoundsException you can solve it by changing the loop condition from i <= tl.size() to i < tl.size(). This is because the last item in a list is at index tl.size() - 1 as lists are 0 based.
To improve upon your current attempt you can do it as follows:
tradeMap.values()
.stream()
.filter(list -> list.size() > 1)
.forEach(T::accept);
where accept is defined as:
private static void accept(List<Trade> list) {
String et = list.get(0).getTradeType();
list.subList(1, list.size()).removeIf(e -> e.getTradeType().equals(et));
}
and T should be substituted with the class containing the accept method.
The above code snippet only removes objects after the first element that are equal to the first element by trade type, which is what your example snippet is attempting to do. if however, you want distinct of all objects then one option would be to override equals and hashcode in the Trade class as follows:
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Trade trade = (Trade) o;
return Objects.equals(tradeType, trade.tradeType);
}
#Override
public int hashCode() {
return Objects.hash(tradeType);
}
Then the accept method needs to be modified to become:
private static void accept(List<Trade> list) {
List<Trade> distinct = list.stream()
.distinct()
.collect(Collectors.toList());
list.clear(); // clear list
list.addAll(distinct); // repopulate with distinct objects by tradeType
}
or if you don't want to override equals and hashcode at all then you can use the toMap collector to get the distinct objects:
private static void accept(List<Trade> list) {
Collection<Trade> distinct = list.stream()
.collect(Collectors.toMap(Trade::getTradeType,
Function.identity(), (l, r) -> l, LinkedHashMap::new))
.values();
list.clear(); // clear list
list.addAll(distinct); // repopulate with distinct objects by tradeType
}
if however, when you say:
"If any two of the object's property name "TradeType" is same then I
have to remove those two from the list."
you actually want to remove all equal Trade objects by tradeType that have 2 or more occurrences then modify the accept method to be as follows:
private static void accept(List<Trade> list) {
list.stream()
.collect(Collectors.groupingBy(Trade::getTradeType))
.values()
.stream()
.filter(l -> l.size() > 1)
.map(l -> l.get(0))
.forEach(t -> list.removeIf(trade -> trade.getTradeType().equals(t.getTradeType())));
}

public void test(){
Map<String, List<Trade>> data = new HashMap<>();
List<Trade> list1 = Arrays.asList(new Trade("1"), new Trade("2"), new Trade("1"), new Trade("3"), new Trade("3"), new Trade("4"));
List<Trade> list2 = Arrays.asList(new Trade("1"), new Trade("2"), new Trade("2"), new Trade("3"), new Trade("3"), new Trade("4"));
data.put("a", list1);
data.put("b", list2);
Map<String, List<Trade>> resultMap = data.entrySet()
.stream()
.collect(Collectors.toMap(Entry::getKey, this::filterDuplicateListObjects));
resultMap.forEach((key, value) -> System.out.println(key + ": " + value));
// Don't forget to override the toString() method of Trade class.
}
public List<Trade> filterDuplicateListObjects(Entry<String, List<Trade>> entry){
return entry.getValue()
.stream()
.filter(trade -> isDuplicate(trade, entry.getValue()))
.collect(Collectors.toList());
}
public boolean isDuplicate(Trade trade, List<Trade> tradeList){
return tradeList.stream()
.filter(t -> !t.equals(trade))
.noneMatch(t -> t.getTradeType().equals(trade.getTradeType()));
}

Map<String,List<Trade>> tradeMap = new HashMap<>();
tradeMap.put("1",Arrays.asList(new Trade("A"),new Trade("B"),new Trade("A")));
tradeMap.put("2",Arrays.asList(new Trade("C"),new Trade("C"),new Trade("D")));
Map<String,Collection<Trade>> tradeMapNew = tradeMap.entrySet()
.stream()
.collect(Collectors.toMap(Entry::getKey,
e -> e.getValue().stream() //This is to remove the duplicates from the list.
.collect(Collectors.toMap(Trade::getTradeType,
t->t,
(t1,t2) -> t1,
LinkedHashMap::new))
.values()));
Output:
{1=[A, B], 2=[C, D]}

Map<String, List<Trade>> tradeMap = new HashMap<>();
tradeMap.values()
.stream()
.forEach(trades -> trades.stream()
.collect(Collectors.groupingBy(Trade::getType))
.values()
.stream()
.filter(tradesByType -> tradesByType.size() > 1)
.flatMap(Collection::stream)
.forEach(trades::remove));

What is equivalent to C#'s Select clause in JAVA's streams API

I wanted to filter list of Person class and finally map to some anonymous class in Java using Streams. I am able to do the same thing very easily in C#.
Person class
class Person
{
public int Id { get; set; }
public string Name { get; set; }
public string Address { get; set; }
}
Code to map the result in desire format.
List<Person> lst = new List<Person>();
lst.Add(new Person() { Name = "Pava", Address = "India", Id = 1 });
lst.Add(new Person() { Name = "tiwari", Address = "USA", Id = 2 });
var result = lst.Select(p => new { Address = p.Address, Name = p.Name }).ToList();
Now if I wanted to access any property of newly created type I can easily access by using below mentioned syntax.
Console.WriteLine( result[0].Address);
Ideally I should use loop to iterate over the result.
I know that in java we have collect for ToList and map for Select.
But i am unable to select only two property of Person class.
How can i do it Java

Java does not have structural types. The closest you could map the values to, are instances of anonymous classes. But there are significant drawbacks. Starting with Java 16, using record would be the better solution, even if it’s a named type and might be slightly more verbose.
E.g. assuming
class Person {
int id;
String name, address;
public Person(String name, String address, int id) {
this.id = id;
this.name = name;
this.address = address;
}
public int getId() {
return id;
}
public String getName() {
return name;
}
public String getAddress() {
return address;
}
}
you can do
List<Person> lst = List.of(
new Person("Pava", "India", 1), new Person("tiwari", "USA", 2));
var result = lst.stream()
.map(p -> {
record NameAndAddress(String name, String address){}
return new NameAndAddress(p.getName(), p.getAddress());
})
.collect(Collectors.toList());
result.forEach(x -> System.out.println(x.name() + " " + x.address()));
The anonymous inner class alternative would look like
List<Person> lst = List.of(
new Person("Pava", "India", 1), new Person("tiwari", "USA", 2));
var result = lst.stream()
.map(p -> new Object(){ String address = p.getAddress(); String name = p.getName();})
.collect(Collectors.toList());
result.forEach(x -> System.out.println(x.name + " " + x.address));
but as you might note, it’s still not as concise as a structural type. Declaring the result variable using var is the only way to refer to the type we can not refer to by name. This requires Java 10 or newer and is limited to the method’s scope.
It’s also important to keep in mind that inner classes can create memory leaks due to capturing a reference to the surrounding this. In the example, each object also captures the value of p used for its initialization. The record doesn’t have these problems and further, it automatically gets suitable equals, hashCode, and toString implementations, which implies that printing the list like System.out.println(result); or transferring it to a set like new HashSet<>(result) will have meaningful results.
Also, it’s much easier to move the record’s declaration to a broader scope.
Prior to Java 10, lambda expressions are the only Java feature that supports declaring variables of an implied type, which could be anonymous. E.g., the following would work even in Java 8:
List<String> result = lst.stream()
.map(p -> new Object(){ String address = p.getAddress(); String name = p.getName();})
.filter(anon -> anon.name.startsWith("ti"))
.map(anon -> anon.address)
.collect(Collectors.toList());

It seems that you want to transform your Person with 3 properties to a Holder that has 2 properties. And that is a simple map operation:
lst.stream().map(p -> new AbstractMap.SimpleEntry(p.address, p.name))
.collect(Collectors.toList());
This is collecting your entries to SimpleEntry that is just a Holder for two values. If you need more then two, you are out of luck - you will need to create your own holder(class).

If you know which attributes to select and this does not change, I would recommend writing a small class with that subset of Person's attributes. You can then map every person to an instance of that class and collect them into a list:
Stream.of(new Person(1, "a", "aa"), new Person(2, "b", "bb"), new Person(3, "b", "bbb"),
new Person(4, "c", "aa"), new Person(5, "b", "bbb"))
.filter(person -> true) // your filter criteria goes here
.map(person -> new PersonSelect(person.getName(), person.getAddress()))
.collect(Collectors.toList());
// result in list of PersonSelects with your name and address
If the set of desired attributes varies, you could use an array instead. It will look more similar to your C# code, but does not provide type safety:
Stream.of(new Person(1, "a", "aa"), new Person(2, "b", "bb"), new Person(3, "b", "bbb"),
new Person(4, "c", "aa"), new Person(5, "b", "bbb"))
.filter(person -> true)
.map(person -> new Object[] {person.getName(), person.getAddress()})
.collect(Collectors.toList())
.forEach(p -> System.out.println(Arrays.asList(p)));
// output: [a, aa], [b, bb], [b, bbb], [c, aa], [b, bbb]

If you want to create a list of new Person instances you first should provide a constructor, e.g. like this:
class Person {
public int id;
public String name;
public String address;
public Person( int pId, String pName, String pAddress ) {
super();
id = pId;
name = pName;
address = pAddress;
}
}
Then you could use the stream:
List<Person> lst = new ArrayList<>();
lst.add(new Person(1, "Pava", "India" ));
lst.add(new Person( 2, "tiwari", "USA" ) );
//since id is an int we can't use null and thus I used -1 here
List<Person> result = lst.stream().map(p -> new Person(-1, p.name, p.address)).collect(Collectors.toList());
If you want to filter persons then just put a filter() in between stream() and map():
List<Person> result = lst.stream().filter(p -> p.name.startsWith( "P" )).map(p -> new Person( -1, p.name, p.address )).collect(Collectors.toList());

How to expand and do regroup a List of List using Java 8 Stream?

I have a list of the Class A, that includes a List itself.
public class A {
public double val;
public String id;
public List<String> names = new ArrayList<String>();
public A(double v, String ID, String name)
{
val = v;
id = ID;
names.add(name);
}
static public List<A> createAnExample()
{
List<A> items = new ArrayList<A>();
items.add(new A(8.0,"x1","y11"));
items.add(new A(12.0, "x2", "y21"));
items.add(new A(24.0,"x3","y31"));
items.get(0).names.add("y12");
items.get(1).names.add("y11");
items.get(1).names.add("y31");
items.get(2).names.add("y11");
items.get(2).names.add("y32");
items.get(2).names.add("y33");
return items;
}
The aim is to sum over average val per id over the List. I added the code in Main function by using some Java 8 stream.
My question is how can I rewrite it in a more elegant way without using the second Array and the for loop.
static public void main(String[] args) {
List<A> items = createAnExample();
List<A> items2 = new ArrayList<A>();
for (int i = 0; i < items.size(); i++) {
List<String> names = items.get(i).names;
double v = items.get(i).val / names.size();
String itemid = items.get(i).id;
for (String n : names) {
A item = new A(v, itemid, n);
items2.add(item);
}
}
Map<String, Double> x = items2.stream().collect(Collectors.groupingBy(item ->
item.names.isEmpty() ? "NULL" : item.names.get(0), Collectors.summingDouble(item -> item.val)));
for (Map.Entry entry : x.entrySet())
System.out.println(entry.getKey() + " --> " + entry.getValue());
}

You can do it with flatMap:
x = items.stream()
.flatMap(a -> a.names.stream()
.map(n -> new AbstractMap.SimpleEntry<>(n, a.val / a.names.size()))
).collect(groupingBy(
Map.Entry::getKey, summingDouble(Map.Entry::getValue)
));
If you find yourself dealing with problems like these often, consider a static method to create a Map.Entry:
static<K,V> Map.Entry<K,V> entry(K k, V v) {
return new AbstractMap.SimpleImmutableEntry<>(k,v);
}
Then you would have a less verbose .map(n -> entry(n, a.val/a.names.size()))

In my free StreamEx library which extends standard Stream API there are special operations which help building such complex maps. Using the StreamEx your problem can be solved like this:
Map<String, Double> x = StreamEx.of(createAnExample())
.mapToEntry(item -> item.names, item -> item.val / item.names.size())
.flatMapKeys(List::stream)
.grouping(Collectors.summingDouble(v -> v));
Here mapToEntry creates stream of map entries (so-called EntryStream) where keys are lists of names and values are averaged vals. Next we use flatMapKeys to flatten the keys leaving values as is (so we have stream of Entry<String, Double>). Finally we group them together summing the values for repeating keys.

What's the purpose of partitioningBy

For example, if I intend to partition some elements, I could do something like:
Stream.of("I", "Love", "Stack Overflow")
.collect(Collectors.partitioningBy(s -> s.length() > 3))
.forEach((k, v) -> System.out.println(k + " => " + v));
which outputs:
false => [I]
true => [Love, Stack Overflow]
But for me partioningBy is only a subcase of groupingBy. Although the former accepts a Predicate as parameter while the latter a Function, I just see a partition as a normal grouping function.
So the same code does exactly the same thing:
Stream.of("I", "Love", "Stack Overflow")
.collect(Collectors.groupingBy(s -> s.length() > 3))
.forEach((k, v) -> System.out.println(k + " => " + v));
which also results in a Map<Boolean, List<String>>.
So is there any reason I should use partioningBy instead of groupingBy? Thanks

partitioningBy will always return a map with two entries, one for where the predicate is true and one for where it is false.
It is possible that both entries will have empty lists, but they will exist.
That's something that groupingBy will not do, since it only creates entries when they are needed.
At the extreme case, if you send an empty stream to partitioningBy you will still get two entries in the map whereas groupingBy will return an empty map.
EDIT: As mentioned below this behavior is not mentioned in the Java docs, however changing it would take away the added value partitioningBy is currently providing. For Java 9 this is already in the specs.

partitioningBy is slightly more efficient, using a special Map implementation optimized for when the key is just a boolean.
(It might also help to clarify what you mean; partitioningBy helps to effectively get across that there's a boolean condition being used to partition the data.)

partitioningBy method will return a map whose key is always a Boolean value, but in case of groupingBy method, the key can be of any Object type
//groupingBy
Map<Object, List<Person>> list2 = new HashMap<Object, List<Person>>();
list2 = list.stream().collect(Collectors.groupingBy(p->p.getAge()==22));
System.out.println("grouping by age -> " + list2);
//partitioningBy
Map<Boolean, List<Person>> list3 = new HashMap<Boolean, List<Person>>();
list3 = list.stream().collect(Collectors.partitioningBy(p->p.getAge()==22));
System.out.println("partitioning by age -> " + list2);
As you can see, the key for map in case of partitioningBy method is always a Boolean value, but in case of groupingBy method, the key is Object type
Detailed code is as follows:
class Person {
String name;
int age;
Person(String name, int age) {
this.name = name;
this.age = age;
}
public String getName() {
return name;
}
public int getAge() {
return age;
}
public String toString() {
return this.name;
}
}
public class CollectorAndCollectPrac {
public static void main(String[] args) {
Person p1 = new Person("Kosa", 21);
Person p2 = new Person("Saosa", 21);
Person p3 = new Person("Tiuosa", 22);
Person p4 = new Person("Komani", 22);
Person p5 = new Person("Kannin", 25);
Person p6 = new Person("Kannin", 25);
Person p7 = new Person("Tiuosa", 22);
ArrayList<Person> list = new ArrayList<>();
list.add(p1);
list.add(p2);
list.add(p3);
list.add(p4);
list.add(p5);
list.add(p6);
list.add(p7);
// groupingBy
Map<Object, List<Person>> list2 = new HashMap<Object, List<Person>>();
list2 = list.stream().collect(Collectors.groupingBy(p -> p.getAge() == 22));
System.out.println("grouping by age -> " + list2);
// partitioningBy
Map<Boolean, List<Person>> list3 = new HashMap<Boolean, List<Person>>();
list3 = list.stream().collect(Collectors.partitioningBy(p -> p.getAge() == 22));
System.out.println("partitioning by age -> " + list2);
}
}

Another difference between groupingBy and partitioningBy is that the former takes a Function<? super T, ? extends K> and the latter a Predicate<? super T>.
When you pass a method reference or a lambda expression, such as s -> s.length() > 3, they can be used by either of these two methods (the compiler will infer the functional interface type based on the type required by the method you choose).
However, if you have a Predicate<T> instance, you can only pass it to Collectors.partitioningBy(). It won't be accepted by Collectors.groupingBy().
And similarly, if you have a Function<T,Boolean> instance, you can only pass it to Collectors.groupingBy(). It won't be accepted by Collectors.partitioningBy().

As denoted by the other answers, segregating a collection into two groups is useful in some scenarios. As these two partitions would always exist, it would be easier to utilize it further. In JDK, to segregate all the class files and config files, partitioningBy is used.
private static final String SERVICES_PREFIX = "META-INF/services/";
// scan the names of the entries in the JAR file
Map<Boolean, Set<String>> map = jf.versionedStream()
.filter(e -> !e.isDirectory())
.map(JarEntry::getName)
.filter(e -> (e.endsWith(".class") ^ e.startsWith(SERVICES_PREFIX)))
.collect(Collectors.partitioningBy(e -> e.startsWith(SERVICES_PREFIX),
Collectors.toSet()));
Set<String> classFiles = map.get(Boolean.FALSE);
Set<String> configFiles = map.get(Boolean.TRUE);
Code snippet is from jdk.internal.module.ModulePath#deriveModuleDescriptor

Java 8 Distinct by property

In Java 8 how can I filter a collection using the Stream API by checking the distinctness of a property of each object?
For example I have a list of Person object and I want to remove people with the same name,
persons.stream().distinct();
Will use the default equality check for a Person object, so I need something like,
persons.stream().distinct(p -> p.getName());
Unfortunately the distinct() method has no such overload. Without modifying the equality check inside the Person class is it possible to do this succinctly?

Consider distinct to be a stateful filter. Here is a function that returns a predicate that maintains state about what it's seen previously, and that returns whether the given element was seen for the first time:
public static <T> Predicate<T> distinctByKey(Function<? super T, ?> keyExtractor) {
Set<Object> seen = ConcurrentHashMap.newKeySet();
return t -> seen.add(keyExtractor.apply(t));
}
Then you can write:
persons.stream().filter(distinctByKey(Person::getName))
Note that if the stream is ordered and is run in parallel, this will preserve an arbitrary element from among the duplicates, instead of the first one, as distinct() does.
(This is essentially the same as my answer to this question: Java Lambda Stream Distinct() on arbitrary key?)

An alternative would be to place the persons in a map using the name as a key:
persons.collect(Collectors.toMap(Person::getName, p -> p, (p, q) -> p)).values();
Note that the Person that is kept, in case of a duplicate name, will be the first encontered.

You can wrap the person objects into another class, that only compares the names of the persons. Afterward, you unwrap the wrapped objects to get a person stream again. The stream operations might look as follows:
persons.stream()
.map(Wrapper::new)
.distinct()
.map(Wrapper::unwrap)
...;
The class Wrapper might look as follows:
class Wrapper {
private final Person person;
public Wrapper(Person person) {
this.person = person;
}
public Person unwrap() {
return person;
}
public boolean equals(Object other) {
if (other instanceof Wrapper) {
return ((Wrapper) other).person.getName().equals(person.getName());
} else {
return false;
}
}
public int hashCode() {
return person.getName().hashCode();
}
}

Another solution, using Set. May not be the ideal solution, but it works
Set<String> set = new HashSet<>(persons.size());
persons.stream().filter(p -> set.add(p.getName())).collect(Collectors.toList());
Or if you can modify the original list, you can use removeIf method
persons.removeIf(p -> !set.add(p.getName()));

There's a simpler approach using a TreeSet with a custom comparator.
persons.stream()
.collect(Collectors.toCollection(
() -> new TreeSet<Person>((p1, p2) -> p1.getName().compareTo(p2.getName()))
));

We can also use RxJava (very powerful reactive extension library)
Observable.from(persons).distinct(Person::getName)
or
Observable.from(persons).distinct(p -> p.getName())

You can use groupingBy collector:
persons.collect(Collectors.groupingBy(p -> p.getName())).values().forEach(t -> System.out.println(t.get(0).getId()));
If you want to have another stream you can use this:
persons.collect(Collectors.groupingBy(p -> p.getName())).values().stream().map(l -> (l.get(0)));

You can use the distinct(HashingStrategy) method in Eclipse Collections.
List<Person> persons = ...;
MutableList<Person> distinct =
ListIterate.distinct(persons, HashingStrategies.fromFunction(Person::getName));
If you can refactor persons to implement an Eclipse Collections interface, you can call the method directly on the list.
MutableList<Person> persons = ...;
MutableList<Person> distinct =
persons.distinct(HashingStrategies.fromFunction(Person::getName));
HashingStrategy is simply a strategy interface that allows you to define custom implementations of equals and hashcode.
public interface HashingStrategy<E>
{
int computeHashCode(E object);
boolean equals(E object1, E object2);
}
Note: I am a committer for Eclipse Collections.

Similar approach which Saeed Zarinfam used but more Java 8 style:)
persons.collect(Collectors.groupingBy(p -> p.getName())).values().stream()
.map(plans -> plans.stream().findFirst().get())
.collect(toList());

You can use StreamEx library:
StreamEx.of(persons)
.distinct(Person::getName)
.toList()

I recommend using Vavr, if you can. With this library you can do the following:
io.vavr.collection.List.ofAll(persons)
.distinctBy(Person::getName)
.toJavaSet() // or any another Java 8 Collection

Extending Stuart Marks's answer, this can be done in a shorter way and without a concurrent map (if you don't need parallel streams):
public static <T> Predicate<T> distinctByKey(Function<? super T, ?> keyExtractor) {
final Set<Object> seen = new HashSet<>();
return t -> seen.add(keyExtractor.apply(t));
}
Then call:
persons.stream().filter(distinctByKey(p -> p.getName());

My approach to this is to group all the objects with same property together, then cut short the groups to size of 1 and then finally collect them as a List.
List<YourPersonClass> listWithDistinctPersons = persons.stream()
//operators to remove duplicates based on person name
.collect(Collectors.groupingBy(p -> p.getName()))
.values()
.stream()
//cut short the groups to size of 1
.flatMap(group -> group.stream().limit(1))
//collect distinct users as list
.collect(Collectors.toList());

Distinct objects list can be found using:
List distinctPersons = persons.stream()
.collect(Collectors.collectingAndThen(
Collectors.toCollection(() -> new TreeSet<>(Comparator.comparing(Person:: getName))),
ArrayList::new));

I made a generic version:
private <T, R> Collector<T, ?, Stream<T>> distinctByKey(Function<T, R> keyExtractor) {
return Collectors.collectingAndThen(
toMap(
keyExtractor,
t -> t,
(t1, t2) -> t1
),
(Map<R, T> map) -> map.values().stream()
);
}
An exemple:
Stream.of(new Person("Jean"),
new Person("Jean"),
new Person("Paul")
)
.filter(...)
.collect(distinctByKey(Person::getName)) // return a stream of Person with 2 elements, jean and Paul
.map(...)
.collect(toList())

Another library that supports this is jOOλ, and its Seq.distinct(Function<T,U>) method:
Seq.seq(persons).distinct(Person::getName).toList();
Under the hood, it does practically the same thing as the accepted answer, though.

Set<YourPropertyType> set = new HashSet<>();
list
.stream()
.filter(it -> set.add(it.getYourProperty()))
.forEach(it -> ...);

While the highest upvoted answer is absolutely best answer wrt Java 8, it is at the same time absolutely worst in terms of performance. If you really want a bad low performant application, then go ahead and use it. Simple requirement of extracting a unique set of Person Names shall be achieved by mere "For-Each" and a "Set".
Things get even worse if list is above size of 10.
Consider you have a collection of 20 Objects, like this:
public static final List<SimpleEvent> testList = Arrays.asList(
new SimpleEvent("Tom"), new SimpleEvent("Dick"),new SimpleEvent("Harry"),new SimpleEvent("Tom"),
new SimpleEvent("Dick"),new SimpleEvent("Huckle"),new SimpleEvent("Berry"),new SimpleEvent("Tom"),
new SimpleEvent("Dick"),new SimpleEvent("Moses"),new SimpleEvent("Chiku"),new SimpleEvent("Cherry"),
new SimpleEvent("Roses"),new SimpleEvent("Moses"),new SimpleEvent("Chiku"),new SimpleEvent("gotya"),
new SimpleEvent("Gotye"),new SimpleEvent("Nibble"),new SimpleEvent("Berry"),new SimpleEvent("Jibble"));
Where you object SimpleEvent looks like this:
public class SimpleEvent {
private String name;
private String type;
public SimpleEvent(String name) {
this.name = name;
this.type = "type_"+name;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getType() {
return type;
}
public void setType(String type) {
this.type = type;
}
}
And to test, you have JMH code like this,(Please note, im using the same distinctByKey Predicate mentioned in accepted answer) :
#Benchmark
#OutputTimeUnit(TimeUnit.SECONDS)
public void aStreamBasedUniqueSet(Blackhole blackhole) throws Exception{
Set<String> uniqueNames = testList
.stream()
.filter(distinctByKey(SimpleEvent::getName))
.map(SimpleEvent::getName)
.collect(Collectors.toSet());
blackhole.consume(uniqueNames);
}
#Benchmark
#OutputTimeUnit(TimeUnit.SECONDS)
public void aForEachBasedUniqueSet(Blackhole blackhole) throws Exception{
Set<String> uniqueNames = new HashSet<>();
for (SimpleEvent event : testList) {
uniqueNames.add(event.getName());
}
blackhole.consume(uniqueNames);
}
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(MyBenchmark.class.getSimpleName())
.forks(1)
.mode(Mode.Throughput)
.warmupBatchSize(3)
.warmupIterations(3)
.measurementIterations(3)
.build();
new Runner(opt).run();
}
Then you'll have Benchmark results like this:
Benchmark Mode Samples Score Score error Units
c.s.MyBenchmark.aForEachBasedUniqueSet thrpt 3 2635199.952 1663320.718 ops/s
c.s.MyBenchmark.aStreamBasedUniqueSet thrpt 3 729134.695 895825.697 ops/s
And as you can see, a simple For-Each is 3 times better in throughput and less in error score as compared to Java 8 Stream.
Higher the throughput, better the performance

I would like to improve Stuart Marks answer. What if the key is null, it will through NullPointerException. Here I ignore the null key by adding one more check as keyExtractor.apply(t)!=null.
public static <T> Predicate<T> distinctByKey(Function<? super T, ?> keyExtractor) {
Set<Object> seen = ConcurrentHashMap.newKeySet();
return t -> keyExtractor.apply(t)!=null && seen.add(keyExtractor.apply(t));
}

This works like a charm:
Grouping the data by unique key to form a map.
Returning the first object from every value of the map (There could be multiple person having same name).
persons.stream()
.collect(groupingBy(Person::getName))
.values()
.stream()
.flatMap(values -> values.stream().limit(1))
.collect(toList());

The easiest way to implement this is to jump on the sort feature as it already provides an optional Comparator which can be created using an element’s property. Then you have to filter duplicates out which can be done using a statefull Predicate which uses the fact that for a sorted stream all equal elements are adjacent:
Comparator<Person> c=Comparator.comparing(Person::getName);
stream.sorted(c).filter(new Predicate<Person>() {
Person previous;
public boolean test(Person p) {
if(previous!=null && c.compare(previous, p)==0)
return false;
previous=p;
return true;
}
})./* more stream operations here */;
Of course, a statefull Predicate is not thread-safe, however if that’s your need you can move this logic into a Collector and let the stream take care of the thread-safety when using your Collector. This depends on what you want to do with the stream of distinct elements which you didn’t tell us in your question.

There are lot of approaches, this one will also help - Simple, Clean and Clear
List<Employee> employees = new ArrayList<>();
employees.add(new Employee(11, "Ravi"));
employees.add(new Employee(12, "Stalin"));
employees.add(new Employee(23, "Anbu"));
employees.add(new Employee(24, "Yuvaraj"));
employees.add(new Employee(35, "Sena"));
employees.add(new Employee(36, "Antony"));
employees.add(new Employee(47, "Sena"));
employees.add(new Employee(48, "Ravi"));
List<Employee> empList = new ArrayList<>(employees.stream().collect(
Collectors.toMap(Employee::getName, obj -> obj,
(existingValue, newValue) -> existingValue))
.values());
empList.forEach(System.out::println);
// Collectors.toMap(
// Employee::getName, - key (the value by which you want to eliminate duplicate)
// obj -> obj, - value (entire employee object)
// (existingValue, newValue) -> existingValue) - to avoid illegalstateexception: duplicate key
Output - toString() overloaded
Employee{id=35, name='Sena'}
Employee{id=12, name='Stalin'}
Employee{id=11, name='Ravi'}
Employee{id=24, name='Yuvaraj'}
Employee{id=36, name='Antony'}
Employee{id=23, name='Anbu'}

Here is the example
public class PayRoll {
private int payRollId;
private int id;
private String name;
private String dept;
private int salary;
public PayRoll(int payRollId, int id, String name, String dept, int salary) {
super();
this.payRollId = payRollId;
this.id = id;
this.name = name;
this.dept = dept;
this.salary = salary;
}
}
import java.util.ArrayList;
import java.util.Comparator;
import java.util.List;
import java.util.Map;
import java.util.Optional;
import java.util.stream.Collector;
import java.util.stream.Collectors;
public class Prac {
public static void main(String[] args) {
int salary=70000;
PayRoll payRoll=new PayRoll(1311, 1, "A", "HR", salary);
PayRoll payRoll2=new PayRoll(1411, 2 , "B", "Technical", salary);
PayRoll payRoll3=new PayRoll(1511, 1, "C", "HR", salary);
PayRoll payRoll4=new PayRoll(1611, 1, "D", "Technical", salary);
PayRoll payRoll5=new PayRoll(711, 3,"E", "Technical", salary);
PayRoll payRoll6=new PayRoll(1811, 3, "F", "Technical", salary);
List<PayRoll>list=new ArrayList<PayRoll>();
list.add(payRoll);
list.add(payRoll2);
list.add(payRoll3);
list.add(payRoll4);
list.add(payRoll5);
list.add(payRoll6);
Map<Object, Optional<PayRoll>> k = list.stream().collect(Collectors.groupingBy(p->p.getId()+"|"+p.getDept(),Collectors.maxBy(Comparator.comparingInt(PayRoll::getPayRollId))));
k.entrySet().forEach(p->
{
if(p.getValue().isPresent())
{
System.out.println(p.getValue().get());
}
});
}
}
Output:
PayRoll [payRollId=1611, id=1, name=D, dept=Technical, salary=70000]
PayRoll [payRollId=1811, id=3, name=F, dept=Technical, salary=70000]
PayRoll [payRollId=1411, id=2, name=B, dept=Technical, salary=70000]
PayRoll [payRollId=1511, id=1, name=C, dept=HR, salary=70000]

Late to the party but I sometimes use this one-liner as an equivalent:
((Function<Value, Key>) Value::getKey).andThen(new HashSet<>()::add)::apply
The expression is a Predicate<Value> but since the map is inline, it works as a filter. This is of course less readable but sometimes it can be helpful to avoid the method.

Building on #josketres's answer, I created a generic utility method:
You could make this more Java 8-friendly by creating a Collector.
public static <T> Set<T> removeDuplicates(Collection<T> input, Comparator<T> comparer) {
return input.stream()
.collect(toCollection(() -> new TreeSet<>(comparer)));
}
#Test
public void removeDuplicatesWithDuplicates() {
ArrayList<C> input = new ArrayList<>();
Collections.addAll(input, new C(7), new C(42), new C(42));
Collection<C> result = removeDuplicates(input, (c1, c2) -> Integer.compare(c1.value, c2.value));
assertEquals(2, result.size());
assertTrue(result.stream().anyMatch(c -> c.value == 7));
assertTrue(result.stream().anyMatch(c -> c.value == 42));
}
#Test
public void removeDuplicatesWithoutDuplicates() {
ArrayList<C> input = new ArrayList<>();
Collections.addAll(input, new C(1), new C(2), new C(3));
Collection<C> result = removeDuplicates(input, (t1, t2) -> Integer.compare(t1.value, t2.value));
assertEquals(3, result.size());
assertTrue(result.stream().anyMatch(c -> c.value == 1));
assertTrue(result.stream().anyMatch(c -> c.value == 2));
assertTrue(result.stream().anyMatch(c -> c.value == 3));
}
private class C {
public final int value;
private C(int value) {
this.value = value;
}
}

Maybe will be useful for somebody. I had a little bit another requirement. Having list of objects A from 3rd party remove all which have same A.b field for same A.id (multiple A object with same A.id in list). Stream partition answer by Tagir Valeev inspired me to use custom Collector which returns Map<A.id, List<A>>. Simple flatMap will do the rest.
public static <T, K, K2> Collector<T, ?, Map<K, List<T>>> groupingDistinctBy(Function<T, K> keyFunction, Function<T, K2> distinctFunction) {
return groupingBy(keyFunction, Collector.of((Supplier<Map<K2, T>>) HashMap::new,
(map, error) -> map.putIfAbsent(distinctFunction.apply(error), error),
(left, right) -> {
left.putAll(right);
return left;
}, map -> new ArrayList<>(map.values()),
Collector.Characteristics.UNORDERED)); }

I had a situation, where I was suppose to get distinct elements from list based on 2 keys.
If you want distinct based on two keys or may composite key, try this
class Person{
int rollno;
String name;
}
List<Person> personList;
Function<Person, List<Object>> compositeKey = personList->
Arrays.<Object>asList(personList.getName(), personList.getRollno());
Map<Object, List<Person>> map = personList.stream().collect(Collectors.groupingBy(compositeKey, Collectors.toList()));
List<Object> duplicateEntrys = map.entrySet().stream()`enter code here`
.filter(settingMap ->
settingMap.getValue().size() > 1)
.collect(Collectors.toList());

A variation of the top answer that handles null:
public static <T, K> Predicate<T> distinctBy(final Function<? super T, K> getKey) {
val seen = ConcurrentHashMap.<Optional<K>>newKeySet();
return obj -> seen.add(Optional.ofNullable(getKey.apply(obj)));
}
In my tests:
assertEquals(
asList("a", "bb"),
Stream.of("a", "b", "bb", "aa").filter(distinctBy(String::length)).collect(toList()));
assertEquals(
asList(5, null, 2, 3),
Stream.of(5, null, 2, null, 3, 3, 2).filter(distinctBy(x -> x)).collect(toList()));
val maps = asList(
hashMapWith(0, 2),
hashMapWith(1, 2),
hashMapWith(2, null),
hashMapWith(3, 1),
hashMapWith(4, null),
hashMapWith(5, 2));
assertEquals(
asList(0, 2, 3),
maps.stream()
.filter(distinctBy(m -> m.get("val")))
.map(m -> m.get("i"))
.collect(toList()));

In my case I needed to control what was the previous element. I then created a stateful Predicate where I controled if the previous element was different from the current element, in that case I kept it.
public List<Log> fetchLogById(Long id) {
return this.findLogById(id).stream()
.filter(new LogPredicate())
.collect(Collectors.toList());
}
public class LogPredicate implements Predicate<Log> {
private Log previous;
public boolean test(Log atual) {
boolean isDifferent = previouws == null || verifyIfDifferentLog(current, previous);
if (isDifferent) {
previous = current;
}
return isDifferent;
}
private boolean verifyIfDifferentLog(Log current, Log previous) {
return !current.getId().equals(previous.getId());
}
}

My solution in this listing:
List<HolderEntry> result ....
List<HolderEntry> dto3s = new ArrayList<>(result.stream().collect(toMap(
HolderEntry::getId,
holder -> holder, //or Function.identity() if you want
(holder1, holder2) -> holder1
)).values());
In my situation i want to find distinct values and put their in List.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Combining / merging object members in a list - java

Related

Java 8 compare objects from the same list(array index out of bound)

What is equivalent to C#'s Select clause in JAVA's streams API

How to expand and do regroup a List of List using Java 8 Stream?

What's the purpose of partitioningBy

Java 8 Distinct by property

Categories

Resources