Java 8 Streams & lambdas maintaining strict FP

Java 8 Streams & lambdas maintaining strict FP - java

Java 8 lambdas are very useful in many situations to implement code in a FP fashion in a compact way.
But there are situations where we may have to access/mutate external state which is not a good practice as per FP practices.
(because Java 8 Functional interfaces have strict input and output signatures we can't pass extra arguments)
Eg:
class Country{
List<State> states;
}
class State{
BigInt population;
String capital;
}
class Main{
List<Country> countries;
//code to fill
}
Let's say the use case is to get list of all capitals and and the whole population of all states in all countries
Normal Implmentation:
List<String> capitals = new ArrayList<>();
BigInt population = new BigInt(0);
for(Country country:countries){
for(State state:states){
capitals.add(state.capital);
population.add(state.population)
}
}
How to implement the same with Java 8 Streams in a more optimized manner?
Stream<State> statesStream = countries.stream().flatMap(country->country.getStates());
capitals = statesStream.get().collect(toList());
population = statesStream.get().reduce((pop1,pop2) -> return pop1+pop2);
But the above Implementation is not very efficient.Any other better way to manipulate more than one collection using Java 8 Streams

If you want to collect multiple results in one pipeline you should create a result container and a custom Collector.
class MyResult {
private BigInteger population = BigInteger.ZERO;
private List<String> capitals = new ArrayList<>();
public void accumulate(State state) {
population = population.add(state.population);
capitals.add(state.capital);
}
public MyResult merge(MyResult other) {
population = population.add(other.population);
capitals.addAll(other.capitals);
return this;
}
}
MyResult result = countries.stream()
.flatMap(c -> c.getStates().stream())
.collect(Collector.of(MyResult::new, MyResult::accumulate, MyResult::merge));
BigInteger population = result.population;
List<String> capitals = result.capitals;
Or stream twice, as you did.

You can only consume a stream once, so you need to create an aggregate that can be reduced:
public class CapitalsAndPopulation {
private List<String> capitals;
private BigInt population;
// constructors and getters omitted for conciseness
public CapitalsAndPopulation merge(CapitalsAndPopulation other) {
return new CapitalsAndPopulation(
Lists.concat(this.capitals, other.capitals),
this.population + other.population);
}
}
Then you produce the pipeline:
countries.stream()
.flatMap(country->
country.getStates()
.stream())
.map(state -> new CapitalsAndPopulation(Collections.singletonList(state.getCapital()), state.population))
.reduce(CapitalsAndPopulation::merge);
The reason this looks so ugly is that you don't have nice syntax for structures like tuples or maps, so you need to create classes to make the pipelines look nice...

Try this.
class Pair<T, U> {
T first;
U second;
Pair(T first, U second) {
this.first = first;
this.second = second;
}
}
Pair<List<String>, BigInteger> result = countries.stream()
.flatMap(country -> country.states.stream())
.collect(() -> new Pair<>(
new ArrayList<>(),
BigInteger.ZERO
),
(acc, state) -> {
acc.first.add(state.capital);
acc.second = acc.second.add(state.population);
},
(a, b) -> {
a.first.addAll(b.first);
a.second = a.second.add(b.second);
});
You can use AbstractMap.Entry<K, V> instead of Pair<T, U>.
Entry<List<String>, BigInteger> result = countries.stream()
.flatMap(country -> country.states.stream())
.collect(() -> new AbstractMap.SimpleEntry<>(
new ArrayList<>(),
BigInteger.ZERO
),
(acc, state) -> {
acc.getKey().add(state.capital);
acc.setValue(acc.getValue().add(state.population));
},
(a, b) -> {
a.getKey().addAll(b.getKey());
a.setValue(a.getValue().add(b.getValue()));
});

Related

Sort list by multiple fields(not then compare) in java

Now I have an object:
public class Room{
private long roomId;
private long roomGroupId;
private String roomName;
... getter
... setter
}
I want sort list of rooms by 'roomId', but in the meantime while room objects has 'roomGroupId' greator than zero and has same value then make them close to each other.
Let me give you some example:
input:
[{"roomId":3,"roomGroupId":0},
{"roomId":6,"roomGroupId":0},
{"roomId":1,"roomGroupId":1},
{"roomId":2,"roomGroupId":0},
{"roomId":4,"roomGroupId":1}]
output:
[{"roomId":6,"roomGroupId":0},
{"roomId":4,"roomGroupId":1},
{"roomId":1,"roomGroupId":1},
{"roomId":3,"roomGroupId":0},
{"roomId":2,"roomGroupId":0}]
As shown above, the list sort by 'roomId', but 'roomId 4' and 'roomId 1' are close together, because they has the same roomGroupId.

This does not have easy nice solution (maybe I am wrong).
You can do this like this
TreeMap<Long, List<Room>> roomMap = new TreeMap<>();
rooms.stream()
.collect(Collectors.groupingBy(Room::getRoomGroupId))
.forEach((key, value) -> {
if (key.equals(0L)) {
value.forEach(room -> roomMap.put(room.getRoomId(), Arrays.asList(room)));
} else {
roomMap.put(
Collections.max(value, Comparator.comparing(Room::getRoomId))
.getRoomId(),
value
.stream()
.sorted(Comparator.comparing(Room::getRoomId)
.reversed())
.collect(Collectors.toList())
);
}
});
List<Room> result = roomMap.descendingMap()
.entrySet()
.stream()
.flatMap(entry -> entry.getValue()
.stream())
.collect(Collectors.toList());

If you're in Java 8, you can use code like this
Collections.sort(roomList, Comparator.comparing(Room::getRoomGroupId)
.thenComparing(Room::getRoomId));
If not, you should use a comparator
class SortRoom implements Comparator<Room>
{
public int compare(Room a, Room b)
{
if (a.getRoomGroupId().compareTo(b.getRoomGroupId()) == 0) {
return a.getRoomId().compareTo(b.getRoomId());
}
return a.getRoomGroupId().compareTo(b.getRoomGroupId();
}
}
and then use it like this
Collections.sort(roomList, new SortRoom());

Converting array iteration to lambda function using Java8

I am trying to convert to Lambda function
So far I am able to convert the above code to lambda function like as shown below
Stream.of(acceptedDetails, rejectedDetails)
.filter(list -> !isNull(list) && list.length > 0)
.forEach(new Consumer<Object>() {
public void accept(Object acceptedOrRejected) {
String id;
if(acceptedOrRejected instanceof EmployeeValidationAccepted) {
id = ((EmployeeValidationAccepted) acceptedOrRejected).getId();
} else {
id = ((EmployeeValidationRejected) acceptedOrRejected).getAd().getId();
}
if(acceptedOrRejected instanceof EmployeeValidationAccepted) {
dates1.add(new Integer(id.split("something")[1]));
Integer empId = Integer.valueOf(id.split("something")[2]);
empIds1.add(empId);
} else {
dates2.add(new Integer(id.split("something")[1]));
Integer empId = Integer.valueOf(id.split("something")[2]);
empIds2.add(empId);
}
}
});
But still my goal was to avoid repeating the same logic and also to convert to Lambda function, still in my converted lambda function I feel its not clean and efficient.
This is just for my learning aspect I am doing this stuff by taking one existing code snippet.
Can anyone please tell me how can I improvise the converted Lambda function

Generally, when you try to refactor code, you should only focus on the necessary changes.
Just because you’re going to use the Stream API, there is no reason to clutter the code with checks for null or empty arrays which weren’t in the loop based code. Neither should you change BigInteger to Integer.
Then, you have two different inputs and want to get distinct results from each of them, in other words, you have two entirely different operations. While it is reasonable to consider sharing common code between them, once you identified identical code, there is no sense in trying to express two entirely different operations as a single operation.
First, let’s see how we would do this for a traditional loop:
static void addToLists(String id, List<Integer> empIdList, List<BigInteger> dateList) {
String[] array = id.split("-");
dateList.add(new BigInteger(array[1]));
empIdList.add(Integer.valueOf(array[2]));
}
List<Integer> empIdAccepted = new ArrayList<>();
List<BigInteger> dateAccepted = new ArrayList<>();
for(EmployeeValidationAccepted acceptedDetail : acceptedDetails) {
addToLists(acceptedDetail.getId(), empIdAccepted, dateAccepted);
}
List<Integer> empIdRejected = new ArrayList<>();
List<BigInteger> dateRejected = new ArrayList<>();
for(EmployeeValidationRejected rejectedDetail : rejectedDetails) {
addToLists(rejectedDetail.getAd().getId(), empIdRejected, dateRejected);
}
If we want to express the same as Stream operations, there’s the obstacle of having two results per operation. It truly took until JDK 12 to get a built-in solution:
static Collector<String,?,Map.Entry<List<Integer>,List<BigInteger>>> idAndDate() {
return Collectors.mapping(s -> s.split("-"),
Collectors.teeing(
Collectors.mapping(a -> Integer.valueOf(a[2]), Collectors.toList()),
Collectors.mapping(a -> new BigInteger(a[1]), Collectors.toList()),
Map::entry));
}
Map.Entry<List<Integer>, List<BigInteger>> e;
e = Arrays.stream(acceptedDetails)
.map(EmployeeValidationAccepted::getId)
.collect(idAndDate());
List<Integer> empIdAccepted = e.getKey();
List<BigInteger> dateAccepted = e.getValue();
e = Arrays.stream(rejectedDetails)
.map(r -> r.getAd().getId())
.collect(idAndDate());
List<Integer> empIdRejected = e.getKey();
List<BigInteger> dateRejected = e.getValue();
Since a method can’t return two values, this uses a Map.Entry to hold them.
To use this solution with Java versions before JDK 12, you can use the implementation posted at the end of this answer. You’d also have to replace Map::entry with AbstractMap.SimpleImmutableEntry::new then.
Or you use a custom collector written for this specific operation:
static Collector<String,?,Map.Entry<List<Integer>,List<BigInteger>>> idAndDate() {
return Collector.of(
() -> new AbstractMap.SimpleImmutableEntry<>(new ArrayList<>(), new ArrayList<>()),
(e,id) -> {
String[] array = id.split("-");
e.getValue().add(new BigInteger(array[1]));
e.getKey().add(Integer.valueOf(array[2]));
},
(e1, e2) -> {
e1.getKey().addAll(e2.getKey());
e1.getValue().addAll(e2.getValue());
return e1;
});
}
In other words, using the Stream API does not always make the code simpler.
As a final note, we don’t need to use the Stream API to utilize lambda expressions. We can also use them to move the loop into the common code.
static <T> void addToLists(T[] elements, Function<T,String> tToId,
List<Integer> empIdList, List<BigInteger> dateList) {
for(T t: elements) {
String[] array = tToId.apply(t).split("-");
dateList.add(new BigInteger(array[1]));
empIdList.add(Integer.valueOf(array[2]));
}
}
List<Integer> empIdAccepted = new ArrayList<>();
List<BigInteger> dateAccepted = new ArrayList<>();
addToLists(acceptedDetails, EmployeeValidationAccepted::getId, empIdAccepted, dateAccepted);
List<Integer> empIdRejected = new ArrayList<>();
List<BigInteger> dateRejected = new ArrayList<>();
addToLists(rejectedDetails, r -> r.getAd().getId(), empIdRejected, dateRejected);

A similar approach as #roookeee already posted with but maybe slightly more concise would be to store the mappings using mapping functions declared as :
Function<String, Integer> extractEmployeeId = empId -> Integer.valueOf(empId.split("-")[2]);
Function<String, BigInteger> extractDate = empId -> new BigInteger(empId.split("-")[1]);
then proceed with mapping as:
Map<Integer, BigInteger> acceptedDetailMapping = Arrays.stream(acceptedDetails)
.collect(Collectors.toMap(a -> extractEmployeeId.apply(a.getId()),
a -> extractDate.apply(a.getId())));
Map<Integer, BigInteger> rejectedDetailMapping = Arrays.stream(rejectedDetails)
.collect(Collectors.toMap(a -> extractEmployeeId.apply(a.getAd().getId()),
a -> extractDate.apply(a.getAd().getId())));
Hereafter you can also access the date of acceptance or rejection corresponding to the employeeId of the employee as well.

How about this:
class EmployeeValidationResult {
//constructor + getters omitted for brevity
private final BigInteger date;
private final Integer employeeId;
}
List<EmployeeValidationResult> accepted = Stream.of(acceptedDetails)
.filter(Objects:nonNull)
.map(this::extractValidationResult)
.collect(Collectors.toList());
List<EmployeeValidationResult> rejected = Stream.of(rejectedDetails)
.filter(Objects:nonNull)
.map(this::extractValidationResult)
.collect(Collectors.toList());
EmployeeValidationResult extractValidationResult(EmployeeValidationAccepted accepted) {
return extractValidationResult(accepted.getId());
}
EmployeeValidationResult extractValidationResult(EmployeeValidationRejected rejected) {
return extractValidationResult(rejected.getAd().getId());
}
EmployeeValidationResult extractValidationResult(String id) {
String[] empIdList = id.split("-");
BigInteger date = extractDate(empIdList[1])
Integer empId = extractId(empIdList[2]);
return new EmployeeValidationResult(date, employeeId);
}
Repeating the filter or map operations is good style and explicit about what is happening. Merging the two lists of objects into one and using instanceof clutters the implementation and makes it less readable / maintainable.

Java - How to filter items from a list so that only one per element attribute is present? [duplicate]

In Java 8 how can I filter a collection using the Stream API by checking the distinctness of a property of each object?
For example I have a list of Person object and I want to remove people with the same name,
persons.stream().distinct();
Will use the default equality check for a Person object, so I need something like,
persons.stream().distinct(p -> p.getName());
Unfortunately the distinct() method has no such overload. Without modifying the equality check inside the Person class is it possible to do this succinctly?

Consider distinct to be a stateful filter. Here is a function that returns a predicate that maintains state about what it's seen previously, and that returns whether the given element was seen for the first time:
public static <T> Predicate<T> distinctByKey(Function<? super T, ?> keyExtractor) {
Set<Object> seen = ConcurrentHashMap.newKeySet();
return t -> seen.add(keyExtractor.apply(t));
}
Then you can write:
persons.stream().filter(distinctByKey(Person::getName))
Note that if the stream is ordered and is run in parallel, this will preserve an arbitrary element from among the duplicates, instead of the first one, as distinct() does.
(This is essentially the same as my answer to this question: Java Lambda Stream Distinct() on arbitrary key?)

An alternative would be to place the persons in a map using the name as a key:
persons.collect(Collectors.toMap(Person::getName, p -> p, (p, q) -> p)).values();
Note that the Person that is kept, in case of a duplicate name, will be the first encontered.

You can wrap the person objects into another class, that only compares the names of the persons. Afterward, you unwrap the wrapped objects to get a person stream again. The stream operations might look as follows:
persons.stream()
.map(Wrapper::new)
.distinct()
.map(Wrapper::unwrap)
...;
The class Wrapper might look as follows:
class Wrapper {
private final Person person;
public Wrapper(Person person) {
this.person = person;
}
public Person unwrap() {
return person;
}
public boolean equals(Object other) {
if (other instanceof Wrapper) {
return ((Wrapper) other).person.getName().equals(person.getName());
} else {
return false;
}
}
public int hashCode() {
return person.getName().hashCode();
}
}

Another solution, using Set. May not be the ideal solution, but it works
Set<String> set = new HashSet<>(persons.size());
persons.stream().filter(p -> set.add(p.getName())).collect(Collectors.toList());
Or if you can modify the original list, you can use removeIf method
persons.removeIf(p -> !set.add(p.getName()));

There's a simpler approach using a TreeSet with a custom comparator.
persons.stream()
.collect(Collectors.toCollection(
() -> new TreeSet<Person>((p1, p2) -> p1.getName().compareTo(p2.getName()))
));

We can also use RxJava (very powerful reactive extension library)
Observable.from(persons).distinct(Person::getName)
or
Observable.from(persons).distinct(p -> p.getName())

You can use groupingBy collector:
persons.collect(Collectors.groupingBy(p -> p.getName())).values().forEach(t -> System.out.println(t.get(0).getId()));
If you want to have another stream you can use this:
persons.collect(Collectors.groupingBy(p -> p.getName())).values().stream().map(l -> (l.get(0)));

You can use the distinct(HashingStrategy) method in Eclipse Collections.
List<Person> persons = ...;
MutableList<Person> distinct =
ListIterate.distinct(persons, HashingStrategies.fromFunction(Person::getName));
If you can refactor persons to implement an Eclipse Collections interface, you can call the method directly on the list.
MutableList<Person> persons = ...;
MutableList<Person> distinct =
persons.distinct(HashingStrategies.fromFunction(Person::getName));
HashingStrategy is simply a strategy interface that allows you to define custom implementations of equals and hashcode.
public interface HashingStrategy<E>
{
int computeHashCode(E object);
boolean equals(E object1, E object2);
}
Note: I am a committer for Eclipse Collections.

Similar approach which Saeed Zarinfam used but more Java 8 style:)
persons.collect(Collectors.groupingBy(p -> p.getName())).values().stream()
.map(plans -> plans.stream().findFirst().get())
.collect(toList());

You can use StreamEx library:
StreamEx.of(persons)
.distinct(Person::getName)
.toList()

I recommend using Vavr, if you can. With this library you can do the following:
io.vavr.collection.List.ofAll(persons)
.distinctBy(Person::getName)
.toJavaSet() // or any another Java 8 Collection

Extending Stuart Marks's answer, this can be done in a shorter way and without a concurrent map (if you don't need parallel streams):
public static <T> Predicate<T> distinctByKey(Function<? super T, ?> keyExtractor) {
final Set<Object> seen = new HashSet<>();
return t -> seen.add(keyExtractor.apply(t));
}
Then call:
persons.stream().filter(distinctByKey(p -> p.getName());

My approach to this is to group all the objects with same property together, then cut short the groups to size of 1 and then finally collect them as a List.
List<YourPersonClass> listWithDistinctPersons = persons.stream()
//operators to remove duplicates based on person name
.collect(Collectors.groupingBy(p -> p.getName()))
.values()
.stream()
//cut short the groups to size of 1
.flatMap(group -> group.stream().limit(1))
//collect distinct users as list
.collect(Collectors.toList());

Distinct objects list can be found using:
List distinctPersons = persons.stream()
.collect(Collectors.collectingAndThen(
Collectors.toCollection(() -> new TreeSet<>(Comparator.comparing(Person:: getName))),
ArrayList::new));

I made a generic version:
private <T, R> Collector<T, ?, Stream<T>> distinctByKey(Function<T, R> keyExtractor) {
return Collectors.collectingAndThen(
toMap(
keyExtractor,
t -> t,
(t1, t2) -> t1
),
(Map<R, T> map) -> map.values().stream()
);
}
An exemple:
Stream.of(new Person("Jean"),
new Person("Jean"),
new Person("Paul")
)
.filter(...)
.collect(distinctByKey(Person::getName)) // return a stream of Person with 2 elements, jean and Paul
.map(...)
.collect(toList())

Another library that supports this is jOOλ, and its Seq.distinct(Function<T,U>) method:
Seq.seq(persons).distinct(Person::getName).toList();
Under the hood, it does practically the same thing as the accepted answer, though.

Set<YourPropertyType> set = new HashSet<>();
list
.stream()
.filter(it -> set.add(it.getYourProperty()))
.forEach(it -> ...);

While the highest upvoted answer is absolutely best answer wrt Java 8, it is at the same time absolutely worst in terms of performance. If you really want a bad low performant application, then go ahead and use it. Simple requirement of extracting a unique set of Person Names shall be achieved by mere "For-Each" and a "Set".
Things get even worse if list is above size of 10.
Consider you have a collection of 20 Objects, like this:
public static final List<SimpleEvent> testList = Arrays.asList(
new SimpleEvent("Tom"), new SimpleEvent("Dick"),new SimpleEvent("Harry"),new SimpleEvent("Tom"),
new SimpleEvent("Dick"),new SimpleEvent("Huckle"),new SimpleEvent("Berry"),new SimpleEvent("Tom"),
new SimpleEvent("Dick"),new SimpleEvent("Moses"),new SimpleEvent("Chiku"),new SimpleEvent("Cherry"),
new SimpleEvent("Roses"),new SimpleEvent("Moses"),new SimpleEvent("Chiku"),new SimpleEvent("gotya"),
new SimpleEvent("Gotye"),new SimpleEvent("Nibble"),new SimpleEvent("Berry"),new SimpleEvent("Jibble"));
Where you object SimpleEvent looks like this:
public class SimpleEvent {
private String name;
private String type;
public SimpleEvent(String name) {
this.name = name;
this.type = "type_"+name;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getType() {
return type;
}
public void setType(String type) {
this.type = type;
}
}
And to test, you have JMH code like this,(Please note, im using the same distinctByKey Predicate mentioned in accepted answer) :
#Benchmark
#OutputTimeUnit(TimeUnit.SECONDS)
public void aStreamBasedUniqueSet(Blackhole blackhole) throws Exception{
Set<String> uniqueNames = testList
.stream()
.filter(distinctByKey(SimpleEvent::getName))
.map(SimpleEvent::getName)
.collect(Collectors.toSet());
blackhole.consume(uniqueNames);
}
#Benchmark
#OutputTimeUnit(TimeUnit.SECONDS)
public void aForEachBasedUniqueSet(Blackhole blackhole) throws Exception{
Set<String> uniqueNames = new HashSet<>();
for (SimpleEvent event : testList) {
uniqueNames.add(event.getName());
}
blackhole.consume(uniqueNames);
}
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(MyBenchmark.class.getSimpleName())
.forks(1)
.mode(Mode.Throughput)
.warmupBatchSize(3)
.warmupIterations(3)
.measurementIterations(3)
.build();
new Runner(opt).run();
}
Then you'll have Benchmark results like this:
Benchmark Mode Samples Score Score error Units
c.s.MyBenchmark.aForEachBasedUniqueSet thrpt 3 2635199.952 1663320.718 ops/s
c.s.MyBenchmark.aStreamBasedUniqueSet thrpt 3 729134.695 895825.697 ops/s
And as you can see, a simple For-Each is 3 times better in throughput and less in error score as compared to Java 8 Stream.
Higher the throughput, better the performance

I would like to improve Stuart Marks answer. What if the key is null, it will through NullPointerException. Here I ignore the null key by adding one more check as keyExtractor.apply(t)!=null.
public static <T> Predicate<T> distinctByKey(Function<? super T, ?> keyExtractor) {
Set<Object> seen = ConcurrentHashMap.newKeySet();
return t -> keyExtractor.apply(t)!=null && seen.add(keyExtractor.apply(t));
}

This works like a charm:
Grouping the data by unique key to form a map.
Returning the first object from every value of the map (There could be multiple person having same name).
persons.stream()
.collect(groupingBy(Person::getName))
.values()
.stream()
.flatMap(values -> values.stream().limit(1))
.collect(toList());

The easiest way to implement this is to jump on the sort feature as it already provides an optional Comparator which can be created using an element’s property. Then you have to filter duplicates out which can be done using a statefull Predicate which uses the fact that for a sorted stream all equal elements are adjacent:
Comparator<Person> c=Comparator.comparing(Person::getName);
stream.sorted(c).filter(new Predicate<Person>() {
Person previous;
public boolean test(Person p) {
if(previous!=null && c.compare(previous, p)==0)
return false;
previous=p;
return true;
}
})./* more stream operations here */;
Of course, a statefull Predicate is not thread-safe, however if that’s your need you can move this logic into a Collector and let the stream take care of the thread-safety when using your Collector. This depends on what you want to do with the stream of distinct elements which you didn’t tell us in your question.

There are lot of approaches, this one will also help - Simple, Clean and Clear
List<Employee> employees = new ArrayList<>();
employees.add(new Employee(11, "Ravi"));
employees.add(new Employee(12, "Stalin"));
employees.add(new Employee(23, "Anbu"));
employees.add(new Employee(24, "Yuvaraj"));
employees.add(new Employee(35, "Sena"));
employees.add(new Employee(36, "Antony"));
employees.add(new Employee(47, "Sena"));
employees.add(new Employee(48, "Ravi"));
List<Employee> empList = new ArrayList<>(employees.stream().collect(
Collectors.toMap(Employee::getName, obj -> obj,
(existingValue, newValue) -> existingValue))
.values());
empList.forEach(System.out::println);
// Collectors.toMap(
// Employee::getName, - key (the value by which you want to eliminate duplicate)
// obj -> obj, - value (entire employee object)
// (existingValue, newValue) -> existingValue) - to avoid illegalstateexception: duplicate key
Output - toString() overloaded
Employee{id=35, name='Sena'}
Employee{id=12, name='Stalin'}
Employee{id=11, name='Ravi'}
Employee{id=24, name='Yuvaraj'}
Employee{id=36, name='Antony'}
Employee{id=23, name='Anbu'}

Here is the example
public class PayRoll {
private int payRollId;
private int id;
private String name;
private String dept;
private int salary;
public PayRoll(int payRollId, int id, String name, String dept, int salary) {
super();
this.payRollId = payRollId;
this.id = id;
this.name = name;
this.dept = dept;
this.salary = salary;
}
}
import java.util.ArrayList;
import java.util.Comparator;
import java.util.List;
import java.util.Map;
import java.util.Optional;
import java.util.stream.Collector;
import java.util.stream.Collectors;
public class Prac {
public static void main(String[] args) {
int salary=70000;
PayRoll payRoll=new PayRoll(1311, 1, "A", "HR", salary);
PayRoll payRoll2=new PayRoll(1411, 2 , "B", "Technical", salary);
PayRoll payRoll3=new PayRoll(1511, 1, "C", "HR", salary);
PayRoll payRoll4=new PayRoll(1611, 1, "D", "Technical", salary);
PayRoll payRoll5=new PayRoll(711, 3,"E", "Technical", salary);
PayRoll payRoll6=new PayRoll(1811, 3, "F", "Technical", salary);
List<PayRoll>list=new ArrayList<PayRoll>();
list.add(payRoll);
list.add(payRoll2);
list.add(payRoll3);
list.add(payRoll4);
list.add(payRoll5);
list.add(payRoll6);
Map<Object, Optional<PayRoll>> k = list.stream().collect(Collectors.groupingBy(p->p.getId()+"|"+p.getDept(),Collectors.maxBy(Comparator.comparingInt(PayRoll::getPayRollId))));
k.entrySet().forEach(p->
{
if(p.getValue().isPresent())
{
System.out.println(p.getValue().get());
}
});
}
}
Output:
PayRoll [payRollId=1611, id=1, name=D, dept=Technical, salary=70000]
PayRoll [payRollId=1811, id=3, name=F, dept=Technical, salary=70000]
PayRoll [payRollId=1411, id=2, name=B, dept=Technical, salary=70000]
PayRoll [payRollId=1511, id=1, name=C, dept=HR, salary=70000]

Late to the party but I sometimes use this one-liner as an equivalent:
((Function<Value, Key>) Value::getKey).andThen(new HashSet<>()::add)::apply
The expression is a Predicate<Value> but since the map is inline, it works as a filter. This is of course less readable but sometimes it can be helpful to avoid the method.

Building on #josketres's answer, I created a generic utility method:
You could make this more Java 8-friendly by creating a Collector.
public static <T> Set<T> removeDuplicates(Collection<T> input, Comparator<T> comparer) {
return input.stream()
.collect(toCollection(() -> new TreeSet<>(comparer)));
}
#Test
public void removeDuplicatesWithDuplicates() {
ArrayList<C> input = new ArrayList<>();
Collections.addAll(input, new C(7), new C(42), new C(42));
Collection<C> result = removeDuplicates(input, (c1, c2) -> Integer.compare(c1.value, c2.value));
assertEquals(2, result.size());
assertTrue(result.stream().anyMatch(c -> c.value == 7));
assertTrue(result.stream().anyMatch(c -> c.value == 42));
}
#Test
public void removeDuplicatesWithoutDuplicates() {
ArrayList<C> input = new ArrayList<>();
Collections.addAll(input, new C(1), new C(2), new C(3));
Collection<C> result = removeDuplicates(input, (t1, t2) -> Integer.compare(t1.value, t2.value));
assertEquals(3, result.size());
assertTrue(result.stream().anyMatch(c -> c.value == 1));
assertTrue(result.stream().anyMatch(c -> c.value == 2));
assertTrue(result.stream().anyMatch(c -> c.value == 3));
}
private class C {
public final int value;
private C(int value) {
this.value = value;
}
}

Maybe will be useful for somebody. I had a little bit another requirement. Having list of objects A from 3rd party remove all which have same A.b field for same A.id (multiple A object with same A.id in list). Stream partition answer by Tagir Valeev inspired me to use custom Collector which returns Map<A.id, List<A>>. Simple flatMap will do the rest.
public static <T, K, K2> Collector<T, ?, Map<K, List<T>>> groupingDistinctBy(Function<T, K> keyFunction, Function<T, K2> distinctFunction) {
return groupingBy(keyFunction, Collector.of((Supplier<Map<K2, T>>) HashMap::new,
(map, error) -> map.putIfAbsent(distinctFunction.apply(error), error),
(left, right) -> {
left.putAll(right);
return left;
}, map -> new ArrayList<>(map.values()),
Collector.Characteristics.UNORDERED)); }

I had a situation, where I was suppose to get distinct elements from list based on 2 keys.
If you want distinct based on two keys or may composite key, try this
class Person{
int rollno;
String name;
}
List<Person> personList;
Function<Person, List<Object>> compositeKey = personList->
Arrays.<Object>asList(personList.getName(), personList.getRollno());
Map<Object, List<Person>> map = personList.stream().collect(Collectors.groupingBy(compositeKey, Collectors.toList()));
List<Object> duplicateEntrys = map.entrySet().stream()`enter code here`
.filter(settingMap ->
settingMap.getValue().size() > 1)
.collect(Collectors.toList());

A variation of the top answer that handles null:
public static <T, K> Predicate<T> distinctBy(final Function<? super T, K> getKey) {
val seen = ConcurrentHashMap.<Optional<K>>newKeySet();
return obj -> seen.add(Optional.ofNullable(getKey.apply(obj)));
}
In my tests:
assertEquals(
asList("a", "bb"),
Stream.of("a", "b", "bb", "aa").filter(distinctBy(String::length)).collect(toList()));
assertEquals(
asList(5, null, 2, 3),
Stream.of(5, null, 2, null, 3, 3, 2).filter(distinctBy(x -> x)).collect(toList()));
val maps = asList(
hashMapWith(0, 2),
hashMapWith(1, 2),
hashMapWith(2, null),
hashMapWith(3, 1),
hashMapWith(4, null),
hashMapWith(5, 2));
assertEquals(
asList(0, 2, 3),
maps.stream()
.filter(distinctBy(m -> m.get("val")))
.map(m -> m.get("i"))
.collect(toList()));

In my case I needed to control what was the previous element. I then created a stateful Predicate where I controled if the previous element was different from the current element, in that case I kept it.
public List<Log> fetchLogById(Long id) {
return this.findLogById(id).stream()
.filter(new LogPredicate())
.collect(Collectors.toList());
}
public class LogPredicate implements Predicate<Log> {
private Log previous;
public boolean test(Log atual) {
boolean isDifferent = previouws == null || verifyIfDifferentLog(current, previous);
if (isDifferent) {
previous = current;
}
return isDifferent;
}
private boolean verifyIfDifferentLog(Log current, Log previous) {
return !current.getId().equals(previous.getId());
}
}

My solution in this listing:
List<HolderEntry> result ....
List<HolderEntry> dto3s = new ArrayList<>(result.stream().collect(toMap(
HolderEntry::getId,
holder -> holder, //or Function.identity() if you want
(holder1, holder2) -> holder1
)).values());
In my situation i want to find distinct values and put their in List.

Multiple aggregate functions in Java 8 Stream API

I have a class defined like
public class TimePeriodCalc {
private double occupancy;
private double efficiency;
private String atDate;
}
I would like to perform the following SQL statement using Java 8 Stream API.
SELECT atDate, AVG(occupancy), AVG(efficiency)
FROM TimePeriodCalc
GROUP BY atDate
I tried :
Collection<TimePeriodCalc> collector = result.stream().collect(groupingBy(p -> p.getAtDate(), ....
What can be put into the code to select multiple attributes ? I'm thinking of using multiple Collectors but really don't know how to do so.

To do it without a custom Collector (not streaming again on the result), you could do it like this. It's a bit dirty, since it is first collecting to Map<String, List<TimePeriodCalc>> and then streaming that list and get the average double.
Since you need two averages, they are collected to a Holder or a Pair, in this case I'm using AbstractMap.SimpleEntry
Map<String, SimpleEntry<Double, Double>> map = Stream.of(new TimePeriodCalc(12d, 10d, "A"), new TimePeriodCalc(2d, 16d, "A"))
.collect(Collectors.groupingBy(TimePeriodCalc::getAtDate,
Collectors.collectingAndThen(Collectors.toList(), list -> {
double occupancy = list.stream().collect(
Collectors.averagingDouble(TimePeriodCalc::getOccupancy));
double efficiency = list.stream().collect(
Collectors.averagingDouble(TimePeriodCalc::getEfficiency));
return new AbstractMap.SimpleEntry<>(occupancy, efficiency);
})));
System.out.println(map);

Here's a way with a custom collector. It only needs one pass, but it's not very easy, especially because of generics...
If you have this method:
#SuppressWarnings("unchecked")
#SafeVarargs
static <T, A, C extends Collector<T, A, Double>> Collector<T, ?, List<Double>>
averagingManyDoubles(ToDoubleFunction<? super T>... extractors) {
List<C> collectors = Arrays.stream(extractors)
.map(extractor -> (C) Collectors.averagingDouble(extractor))
.collect(Collectors.toList());
class Acc {
List<A> averages = collectors.stream()
.map(c -> c.supplier().get())
.collect(Collectors.toList());
void add(T elem) {
IntStream.range(0, extractors.length).forEach(i ->
collectors.get(i).accumulator().accept(averages.get(i), elem));
}
Acc merge(Acc another) {
IntStream.range(0, extractors.length).forEach(i ->
averages.set(i, collectors.get(i).combiner()
.apply(averages.get(i), another.averages.get(i))));
return this;
}
List<Double> finish() {
return IntStream.range(0, extractors.length)
.mapToObj(i -> collectors.get(i).finisher().apply(averages.get(i)))
.collect(Collectors.toList());
}
}
return Collector.of(Acc::new, Acc::add, Acc::merge, Acc::finish);
}
This receives an array of functions that will extract double values from each element of the stream. These extractors are converted to Collectors.averagingDouble collectors and then the local Acc class is created with the mutable structures that are used to accumulate the averages for each collector. Then, the accumulator function forwards to each accumulator, and so with the combiner and finisher functions.
Usage is as follows:
Map<String, List<Double>> averages = list.stream()
.collect(Collectors.groupingBy(
TimePeriodCalc::getAtDate,
averagingManyDoubles(
TimePeriodCalc::getOccupancy,
TimePeriodCalc::getEfficiency)));

Assuming that your TimePeriodCalc class has all the necessary getters, this should get you the list you want:
List<TimePeriodCalc> result = new ArrayList<>(
list.stream()
.collect(Collectors.groupingBy(TimePeriodCalc::getAtDate,
Collectors.collectingAndThen(Collectors.toList(), TimePeriodCalc::avgTimePeriodCalc)))
.values()
);
Where TimePeriodCalc.avgTimePeriodCalc is this method in the TimePeriodCalc class:
public static TimePeriodCalc avgTimePeriodCalc(List<TimePeriodCalc> list){
return new TimePeriodCalc(
list.stream().collect(Collectors.averagingDouble(TimePeriodCalc::getOccupancy)),
list.stream().collect(Collectors.averagingDouble(TimePeriodCalc::getEfficiency)),
list.get(0).getAtDate()
);
}
The above can be combined into this monstrosity:
List<TimePeriodCalc> result = new ArrayList<>(
list.stream()
.collect(Collectors.groupingBy(TimePeriodCalc::getAtDate,
Collectors.collectingAndThen(
Collectors.toList(), a -> {
return new TimePeriodCalc(
a.stream().collect(Collectors.averagingDouble(TimePeriodCalc::getOccupancy)),
a.stream().collect(Collectors.averagingDouble(TimePeriodCalc::getEfficiency)),
a.get(0).getAtDate()
);
}
)))
.values());
With input:
List<TimePeriodCalc> list = new ArrayList<>();
list.add(new TimePeriodCalc(10,10,"a"));
list.add(new TimePeriodCalc(10,10,"b"));
list.add(new TimePeriodCalc(10,10,"c"));
list.add(new TimePeriodCalc(5,5,"a"));
list.add(new TimePeriodCalc(0,0,"b"));
This would give:
TimePeriodCalc [occupancy=7.5, efficiency=7.5, atDate=a]
TimePeriodCalc [occupancy=5.0, efficiency=5.0, atDate=b]
TimePeriodCalc [occupancy=10.0, efficiency=10.0, atDate=c]

You can chain multiple attributes like this:
Collection<TimePeriodCalc> collector = result.stream().collect(Collectors.groupingBy(p -> p.getAtDate(), Collectors.averagingInt(p -> p.getOccupancy())));
If you want more, you get the idea.

Java 8 Distinct by property

In Java 8 how can I filter a collection using the Stream API by checking the distinctness of a property of each object?
For example I have a list of Person object and I want to remove people with the same name,
persons.stream().distinct();
Will use the default equality check for a Person object, so I need something like,
persons.stream().distinct(p -> p.getName());
Unfortunately the distinct() method has no such overload. Without modifying the equality check inside the Person class is it possible to do this succinctly?

Consider distinct to be a stateful filter. Here is a function that returns a predicate that maintains state about what it's seen previously, and that returns whether the given element was seen for the first time:
public static <T> Predicate<T> distinctByKey(Function<? super T, ?> keyExtractor) {
Set<Object> seen = ConcurrentHashMap.newKeySet();
return t -> seen.add(keyExtractor.apply(t));
}
Then you can write:
persons.stream().filter(distinctByKey(Person::getName))
Note that if the stream is ordered and is run in parallel, this will preserve an arbitrary element from among the duplicates, instead of the first one, as distinct() does.
(This is essentially the same as my answer to this question: Java Lambda Stream Distinct() on arbitrary key?)

An alternative would be to place the persons in a map using the name as a key:
persons.collect(Collectors.toMap(Person::getName, p -> p, (p, q) -> p)).values();
Note that the Person that is kept, in case of a duplicate name, will be the first encontered.

You can wrap the person objects into another class, that only compares the names of the persons. Afterward, you unwrap the wrapped objects to get a person stream again. The stream operations might look as follows:
persons.stream()
.map(Wrapper::new)
.distinct()
.map(Wrapper::unwrap)
...;
The class Wrapper might look as follows:
class Wrapper {
private final Person person;
public Wrapper(Person person) {
this.person = person;
}
public Person unwrap() {
return person;
}
public boolean equals(Object other) {
if (other instanceof Wrapper) {
return ((Wrapper) other).person.getName().equals(person.getName());
} else {
return false;
}
}
public int hashCode() {
return person.getName().hashCode();
}
}

Another solution, using Set. May not be the ideal solution, but it works
Set<String> set = new HashSet<>(persons.size());
persons.stream().filter(p -> set.add(p.getName())).collect(Collectors.toList());
Or if you can modify the original list, you can use removeIf method
persons.removeIf(p -> !set.add(p.getName()));

There's a simpler approach using a TreeSet with a custom comparator.
persons.stream()
.collect(Collectors.toCollection(
() -> new TreeSet<Person>((p1, p2) -> p1.getName().compareTo(p2.getName()))
));

We can also use RxJava (very powerful reactive extension library)
Observable.from(persons).distinct(Person::getName)
or
Observable.from(persons).distinct(p -> p.getName())

You can use groupingBy collector:
persons.collect(Collectors.groupingBy(p -> p.getName())).values().forEach(t -> System.out.println(t.get(0).getId()));
If you want to have another stream you can use this:
persons.collect(Collectors.groupingBy(p -> p.getName())).values().stream().map(l -> (l.get(0)));

You can use the distinct(HashingStrategy) method in Eclipse Collections.
List<Person> persons = ...;
MutableList<Person> distinct =
ListIterate.distinct(persons, HashingStrategies.fromFunction(Person::getName));
If you can refactor persons to implement an Eclipse Collections interface, you can call the method directly on the list.
MutableList<Person> persons = ...;
MutableList<Person> distinct =
persons.distinct(HashingStrategies.fromFunction(Person::getName));
HashingStrategy is simply a strategy interface that allows you to define custom implementations of equals and hashcode.
public interface HashingStrategy<E>
{
int computeHashCode(E object);
boolean equals(E object1, E object2);
}
Note: I am a committer for Eclipse Collections.

Similar approach which Saeed Zarinfam used but more Java 8 style:)
persons.collect(Collectors.groupingBy(p -> p.getName())).values().stream()
.map(plans -> plans.stream().findFirst().get())
.collect(toList());

You can use StreamEx library:
StreamEx.of(persons)
.distinct(Person::getName)
.toList()

I recommend using Vavr, if you can. With this library you can do the following:
io.vavr.collection.List.ofAll(persons)
.distinctBy(Person::getName)
.toJavaSet() // or any another Java 8 Collection

Extending Stuart Marks's answer, this can be done in a shorter way and without a concurrent map (if you don't need parallel streams):
public static <T> Predicate<T> distinctByKey(Function<? super T, ?> keyExtractor) {
final Set<Object> seen = new HashSet<>();
return t -> seen.add(keyExtractor.apply(t));
}
Then call:
persons.stream().filter(distinctByKey(p -> p.getName());

My approach to this is to group all the objects with same property together, then cut short the groups to size of 1 and then finally collect them as a List.
List<YourPersonClass> listWithDistinctPersons = persons.stream()
//operators to remove duplicates based on person name
.collect(Collectors.groupingBy(p -> p.getName()))
.values()
.stream()
//cut short the groups to size of 1
.flatMap(group -> group.stream().limit(1))
//collect distinct users as list
.collect(Collectors.toList());

Distinct objects list can be found using:
List distinctPersons = persons.stream()
.collect(Collectors.collectingAndThen(
Collectors.toCollection(() -> new TreeSet<>(Comparator.comparing(Person:: getName))),
ArrayList::new));

I made a generic version:
private <T, R> Collector<T, ?, Stream<T>> distinctByKey(Function<T, R> keyExtractor) {
return Collectors.collectingAndThen(
toMap(
keyExtractor,
t -> t,
(t1, t2) -> t1
),
(Map<R, T> map) -> map.values().stream()
);
}
An exemple:
Stream.of(new Person("Jean"),
new Person("Jean"),
new Person("Paul")
)
.filter(...)
.collect(distinctByKey(Person::getName)) // return a stream of Person with 2 elements, jean and Paul
.map(...)
.collect(toList())

Another library that supports this is jOOλ, and its Seq.distinct(Function<T,U>) method:
Seq.seq(persons).distinct(Person::getName).toList();
Under the hood, it does practically the same thing as the accepted answer, though.

Set<YourPropertyType> set = new HashSet<>();
list
.stream()
.filter(it -> set.add(it.getYourProperty()))
.forEach(it -> ...);

While the highest upvoted answer is absolutely best answer wrt Java 8, it is at the same time absolutely worst in terms of performance. If you really want a bad low performant application, then go ahead and use it. Simple requirement of extracting a unique set of Person Names shall be achieved by mere "For-Each" and a "Set".
Things get even worse if list is above size of 10.
Consider you have a collection of 20 Objects, like this:
public static final List<SimpleEvent> testList = Arrays.asList(
new SimpleEvent("Tom"), new SimpleEvent("Dick"),new SimpleEvent("Harry"),new SimpleEvent("Tom"),
new SimpleEvent("Dick"),new SimpleEvent("Huckle"),new SimpleEvent("Berry"),new SimpleEvent("Tom"),
new SimpleEvent("Dick"),new SimpleEvent("Moses"),new SimpleEvent("Chiku"),new SimpleEvent("Cherry"),
new SimpleEvent("Roses"),new SimpleEvent("Moses"),new SimpleEvent("Chiku"),new SimpleEvent("gotya"),
new SimpleEvent("Gotye"),new SimpleEvent("Nibble"),new SimpleEvent("Berry"),new SimpleEvent("Jibble"));
Where you object SimpleEvent looks like this:
public class SimpleEvent {
private String name;
private String type;
public SimpleEvent(String name) {
this.name = name;
this.type = "type_"+name;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getType() {
return type;
}
public void setType(String type) {
this.type = type;
}
}
And to test, you have JMH code like this,(Please note, im using the same distinctByKey Predicate mentioned in accepted answer) :
#Benchmark
#OutputTimeUnit(TimeUnit.SECONDS)
public void aStreamBasedUniqueSet(Blackhole blackhole) throws Exception{
Set<String> uniqueNames = testList
.stream()
.filter(distinctByKey(SimpleEvent::getName))
.map(SimpleEvent::getName)
.collect(Collectors.toSet());
blackhole.consume(uniqueNames);
}
#Benchmark
#OutputTimeUnit(TimeUnit.SECONDS)
public void aForEachBasedUniqueSet(Blackhole blackhole) throws Exception{
Set<String> uniqueNames = new HashSet<>();
for (SimpleEvent event : testList) {
uniqueNames.add(event.getName());
}
blackhole.consume(uniqueNames);
}
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(MyBenchmark.class.getSimpleName())
.forks(1)
.mode(Mode.Throughput)
.warmupBatchSize(3)
.warmupIterations(3)
.measurementIterations(3)
.build();
new Runner(opt).run();
}
Then you'll have Benchmark results like this:
Benchmark Mode Samples Score Score error Units
c.s.MyBenchmark.aForEachBasedUniqueSet thrpt 3 2635199.952 1663320.718 ops/s
c.s.MyBenchmark.aStreamBasedUniqueSet thrpt 3 729134.695 895825.697 ops/s
And as you can see, a simple For-Each is 3 times better in throughput and less in error score as compared to Java 8 Stream.
Higher the throughput, better the performance

I would like to improve Stuart Marks answer. What if the key is null, it will through NullPointerException. Here I ignore the null key by adding one more check as keyExtractor.apply(t)!=null.
public static <T> Predicate<T> distinctByKey(Function<? super T, ?> keyExtractor) {
Set<Object> seen = ConcurrentHashMap.newKeySet();
return t -> keyExtractor.apply(t)!=null && seen.add(keyExtractor.apply(t));
}

This works like a charm:
Grouping the data by unique key to form a map.
Returning the first object from every value of the map (There could be multiple person having same name).
persons.stream()
.collect(groupingBy(Person::getName))
.values()
.stream()
.flatMap(values -> values.stream().limit(1))
.collect(toList());

The easiest way to implement this is to jump on the sort feature as it already provides an optional Comparator which can be created using an element’s property. Then you have to filter duplicates out which can be done using a statefull Predicate which uses the fact that for a sorted stream all equal elements are adjacent:
Comparator<Person> c=Comparator.comparing(Person::getName);
stream.sorted(c).filter(new Predicate<Person>() {
Person previous;
public boolean test(Person p) {
if(previous!=null && c.compare(previous, p)==0)
return false;
previous=p;
return true;
}
})./* more stream operations here */;
Of course, a statefull Predicate is not thread-safe, however if that’s your need you can move this logic into a Collector and let the stream take care of the thread-safety when using your Collector. This depends on what you want to do with the stream of distinct elements which you didn’t tell us in your question.

There are lot of approaches, this one will also help - Simple, Clean and Clear
List<Employee> employees = new ArrayList<>();
employees.add(new Employee(11, "Ravi"));
employees.add(new Employee(12, "Stalin"));
employees.add(new Employee(23, "Anbu"));
employees.add(new Employee(24, "Yuvaraj"));
employees.add(new Employee(35, "Sena"));
employees.add(new Employee(36, "Antony"));
employees.add(new Employee(47, "Sena"));
employees.add(new Employee(48, "Ravi"));
List<Employee> empList = new ArrayList<>(employees.stream().collect(
Collectors.toMap(Employee::getName, obj -> obj,
(existingValue, newValue) -> existingValue))
.values());
empList.forEach(System.out::println);
// Collectors.toMap(
// Employee::getName, - key (the value by which you want to eliminate duplicate)
// obj -> obj, - value (entire employee object)
// (existingValue, newValue) -> existingValue) - to avoid illegalstateexception: duplicate key
Output - toString() overloaded
Employee{id=35, name='Sena'}
Employee{id=12, name='Stalin'}
Employee{id=11, name='Ravi'}
Employee{id=24, name='Yuvaraj'}
Employee{id=36, name='Antony'}
Employee{id=23, name='Anbu'}

Here is the example
public class PayRoll {
private int payRollId;
private int id;
private String name;
private String dept;
private int salary;
public PayRoll(int payRollId, int id, String name, String dept, int salary) {
super();
this.payRollId = payRollId;
this.id = id;
this.name = name;
this.dept = dept;
this.salary = salary;
}
}
import java.util.ArrayList;
import java.util.Comparator;
import java.util.List;
import java.util.Map;
import java.util.Optional;
import java.util.stream.Collector;
import java.util.stream.Collectors;
public class Prac {
public static void main(String[] args) {
int salary=70000;
PayRoll payRoll=new PayRoll(1311, 1, "A", "HR", salary);
PayRoll payRoll2=new PayRoll(1411, 2 , "B", "Technical", salary);
PayRoll payRoll3=new PayRoll(1511, 1, "C", "HR", salary);
PayRoll payRoll4=new PayRoll(1611, 1, "D", "Technical", salary);
PayRoll payRoll5=new PayRoll(711, 3,"E", "Technical", salary);
PayRoll payRoll6=new PayRoll(1811, 3, "F", "Technical", salary);
List<PayRoll>list=new ArrayList<PayRoll>();
list.add(payRoll);
list.add(payRoll2);
list.add(payRoll3);
list.add(payRoll4);
list.add(payRoll5);
list.add(payRoll6);
Map<Object, Optional<PayRoll>> k = list.stream().collect(Collectors.groupingBy(p->p.getId()+"|"+p.getDept(),Collectors.maxBy(Comparator.comparingInt(PayRoll::getPayRollId))));
k.entrySet().forEach(p->
{
if(p.getValue().isPresent())
{
System.out.println(p.getValue().get());
}
});
}
}
Output:
PayRoll [payRollId=1611, id=1, name=D, dept=Technical, salary=70000]
PayRoll [payRollId=1811, id=3, name=F, dept=Technical, salary=70000]
PayRoll [payRollId=1411, id=2, name=B, dept=Technical, salary=70000]
PayRoll [payRollId=1511, id=1, name=C, dept=HR, salary=70000]

Late to the party but I sometimes use this one-liner as an equivalent:
((Function<Value, Key>) Value::getKey).andThen(new HashSet<>()::add)::apply
The expression is a Predicate<Value> but since the map is inline, it works as a filter. This is of course less readable but sometimes it can be helpful to avoid the method.

Building on #josketres's answer, I created a generic utility method:
You could make this more Java 8-friendly by creating a Collector.
public static <T> Set<T> removeDuplicates(Collection<T> input, Comparator<T> comparer) {
return input.stream()
.collect(toCollection(() -> new TreeSet<>(comparer)));
}
#Test
public void removeDuplicatesWithDuplicates() {
ArrayList<C> input = new ArrayList<>();
Collections.addAll(input, new C(7), new C(42), new C(42));
Collection<C> result = removeDuplicates(input, (c1, c2) -> Integer.compare(c1.value, c2.value));
assertEquals(2, result.size());
assertTrue(result.stream().anyMatch(c -> c.value == 7));
assertTrue(result.stream().anyMatch(c -> c.value == 42));
}
#Test
public void removeDuplicatesWithoutDuplicates() {
ArrayList<C> input = new ArrayList<>();
Collections.addAll(input, new C(1), new C(2), new C(3));
Collection<C> result = removeDuplicates(input, (t1, t2) -> Integer.compare(t1.value, t2.value));
assertEquals(3, result.size());
assertTrue(result.stream().anyMatch(c -> c.value == 1));
assertTrue(result.stream().anyMatch(c -> c.value == 2));
assertTrue(result.stream().anyMatch(c -> c.value == 3));
}
private class C {
public final int value;
private C(int value) {
this.value = value;
}
}

Maybe will be useful for somebody. I had a little bit another requirement. Having list of objects A from 3rd party remove all which have same A.b field for same A.id (multiple A object with same A.id in list). Stream partition answer by Tagir Valeev inspired me to use custom Collector which returns Map<A.id, List<A>>. Simple flatMap will do the rest.
public static <T, K, K2> Collector<T, ?, Map<K, List<T>>> groupingDistinctBy(Function<T, K> keyFunction, Function<T, K2> distinctFunction) {
return groupingBy(keyFunction, Collector.of((Supplier<Map<K2, T>>) HashMap::new,
(map, error) -> map.putIfAbsent(distinctFunction.apply(error), error),
(left, right) -> {
left.putAll(right);
return left;
}, map -> new ArrayList<>(map.values()),
Collector.Characteristics.UNORDERED)); }

I had a situation, where I was suppose to get distinct elements from list based on 2 keys.
If you want distinct based on two keys or may composite key, try this
class Person{
int rollno;
String name;
}
List<Person> personList;
Function<Person, List<Object>> compositeKey = personList->
Arrays.<Object>asList(personList.getName(), personList.getRollno());
Map<Object, List<Person>> map = personList.stream().collect(Collectors.groupingBy(compositeKey, Collectors.toList()));
List<Object> duplicateEntrys = map.entrySet().stream()`enter code here`
.filter(settingMap ->
settingMap.getValue().size() > 1)
.collect(Collectors.toList());

A variation of the top answer that handles null:
public static <T, K> Predicate<T> distinctBy(final Function<? super T, K> getKey) {
val seen = ConcurrentHashMap.<Optional<K>>newKeySet();
return obj -> seen.add(Optional.ofNullable(getKey.apply(obj)));
}
In my tests:
assertEquals(
asList("a", "bb"),
Stream.of("a", "b", "bb", "aa").filter(distinctBy(String::length)).collect(toList()));
assertEquals(
asList(5, null, 2, 3),
Stream.of(5, null, 2, null, 3, 3, 2).filter(distinctBy(x -> x)).collect(toList()));
val maps = asList(
hashMapWith(0, 2),
hashMapWith(1, 2),
hashMapWith(2, null),
hashMapWith(3, 1),
hashMapWith(4, null),
hashMapWith(5, 2));
assertEquals(
asList(0, 2, 3),
maps.stream()
.filter(distinctBy(m -> m.get("val")))
.map(m -> m.get("i"))
.collect(toList()));

In my case I needed to control what was the previous element. I then created a stateful Predicate where I controled if the previous element was different from the current element, in that case I kept it.
public List<Log> fetchLogById(Long id) {
return this.findLogById(id).stream()
.filter(new LogPredicate())
.collect(Collectors.toList());
}
public class LogPredicate implements Predicate<Log> {
private Log previous;
public boolean test(Log atual) {
boolean isDifferent = previouws == null || verifyIfDifferentLog(current, previous);
if (isDifferent) {
previous = current;
}
return isDifferent;
}
private boolean verifyIfDifferentLog(Log current, Log previous) {
return !current.getId().equals(previous.getId());
}
}

My solution in this listing:
List<HolderEntry> result ....
List<HolderEntry> dto3s = new ArrayList<>(result.stream().collect(toMap(
HolderEntry::getId,
holder -> holder, //or Function.identity() if you want
(holder1, holder2) -> holder1
)).values());
In my situation i want to find distinct values and put their in List.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java 8 Streams & lambdas maintaining strict FP - java

Related

Sort list by multiple fields(not then compare) in java

Converting array iteration to lambda function using Java8

Java - How to filter items from a list so that only one per element attribute is present? [duplicate]

Multiple aggregate functions in Java 8 Stream API

Java 8 Distinct by property

Categories

Resources