Filter stream distinct

Filter stream distinct - java

I have a stream over a simple Java data class like:
class Developer{
private Long id;
private String name;
private Integer codePost;
private Integer codeLevel;
}
I would like to apply this filter to my stream :
if 2 dev has the same codePost with different codeExperience keep the dev with codeLevel = 5
keep all devs if Developers has the same codePost with the same codeLevel
Example
ID
name
codePost
codeExperience
1
Alan stonly
30
4
2
Peter Zola
20
4
3
Camilia Frim
30
5
4
Antonio Alcant
40
4
or in java
Developer dev1 = new Developer (1,"Alan stonly",30,4);
Developer dev2 = new Developer (2,"Peter Zola",20,4);
Developer dev3 = new Developer (3,"Camilia Frim ",30,5);
Developer dev4 = new Developer (4,"Antonio Alcant",40,4);
Stream<Developer> Developers = Stream.of(dev1, dev2, dev3 , dev4);

As mentioned in the comments, Collectors.toMap should be used here with the merge function (and optionally a map supplier, e.g. LinkedHashMap::new to keep insertion order):
Stream.of(dev1, dev2, dev3, dev4)
.collect(Collectors.toMap(
Developer::getCodePost,
dev -> dev,
(d1, d2) -> Stream.of(d1, d2)
.filter(d -> d.getCodeLevel() == 5)
.findFirst()
.orElse(d1),
LinkedHashMap::new // keep insertion order
))
.values()
.forEach(System.out::println);
The merge function may be implemented with ternary operator too:
(d1, d2) -> d1.getCodeLevel() == 5 ? d1 : d2.codeLevel() == 5 ? d2 : d1
Output:
Developer(id=3, name=Camilia Frim , codePost=30, codeLevel=5)
Developer(id=2, name=Peter Zola, codePost=20, codeLevel=4)
Developer(id=4, name=Antonio Alcant, codePost=40, codeLevel=4)
If the output needs to be sorted in another order, values() should be sorted as values().stream().sorted(DeveloperComparator) with a custom developer comparator, e.g. Comparator.comparingLong(Developer::getId) or Comparator.comparing(Developer::getName) etc.
Update
As the devs sharing the same codeLevel should NOT be filtered out, the following (a bit clumsy) solution is possible on the basis of Collectors.collectingAndThen and Collectors.groupingBy:
input list is grouped into a map of codePost to the list of developers
then the List<Developer> values in the map are filtered to keep the devs with max codeLevel
// added two more devs
Developer dev5 = new Developer (5L,"Donkey Hot",40,3);
Developer dev6 = new Developer (6L,"Miguel Servantes",40,4);
Stream.of(dev1, dev2, dev3, dev4, dev5, dev6)
.collect(Collectors.collectingAndThen(Collectors.groupingBy(
Developer::getCodePost
), map -> {
map.values()
.stream()
.filter(devs -> devs.size() > 1)
.forEach(devs -> {
int maxLevel = devs.stream()
.mapToInt(Developer::getCodeLevel)
.max().orElse(5);
devs.removeIf(x -> x.getCodeLevel() != maxLevel);
});
return map;
}))
.values()
.stream()
.flatMap(List::stream)
.sorted(Comparator.comparingLong(Developer::getId))
.forEach(System.out::println);
Output:
Developer(id=2, name=Peter Zola, codePost=20, codeLevel=4)
Developer(id=3, name=Camilia Frim , codePost=30, codeLevel=5)
Developer(id=4, name=Antonio Alcant, codePost=40, codeLevel=4)
Developer(id=6, name=Miguel Servantes, codePost=40, codeLevel=4)

Related

Replace IFs and FOR LOOPs with STREAMs and LAMBDAs

I want to optimalize my code. I used to use for loops and ifs, but I know that there is more faster ways than this. I am still pretty new to the lambdas and streams. For practise, I decided I replace my old codes with them.
I am curious, how this code below could change.
int counter = 0;
List<Integer> points = new ArrayList<>();
for (String name : names) {
for (Car car : cars) {
if (counter != 0) {
points.add(counter);
}
counter= 0;
for (Driver driver : car.getDriversWhoDrivesIt()) {
if (driver.getYearsInMotorsport() == 15) {
if (!(names.contains(driver.getName()))) {
filteredCars.remove(car);
counter= 0;
break;
}
}
if (driver.getYearsInMotorsport() == 7 ) {
counter+= 7;
}
if (driver.getYearsInMotorsport() == 3) {
counter+= 3;
}
}
}
}
So the task here is that there is a list (names) with the drivers which earlier the user define. After that I iterate through all the drivers that drive that cars and if somebody has exactly 15 years of experience and the user not selected it (in the names list), than the car that the driver drived got eliminated (removed from the filteredCar and no need to continue with that car).
So for example I have 3 cars and the drivers with exp:
car : Lewis(15years), Marco(4), Sebastian(15)
car: Max(15), Amanda(7)
car: Bob(15), George(3), Lando(15)
Than the user defines the names:
Lewis, Bob, Amanda, Lando, Max
If the driver has 15 years of exp and the user not defined it, than I dont want that car in my filteredCars.
And if all the 15 years of exp drivers defined I want to collect the other drivers exp(counter)
So in the end I want my filteredCar list like this:
2. car - 7
3.car - 3
Explanation:
The first car got eliminated, because the user not defined Sebastian who has 15 years.
The second and third car got promoted, because the user defined all the 15 years experienced drivers, and the second car got 7 point(cuz Amanda), and the third got 3 (George).
I tried to solve this problem with flatMap. But I am got stucked with the if-s. My problem is that I need to use inline if in lambdas but my if-s dont have else part.
names.stream()
.flatMap(name -> cars.stream()
.flatMap(car -> car.getDriversWhoDrivesIt().stream()
// .flatMap(driver -> driver.getYearsInMotorsport() == 5 ? ) //?? now what?
)
);
I hope somebody can help me with this.

Instead of the list of names I would advise defining a Set. For each Car filter drivers that have exactly 15 year of experience and then check whether all they are present in the user-difined set of names using allMatch() operation.
Then collect all the Car objects remained in the stream into a map using collector toMap():
Set<String> names = Set.of("Lewis", "Bob", "Amanda", "Lando", "Max");
List<Car> cars = List.of(
new Car("Car1", List.of(new Driver("Lewis", 15),
new Driver("Marco", 4),
new Driver("Sebastian", 15))
),
new Car("Car2", List.of(new Driver("Max", 15),
new Driver("Amanda", 7))
),
new Car("Car3", List.of(new Driver("Bob", 15),
new Driver("George", 3),
new Driver("Lando", 15))
)
);
Map<Car, Integer> pointByCar = cars.stream()
.filter(car -> car.getDrivers().stream()
.filter(driver -> driver.getYearsInMotorsport() == 15)
.map(Driver::getName)
.allMatch(names::contains)
)
.collect(Collectors.toMap(
Function.identity(),
car -> car.getDrivers().stream()
.mapToInt(Driver::getYearsInMotorsport)
.filter(i -> i == 7 || i == 3)
.sum()
));
pointByCar.forEach((car, points) -> System.out.println(car + " -> " + points));
Output:
Car{name='Car2'} -> 7
Car{name='Car3'} -> 3
A link to Online Demo

I know that there is more faster ways than this
Only faster to write and some may find it more readable.
In this example I'm removing the cars, that have a driver with 15 years experience and aren't listed in the names list, from the stream. Then I just collect the result into a map. Key is the car. Value is the sum of the drivers years - the drivers that have 15 years of experience.
Map<Car, Integer> filteredCars = cars.stream()
.filter(car -> car.driversWhoDrivesIt().stream().allMatch(driver -> driver.yearsInMotorsport() != 15 || names.contains(driver.name())))
.collect(Collectors.toMap(
Function.identity(),
car -> car.driversWhoDrivesIt().stream()
.mapToInt(Driver::yearsInMotorsport)
.filter(y -> y != 15)
.sum()));

More efficient solution on coding task using Stream API?

I recently had a technical interview and got small coding task on Stream API.
Let's consider next input:
public class Student {
private String name;
private List<String> subjects;
//getters and setters
}
Student stud1 = new Student("John", Arrays.asList("Math", "Chemistry"));
Student stud2 = new Student("Peter", Arrays.asList("Math", "History"));
Student stud3 = new Student("Antony", Arrays.asList("Music", "History", "English"));
Stream<Student> studentStream = Stream.of(stud1, stud2, stud3);
The task is to find Students with unique subjects using Stream API.
So for the provided input expected result (ignoring order) is [John, Anthony].
I presented the solution using custom Collector:
Collector<Student, Map<String, Set<String>>, List<String>> studentsCollector = Collector.of(
HashMap::new,
(container, student) -> student.getSubjects().forEach(
subject -> container
.computeIfAbsent(subject, s -> new HashSet<>())
.add(student.getName())),
(c1, c2) -> c1,
container -> container.entrySet().stream()
.filter(e -> e.getValue().size() == 1)
.map(e -> e.getValue().iterator().next())
.distinct()
.collect(Collectors.toList())
);
List<String> studentNames = studentStream.collect(studentsCollector);
But the solution was considered as not optimal/efficient.
Could you please share your ideas on more efficient solution for this task?
UPDATE: I got another opinion from one guy that he would use reducer (Stream.reduce() method).
But I cannot understand how this could increase efficiency. What do you think?

Here is another one.
// using SimpleEntry from java.util.AbstractMap
Set<Student> list = new HashSet<>(studentStream
.flatMap(student -> student.getSubjects().stream()
.map(subject -> new SimpleEntry<>(subject, student)))
.collect(Collectors.toMap(Entry::getKey, Entry::getValue, (l, r) -> Student.SENTINEL_VALUE)
.values());
list.remove(Student.SENTINEL_VALUE);
(Intentionally using a sentinel value, more about that below.)
The steps:
Set<Student> list = new HashSet<>(studentStream
We're creating a HashSet from the Collection we're going to collect. That's because we want to get rid of the duplicate students (students with multiple unique subjects, in your case Antony).
.flatMap(student -> student.subjects()
.map(subject -> new SimpleEntry(subject, student)))
We are flatmapping each student's subjects into a stream, but first we map each element to a pair with as key the subject and as value the student. This is because we need to retain the association between the subject and the student. I'm using AbstractMap.SimpleEntry, but of course, you can use any implementation of a pair.
.collect(Collectors.toMap(Entry::getKey, Entry::getValue, (l, r) -> Student.SENTINEL_VALUE)
We are collecting the values into a map, setting the subject as key and the student as value for the resulting map. We pass in a third argument (a BinaryOperator) to define what should happen if a key collision takes place. We cannot pass in null, so we use a sentinel value1.
At this point, we have inverted the relation student ↔ subject by mapping each subject to a student (or the SENTINEL_VALUE if a subject has multiple students).
.values());
We take the values of the map, yielding the list of all students with a unique subject, plus the sentinel value.
list.remove(Student.SENTINEL_VALUE);
The only thing left to do is getting rid of the sentinel value.
1 We cannot use null in this situation. Most implementations of a Map make no distinction between a key mapped to null or the absence of that particular key. Or, more accurately, the merge method of HashMap actively removes a node when the remapping function returns null. If we want to avoid a sentinel value, then we must implement or own merge method, which could be implemented like something like this: return (!containsKey(key) ? super.merge(key, value, remappingFunction) : put(key, null));.

Another solution. Looks kind of similar to Eugene.
Stream.of(stud1, stud2, stud3, stud4)
.flatMap( s -> s.getSubjects().stream().map( subj -> new AbstractMap.SimpleEntry<>( subj, s ) ) )
.collect( Collectors.groupingBy(Map.Entry::getKey) )
.entrySet().stream()
.filter( e -> e.getValue().size() == 1 )
.map( e -> e.getValue().get(0).getValue().getName() )
.collect( Collectors.toSet() );

Not the most readable solution, but here you go:
studentStream.flatMap(st -> st.getSubjects().stream().map(subj -> new SimpleEntry<>(st.getName(), subj)))
.collect(Collectors.toMap(
Entry::getValue,
x -> {
List<String> list = new ArrayList<>();
list.add(x.getKey());
return list;
},
(left, right) -> {
left.addAll(right);
return left;
}
))
.entrySet()
.stream()
.filter(x -> x.getValue().size() == 1)
.map(Entry::getValue)
.flatMap(List::stream)
.distinct()
.forEachOrdered(System.out::println);

You can probably do it in a simpler way as :
Stream<Student> studentStream = Stream.of(stud1, stud2, stud3);
// collect all the unique subjects into a Set
Set<String> uniqueSubjects = studentStream
.flatMap(st -> st.getSubjects().stream()
.map(subj -> new AbstractMap.SimpleEntry<>(st.getName(), subj)))
// subject to occurence count map
.collect(Collectors.groupingBy(Map.Entry::getValue, Collectors.counting()))
.entrySet()
.stream()
.filter(x -> x.getValue() == 1) // occurs only once
.map(Map.Entry::getKey) // Q -> map keys are anyway unique
.collect(Collectors.toSet()); // ^^ ... any way to optimise this?(keySet)
// amongst the students, filter those which have any unique subject in their subject list
List<String> studentsStudyingUniqueSubjects = studentStream
.filter(stud -> stud.getSubjects().stream()
.anyMatch(uniqueSubjects::contains))
.map(Student::getName)
.collect(Collectors.toList());

Java 8 Streams reduce remove duplicates keeping the most recent entry

I have a Java bean, like
class EmployeeContract {
Long id;
Date date;
getter/setter
}
If a have a long list of these, in which we have duplicates by id but with different date, such as:
1, 2015/07/07
1, 2018/07/08
2, 2015/07/08
2, 2018/07/09
How can I reduce such a list keeping only the entries with the most recent date, such as:
1, 2018/07/08
2, 2018/07/09
?
Preferably using Java 8...
I've started with something like:
contract.stream()
.collect(Collectors.groupingBy(EmployeeContract::getId, Collectors.mapping(EmployeeContract::getId, Collectors.toList())))
.entrySet().stream().findFirst();
That gives me the mapping within individual groups, but I'm stuck as to how to collect that into a result list - my streams are not too strong I'm afraid...

Well, I am just going to put my comment here in the shape of an answer:
yourList.stream()
.collect(Collectors.toMap(
EmployeeContract::getId,
Function.identity(),
BinaryOperator.maxBy(Comparator.comparing(EmployeeContract::getDate)))
)
.values();
This will give you a Collection instead of a List, if you really care about this.

You can do it in two steps as follows :
List<EmployeeContract> finalContract = contract.stream() // Stream<EmployeeContract>
.collect(Collectors.toMap(EmployeeContract::getId,
EmployeeContract::getDate, (a, b) -> a.after(b) ? a : b)) // Map<Long, Date> (Step 1)
.entrySet().stream() // Stream<Entry<Long, Date>>
.map(a -> new EmployeeContract(a.getKey(), a.getValue())) // Stream<EmployeeContract>
.collect(Collectors.toList()); // Step 2
First step: ensures the comparison of dates with the most recent one mapped to an id.
Second step: maps these key, value pairs to a final List<EmployeeContract> as a result.

Just to complement the existing answers, as you're asking:
how to collect that into a result list
Here are some options:
Wrap the values() into an ArrayList:
List<EmployeeContract> list1 =
new ArrayList<>(list.stream()
.collect(toMap(EmployeeContract::getId,
identity(),
maxBy(comparing(EmployeeContract::getDate))))
.values());
Wrap the toMap collector into collectingAndThen:
List<EmployeeContract> list2 =
list.stream()
.collect(collectingAndThen(toMap(EmployeeContract::getId,
identity(),
maxBy(comparing(EmployeeContract::getDate))),
c -> new ArrayList<>(c.values())));
Collect the values to a new List using another stream:
List<EmployeeContract> list3 =
list.stream()
.collect(toMap(EmployeeContract::getId,
identity(),
maxBy(comparing(EmployeeContract::getDate))))
.values()
.stream()
.collect(toList());

With vavr.io you can do it like this:
var finalContract = Stream.ofAll(contract) //create io.vavr.collection.Stream
.groupBy(EmployeeContract::getId)
.map(tuple -> tuple._2.maxBy(EmployeeContract::getDate))
.collect(Collectors.toList()); //result is list from java.util package

Stream Hash Map using Lambda

Example SQL Result
dataResult
Code Amt TotalAmtPerCode
A1 4 0
A1 4 0
B1 4 0
B1 5 0
A1 6 0
with this result
i would like to ask on how to compute the TotalAmtPerCode
The expected result should be
Code Amt TotalAmtPerCode
A1 4 14
A1 4 14
B1 4 9
B1 5 9
A1 6 14
sample code
for (Map<String, Object> data: dataResult) {
Long total = ComputeTotalAmount(dataResult,data.get(DBColumn.Code.name();
container.setTotalAmtPerCode(total);
}
function that computes the total amount
private static long ComputeTotalAmount(List<Map<String, Object>> list, String code) {
Long total = 0;
for (Map<String, Object> data: dataResult) {
if (code.equals(data.get(DBColumn.Code.name()))) {
total = total+Long.valueOf(data.get(DBColumn.Code.name()).toString)
}
}
}
This one is working fine but I would like to ask for an optimization on this code. Because if I would loop 10,000 records, it would check 1st record for the Code then reiterate the 10k to find the same Code and get the amount on that code and sum it all then it would check the 2nd record and so-on.

Welcome to StackOverflow :)
As far as I see you need to group the sum of TotalAmtPerCode values by the Code. There exist a method Stream::collect to transform the values to the desired output using Collectors::groupingBy collector which groups the Stream<T> into Map<K, V> where V is a collection.
Map<String, Integer> map = dataresult.stream()
.collect(Collectors.groupingBy( // Group to Map
d -> d.get(DBColumn.Code.name()), // Key is the code
Collectors.summingInt(d -> d.get(DBColumn.TotalAmtPerCode.name())))); // Value is the sum
Note:
You might need to edit d -> d.get(DBColumn.Code.name()) and d.get(DBColumn.TotalAmtPerCode.name()) according to your needs to get the Code and TotalAmtPerCode - I dont know the data model.
I assume the TotalAmtPerCode is int. Otherwise, use Collectors.summingLong.

You could use Collectors.groupingBy():
Map<String, Long> collect = list.stream()
.collect(Collectors.groupingBy(
p -> p.getFirst(),
Collectors.summingLong(p -> p.getSecond())
)
);
This groups the input by some classifier (here it's p -> p.getFirst(), this will be probably something like data.get(DBColumn.Code.name()) in your case) and summarizes the values (p -> p.getSecond(), which must be changed to something like Long.valueOf(data.get(DBColumn.Code.name()).toString)).
Note: getFirst() and getSecond() are methods from org.springframework.data.util.Pair.
Example:
List<Pair<String, Long>> list = new ArrayList<>();
list.add(Pair.of("A1", 1L));
list.add(Pair.of("A1", 2L));
list.add(Pair.of("B1", 1L));
Map<String, Long> collect = list.stream()
.collect(Collectors.groupingBy(
p -> p.getFirst(),
Collectors.summingLong(p -> p.getSecond())
)
);
System.out.println(collect);
Output:
{A1=3, B1=1}

Java 8 Stream with map and multiples sets

I am trying to write these lines using java8 streams:
for (Town town : getAllTowns(routes)) {
if (originTown.equals(town))
continue;
for (Route route : routes) {
if (route.hasOrigin(originTown) && route.hasDestine(town)) {
distances.put(town, route.getDistance());
break;
}
distances.put(town, maxDistance);
}
}
return distances; //Map<Town,Integer>
The result that I got so far is:
Map<Town, Integer> distances = getAllTowns(routes).stream()
.filter(town -> !originTown.equals(town))
.forEach(town -> routes.stream()
.filter(route -> route.hasOrigin(originTown) && route.hasDestine(town)
...)
return distances;
How can I collect after the inner filter and build the Map< Town,Integer> where the integer is the route.getDistance()?
I tried to use:
.collect(Collectors.toMap(route -> route.getDestineTown(), route -> route.getDistance()))
But it is inside the forEach call, then I can't return it to my variable distances because it generates the map only for the inner call. I did not understand it. Any input would be really helpful. Thanks.

You can use findFirst() to build a list that contains, for each town, the first route that has that town as the destination, and then call toMap() on it. The default values for missing cities can be handled separately.
Collection<Town> towns = getAllTowns(routes);
Map<Town, Integer> distances = towns.stream()
.filter(town -> !originTown.equals(town))
.map(town -> routes.stream()
.filter(route -> route.hasOrigin(originTown) && route.hasDestine(town))
.findFirst())
.filter(Optional::isPresent)
.collect(toMap(route -> route.get().getDestine(), route -> route.get().getDistance()));
towns.stream()
.filter(town -> !distances.containsKey(town))
.forEach(town -> distances.put(town, maxDistance));
(Note that town is no longer available in collect(), but you can take advantage of the fact that each route got added only if its destination town was town.)
Also note that toMap() doesn't accept duplicate keys. If there can be multiple routes to any town (which I assume there might be), you should use groupingBy() instead.

I think you have two options to solve this. Either you create your resulting Map beforehand and use nested foreachs:
Map<Town, Integer> distances = new HashMap<>();
getAllTowns(routes).stream().filter(town -> !originTown.equals(town))
.forEach(town -> routes.stream().forEach(route -> distances.put(town,
route.hasOrigin(originTown) && route.hasDestine(town) ? route.getDistance() : maxDistance)));
The other option is to collect your stream to a Map by creating an intermediate Object which is essentially a Pair of Town and Integer:
Map<Town, Integer> distances = getAllTowns(routes).stream().filter(town -> !originTown.equals(town))
.flatMap(town -> routes.stream()
.map(route -> new AbstractMap.SimpleEntry<Town, Integer>(town,
route.hasOrigin(originTown) && route.hasDestine(town) ? route.getDistance()
: maxDistance)))
.collect(Collectors.toMap(entry -> entry.getKey(), entry -> entry.getValue()));

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Filter stream distinct - java

Related

Replace IFs and FOR LOOPs with STREAMs and LAMBDAs

More efficient solution on coding task using Stream API?

Java 8 Streams reduce remove duplicates keeping the most recent entry

Stream Hash Map using Lambda

Java 8 Stream with map and multiples sets

Categories

Resources