Remove duplicates from List<Object> based on condition - java

Starting point:
public class Employee {
private String id;
private String name;
private String age;
}
I have a list of Employee: List<Employee> employee;
Employee examples from the list:
{id="1", name="John", age=10}
{id="2", name="Ana", age=12}
{id="3", name="John", age=23}
{id="4", name="John", age=14}
Let's assume that age is unique.
How can I remove all duplicates from the list based on the name property and to keep in the output the entry with the largest age?
The output should look like:
{id="2", name="Ana", age=12}
{id="3", name="John", age=23}
The way I tried:
HashSet<Object> temp = new HashSet<>();
employee.removeIf(e->!temp.add(e.getName()));
..but the this way the first match will be kept in employee
{id="1", name="John", age=10}
{id="2", name="Ana", age=12}
...and I have no idea how to put an another condition here to keep the one with the largest age.

Here's a way that groups elements by name and reduces groups by selecting the one with max age:
List<Employee> uniqueEmployees = employees.stream()
.collect(Collectors.groupingBy(Employee::getName,
Collectors.maxBy(Comparator.comparing(Employee::getAge))))
.values()
.stream()
.map(Optional::get)
.collect(Collectors.toList());
Which returns [[id=2, name=Ana, age=12], [id=3, name=John, age=23]] with your test data.

Apart from the accepted answer, here are two variants:
Collection<Employee> employeesWithMaxAge = employees.stream()
.collect(Collectors.toMap(
Employee::getName,
Function.identity(),
BinaryOperator.maxBy(Comparator.comparing(Employee::getAge))))
.values();
This one uses Collectors.toMap to group employees by name, letting Employee instances as the values. If there are employees with the same name, the 3rd argument (which is a binary operator), selects the employee that has max age.
The other variant does the same, but doesn't use streams:
Map<String, Employee> map = new LinkedHashMap<>(); // preserves insertion order
employees.forEach(e -> map.merge(
e.getName(),
e,
(e1, e2) -> e1.getAge() > e2.getAge() ? e1 : e2));
Or, with BinaryOperator.maxBy:
Map<String, Employee> map = new LinkedHashMap<>(); // preserves insertion order
employees.forEach(e -> map.merge(
e.getName(),
e,
BinaryOperator.maxBy(Comparator.comparing(Employee::getAge))));

ernest_k answer is great but if you maybe want to avoid adding duplicates you can use this:
public void addToEmployees(Employee e) {
Optional<Employee> alreadyAdded = employees.stream().filter(employee -> employee.getName().equals(e.getName())).findFirst();
if(alreadyAdded.isPresent()) {
updateAgeIfNeeded(alreadyAdded.get(), e);
}else {
employees.add(e);
}
}
public void updateAgeIfNeeded(Employee alreadyAdded, Employee newlyRequested) {
if(Integer.valueOf(newlyRequested.getAge()) > Integer.valueOf(alreadyAdded.getAge())) {
alreadyAdded.setAge(newlyRequested.getAge());
}
}
Just use addToEmployees method to add Employee to your list.
You can also create class extending ArrayList and override add method like so and then use your own list :)

you can add values in a Map<String, Employee> (where string is name) only if age is greater of equals than the one in the map.

Related

Using Streams, see if a list contains a property of an object from another list

Jeez, I almost need another bit of help in how to phrase this question! New Java II student here, thanks in advance for your time.
I have a list of employees that look like this:
public class Employee {
private String name;
private String department;
}
And a list of Companies that look like this:
public class Company {
private String name;
List<Department> departments;
}
Department is just:
public class Department{
private String name;
private Integer totalSalary;
}
So I'm tasked with streaming a list of employees that work for the same company. (Sorry for not saying before: the company is passed in to a fuction. It's the lone argument) It seemed easy when I first read it, but because of how the classes are set up, (Company with only a list of departments, and Employee with a single department, but no link between employee and company) I can stream to make a list of all Departments in a Company, but just don't know how to bring that back and match the employee's department string with any string from the departments that belong to the Company in question...
List<Department> deptsInCompany = companies.stream()
.filter(s -> s.getName().equals(passedInCompany))
.flatMap(s -> s.getDepartments().stream())
.collect(Collectors.toList());
I'm just not sure how to use that list of departments to backtrack and find the Employees in those departments. I think my ROOKIE mind can't get past wanting there to be a list of Employees in each department object, but there's not!
Any little nudge would be greatly appreciated! I promise to pay it forward when i've got some skill!!
Collect the department names of the (single) company with the given name into a Set (which is faster for lookup than a List).
Set<String> departmentNames = companies.stream()
.filter(c -> c.getName().equals(companyName))
.findFirst().get().getDepartments().stream()
.map(Department::getName)
.collect(Collectors.toSet());
Then remove all employees that aren't in those departments from the list.
employees.removeIf(e -> !departmentNames.contains(e.getDepartment()));
If you want to preserve the list of employees, filter and collect:
List<Employee> employeesInCompany = employees.stream()
.filter(e -> departmentNames.contains(e.getDepartment()))
.collect(Collectors.toList());
Assuming that you have a list of all employees and that all your model classes have getters for their properties, you can do the following:
public static void main(String[] args) {
List<Company> companies = // Your list of Companies
String passedInCompany = "Company";
List<String> deptsNameInCompany = companies.stream()
.filter(s -> s.getName().equals(passedInCompany))
.flatMap(s -> s.getDepartments().stream())
.map(Department::getName)
.collect(Collectors.toList());
List<Employee> employees = // All Employees
List<Employee> employeesInCompanyDepts = employees.stream()
.filter(employee -> deptsNameInCompany.contains(employee.getDepartment()))
.collect(Collectors.toList());
}
Basically you need to collect all the Departments names and then find the Employees that have such Department name in its department property.

Collectors.toMap write a merge function on a different attribute of object than the one which is not used as value

I need to create Map<String, String> from List<Person> using Stream API.
persons.stream()
.collect(Collectors
.toMap(Person::getNationality, Person::getName, (name1, name2) -> name1)
But in the above case, I want to resolve conflict in name attribute by using Person's age. is there any way to pass merge function something around the lines (age1, age2) -> // if age1 is greater than age2 return name1, else return name2 ?
To select a person based on its age, you need the Person instance to query the age. You cannot reconstitute the information after you mapped the Person to a plain name String.
So you have to collect the persons first, to be able to select the oldest, followed by mapping them to their names:
persons.stream()
.collect(Collectors.groupingBy(Person::getNationality, Collectors.collectingAndThen(
Collectors.maxBy(Comparator.comparingInt(Person::getAge)),
o -> o.get().getName())));
order the elements of stream by age and then just choose first:
persons.stream()
.sorted(Comparator.comparing(Person::getAge).reversed())
.collect(Collectors.toMap(Person::getNationality, Person::getName, (n1, n2) -> n1));
If you don't want to use a helper data structure, it is possible if you first keep your Person info and perform the merge based on it and apply the mapping afterwards:
public void test() {
final List<Person> persons = new ArrayList<>();
final BinaryOperator<Person> mergeFunction =
(lhs, rhs) -> lhs.getAge() > rhs.getAge() ? lhs : rhs;
final Function<Person, String> mapFunction = Person::getName;
final Map<String, String> personNamesByNation =
persons.stream()
.collect(
Collectors.groupingBy(Person::getNation, // KeyMapper Person.getNation: Map<String, List<Person>>
Collectors.collectingAndThen(
Collectors.collectingAndThen(
Collectors.reducing(mergeFunction), // Merge Persons into single value via merge function: Map<String, Optional<Person>>
Optional::get), // unwrap value: Map<String, Person>
mapFunction))); // apply map function afterwards: Map<String, String>
}

Java 8 Streams: collector returning filtered object

Say I have a Set that I'd like to filter down to the oldest per school.
So far I have:
Map<String, Long> getOldestPerSchool(Set<Person> persons) {
return persons.stream().collect(Collectors.toMap(Person::getSchoolname, Person::getAge, Long::max);
}
Trouble is, I want the whole person instead of only the name. But if I change it to:
Map<Person, Long> getOldestPerSchool(Set<Person> persons) {
return persons.stream().collect(Collectors.toMap(p -> p, Person::getAge, Long::max);
}
I get all persons, and I do not necessarily need a Map.
Set that I'd like to filter down to the oldest per school.
Assuming oldest per school meant oldest Person per school, you are possibly looking for an output like:
Map<String, Person> getOldestPersonPerSchool(Set<Person> persons) {
return persons.stream()
.collect(Collectors.toMap(
Person::getSchoolname, // school name
Function.identity(), // person
(a, b) -> a.getAge() > b.getAge() ? a : b)); // ensure to store oldest (no tie breaker for same age)
}
You can achieve this with an intermediate grouping and then only streaming over the values() of the resulting grouped list, there you just select the oldest person
Set<Person> oldestPerSchool = persons.stream() // Stream<Person>
.collect(Collectors.groupingBy(Person::getSchoolname)) // Map<String, List<Person>>
.values().stream() // Stream<List<Person>>
.map(list -> list.stream() // (Inner) Stream<Person>
.max(Comparator.comparingInt(Person::getAge)) // (Inner) Optional<Person>
.get() // (Inner) Person
) // Stream<Person>
.collect(Collectors.toSet()); // Set<Person>

Finding duplicated objects by two properties

Considering that I have a list of Person objects like this :
Class Person {
String fullName;
String occupation;
String hobby;
int salary;
}
Using java8 streams, how can I get list of duplicated objects only by fullName and occupation property?
By using java-8 Stream() and Collectors.groupingBy() on firstname and occupation
List<Person> duplicates = list.stream()
.collect(Collectors.groupingBy(p -> p.getFullName() + "-" + p.getOccupation(), Collectors.toList()))
.values()
.stream()
.filter(i -> i.size() > 1)
.flatMap(j -> j.stream())
.collect(Collectors.toList());
I need to find if they were any duplicates in fullName - occupation pair, which has to be unique
Based on this comment it seems that you don't really care about which Person objects were duplicated, just that there were any.
In that case you can use a stateful anyMatch:
Collection<Person> input = new ArrayList<>();
Set<List<String>> seen = new HashSet<>();
boolean hasDupes = input.stream()
.anyMatch(p -> !seen.add(List.of(p.fullName, p.occupation)));
You can use a List as a 'key' for a set which contains the fullName + occupation combinations that you've already seen. If this combination is seen again you immediately return true, otherwise you finish iterating the elements and return false.
I offer solution with O(n) complexity. I offer to use Map to group given list by key (fullName + occupation) and then retrieve duplicates.
public static List<Person> getDuplicates(List<Person> persons, Function<Person, String> classifier) {
Map<String, List<Person>> map = persons.stream()
.collect(Collectors.groupingBy(classifier, Collectors.mapping(Function.identity(), Collectors.toList())));
return map.values().stream()
.filter(personList -> personList.size() > 1)
.flatMap(List::stream)
.collect(Collectors.toList());
}
Client code:
List<Person> persons = Collections.emptyList();
List<Person> duplicates = getDuplicates(persons, person -> person.fullName + ':' + person.occupation);
First implement equals and hashCode in your person class and then use.
List<Person> personList = new ArrayList<>();
Set<Person> duplicates=personList.stream().filter(p -> Collections.frequency(personList, p) ==2)
.collect(Collectors.toSet());
If objects are more than 2 then you use Collections.frequency(personList, p) >1 in filter predicate.

Combine two sets conditionally

I have a Person object which has a name attribute and some other attributes. I have two HashSet with Person objects. Note that name is not an unique attribute meaning that two Persons with same name can have different height so using HashSet does not guarantee that two Persons with same name are not in the same set.
I need to add one set to another so there are no Persons in the result with the same name. So something like this:
public void combine(HashSet<Person> set1, HashSet<Person> set2){
for (String item2 : set2) {
boolean exists = false;
for (String item1 : set1) {
if(item2.name.equals(item1.name)){
exists = true;
}
}
if(!exists){
set1.add(item2);
}
}
}
Is there a cleaner way of doing this in java8?
set1.addAll(set2.stream().filter(e -> set1.stream()
.noneMatch(p -> p.getName().equals(e.getName())))
.collect(Collectors.toSet()));
If it makes sense for you to override equals and hashCode you can use something like this:
Set<Parent> result = Stream.concat(set1.stream(), set2.stream())
.collect(Collectors.toSet());
Without the Java 8 streams you can easily just do this:
Set<Parent> result = new HashSet<>();
result.addAll(set1);
result.addAll(set2);
But remember this solution is only feasible when it makes sense to have equals and hashCode overridden.
`
You can use a HashMap with name as key, then you avoid the O(n²) runtime complexity of your method. If you need HashSet, then there is no faster way. Even if you use Java 8 Streams. They add just more overhead.
public Map<String, Person> combine(Set<Person> set1, Set<Person> set2) {
Map<String, Person> persons = new HashMap<>();
set1.forEach(pers -> persons.computeIfAbsent(pers.getName(), key -> pers));
set2.forEach(pers -> persons.computeIfAbsent(pers.getName(), key -> pers));
return persons;
}
Alternatively, you could create your own collector. Assuming that you're certain that two persons with the same name are in fact the same person:
First you can define a collector:
static Collector<Person, ?, Map<String, Person>> groupByName() {
return Collector.of(
HashMap::new,
(a,b) -> a.putIfAbsent(b.name, b),
(a,b) -> { a.putAll(b); return a;}
);
}
Then you can use it to group persons by name:
Stream.concat(s1.stream(), s2.stream())
.collect(groupByName());
However, this would give you a Map<String, Person> and you just want the whole set of Persons found, right?
So, you could just do:
Set<Person> p = Stream.concat(s1.stream(), s2.stream())
.collect(collectingAndThen(groupByName(), p -> new HashSet<>(p.values())));

Categories