Split list into duplicate and non-duplicate lists Java 8 - java

I have a List<String> that may or not contain duplicated values:
In the case of duplicated "ABC" value (only ABC for this matter)
List myList = {"ABC", "EFG", "IJK", "ABC", "ABC"},
I want to split the list in two lists to finally get
List duplicatedValues = {"ABC"};
and
List nonDuplicatedValues = {"EFG", "IJK"};
And also if the list doesn't have more than one "ABC" it will return the same list
What I did so far :
void generateList(List<String> duplicatedValues, List<String> nonDuplicatedValues){
List<String> myList=List.of("ABC","EFG","IJK","ABC","ABC");
Optional<String> duplicatedValue = myList.stream().filter(isDuplicated -> Collections.frequency(myList, "ABC") > 1).findFirst();
if (duplicatedValue.isPresent())
{
duplicatedValues.addAll(List.of(duplicatedValue.get()));
nonDuplicatedValues.addAll(myList.stream().filter(string->string.equals("ABC")).collect(Collectors.toList()));
}
else
{
nonDuplicatedValues.addAll(myList);
}
}
Is there a more efficient way to do that using only a stream of myList ?

You can do something like this:
myList.stream().forEach((x) -> ((Collections.frequency(myList, x) > 1) ? duplicatedValues : nonDuplicatedValues).add(x));
(The duplicatedValues should be a Set to prevent duplications)

Also it can be done by collecting to lists of duplicated and non-duplicated values:
Map<Boolean, List<String>> result = input.stream()
.collect(Collectors.collectingAndThen(
Collectors.groupingBy(s -> s, Collectors.counting()),
m -> m.entrySet().stream()
.collect(Collectors.groupingBy(e -> e.getValue() > 1,
Collectors.mapping(e -> e.getKey(), Collectors.toList()))
)
));
List<String> duplicates = result.get(true);
List<String> nonDuplicates = result.get(false);

It is possible to use a stream to create from your list a Map storing strings and their frequencies in your list; after you can iterate over the map to put elements in lists duplicatedValues and nonDuplicatedValues like below:
List<String> duplicatedValues = new ArrayList<String>();
List<String> nonDuplicatedValues = new ArrayList<String>();
List<String> myList=List.of("ABC","EFG","IJK","ABC","ABC");
Map<String, Long> map = myList.stream()
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
map.forEach((k, v) -> { if (v > 1) duplicatedValues.add(k); else nonDuplicatedValues.add(k); });

Here is one way to do it. It basically does a frequency count and divdes accordingly.
List<String> myList = new ArrayList<>(
List.of("ABC", "EFG", "IJK", "ABC", "ABC", "RJL"));
Map<String,Long> freq = new HashMap<>();
for (String str : myList) {
freq.compute(str, (k,v)->v == null ? 1 : v + 1);
}
Map<String,List<String>> dupsAndNonDups = new HashMap<>();
for (Entry<String,Long> e : freq.entrySet()) {
dupsAndNonDups.computeIfAbsent(e.getValue() > 1 ? "dups" : "nondups",
k-> new ArrayList<>()).add(e.getKey());
}
System.out.println("dups = " + dupsAndNonDups.get("dups"));
Prints
dups = [ABC]
nondups = [RJL, EFG, IJK]

Related

Java Stream fill map from two lists by the first list items as keys

I have two lists that I need to check that every product (from products) has a code (from productCodes)
List<String> productCodes = List.of("X_14_AA_85", "X_14_BB_85", "X_14_ZZ_85");
List<String> products = List.of("AA", "BB", "CC", "ZZ");
// I want to achieve a collection of (product code, product)
// according if product name exists in productCode name
// key - product code, value - product
/*
Map<String, String> map = Map.of(
"AA", "X_14_AA_85",
"BB", "X_14_BB_85",
"CC", null, // null if code doesn't exist
"ZZ", "X_14_ZZ_85"
);
*/
// after a filter with null keys I could return a message something like this
// List<String> nullableProducts = List.of("CC");
// return "I could prompt that there's no code for product/s: " + nullableProducts;
Is there a way with streams to filter by list item values?
You can stream the keySet and filter null values:
Java 16+:
List<String> list = map.keySet().stream()
.filter(key -> map.get(key) == null).toList();
Java 15 and older:
List<String> list = map.keySet().stream()
.filter(key -> map.get(key) == null)
.collect(Collectors.toList());
Note: You can't instantiate an unmodifiable Map using Map.of() with null keys or values. Instead, you can do:
Map<String, String> map = new HashMap<>();
map.put("AA", "X_14_AA_85");
map.put("BB", "X_14_BB_85");
map.put("CC", null);
map.put("ZZ", "X_14_ZZ_85");
If the purpose is to get a map containing null value, this has to be implemented using a custom collector, because existing implementation throws NullPointerException when putting null:
List<String> productCodes = List.of("X_14_AA_85", "X_14_BB_85", "X_14_ZZ_85");
List<String> products = List.of("AA", "BB", "CC", "ZZ");
Map<String, String> mapCodes = products.stream()
.distinct()
.collect(
HashMap::new,
(m, p) -> m.put(p, productCodes
.stream()
.filter(pc -> pc.contains(p))
.findFirst()
.orElse(null)
),
HashMap::putAll
);
// -> {AA=X_14_AA_85, BB=X_14_BB_85, CC=null, ZZ=X_14_ZZ_85}
Then the list of non-matched products may be retrieved as follows:
List<String> nonMatchedProducts = mapCodes.entrySet()
.stream()
.filter(e -> e.getValue() == null)
.map(Map.Entry::getKey)
.collect(Collectors.toList());
// -> [CC]
However, as the result of findFirst is returned as Optional it may be used along with Collectors::toMap, and then the non-matched values can be filtered out using Optional::isEmpty:
Map<String, Optional<String>> mapCodes2 = products.stream()
.distinct()
.collect(Collectors.toMap(
p -> p,
p -> productCodes.stream().filter(pc -> pc.contains(p)).findFirst()
));
// -> {AA=Optional[X_14_AA_85], BB=Optional[X_14_BB_85], CC=Optional.empty, ZZ=Optional[X_14_ZZ_85]}
List<String> nonMatchedProducts2 = mapCodes2.entrySet()
.stream()
.filter(e -> e.getValue().isEmpty())
.map(Map.Entry::getKey)
.collect(Collectors.toList());
// -> [CC]
Also, the null/empty values may not be stored at all, then non-matched products can be found after removing all the matched ones:
Map<String, String> map3 = new HashMap<>();
for (String p : products) {
productCodes.stream()
.filter(pc -> pc.contains(p))
.findFirst()
.ifPresent(pc -> map3.put(p, pc)); // only matched pairs
}
// -> {AA=X_14_AA_85, BB=X_14_BB_85, ZZ=X_14_ZZ_85}
List<String> nonMatchedProducts3 = new ArrayList<>(products);
nonMatchedProducts3.removeAll(map3.keySet());
// -> [CC]
Given your two lists, I would do something like this. I added two products that contain non-existent codes.
List<String> products =
List.of("X_14_AA_85", "X_14_SS_88", "X_14_BB_85", "X_14_ZZ_85", "X_16_RR_85");
List<String> productCodes = List.of("AA", "BB", "CC", "ZZ");
Declare a lambda to extract the code and copy the codes to a set for efficient lookup. In fact, since duplicates codes aren't necessary, a set would be the preferred data structure from the start.
Assuming the product code is the same place and length, you can do it like this using substring. Otherwise you may need to use a regular expression to parse the product string.
Function<String, String> extractCode =
code -> code.substring(5,7);
Set<String> productCodeSet = new HashSet<>(productCodes);
And run it like this.
List<String> missingCodes = products.stream()
.filter(product -> !productCodeSet
.contains(extractCode.apply(product)))
.toList();
System.out.println("There are no codes for the following products: " + missingCodes);
Prints
There are no codes for the following products: [X_14_SS_88, X_16_RR_85]

How to find duplicate values based upon first 10 digits?

I have a scenario where i have a list as below :
List<String> a1 = new ArrayList<String>();
a1.add("1070045028000");
a1.add("1070045028001");
a1.add("1070045052000");
a1.add("1070045086000");
a1.add("1070045052001");
a1.add("1070045089000");
I tried below to find duplicate elements but it will check whole string instead of partial string(first 10 digits).
for (String s:al){
if(!unique.add(s)){
System.out.println(s);
}
}
Is there any possible way to identify all duplicates based upon the first 10 digits of a number & then find the lowest strings by comparing from the duplicates & add in to another list?
Note: Also there will be only 2 duplicates with each 10 digit string code always!!
You may group by a (String s) -> s.substring(0, 10)
Map<String, List<String>> map = list.stream()
.collect(Collectors.groupingBy(s -> s.substring(0, 10)));
map.values() would give you Collection<List<String>> where each List<String> is a list of duplicates.
{
1070045028=[1070045028000, 1070045028001],
1070045089=[1070045089000],
1070045086=[1070045086000],
1070045052=[1070045052000, 1070045052001]
}
If it's a single-element list, no duplicates were found, and you can filter these entries out.
{
1070045028=[1070045028000, 1070045028001],
1070045052=[1070045052000, 1070045052001]
}
Then the problem boils down to reducing a list of values to a single value.
[1070045028000, 1070045028001] -> 1070045028000
We know that the first 10 symbols are the same, we may ignore them while comparing.
[1070045028000, 1070045028001] -> [000, 001]
They are still raw String values, we may convert them to numbers.
[000, 001] -> [0, 1]
A natural Comparator<Integer> will give 0 as the minimum.
0
0 -> 000 -> 1070045028000
Repeat it for all the lists in map.values() and you are done.
The code would be
List<String> result = map
.values()
.stream()
.filter(list -> list.size() > 1)
.map(l -> l.stream().min(Comparator.comparingInt(s -> Integer.valueOf(s.substring(10)))).get())
.collect(Collectors.toList());
A straight-forward loop solution would be
List<String> a1 = Arrays.asList("1070045028000", "1070045028001",
"1070045052000", "1070045086000", "1070045052001", "1070045089000");
Set<String> unique = new HashSet<>();
Map<String,String> map = new HashMap<>();
for(String s: a1) {
String firstTen = s.substring(0, 10);
if(!unique.add(firstTen)) map.put(firstTen, s);
}
for(String s1: a1) {
String firstTen = s1.substring(0, 10);
map.computeIfPresent(firstTen, (k, s2) -> s1.compareTo(s2) < 0? s1: s2);
}
List<String> minDup = new ArrayList<>(map.values());
First, we add all duplicates to a Map, then we iterate over the list again and select the minimum for all values present in the map.
Alternatively, we may add all elements to a map, collecting them into lists, then select the minimum out of those, which have a size bigger than one:
List<String> minDup = new ArrayList<>();
Map<String,List<String>> map = new HashMap<>();
for(String s: a1) {
map.computeIfAbsent(s.substring(0, 10), x -> new ArrayList<>()).add(s);
}
for(List<String> list: map.values()) {
if(list.size() > 1) minDup.add(Collections.min(list));
}
This logic is directly expressible with the Stream API:
List<String> minDup = a1.stream()
.collect(Collectors.groupingBy(s -> s.substring(0, 10)))
.values().stream()
.filter(list -> list.size() > 1)
.map(Collections::min)
.collect(Collectors.toList());
Since you said that there will be only 2 duplicates per key, the overhead of collecting a List before selecting the minimum is negligible.
The solutions above assume that you only want to keep values having duplicates. Otherwise, you can use
List<String> minDup = a1.stream()
.collect(Collectors.collectingAndThen(
Collectors.toMap(s -> s.substring(0, 10), Function.identity(),
BinaryOperator.minBy(Comparator.<String>naturalOrder())),
m -> new ArrayList<>(m.values())));
which is equivalent to
Map<String,String> map = new HashMap<>();
for(String s: a1) {
map.merge(s.substring(0, 10), s, BinaryOperator.minBy(Comparator.naturalOrder()));
}
List<String> minDup = new ArrayList<>(map.values());
Common to those solutions is that you don’t have to identify duplicates first, as when you want to keep unique values too, the task reduces to selecting the minimum when encountering a minimum.
While I hate doing your homework for you, this was fun. :/
public static void main(String[] args) {
List<String> al=new ArrayList<>();
al.add("1070045028000");
al.add("1070045028001");
al.add("1070045052000");
al.add("1070045086000");
al.add("1070045052001");
al.add("1070045089000");
List<String> ret=new ArrayList<>();
for(String a:al) {
boolean handled = false;
for(int i=0;i<ret.size();i++){
String ri = ret.get(i);
if(ri.substring(0, 10).equals(a.substring(0,10))) {
Long iri = Long.parseLong(ri);
Long ia = Long.parseLong(a);
if(ia < iri){
//a is smaller, so replace it in the list
ret.set(i, a);
}
//it was a duplicate, we are done with it
handled = true;
break;
}
}
if(!handled) {
//wasn't a duplicate, just add it
ret.add(a);
}
}
System.out.println(ret);
}
prints
[1070045028000, 1070045052000, 1070045086000, 1070045089000]
Here's another way to do it – construct a Set and store just the 10-digit prefix:
Set<String> set = new HashSet<>();
for (String number : a1) {
String prefix = number.substring(0, 10);
if (set.contains(prefix)) {
System.out.println("found duplicate prefix [" + prefix + "], skipping " + number);
} else {
set.add(prefix);
}
}

Convert nested loops into streams Java 8

I am trying to convert the below nested loop in to streams Java 8.
Each element in newself2 is a list of string - ["1 2","3 4"] needs to change to ["1","2","3","4"].
for (List<String> list : newself2) {
// cartesian = [["1 2","3 4"],["4 5","6 8"]...] list = ["1 2","3 4"]...
List<String> clearner = new ArrayList<String>();
for (String string : list) { //string = "1 3 4 5"
for (String stringElement : string.split(" ")) {
clearner.add(stringElement);
}
}
newself.add(clearner);
//[["1","2","3","4"],["4","5","6","8"]...]
}
What I have tried till now -
newself2.streams().forEach(list -> list.foreach(y -> y.split(" ")))
Now I am now sure how to add the split array in the inner for loop to a new list for x?
Any help is greatly appreciated.
Here's how I'd do it:
List<List<String>> result = newself2.stream()
.map(list -> list.stream()
.flatMap(string -> Arrays.stream(string.split(" ")))
.collect(Collectors.toList()))
.collect(Collectors.toList());
This is other solution.
Function<List<String>,List<String>> function = list->Arrays.asList(list.stream()
.reduce("",(s, s2) -> s.concat(s2.replace(" ",",")+",")).split(","));
and use this function
List<List<String>> finalResult = lists
.stream()
.map(function::apply)
.collect(Collectors.toList());
with for loop is similar to this:
List<List<String>> finalResult = new ArrayList<>();
for (List<String> list : lists) {
String acc = "";
for (String s : list) {
acc = acc.concat(s.replace(" ", ",") + ",");
}
finalResult.add(Arrays.asList(acc.split(",")));
}

Java stream - find unique elements

I have List<Person> persons = new ArrayList<>(); and I want to list all unique names. I mean If there are "John", "Max", "John", "Greg" then I want to list only "Max" and "Greg". Is there some way to do it with Java stream?
We can use streams and Collectors.groupingBy in order to count how many occurrences we have of each name - then filter any name that appears more than once:
List<String> res = persons.stream()
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
.entrySet()
.stream()
.filter(e -> e.getValue() == 1)
.map(e -> e.getKey())
.collect(Collectors.toList());
System.out.println(res); // [Max, Greg]
List persons = new ArrayList();
persons.add("Max");
persons.add("John");
persons.add("John");
persons.add("Greg");
persons.stream()
.filter(person -> Collections.frequency(persons, person) == 1)
.collect(Collectors.toList());
First guess solution.
persons.stream()
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
.entrySet()
.stream()
.filter(entry -> entry.getValue() == 1)
.map(Map.Entry::getKey)
.collect(Collectors.toList())
Here is my solution:
List<String> persons = new ArrayList<>();
persons.add("John");
persons.add("John");
persons.add("MAX");
persons.add("Greg");
persons.stream()
.distinct()
.sorted()
.collect(Collectors.toList());
This is an old post, but I'd like to propose yet another approach based on a custom collector:
public static <T> Collector<T, ?, List<T>> excludingDuplicates() {
return Collector.<T, Map<T, Boolean>, List<T>>of(
LinkedHashMap::new,
(map, elem) -> map.compute(elem, (k, v) -> v == null),
(left, right) -> {
right.forEach((k, v) -> left.merge(k, v, (o, n) -> false));
return left;
},
m -> m.keySet().stream().filter(m::get).collect(Collectors.toList()));
}
Here I'm using Collector.of to create a custom collector that will accumulate elements on a LinkedHashMap: if the element is not present as a key, its value will be true, otherwise it will be false. The merge function is only applied for parallel streams and it merges the right map into the left map by attempting to put each entry of the right map in the left map, changing the value of already present keys to false. Finally, the finisher function returns a list with the keys of the map whose values are true.
This method can be used as follows:
List<String> people = Arrays.asList("John", "Max", "John", "Greg");
List<String> result = people.stream().collect(excludingDuplicates());
System.out.println(result); // [Max, Greg]
And here's another approach simpler than using a custom collector:
Map<String, Boolean> duplicates = new LinkedHashMap<>();
people.forEach(elem -> duplicates.compute(elem, (k, v) -> v != null));
duplicates.values().removeIf(v -> v);
Set<String> allUnique = duplicates.keySet();
System.out.println(allUnique); // [Max, Greg]
You can try the below code.
List<Person> uniquePersons = personList.stream()
.collect(Collectors.groupingBy(person -> person.getName()))
.entrySet().stream().filter(stringListEntry -> stringListEntry.getValue().size()==1)
.map(stringListEntry -> { return stringListEntry.getValue().get(0); })
.collect(Collectors.toList());
This should remove all the duplicate elements.
List<String> persons = new ArrayList<>();
persons.add("John");
persons.add("John");
persons.add("MAX");
persons.add("Greg");
Set<String> set = new HashSet<String>();
Set<String> duplicateSet = new HashSet<String>();
for (String p : persons) {
if (!set.add(p)) {
duplicateSet.add(p);
}
}
System.out.println(duplicateSet.toString());
set.removeAll(duplicateSet);
System.out.println(set.toString());
You can simply use Collections.frequency to check the element occurance in the list as shown below to filter the duplicates:
List<String> listInputs = new ArrayList<>();
//add your users
List<String> listOutputs = new ArrayList<>();
for(String value : listInputs) {
if(Collections.frequency(listInputs, value) ==1) {
listOutputs.add(value);
}
}
System.out.println(listOutputs);

Counting same Strings from Array in Java

How can I count the same Strings from an array and write them out in the console?
The order of the items should correspond to the order of the first appearance of the item. If there are are two or more items of a kind, add an "s" to the item name.
String[] array = {"Apple","Banana","Apple","Peanut","Banana","Orange","Apple","Peanut"};
Output:
3 Apples
2 Bananas
2 Peanuts
1 Orange
I tried this:
String[] input = new String[1000];
Scanner sIn = new Scanner(System.in);
int counter =0;
String inputString = "start";
while(inputString.equals("stop")==false){
inputString = sIn.nextLine();
input[counter]=inputString;
counter++;
}
List<String> asList = Arrays.asList(input);
Map<String, Integer> map = new HashMap<String, Integer>();
for (String s : input) {
map.put(s, Collections.frequency(asList, s));
}
System.out.println(map);
But I don't know how to get the elements out of the Map and sort them like I would like.
You can use a Map to put your result, here is a simple example:
public static void main(String args[]){
String[] array = {"Apple","Banana","Apple","Peanut","Banana","Orange","Apple","Peanut"};
Map<String, Integer> result = new HashMap<>();
for(String s : array){
if(result.containsKey(s)){
//if the map contain this key then just increment your count
result.put(s, result.get(s)+1);
}else{
//else just create a new node with 1
result.put(s, 1);
}
}
System.out.println(result);
}
Use Java streams groupingBy and collect the results into a Map<String, Long> as shown below:
String[] array = {"Apple","Banana","Apple","Peanut","Banana","Orange","Apple", "Peanut"};
Map<String, Long> map = Stream.of(array).collect(Collectors.
groupingBy(Function.identity(), //use groupingBy array element
Collectors.counting())); //count number of occurances
System.out.println(map);//output the results of the Map
Java 8 would allow a pretty elegant way of doing this with groupingBy and counting. Using a LinkedHashMap instead of the default map should handle the ordering:
Arrays.stream(array)
.collect(Collectors.groupingBy(Function.identity(),
LinkedHashMap::new,
Collectors.counting()))
.entrySet()
.forEach(e -> System.out.println(e.getValue() +
"\t" +
e.getKey() +
(e.getValue() > 1 ? "s" : "")));
use java 8
Map<String, Long> myMap = Stream.of(array).collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));

Categories