Facing a challenge to come up with an efficient way of merging two ArrayLists of Maps.
The map looks like this:
{Username=User1, Role=Admin}
So one list looks like this:
List1 = [{Username=User1, Role=Admin},{Username=User2, Role=Auditor}]
and so on.
There is another list:
List 2 = [{Username=User1, Role=Integrator},{Username=User2, Role=Manager}]
Note: The users have different roles in different lists.
What i want to end up with is:
MergedList = [{Username=User1, Role=[Admin,Integrator]},{Username=User2, Role=[Auditor,Manager}]
Another Note: The actual list has 50,000 maps and each map has 20entries!! Just tried to keep it simple here.
Below are the stuff i tried. But failed.
Tried putAll.
Tried merge.
Tried something that I found in another post
map2.forEach((k, v) -> map3.merge(k, v, String::concat));
With regard to the performance and massive amount of data, I recommend you to avoid any usage of java-stream (although it is quite quick itself) and the Map::merge method.
Here you have to stick with the constructs closes to the JVM level and for-loops are your friends, here is the simplest approach I am aware of that might work:
final Map<String, Set<String>> newMap = new HashMap<>();
for (Map<String, String> map: list) { // iterate the List<Map>
for (Entry<String, String> entry: map.entrySet()) { // iterate the entries
final String key = entry.getKey(); // get the entry's key
newMap.computeIfAbsent(key, k -> new HashSet<>()); // compute a new pair
newMap.get(key).add(entry.getValue()); // add a value in any case
}
}
Set prevents duplicate values.
This solution assumes the following data structure. Slight variations are easy to apply to the solution above.
List<Map<String, String>> list = new ArrayList<>();
Map<String, String> map1 = new HashMap<>();
map1.put("User1", "Admin");
map1.put("User2", "Auditor");
Map<String, String> map2 = new HashMap<>();
map2.put("User1", "Integrator");
map2.put("User2", "Manager");
map2.put("User3", "Coffee machine");
list.add(map1);
list.add(map2);
You can use Java Streams to achieve this:
Map<String, List<String>> result = Stream.concat(users1.stream(), users2.stream())
.collect(Collectors.groupingBy(m -> m.get("Username"), Collectors.mapping(m -> m.get("Role"), Collectors.toList())));
This groups all the users and collects their roles.
The result will be:
{User1=[Admin, Integrator], User2=[Auditor, Manager]}
Related
I have a Set<String> set1 and Set<String> set2, as well as 2 functions getSet1ElementScore(String s) and getSet2ElementScore(String s) (that return Integers) and want to insert all elements from both sets into a HashMap as its keys, with each key's value either calculated from getSet1ElementScore or getSet2ElementScore depending on which set the key came from.
Can I use a stream to pipeline this?
I'm not 100% sure I got your question right. This might achieve what you want:
Set<String> set1 = new HashSet<>();
Set<String> set2 = new HashSet<>();
Map<String, String> mapFromSet1 =
set1.stream().collect( Collectors.toMap(Function.identity(), p -> getSet1ElementScore(p)) );
Map<String, String> mapFromSet2 =
set2.stream().collect( Collectors.toMap(Function.identity(), p -> getSet2ElementScore(p)) );
Map<String, String> resultMap = new HashMap<>();
resultMap.putAll(mapFromSet1);
resultMap.putAll(mapFromSet2);
In order to transform it in one single pipeline, I think it is possible but you'd need to use (unnecessarily) more code than that.
You can process the elements of the two sets calling the appropriate function as:
Map<String, String> result = set1.stream()
.collect(Collectors.toMap(Function.identity(), this::getSet1ElementScore,
(old, new) -> old,
HashMap::new));
result.putAll(
set2.stream()
.collect(Collectors.toMap(Function.identity(), this::getSet2ElementScore))
);
I explicitly created a HashMap in the first processing so that it is mutable and we can merge the second into it.
I have List of LinkedHashMap like List<Map<String, String>>. Every Map has the same number of elements and every Map has the same keys.
The second element is LinkedHashSet<String> - set of keys.
Now I would like to order every Map from List by keys. Sort ordering is in LinkedHashSet<String>.
My attempt is iterate by List<Map<String, String>>. For every Map create new Map and iterate by Set. To the new Map put key and value from old Mapwhere key is taken from Set. In code:
private List<Map<String, String>> sort(List<Map<String,String> result, LinkedHashSet<String> keys){
List<Map<String, String>> sortedResult = new LinkedList<>();
result.forEach(map -> {
Map<String, String> sortedMap = new LinkedHashMap<>();
keys.forEach(key -> {
sortedMap.put(key, map.get(key));
});
sortedResult.add(sortedMap);
});
return sortedResult;
}
I think it is a little bit complicated and in my opinion there exsists better way to do that.
You have a LinkedHashMap which tries to maintain only the order of insertion of keys, not the natural-ordering of keys. One thing you can do is to maintain a list of keys outside the map and sort them and re-insert the <key,value> pairs as per the order of the sorted list of keys. So it seems you are already doing this in your code by having an order defined by LinkedHashSet.
The other simple approach is:
If you want an ordered map by keys, you most probably need a TreeMap, insertion into this map maintains the natural ordering of keys and you can construct a treemap from an existing map.
private List<Map<String, String>> sort(List<Map<String,String> result) {
List<Map<String, String>> sortedResult = new LinkedList<>();
for ( Map<String, String> m : result )
sortedResult.add(new TreeMap(m)));
return sortedResult;
}
BTW, Local variables referenced from a lambda expression must be final
There are a couple of things I would change:
The argument name "result" is misleading. People going over the code quickly will think this is the returned result. I would change it to "unsortedMaps" or something similar
Each Map shouldn't affect the other so instead of result.forEach you could use result.parallelStream().forEach to make every map sorted in it's own thread. You will need to make insertion to list itself thread safe (either surronding "sortedResult.add(sortedMap" with synchronized statment or use a thread-safe list implementation. All this doesn't guarantee improvement in performance. It depends on many variants such as the size of the collections and number of cores. Test it to find out.
There are a lot of details in this function. I would extract the part dealing with each map to a seperate function
Here is the result (didn't test the code so can't gurentee correctness. Needless to say unit-tests are always the way to go):
private List<Map<String, String>> sort(List<Map<String,String>> unsortedMaps, LinkedHashSet<String> keys){
List<Map<String, String>> sortedResult = new LinkedList<>();
unsortedMaps.parallelStream().forEach(map -> {
Map<String, String> sortedMap = getSortedMap(keys, map);
synchronized (sortedResult) {
sortedResult.add(sortedMap);
}
});
return sortedResult;
}
private Map<String, String> getSortedMap(LinkedHashSet<String> keys, Map<String, String> map) {
Map<String, String> sortedMap = new LinkedHashMap<>();
keys.forEach(key -> {
sortedMap.put(key, map.get(key));
});
return sortedMap;
}
To complete SomeDude answer, if the natural order isn't enough for your need, you can try to specify a Comparator to the TreeMap :
private List<Map<String, String>> sort(List<Map<String,String>> mapList, Set<String> keys){
List<String> keysList = new ArrayList<>(keys);
return mapList.stream().map(map -> copyAndReOrderMap(map, keysList)).collect(Collectors.toList());
}
private Map<String, String> copyAndReOrderMap(Map<String, String> map, List<String> keysList) {
Map<String, String> orderedMap = new TreeMap<>((key1, key2) -> Integer.compare(keysList.indexOf(key1), keysList.indexOf(key2)));
orderedMap.putAll(map);
return orderedMap;
}
NB: Unless you deal with very large maps, i don't see why you would want to sort each map in a separate Thread.
Is there a better way to transform "Map<String, Collection<String>>" to "Map<String, List<String>>"?
Map<String, Collection<String>> collectionsMap = ...
Map<String, List<String>> listsaps =
collectionsMap.entrySet().stream()
.collect(Collectors.<Map.Entry<String, Collection<String>>,
String, List<String>>toMap(
Map.Entry::getKey,
e -> e. getValue().stream().collect(Collectors.toList())
)
);
Thank you for helping us improve
For cases like this, I'd consider using Map.forEach to perform the operation using side effects. Streams over maps are somewhat cumbersome, as one needs to write extra code to stream the map entries and then extract the key and value from each entry. By contrast, Map.forEach passes each key and value to the function as a separate parameter. Here's what that looks like:
Map<String, Collection<String>> collectionsMap = ...
Map<String, List<String>> listsaps = new HashMap<>(); // pre-size if desired
collectionsMap.forEach((k, v) -> listsaps.put(k, new ArrayList<>(v)));
If your map is large, you'll probably want to pre-size the destination in order to avoid rehashing during its population. To do this properly you have to know that HashMap takes the number of buckets, not the number of elements, as its parameter. This requires dividing by the default load factor of 0.75 in order to pre-size properly given a certain number of elements:
Map<String, List<String>> listsaps = new HashMap<>((int)(collectionsMap.size() / 0.75 + 1));
1) In Collectors.toMap() you don't need to repeat the generic types as these are inferred.
So :
collect(Collectors.<Map.Entry<String, Collection<String>>,
String, List<String>>toMap(...)
can be replaced by :
collect(Collectors.toMap(...)
2) The way of transforming the collection into a List could also be simplified.
This :
e -> e. getValue().stream().collect(Collectors.toList())
could be written as :
e -> new ArrayList<>(e.getValue())
You could write :
Map<String, List<String>> listsaps =
collectionsMap.entrySet()
.stream()
.collect(Collectors.toMap(
Map.Entry::getKey,
e -> new ArrayList<>(e.getValue())
)
);
I think that this is easier to read:
Map<String, List<String>> listsaps = new HashMap<>();
collectionsMap.entrySet()
.stream()
.forEach(e -> listsaps.put(e.getKey(), new ArrayList<>(e.getValue())));
If you just want to convert the entries to lists but don't really care about changing the type of the collection then you can use map.replaceAll:
collectionsMap.replaceAll((k, v) -> new ArrayList<>(v));
Map is of type Map<String, List<MyClass>>
I have more than 100 objects of MyClass associated with Key.
I need to split it if it's more than 100.
For e.g.
Input is
Map<String, List<MyClass>> myMap = new HashMap<String, List<MyClass>>();
myMap.put("ABC", [CustomObject1,CustomObject2,CustomObject3....CustomObject100...CustomObject110]);
& Output should be
myMap.put("ABC", [CustomObject1,CustomObject2,CustomObject3....CustomObject100]);
myMap.put("ABC", [CustomObject100,CustomObject101....CustomObject110]);
I thought of getting myMap.containsKey(string) and check size of list and then create new entry or add it in same.
I tried using guava's multimap but it returns Collection> when I try to get elements so not sure how to insert it. Or if there is any better option for this?
Try using Guava Lists.partition
Map<String, List<MyClass>> myMap = new HashMap<String, List<MyClass>>();
myMap.put("ABC", [CustomObject1,CustomObject2,CustomObject3....CustomObject100...CustomObject110]);
Map<String, List<List<MyClass>>> output = myMap.entrySet().stream().collect(Collectors.toMap(e -> e.getKey(), e -> Lists.partition(e.getValue(), 100)));
I would like to flatten a Map which associates an Integer key to a list of String, without losing the key mapping.
I am curious as though it is possible and useful to do so with stream and lambda.
We start with something like this:
Map<Integer, List<String>> mapFrom = new HashMap<>();
Let's assume that mapFrom is populated somewhere, and looks like:
1: a,b,c
2: d,e,f
etc.
Let's also assume that the values in the lists are unique.
Now, I want to "unfold" it to get a second map like:
a: 1
b: 1
c: 1
d: 2
e: 2
f: 2
etc.
I could do it like this (or very similarly, using foreach):
Map<String, Integer> mapTo = new HashMap<>();
for (Map.Entry<Integer, List<String>> entry: mapFrom.entrySet()) {
for (String s: entry.getValue()) {
mapTo.put(s, entry.getKey());
}
}
Now let's assume that I want to use lambda instead of nested for loops. I would probably do something like this:
Map<String, Integer> mapTo = mapFrom.entrySet().stream().map(e -> {
e.getValue().stream().?
// Here I can iterate on each List,
// but my best try would only give me a flat map for each key,
// that I wouldn't know how to flatten.
}).collect(Collectors.toMap(/*A String value*/,/*An Integer key*/))
I also gave a try to flatMap, but I don't think that it is the right way to go, because although it helps me get rid of the dimensionality issue, I lose the key in the process.
In a nutshell, my two questions are :
Is it possible to use streams and lambda to achieve this?
Is is useful (performance, readability) to do so?
You need to use flatMap to flatten the values into a new stream, but since you still need the original keys for collecting into a Map, you have to map to a temporary object holding key and value, e.g.
Map<String, Integer> mapTo = mapFrom.entrySet().stream()
.flatMap(e->e.getValue().stream()
.map(v->new AbstractMap.SimpleImmutableEntry<>(e.getKey(), v)))
.collect(Collectors.toMap(Map.Entry::getValue, Map.Entry::getKey));
The Map.Entry is a stand-in for the nonexistent tuple type, any other type capable of holding two objects of different type is sufficient.
An alternative not requiring these temporary objects, is a custom collector:
Map<String, Integer> mapTo = mapFrom.entrySet().stream().collect(
HashMap::new, (m,e)->e.getValue().forEach(v->m.put(v, e.getKey())), Map::putAll);
This differs from toMap in overwriting duplicate keys silently, whereas toMap without a merger function will throw an exception, if there is a duplicate key. Basically, this custom collector is a parallel capable variant of
Map<String, Integer> mapTo = new HashMap<>();
mapFrom.forEach((k, l) -> l.forEach(v -> mapTo.put(v, k)));
But note that this task wouldn’t benefit from parallel processing, even with a very large input map. Only if there were additional computational intense task within the stream pipeline that could benefit from SMP, there was a chance of getting a benefit from parallel streams. So perhaps, the concise, sequential Collection API solution is preferable.
You should use flatMap as follows:
entrySet.stream()
.flatMap(e -> e.getValue().stream()
.map(s -> new SimpleImmutableEntry(e.getKey(), s)));
SimpleImmutableEntry is a nested class in AbstractMap.
Hope this would do it in simplest way. :))
mapFrom.forEach((key, values) -> values.forEach(value -> mapTo.put(value, key)));
This should work. Please notice that you lost some keys from List.
Map<Integer, List<String>> mapFrom = new HashMap<>();
Map<String, Integer> mapTo = mapFrom.entrySet().stream()
.flatMap(integerListEntry -> integerListEntry.getValue()
.stream()
.map(listItem -> new AbstractMap.SimpleEntry<>(listItem, integerListEntry.getKey())))
.collect(Collectors.toMap(AbstractMap.SimpleEntry::getKey, AbstractMap.SimpleEntry::getValue));
Same as the previous answers with Java 9:
Map<String, Integer> mapTo = mapFrom.entrySet()
.stream()
.flatMap(entry -> entry.getValue()
.stream()
.map(s -> Map.entry(s, entry.getKey())))
.collect(toMap(Entry::getKey, Entry::getValue));