This is a continuation of my previous question at Group, Sum byType then get diff using Java streams.
As suggested, I should post as a separate thread instead of updating the original one.
So with my previous set of question, I have achieved that, and now, with the continuation.
Background:
I have the following dataset
Sample(SampleId=1, SampleTypeId=1, SampleQuantity=5, SampleType=ADD),
Sample(SampleId=2, SampleTypeId=1, SampleQuantity=15, SampleType=ADD),
Sample(SampleId=3, SampleTypeId=1, SampleQuantity=25, SampleType=ADD),
Sample(SampleId=4, SampleTypeId=1, SampleQuantity=5, SampleType=SUBTRACT),
Sample(SampleId=5, SampleTypeId=1, SampleQuantity=25, SampleType=SUBTRACT)
Sample(SampleId=6, SampleTypeId=2, SampleQuantity=10, SampleType=ADD),
Sample(SampleId=7, SampleTypeId=2, SampleQuantity=20, SampleType=ADD),
Sample(SampleId=8, SampleTypeId=2, SampleQuantity=30, SampleType=ADD),
Sample(SampleId=9, SampleTypeId=2, SampleQuantity=15, SampleType=SUBTRACT),
Sample(SampleId=10, SampleTypeId=2, SampleQuantity=35, SampleType=SUBTRACT)
I am currently using this:
sampleList.stream()
.collect(Collectors.groupingBy(Sample::getTypeId,
Collectors.summingInt(
sample -> SampleType.ADD.equalsIgnoreCase(sample.getSampleType())
? sample.getSampleQuantity() :
-sample.getSampleQuantity()
)));
And also this
sampleList.stream()
.collect(Collectors.groupingBy(Sample::getSampleTypeId,
Collectors.collectingAndThen(
Collectors.groupingBy(Sample::getSampleType,
Collectors.summingInt(Sample::getSampleQuantity)),
map -> map.getOrDefault(SampleType.ADD, 0)
- map.getOrDefault(SampleType.SUBTRACT, 0))));
as the accepted answer to get the desired output to group in a Map<Long, Integer>:
{1=15, 2=10}
With that, I was wondering, if this could be expanded into something more.
First, how could I have it return as a Map<String, Integer> instead of the original Map<Long, Integer>. Basically, for the SampleTypeId; 1 refers to HELLO, 2 refers to WORLD.
So I would need like a .map (or maybe other function) to transform the data from 1 to HELLO and 2 to WORLD by calling a function say convertType(sampleTypeId)?. So the expected output would then be {"HELLO"=15, "WORLD"=10}. Is that right? How should I edit the current suggested solution to this?
Lastly, I would like to know if it is also possible to return it to a Object instead of a Map. So let's say I have a Object; SummaryResult with (String) name and (int) result. So it returns a List<SummaryResult> instead of the original Map<Long, Integer>. How can I use the .map (or other) feature to do this? Or is there other way to doing so? The expected output would be something along this line.
SummaryResult(name="hello", result=15),
SummaryResult(name="world", result=10),
Would really appreciate it with the explanation in steps as given previously by #M. Prokhorov.
Update:
After updating to
sampleList.stream()
.collect(Collectors.groupingBy(sample -> convertType(sample.getSampleTypeId()),
Collectors.collectingAndThen(
Collectors.groupingBy(Sample::getSampleType,
Collectors.summingInt(Sample::getSampleQuantity)),
map -> map.getOrDefault(SampleType.ADD, 0)
- map.getOrDefault(SampleType.SUBTRACT, 0))));
private String convertType(int id) {
return (id == 1) ? "HELLO" : "WORLD";
}
For first part, considering you have somewhere the method
String convertType(int typeId)
You simply need to change first classifier from this
groupingBy(SampleType::getTypeId)
to this
groupingBy(sample -> convertType(sample.getTypeId()))
Everything else remains the same.
Latter type is a little trickier, and technically doesn't benefit from it being a stream-related solution at all.
What you need is this:
public List<SummaryResult> toSummaryResultList(Map<String, Integer> resultMap) {
List<SummaryResult> list = new ArrayList<>(resultMap.size());
for (Map.Entry<String, Integer> entry : resultMap.entrySet()) {
String name = entry.getKey();
Integer value = entry.getValue();
// replace below with construction method you actually have
list.add(SummaryResult.withName(name).andResult(value));
}
return list;
}
You can use this as part of collector composition, where your whole collector will get wrapped into a collectingAndThen call:
collectingAndThen(
groupingBy(sample -> convertType(sample.getTypeId()),
collectingAndThen(
groupingBy(Sample::getSampleType,
summingInt(Sample::getSampleQuantity)),
map -> map.getOrDefault(SampleType.ADD, 0)
- map.getOrDefault(SampleType.SUBTRACT, 0))),
result -> toSummaryResultList(result))
However, as you can see, it is the whole collector that gets wrapped, so there is no real benefit in my eyes to the above version to a simpler and easier to follow (at least to me) version below that uses an intermediate variable, but isn't so much of a wall of code:
// do the whole collecting thing like before
Map<String, Integer> map = sampleList.stream()
.collect(Collectors.groupingBy(sample -> convertType(sample.getTypeId()),
Collectors.collectingAndThen(
Collectors.groupingBy(Sample::getSampleType,
Collectors.summingInt(Sample::getSampleQuantity)),
map -> map.getOrDefault(SampleType.ADD, 0)
- map.getOrDefault(SampleType.SUBTRACT, 0))));
// return the "beautified" result
return toSummaryResultList(map);
Another point to consider in above is: convertType method will be called as many times as there are elements in sampleList, so if convertType call is "heavy" (for example, uses database or IO), then it's better to call it as part of toSummaryResultList conversion, not as stream element classifier. In which case you will be collecting from map of type Map<Integer, Integer> still, and using convertType inside the loop. I will not add any code with this in consideration, as I view this change as trivial.
You could indeed use a map() function
sampleList.stream()
.collect(Collectors.groupingBy(Sample::getSampleTypeId,
Collectors.collectingAndThen(
Collectors.groupingBy(Sample::getSampleType,
Collectors.summingInt(Sample::getSampleQuantity)),
map -> map.getOrDefault(SampleType.ADD, 0)
- map.getOrDefault(SampleType.SUBTRACT, 0))))
.entrySet()
.stream()
.map(entry->new SummaryResult(entry.getKey()),entry.getValue())
.collect(Collectors.toList());
ToIntFunction<Sample> signedQuantityMapper= sample -> sample.getQuantity()
* (sample.getType() == Type.ADD ? 1 : -1);
Function<Sample, String> keyMapper = s -> Integer.toString(s.getTypeId());
Map<String, Integer> result = sampleList.stream().collect(
Collectors.groupingBy(
keyMapper,
Collectors.summingInt(signedQuantityMapper)));
Related
I have a Map<String, Integer>, which has some keys and values. I want to associate all keys with the values as the key's length.
I have been able to solve this in pure java and java-8, but somehow I don't think that appending a terminal operation at the end like .collect(Collectors.toList()); which is not required for me in my code.
My code: ( Java ) works fine
Map<String, Integer> nameLength = new HashMap<>();
nameLength.put("John", null);
nameLength.put("Antony", 6);
nameLength.put("Yassir", 6);
nameLength.put("Karein", 6);
nameLength.put("Smith", null);
nameLength.put("JackeyLent",null);
for(Entry<String, Integer> length: nameLength.entrySet()){
if(length.getValue() == null){
nameLength.put(length.getKey(),length.getKey().length());
}
}
Java-8 also works fine but the terminal operation is useless, how I avoid it without using .foreach().
nameLength.entrySet().stream().map(s->{
if(s.getValue() == null){
nameLength.put(s.getKey(),s.getKey().length());
}
return nameLength;
}).collect(Collectors.toList());
System.out.println(nameLength);
Any other way in which I can do the above logic in Java-8 and above??
If you're going to use streams then you should avoid side effects. Functional programming is all about pure operations where the output depends only on the input and functions have no side effects. In other words, create a new map instead of modifying the existing one.
If you do that you might as well just throw away the partially-filled-out map and recompute everything from scratch. Calling String.length() is cheap and it's not really worth the effort to figure out which values are null and which aren't. Recompute all the lengths.
Map<String, Integer> newMap = nameLength.keySet().stream()
.collect(Collectors.toMap(
name -> name,
name -> name.length()
));
On the other hand if you just want to patch up your current map streams don't really buy you anything. I'd just modify it in place without involving streams.
for (Map.Entry<String, Integer> entry: nameLength.entrySet()) {
if (entry.getValue() == null) {
entry.setValue(entry.getKey().length());
}
}
Or, as discussed above, you could simplify matters by replacing all of the lengths:
nameLength.replaceAll((name, __) -> name.length());
(__ signifies a variable that isn't used and so doesn't get a meaningful name.)
You almost there, just use the filter to identify the entries with null values and then use Collectors.toMap to collect them into Map with key length as value
Map<String, Integer> nameLengths = nameLength.entrySet()
.stream()
.filter(entry->entry.getValue()==null)
.collect(Collectors.toMap(Map.Entry::getKey, entry->entry.getKey().length()));
Or more simpler way you have that check in Collectors.toMap
Map<String, Integer> nameLengths = nameLength.entrySet()
.stream()
.collect(Collectors.toMap(Map.Entry::getKey, entry->entry.getValue() == null ? entry.getKey().length() : entry.getValue()));
I am new to java8 stream & sorry about the stupid question . Here is my code which i am trying to create a map of id & value, but i am getting this error, not able to fix. Can anyone help me what is the alternative?
public static Map<Integer, String> findIdMaxValue(){
Map<Integer, Map<String, Integer>> attrIdAttrValueCountMap = new HashMap<>();
Map<Integer, String> attrIdMaxValueMap = new HashMap<>();
attrIdAttrValueCountMap.forEach((attrId, attrValueCountMap) -> {
attrValueCountMap.entrySet().stream().sorted(this::compareAttrValueCountEntry).findFirst().ifPresent(e -> {
attrIdMaxValueMap.put(attrId, e.getKey());
});
});
}
and sorting method
public static int compareAttrValueCountEntry(Map.Entry<String, Integer> e1, Map.Entry<String, Integer> e2) {
int diff = e1.getValue() - e2.getValue();
if (diff != 0) {
return -diff;
}
return e1.getKey().compareTo(e2.getKey());
}
I am getting this error
"Cannot use this in a static context"
There are several issues with your code. While this::compareAttrValueCountEntry would be easy to
fix by changing it to ContainingClassName::compareAttrValueCountEntry, this method is unnecessary
as there are several factory methods like Map.Entry.comparingByKey, Map.Entry.comparingByValue,
Comparator.reversed and Comparator.thenComparing, which can be combined to achieve the same goal
This guards you from the errors made within compareAttrValueCountEntry. It’s tempting to compare int
values by subtracting, but this is error prone as the difference between two int values doesn’t always
fit into the int range, so overflows can occur. Also, negating the result for reversing the order is
broken, as the value might be Integer.MIN_VALUE, which has no positive counterpart, hence, negating it
will overflow back to Integer.MIN_VALUE instead of changing the sign.
Instead of looping via forEach to add to another map, you may use a cleaner stream operation producing
the map and you can simplify sorted(…).findFirst() to min(…) which in not only shorter, but a
potentially cheaper operation.
Putting it together, we get
Map<Integer, String> attrIdMaxValueMap =
attrIdAttrValueCountMap.entrySet().stream()
.filter(e -> !e.getValue().isEmpty())
.collect(Collectors.toMap(Map.Entry::getKey,
e -> e.getValue().entrySet().stream()
.min(Map.Entry.<String, Integer>comparingByValue().reversed()
.thenComparing(Map.Entry.comparingByKey())).get().getKey()));
Note that I prepended a filter operation rejecting empty maps, which ensures that there will always be
a matching element, so there is no need to deal with ifPresent or such alike. Instead, Optional.get
can be called unconditionally.
Since this method is called findIdMaxValue, there might be a desire to reflect that by calling max
on the Stream instead of min, wich is only a matter of which comparator to reverse:
Map<Integer, String> attrIdMaxValueMap =
attrIdAttrValueCountMap.entrySet().stream()
.filter(e -> !e.getValue().isEmpty())
.collect(Collectors.toMap(Map.Entry::getKey,
e -> e.getValue().entrySet().stream()
.max(Map.Entry.<String, Integer>comparingByValue()
.thenComparing(Map.Entry.comparingByKey(Comparator.reverseOrder())))
.get().getKey()));
Unfortunately, such constructs hit the limitations of the type inference, which requires us to either,
use nested constructs (like Map.Entry.comparingByKey(Comparator.reverseOrder()) instead of
Map.Entry.comparingByKey().reversed()) or to insert explicit types, like with
Map.Entry.<String, Integer>comparingByValue(). In the second variant, reversing the second comparator,
we are hitting the litimation twice…
In this specific case, there might be a point in creating the comparator only once, keeping it in a variable and reuse it within the stream operation:
Comparator<Map.Entry<String, Integer>> valueOrMinKey
= Map.Entry.<String, Integer>comparingByValue()
.thenComparing(Map.Entry.comparingByKey(Comparator.reverseOrder()));
Map<Integer, String> attrIdMaxValueMap =
attrIdAttrValueCountMap.entrySet().stream()
.filter(e -> !e.getValue().isEmpty())
.collect(Collectors.toMap(Map.Entry::getKey,
e -> e.getValue().entrySet().stream().max(valueOrMinKey).get().getKey()));
Since the method compareAttrValueCountEntry is declared static,
replace the method reference
this::compareAttrValueCountEntry
with
<Yourclass>::compareAttrValueCountEntry
I would like to flatten a Map which associates an Integer key to a list of String, without losing the key mapping.
I am curious as though it is possible and useful to do so with stream and lambda.
We start with something like this:
Map<Integer, List<String>> mapFrom = new HashMap<>();
Let's assume that mapFrom is populated somewhere, and looks like:
1: a,b,c
2: d,e,f
etc.
Let's also assume that the values in the lists are unique.
Now, I want to "unfold" it to get a second map like:
a: 1
b: 1
c: 1
d: 2
e: 2
f: 2
etc.
I could do it like this (or very similarly, using foreach):
Map<String, Integer> mapTo = new HashMap<>();
for (Map.Entry<Integer, List<String>> entry: mapFrom.entrySet()) {
for (String s: entry.getValue()) {
mapTo.put(s, entry.getKey());
}
}
Now let's assume that I want to use lambda instead of nested for loops. I would probably do something like this:
Map<String, Integer> mapTo = mapFrom.entrySet().stream().map(e -> {
e.getValue().stream().?
// Here I can iterate on each List,
// but my best try would only give me a flat map for each key,
// that I wouldn't know how to flatten.
}).collect(Collectors.toMap(/*A String value*/,/*An Integer key*/))
I also gave a try to flatMap, but I don't think that it is the right way to go, because although it helps me get rid of the dimensionality issue, I lose the key in the process.
In a nutshell, my two questions are :
Is it possible to use streams and lambda to achieve this?
Is is useful (performance, readability) to do so?
You need to use flatMap to flatten the values into a new stream, but since you still need the original keys for collecting into a Map, you have to map to a temporary object holding key and value, e.g.
Map<String, Integer> mapTo = mapFrom.entrySet().stream()
.flatMap(e->e.getValue().stream()
.map(v->new AbstractMap.SimpleImmutableEntry<>(e.getKey(), v)))
.collect(Collectors.toMap(Map.Entry::getValue, Map.Entry::getKey));
The Map.Entry is a stand-in for the nonexistent tuple type, any other type capable of holding two objects of different type is sufficient.
An alternative not requiring these temporary objects, is a custom collector:
Map<String, Integer> mapTo = mapFrom.entrySet().stream().collect(
HashMap::new, (m,e)->e.getValue().forEach(v->m.put(v, e.getKey())), Map::putAll);
This differs from toMap in overwriting duplicate keys silently, whereas toMap without a merger function will throw an exception, if there is a duplicate key. Basically, this custom collector is a parallel capable variant of
Map<String, Integer> mapTo = new HashMap<>();
mapFrom.forEach((k, l) -> l.forEach(v -> mapTo.put(v, k)));
But note that this task wouldn’t benefit from parallel processing, even with a very large input map. Only if there were additional computational intense task within the stream pipeline that could benefit from SMP, there was a chance of getting a benefit from parallel streams. So perhaps, the concise, sequential Collection API solution is preferable.
You should use flatMap as follows:
entrySet.stream()
.flatMap(e -> e.getValue().stream()
.map(s -> new SimpleImmutableEntry(e.getKey(), s)));
SimpleImmutableEntry is a nested class in AbstractMap.
Hope this would do it in simplest way. :))
mapFrom.forEach((key, values) -> values.forEach(value -> mapTo.put(value, key)));
This should work. Please notice that you lost some keys from List.
Map<Integer, List<String>> mapFrom = new HashMap<>();
Map<String, Integer> mapTo = mapFrom.entrySet().stream()
.flatMap(integerListEntry -> integerListEntry.getValue()
.stream()
.map(listItem -> new AbstractMap.SimpleEntry<>(listItem, integerListEntry.getKey())))
.collect(Collectors.toMap(AbstractMap.SimpleEntry::getKey, AbstractMap.SimpleEntry::getValue));
Same as the previous answers with Java 9:
Map<String, Integer> mapTo = mapFrom.entrySet()
.stream()
.flatMap(entry -> entry.getValue()
.stream()
.map(s -> Map.entry(s, entry.getKey())))
.collect(toMap(Entry::getKey, Entry::getValue));
I have Map<Integer,Doctor> docLib=new HashMap<>(); to save class of Doctor.
Class Doctor has methods:getSpecialization() return a String,
getPatients() to return a collection of class Person.
In the main method, I type:
public Map<String,Set<Person>> getPatientsPerSpecialization(){
Map<String,Set<Person>> res=this.docLib.entrySet().stream().
map(d->d.getValue()).
collect(groupingBy(d->d.getSpecialization(),
d.getPatients()) //error
);
return res;
}
As you can see, I have problem with groupingBy,I try to send the same value d to the method, but it's wrong.
How to solve this?
You need a second Collector for that mapping :
public Map<String,Set<Person>> getPatientsPerSpecialization(){
return this.docLib
.values()
.stream()
.collect(Colectors.groupingBy(Doctor::getSpecialization,
Collectors.mapping(Doctor::getPatients,toSet()))
);
}
EDIT:
I think my original answer may be wrong (it's hard to say without being able to test it). Since Doctor::getPatients returns a Collection, I think my code may return a Map<String,Set<Collection<Person>>> instead of the desired Map<String,Set<Person>>.
The easiest way to overcome that is to iterate over that Map again to produce the desired Map :
public Map<String,Set<Person>> getPatientsPerSpecialization(){
return this.docLib
.values()
.stream()
.collect(Colectors.groupingBy(Doctor::getSpecialization,
Collectors.mapping(Doctor::getPatients,toSet()))
)
.entrySet()
.stream()
.collect (Collectors.toMap (e -> e.getKey(),
e -> e.getValue().stream().flatMap(c -> c.stream()).collect(Collectors.toSet()))
);
}
Perhaps there's a way to get the same result with a single Stream pipeline, but I can't see it right now.
Instead of groupingBy, you could use toMap:
public Map<String, Set<Person>> getPatientsPerSpecialization() {
return docLib.values()
.stream()
.collect(toMap(Doctor::getSpecialization,
d -> new HashSet<>(d.getPatients()),
(p1, p2) -> Stream.concat(p1.stream(), p2.stream()).collect(toSet())));
}
What it does is that it groups the doctors per specialization and map each one to a set of the patients it has (so a Map<String, Set<Person>>).
If, when collecting the data from the pipeline, you encounter a doctor with a specialization that is already stored as a key in the map, you use the merge function to produce a new set of values with both sets (the set that is already stored as a value for the key, and the set that you want to associate with the key).
I was presented with an interesting problem by a colleague of mine and I was unable to find a neat and pretty Java 8 solution. The problem is to stream through a list of POJOs and then collect them in a map based on multiple properties - the mapping causes the POJO to occur multiple times
Imagine the following POJO:
private static class Customer {
public String first;
public String last;
public Customer(String first, String last) {
this.first = first;
this.last = last;
}
public String toString() {
return "Customer(" + first + " " + last + ")";
}
}
Set it up as a List<Customer>:
// The list of customers
List<Customer> customers = Arrays.asList(
new Customer("Johnny", "Puma"),
new Customer("Super", "Mac"));
Alternative 1: Use a Map outside of the "stream" (or rather outside forEach).
// Alt 1: not pretty since the resulting map is "outside" of
// the stream. If parallel streams are used it must be
// ConcurrentHashMap
Map<String, Customer> res1 = new HashMap<>();
customers.stream().forEach(c -> {
res1.put(c.first, c);
res1.put(c.last, c);
});
Alternative 2: Create map entries and stream them, then flatMap them. IMO it is a bit too verbose and not so easy to read.
// Alt 2: A bit verbose and "new AbstractMap.SimpleEntry" feels as
// a "hard" dependency to AbstractMap
Map<String, Customer> res2 =
customers.stream()
.map(p -> {
Map.Entry<String, Customer> firstEntry = new AbstractMap.SimpleEntry<>(p.first, p);
Map.Entry<String, Customer> lastEntry = new AbstractMap.SimpleEntry<>(p.last, p);
return Stream.of(firstEntry, lastEntry);
})
.flatMap(Function.identity())
.collect(Collectors.toMap(
Map.Entry::getKey, Map.Entry::getValue));
Alternative 3: This is another one that I came up with the "prettiest" code so far but it uses the three-arg version of reduce and the third parameter is a bit dodgy as found in this question: Purpose of third argument to 'reduce' function in Java 8 functional programming. Furthermore, reduce does not seem like a good fit for this problem since it is mutating and parallel streams may not work with the approach below.
// Alt 3: using reduce. Not so pretty
Map<String, Customer> res3 = customers.stream().reduce(
new HashMap<>(),
(m, p) -> {
m.put(p.first, p);
m.put(p.last, p);
return m;
}, (m1, m2) -> m2 /* <- NOT USED UNLESS PARALLEL */);
If the above code is printed like this:
System.out.println(res1);
System.out.println(res2);
System.out.println(res3);
The result would be:
{Super=Customer(Super Mac), Johnny=Customer(Johnny Puma), Mac=Customer(Super Mac), Puma=Customer(Johnny Puma)}
{Super=Customer(Super Mac), Johnny=Customer(Johnny Puma), Mac=Customer(Super Mac), Puma=Customer(Johnny Puma)}
{Super=Customer(Super Mac), Johnny=Customer(Johnny Puma), Mac=Customer(Super Mac), Puma=Customer(Johnny Puma)}
So, now to my question: How should I, in a Java 8 orderly fashion, stream through the List<Customer> and then somehow collect it as a Map<String, Customer> where you split the whole thing as two keys (first AND last) i.e. the Customer is mapped twice. I do not want to use any 3rd party libraries, I do not want to use a map outside of the stream as in alt 1. Are there any other nice alternatives?
The full code can be found on hastebin for simple copy-paste to get the whole thing running.
I think your alternatives 2 and 3 can be re-written to be more clear:
Alternative 2:
Map<String, Customer> res2 = customers.stream()
.flatMap(
c -> Stream.of(c.first, c.last)
.map(k -> new AbstractMap.SimpleImmutableEntry<>(k, c))
).collect(toMap(Map.Entry::getKey, Map.Entry::getValue));
Alternative 3: Your code abuses reduce by mutating the HashMap. To do mutable reduction, use collect:
Map<String, Customer> res3 = customers.stream()
.collect(
HashMap::new,
(m,c) -> {m.put(c.first, c); m.put(c.last, c);},
HashMap::putAll
);
Note that these are not identical. Alternative 2 will throw an exception if there are duplicate keys while Alternative 3 will silently overwrite the entries.
If overwriting entries in case of duplicate keys is what you want, I would personally prefer Alternative 3. It is immediately clear to me what it does. It most closely resembles the iterative solution. I would expect it to be more performant as Alternative 2 has to do a bunch of allocations per customer with all that flatmapping.
However, Alternative 2 has a huge advantage over Alternative 3 by separating the production of entries from their aggregation. This gives you a great deal of flexibility. For example, if you want to change Alternative 2 to overwrite entries on duplicate keys instead of throwing an exception, you would simply add (a,b) -> b to toMap(...). If you decide you want to collect matching entries into a list, all you would have to do is replace toMap(...) with groupingBy(...), etc.