Multiple aggregate functions in Java 8 Stream API - java

I have a class defined like
public class TimePeriodCalc {
private double occupancy;
private double efficiency;
private String atDate;
}
I would like to perform the following SQL statement using Java 8 Stream API.
SELECT atDate, AVG(occupancy), AVG(efficiency)
FROM TimePeriodCalc
GROUP BY atDate
I tried :
Collection<TimePeriodCalc> collector = result.stream().collect(groupingBy(p -> p.getAtDate(), ....
What can be put into the code to select multiple attributes ? I'm thinking of using multiple Collectors but really don't know how to do so.

To do it without a custom Collector (not streaming again on the result), you could do it like this. It's a bit dirty, since it is first collecting to Map<String, List<TimePeriodCalc>> and then streaming that list and get the average double.
Since you need two averages, they are collected to a Holder or a Pair, in this case I'm using AbstractMap.SimpleEntry
Map<String, SimpleEntry<Double, Double>> map = Stream.of(new TimePeriodCalc(12d, 10d, "A"), new TimePeriodCalc(2d, 16d, "A"))
.collect(Collectors.groupingBy(TimePeriodCalc::getAtDate,
Collectors.collectingAndThen(Collectors.toList(), list -> {
double occupancy = list.stream().collect(
Collectors.averagingDouble(TimePeriodCalc::getOccupancy));
double efficiency = list.stream().collect(
Collectors.averagingDouble(TimePeriodCalc::getEfficiency));
return new AbstractMap.SimpleEntry<>(occupancy, efficiency);
})));
System.out.println(map);

Here's a way with a custom collector. It only needs one pass, but it's not very easy, especially because of generics...
If you have this method:
#SuppressWarnings("unchecked")
#SafeVarargs
static <T, A, C extends Collector<T, A, Double>> Collector<T, ?, List<Double>>
averagingManyDoubles(ToDoubleFunction<? super T>... extractors) {
List<C> collectors = Arrays.stream(extractors)
.map(extractor -> (C) Collectors.averagingDouble(extractor))
.collect(Collectors.toList());
class Acc {
List<A> averages = collectors.stream()
.map(c -> c.supplier().get())
.collect(Collectors.toList());
void add(T elem) {
IntStream.range(0, extractors.length).forEach(i ->
collectors.get(i).accumulator().accept(averages.get(i), elem));
}
Acc merge(Acc another) {
IntStream.range(0, extractors.length).forEach(i ->
averages.set(i, collectors.get(i).combiner()
.apply(averages.get(i), another.averages.get(i))));
return this;
}
List<Double> finish() {
return IntStream.range(0, extractors.length)
.mapToObj(i -> collectors.get(i).finisher().apply(averages.get(i)))
.collect(Collectors.toList());
}
}
return Collector.of(Acc::new, Acc::add, Acc::merge, Acc::finish);
}
This receives an array of functions that will extract double values from each element of the stream. These extractors are converted to Collectors.averagingDouble collectors and then the local Acc class is created with the mutable structures that are used to accumulate the averages for each collector. Then, the accumulator function forwards to each accumulator, and so with the combiner and finisher functions.
Usage is as follows:
Map<String, List<Double>> averages = list.stream()
.collect(Collectors.groupingBy(
TimePeriodCalc::getAtDate,
averagingManyDoubles(
TimePeriodCalc::getOccupancy,
TimePeriodCalc::getEfficiency)));

Assuming that your TimePeriodCalc class has all the necessary getters, this should get you the list you want:
List<TimePeriodCalc> result = new ArrayList<>(
list.stream()
.collect(Collectors.groupingBy(TimePeriodCalc::getAtDate,
Collectors.collectingAndThen(Collectors.toList(), TimePeriodCalc::avgTimePeriodCalc)))
.values()
);
Where TimePeriodCalc.avgTimePeriodCalc is this method in the TimePeriodCalc class:
public static TimePeriodCalc avgTimePeriodCalc(List<TimePeriodCalc> list){
return new TimePeriodCalc(
list.stream().collect(Collectors.averagingDouble(TimePeriodCalc::getOccupancy)),
list.stream().collect(Collectors.averagingDouble(TimePeriodCalc::getEfficiency)),
list.get(0).getAtDate()
);
}
The above can be combined into this monstrosity:
List<TimePeriodCalc> result = new ArrayList<>(
list.stream()
.collect(Collectors.groupingBy(TimePeriodCalc::getAtDate,
Collectors.collectingAndThen(
Collectors.toList(), a -> {
return new TimePeriodCalc(
a.stream().collect(Collectors.averagingDouble(TimePeriodCalc::getOccupancy)),
a.stream().collect(Collectors.averagingDouble(TimePeriodCalc::getEfficiency)),
a.get(0).getAtDate()
);
}
)))
.values());
With input:
List<TimePeriodCalc> list = new ArrayList<>();
list.add(new TimePeriodCalc(10,10,"a"));
list.add(new TimePeriodCalc(10,10,"b"));
list.add(new TimePeriodCalc(10,10,"c"));
list.add(new TimePeriodCalc(5,5,"a"));
list.add(new TimePeriodCalc(0,0,"b"));
This would give:
TimePeriodCalc [occupancy=7.5, efficiency=7.5, atDate=a]
TimePeriodCalc [occupancy=5.0, efficiency=5.0, atDate=b]
TimePeriodCalc [occupancy=10.0, efficiency=10.0, atDate=c]

You can chain multiple attributes like this:
Collection<TimePeriodCalc> collector = result.stream().collect(Collectors.groupingBy(p -> p.getAtDate(), Collectors.averagingInt(p -> p.getOccupancy())));
If you want more, you get the idea.

Related

Have Java Streams GroupingBy result Map include a key for each value of an enum, even if value is an empty List

This question is about Java Streams' groupingBy capability.
Suppose I have a class, WorldCup:
public class WorldCup {
int year;
Country champion;
// all-arg constructor, getter/setters, etc
}
and an enum, Country:
public enum Country {
Brazil, France, USA
}
and the following code snippet:
WorldCup wc94 = new WorldCup(1994, Country.Brazil);
WorldCup wc98 = new WorldCup(1998, Country.France);
List<WorldCup> wcList = new ArrayList<WorldCup>();
wcList.add(wc94);
wcList.add(wc98);
Map<Country, List<Integer>> championsMap = wcList.stream()
.collect(Collectors.groupingBy(WorldCup::getCountry, Collectors.mapping(WorldCup::getYear));
After running this code, championsMap will contain:
Brazil: [1994]
France: [1998]
Is there a succinct way to have this list include an entry for all of the values of the enum? What I'm looking for is:
Brazil: [1994]
France: [1998]
USA: []
There are several approaches you can take.
The map which would be used for accumulating the stream data can be prepopulated with entries corresponding to every enum-member. To access all existing enum-members you can use values() method or EnumSet.allOf().
It can be achieved using three-args version of collect() or through a custom collector created via Collector.of().
Map<Country, List<Integer>> championsMap = wcList.stream()
.collect(
() -> EnumSet.allOf(Country.class).stream() // supplier
.collect(Collectors.toMap(
Function.identity(),
c -> new ArrayList<>()
)),
(Map<Country, List<Integer>> map, WorldCup next) -> // accumulator
map.get(next.getCountry()).add(next.getYear()),
(left, right) -> // combiner
right.forEach((k, v) -> left.get(k).addAll(v))
);
Another option is to add missing entries to the map after reduction of the stream has been finished.
For that we can use built-in collector collectingAndThen().
Map<Country, List<Integer>> championsMap = wcList.stream()
.collect(Collectors.collectingAndThen(
Collectors.groupingBy(WorldCup::getCountry,
Collectors.mapping(WorldCup::getYear,
Collectors.toList())),
map -> {
EnumSet.allOf(Country.class)
.forEach(country -> map.computeIfAbsent(country, k -> new ArrayList<>())); // if you're not going to mutate these lists - use Collections.emptyList()
return map;
}
));

GroupingBy with List as a result

We have the following:
public List<Balance> mapToBalancesWithSumAmounts(List<MonthlyBalancedBooking> entries) {
return entries
.stream()
.collect(
groupingBy(
MonthlyBalancedBooking::getValidFor,
summingDouble(MonthlyBalancedBooking::getAmount)
)
)
.entrySet()
.stream()
.map(localDateDoubleEntry -> new Balance(localDateDoubleEntry.getValue(), localDateDoubleEntry.getKey()))
.collect(toList());
}
Is there a possibility to avoid the second stream() path in the code, so the result of the groupingBy() should be a list in our case. We need a possibility to pass the map()-function to collect or groupingBy is that possible in Java 8?
That wouldn't be possible since the value that you are looking for as you map to the Balance objects could only be evaluated once all the entries of the MonthlyBalancedBooking list are iterated.
new Balance(localDateDoubleEntry.getValue(), localDateDoubleEntry.getKey())
An alternate way though with moving the stream though within a single terminal operation could be by using collectingAndThen as:
public List<Balance> mapToBalancesWithSumAmounts(List<MonthlyBalancedBooking> entries) {
return entries.stream()
.collect(Collectors.collectingAndThen(
Collectors.groupingBy(MonthlyBalancedBooking::getValidFor,
Collectors.summingDouble(MonthlyBalancedBooking::getAmount)),
map -> map.entrySet().stream()
.map(entry -> new Balance(entry.getValue(), entry.getKey()))))
.collect(Collectors.toList());
}
The simple way is just using toMap() collector with merge function like this:
List<Balance> balances = new ArrayList<>(entries.stream()
.collect(toMap(MonthlyBalancedBooking::getValidFor, m -> new Balance(m.getAmount(),
m.getValidFor()),Balance::merge)).values());
I supposed for Balance class these properties:
class Balance {
private Double value;
private Integer key;
public Balance merge(Balance b) {
this.value += b.getValue();
return this;
}
}

How to create a map<k,v> between first and last nested object using Java streams?

Assuming I have the following classes:
class A {
int id;
List<B> b;
}
class B {
int id;
}
I want to create a map between A.id to the list of B.id (Map<Integer, List<Integer>> , where key = A.id, and List<Integer> corresponds to the list of B.id fields for each A ). I tried various combinations of Collectors.groupingBy and Collectors.mapping, but to no effect. Can somebody help me out ?
You may use the toMap collector with a merge function to solve this problem. Here's how it looks.
Map<Integer, List<Integer>> resultMap = aList.stream()
.collect(Collectors.toMap(A::getId,
a -> a.getB().stream().map(B::getId)
.collect(Collectors.toList()), (l1, l2) -> {
l1.addAll(l2);
return l1;
}));
However, if a given set of A objects have distinct id values, then you can merely dispense with the merge function, which is the third argument to the toMap collector. Here's the simplified version.
Map<Integer, List<Integer>> resultMap = aList.stream()
.collect(Collectors.toMap(A::getId,
a -> a.getB().stream().map(B::getId)
.collect(Collectors.toList())));
List<A> aList;
//initialize aList here...
Map<String, List<Integer>> aMap = aList.stream().collect(
Collectors.toMap(element -> element.id,
element -> element.b
.stream()
.map(b -> b.id)
.collect(Collectors.toList())
)
);
You can do it in java-8 using Collectors.mapping to only map List<B> b while groping and then Collectors.collectingAndThen for the final transformation of value List<List<B>> to List<Integer>
Map<Integer, List<Integer>> result = list.stream()
.collect(Collectors.groupingBy(A::getId,
Collectors.mapping(A::getB, Collectors.collectingAndThen(Collectors.toList(),
l -> l.stream().flatMap(List::stream).map(B::getId).collect(Collectors.toList())))));
Java 9 added flatMapping collector, which is perfect for this:
// using static imports of various Collectors
listOfA.stream().collect(
groupingBy(
A::getId,
flatMapping(
a -> a.getB().stream().map(B::getId),
toList()
)
)
);

Adding two lists of own type

I have a simple User class with a String and an int property.
I would like to add two Lists of users this way:
if the String equals then the numbers should be added and that would be its new value.
The new list should include all users with proper values.
Like this:
List1: { [a:2], [b:3] }
List2: { [b:4], [c:5] }
ResultList: {[a:2], [b:7], [c:5]}
User definition:
public class User {
private String name;
private int comments;
}
My method:
public List<User> addTwoList(List<User> first, List<User> sec) {
List<User> result = new ArrayList<>();
for (int i=0; i<first.size(); i++) {
Boolean bsin = false;
Boolean isin = false;
for (int j=0; j<sec.size(); j++) {
isin = false;
if (first.get(i).getName().equals(sec.get(j).getName())) {
int value= first.get(i).getComments() + sec.get(j).getComments();
result.add(new User(first.get(i).getName(), value));
isin = true;
bsin = true;
}
if (!isin) {result.add(sec.get(j));}
}
if (!bsin) {result.add(first.get(i));}
}
return result;
}
But it adds a whole lot of things to the list.
This is better done via the toMap collector:
Collection<User> result = Stream
.concat(first.stream(), second.stream())
.collect(Collectors.toMap(
User::getName,
u -> new User(u.getName(), u.getComments()),
(l, r) -> {
l.setComments(l.getComments() + r.getComments());
return l;
}))
.values();
First, concatenate both the lists into a single Stream<User> via Stream.concat.
Second, we use the toMap collector to merge users that happen to have the same Name and get back a result of Collection<User>.
if you strictly want a List<User> then pass the result into the ArrayList constructor i.e. List<User> resultSet = new ArrayList<>(result);
Kudos to #davidxxx, you could collect to a list directly from the pipeline and avoid an intermediate variable creation with:
List<User> result = Stream
.concat(first.stream(), second.stream())
.collect(Collectors.toMap(
User::getName,
u -> new User(u.getName(), u.getComments()),
(l, r) -> {
l.setComments(l.getComments() + r.getComments());
return l;
}))
.values()
.stream()
.collect(Collectors.toList());
You have to use an intermediate map to merge users from both lists by summing their ages.
One way is with streams, as shown in Aomine's answer. Here's another way, without streams:
Map<String, Integer> map = new LinkedHashMap<>();
list1.forEach(u -> map.merge(u.getName(), u.getComments(), Integer::sum));
list2.forEach(u -> map.merge(u.getName(), u.getComments(), Integer::sum));
Now, you can create a list of users, as follows:
List<User> result = new ArrayList<>();
map.forEach((name, comments) -> result.add(new User(name, comments)));
This assumes User has a constructor that accepts name and comments.
EDIT: As suggested by #davidxxx, we could improve the code by factoring out the first part:
BiConsumer<List<User>, Map<String, Integer>> action = (list, map) ->
list.forEach(u -> map.merge(u.getName(), u.getComments(), Integer::sum));
Map<String, Integer> map = new LinkedHashMap<>();
action.accept(list1, map);
action.accept(list2, map);
This refactor would avoid DRY.
There is a pretty direct way using Collectors.groupingBy and Collectors.reducing which doesnt require setters, which is the biggest advantage since you can keep the User immutable:
Collection<Optional<User>> d = Stream
.of(first, second) // start with Stream<List<User>>
.flatMap(List::stream) // flatting to the Stream<User>
.collect(Collectors.groupingBy( // Collecting to Map<String, List<User>>
User::getName, // by name (the key)
// and reducing the list into a single User
Collectors.reducing((l, r) -> new User(l.getName(), l.getComments() + r.getComments()))))
.values(); // return values from Map<String, List<User>>
Unfortunately, the result is Collection<Optional<User>> since the reducing pipeline returns Optional since the result might not be present after all. You can stream the values and use the map() to get rid of the Optional or use Collectors.collectAndThen*:
Collection<User> d = Stream
.of(first, second) // start with Stream<List<User>>
.flatMap(List::stream) // flatting to the Stream<User>
.collect(Collectors.groupingBy( // Collecting to Map<String, List<User>>
User::getName, // by name (the key)
Collectors.collectingAndThen( // reduce the list into a single User
Collectors.reducing((l, r) -> new User(l.getName(), l.getComments() + r.getComments())),
Optional::get))) // and extract from the Optional
.values();
* Thanks to #Aomine
As alternative fairly straight and efficient :
stream the elements
collect them into a Map<String, Integer> to associate each name to the sum of comments (int)
stream the entries of the collected map to create the List of User.
Alternatively for the third step you could apply a finishing transformation to the Map collector with collectingAndThen(groupingBy()..., m -> ...
but I don't find it always very readable and here we could do without.
It would give :
List<User> users =
Stream.concat(first.stream(), second.stream())
.collect(groupingBy(User::getName, summingInt(User::getComments)))
.entrySet()
.stream()
.map(e -> new User(e.getKey(), e.getValue()))
.collect(toList());

Change data in an immutable way with Java stream

Consider this code:
Function<BigDecimal,BigDecimal> func1 = x -> x;//This could be anything
Function<BigDecimal,BigDecimal> func2 = y -> y;//This could be anything
Map<Integer,BigDecimal> data = new HashMap<>();
Map<Integer,BigDecimal> newData =
data.entrySet().stream().
collect(Collectors.toMap(Entry::getKey,i ->
func1.apply(i.getValue())));
List<BigDecimal> list =
newData.entrySet().stream().map(i ->
func2.apply(i.getValue())).collect(Collectors.toList());
Basically what I'm doing is updating an HashMap with func1,to apply a second trasformation with func2 and to save second time updated value in a list.
I DID all in immutable way generating the new objects newData and list.
MY QUESTION:
It is possible to do that streaming the original HashMap (data) once?
I tried this:
Function<BigDecimal,BigDecimal> func1 = x -> x;
Function<BigDecimal,BigDecimal> func2 = y -> y;
Map<Integer,BigDecimal> data = new HashMap<>();
List<BigDecimal> list = new ArrayList<>();
Map<Integer,BigDecimal> newData =
data.entrySet().stream().collect(Collectors.toMap(
Entry::getKey,i ->
{
BigDecimal newValue = func1.apply(i.getValue());
//SIDE EFFECT!!!!!!!
list.add(func2.apply(newValue));
return newValue;
}));
but doing so I have a side effect in list updating so I lost the 'immutable way' requirement.
This seems like an ideal use case for the upcoming Collectors.teeing method in JDK 12. Here's the webrev and here's the CSR. You can use it as follows:
Map.Entry<Map<Integer, BigDecimal>, List<BigDecimal>> result = data.entrySet().stream()
.collect(Collectors.teeing(
Collectors.toMap(
Map.Entry::getKey,
i -> func1.apply(i.getValue())),
Collectors.mapping(
i -> func1.andThen(func2).apply(i.getValue()),
Collectors.toList()),
Map::entry));
Collectors.teeing collects to two different collectors and then merges both partial results into the final result. For this final step I'm using JDK 9's Map.entry(K k, V v) static method, but I could have used any other container, i.e. Pair or Tuple2, etc.
For the first collector I'm using your exact code to collect to a Map, while for the second collector I'm using Collectors.mapping along with Collectors.toList, using Function.andThen to compose your func1 and func2 functions for the mapping step.
EDIT: If you cannot wait until JDK 12 is released, you could use this code meanwhile:
public static <T, A1, A2, R1, R2, R> Collector<T, ?, R> teeing(
Collector<? super T, A1, R1> downstream1,
Collector<? super T, A2, R2> downstream2,
BiFunction<? super R1, ? super R2, R> merger) {
class Acc {
A1 acc1 = downstream1.supplier().get();
A2 acc2 = downstream2.supplier().get();
void accumulate(T t) {
downstream1.accumulator().accept(acc1, t);
downstream2.accumulator().accept(acc2, t);
}
Acc combine(Acc other) {
acc1 = downstream1.combiner().apply(acc1, other.acc1);
acc2 = downstream2.combiner().apply(acc2, other.acc2);
return this;
}
R applyMerger() {
R1 r1 = downstream1.finisher().apply(acc1);
R2 r2 = downstream2.finisher().apply(acc2);
return merger.apply(r1, r2);
}
}
return Collector.of(Acc::new, Acc::accumulate, Acc::combine, Acc::applyMerger);
}
Note: The characteristics of the downstream collectors are not considered when creating the returned collector (left as an exercise).
EDIT 2: Your solution is absolutely OK, even though it uses two streams. My solution above streams the original map only once, but it applies func1 to all the values twice. If func1 is expensive, you might consider memoizing it (i.e. caching its results, so that whenever it's called again with the same input, you return the result from the cache instead of computing it again). Or you might also first apply func1 to the values of the original map, and then collect with Collectors.teeing.
Memoizing is easy. Just declare this utility method:
public <T, R> Function<T, R> memoize(Function<T, R> f) {
Map<T, R> cache = new HashMap<>(); // or ConcurrentHashMap
return k -> cache.computeIfAbsent(k, f);
}
And then use it as follows:
Function<BigDecimal, BigDecimal> func1 = memoize(x -> x); //This could be anything
Now you can use this memoized func1 and it will work exactly as before, except that it will return results from the cache when its apply method is invoked with an argument that has been previously used.
The other solution would be to apply func1 first and then collect:
Map.Entry<Map<Integer, BigDecimal>, List<BigDecimal>> result = data.entrySet().stream()
.map(i -> Map.entry(i.getKey(), func1.apply(i.getValue())))
.collect(Collectors.teeing(
Collectors.toMap(
Map.Entry::getKey,
Map.Entry::getValue),
Collectors.mapping(
i -> func2.apply(i.getValue()),
Collectors.toList()),
Map::entry));
Again, I'm using jdk9's Map.entry(K k, V v) static method.
Your code can be simplified this way:
List<BigDecimal> list = data.values().stream()
.map(func1)
.map(func2)
.collect(Collectors.toList());
Your goal is to apply these functions to all the BigDecimal values in the Map. You can get all these values from the map using Map::values which returns the List. Then apply the Stream to the List only. Consider the data already contains some entries:
List<BigDecimal> list = data.values().stream()
.map(func1)
.map(func2)
.collect(Collectors.toList());
I discourage you from iterating all the entries (Set<Entry<Integer, BigDecimal>>) since you only need to work with the values.
Try this way it returns Array of Object[2] the first one is the map and second one is the list
Map<Integer, BigDecimal> data = new HashMap<>();
data.put(1, BigDecimal.valueOf(30));
data.put(2, BigDecimal.valueOf(40));
data.put(3, BigDecimal.valueOf(50));
Function<BigDecimal, BigDecimal> func1 = x -> x.add(BigDecimal.valueOf(10));//This could be anything
Function<BigDecimal, BigDecimal> func2 = y -> y.add(BigDecimal.valueOf(-20));//This could be anything
Object[] o = data.entrySet().stream()
.map(AbstractMap.SimpleEntry::new)
.map(entry -> {
entry.setValue(func1.apply(entry.getValue()));
return entry;
})
.collect(Collectors.collectingAndThen(toMap(Map.Entry::getKey, Map.Entry::getValue), a -> {
List<BigDecimal> bigDecimals = a.values().stream().map(func2).collect(Collectors.toList());
return new Object[]{a,bigDecimals};
}));
System.out.println(data);
System.out.println((Map<Integer, BigDecimal>)o[0]);
System.out.println((List<BigDecimal>)o[1]);
Output:
Original Map: {1=30, 2=40, 3=50}
func1 map: {1=40, 2=50, 3=60}
func1+func2 list: [20, 30, 40]

Categories