I have a class like this:
class MultiDataPoint {
private DateTime timestamp;
private Map<String, Number> keyToData;
}
and i want to produce , for each MultiDataPoint
class DataSet {
public String key;
List<DataPoint> dataPoints;
}
class DataPoint{
DateTime timeStamp;
Number data;
}
of course a 'key' can be the same across multiple MultiDataPoints.
So given a List<MultiDataPoint>, how do I use Java 8 streams to convert to List<DataSet>?
This is how I am currently doing the conversion without streams:
Collection<DataSet> convertMultiDataPointToDataSet(List<MultiDataPoint> multiDataPoints)
{
Map<String, DataSet> setMap = new HashMap<>();
multiDataPoints.forEach(pt -> {
Map<String, Number> data = pt.getData();
data.entrySet().forEach(e -> {
String seriesKey = e.getKey();
DataSet dataSet = setMap.get(seriesKey);
if (dataSet == null)
{
dataSet = new DataSet(seriesKey);
setMap.put(seriesKey, dataSet);
}
dataSet.dataPoints.add(new DataPoint(pt.getTimestamp(), e.getValue()));
});
});
return setMap.values();
}
It's an interesting question, because it shows that there are a lot of different approaches to achieve the same result. Below I show three different implementations.
Default methods in Collection Framework: Java 8 added some methods to the collections classes, that are not directly related to the Stream API. Using these methods, you can significantly simplify the implementation of the non-stream implementation:
Collection<DataSet> convert(List<MultiDataPoint> multiDataPoints) {
Map<String, DataSet> result = new HashMap<>();
multiDataPoints.forEach(pt ->
pt.keyToData.forEach((key, value) ->
result.computeIfAbsent(
key, k -> new DataSet(k, new ArrayList<>()))
.dataPoints.add(new DataPoint(pt.timestamp, value))));
return result.values();
}
Stream API with flatten and intermediate data structure: The following implementation is almost identical to the solution provided by Stuart Marks. In contrast to his solution, the following implementation uses an anonymous inner class as intermediate data structure.
Collection<DataSet> convert(List<MultiDataPoint> multiDataPoints) {
return multiDataPoints.stream()
.flatMap(mdp -> mdp.keyToData.entrySet().stream().map(e ->
new Object() {
String key = e.getKey();
DataPoint dataPoint = new DataPoint(mdp.timestamp, e.getValue());
}))
.collect(
collectingAndThen(
groupingBy(t -> t.key, mapping(t -> t.dataPoint, toList())),
m -> m.entrySet().stream().map(e -> new DataSet(e.getKey(), e.getValue())).collect(toList())));
}
Stream API with map merging: Instead of flattening the original data structures, you can also create a Map for each MultiDataPoint, and then merge all maps into a single map with a reduce operation. The code is a bit simpler than the above solution:
Collection<DataSet> convert(List<MultiDataPoint> multiDataPoints) {
return multiDataPoints.stream()
.map(mdp -> mdp.keyToData.entrySet().stream()
.collect(toMap(e -> e.getKey(), e -> asList(new DataPoint(mdp.timestamp, e.getValue())))))
.reduce(new HashMap<>(), mapMerger())
.entrySet().stream()
.map(e -> new DataSet(e.getKey(), e.getValue()))
.collect(toList());
}
You can find an implementation of the map merger within the Collectors class. Unfortunately, it is a bit tricky to access it from the outside. Following is an alternative implementation of the map merger:
<K, V> BinaryOperator<Map<K, List<V>>> mapMerger() {
return (lhs, rhs) -> {
Map<K, List<V>> result = new HashMap<>();
lhs.forEach((key, value) -> result.computeIfAbsent(key, k -> new ArrayList<>()).addAll(value));
rhs.forEach((key, value) -> result.computeIfAbsent(key, k -> new ArrayList<>()).addAll(value));
return result;
};
}
To do this, I had to come up with an intermediate data structure:
class KeyDataPoint {
String key;
DateTime timestamp;
Number data;
// obvious constructor and getters
}
With this in place, the approach is to "flatten" each MultiDataPoint into a list of (timestamp, key, data) triples and stream together all such triples from the list of MultiDataPoint.
Then, we apply a groupingBy operation on the string key in order to gather the data for each key together. Note that a simple groupingBy would result in a map from each string key to a list of the corresponding KeyDataPoint triples. We don't want the triples; we want DataPoint instances, which are (timestamp, data) pairs. To do this we apply a "downstream" collector of the groupingBy which is a mapping operation that constructs a new DataPoint by getting the right values from the KeyDataPoint triple. The downstream collector of the mapping operation is simply toList which collects the DataPoint objects of the same group into a list.
Now we have a Map<String, List<DataPoint>> and we want to convert it to a collection of DataSet objects. We simply stream out the map entries and construct DataSet objects, collect them into a list, and return it.
The code ends up looking like this:
Collection<DataSet> convertMultiDataPointToDataSet(List<MultiDataPoint> multiDataPoints) {
return multiDataPoints.stream()
.flatMap(mdp -> mdp.getData().entrySet().stream()
.map(e -> new KeyDataPoint(e.getKey(), mdp.getTimestamp(), e.getValue())))
.collect(groupingBy(KeyDataPoint::getKey,
mapping(kdp -> new DataPoint(kdp.getTimestamp(), kdp.getData()), toList())))
.entrySet().stream()
.map(e -> new DataSet(e.getKey(), e.getValue()))
.collect(toList());
}
I took some liberties with constructors and getters, but I think they should be obvious.
Related
This question is about Java Streams' groupingBy capability.
Suppose I have a class, WorldCup:
public class WorldCup {
int year;
Country champion;
// all-arg constructor, getter/setters, etc
}
and an enum, Country:
public enum Country {
Brazil, France, USA
}
and the following code snippet:
WorldCup wc94 = new WorldCup(1994, Country.Brazil);
WorldCup wc98 = new WorldCup(1998, Country.France);
List<WorldCup> wcList = new ArrayList<WorldCup>();
wcList.add(wc94);
wcList.add(wc98);
Map<Country, List<Integer>> championsMap = wcList.stream()
.collect(Collectors.groupingBy(WorldCup::getCountry, Collectors.mapping(WorldCup::getYear));
After running this code, championsMap will contain:
Brazil: [1994]
France: [1998]
Is there a succinct way to have this list include an entry for all of the values of the enum? What I'm looking for is:
Brazil: [1994]
France: [1998]
USA: []
There are several approaches you can take.
The map which would be used for accumulating the stream data can be prepopulated with entries corresponding to every enum-member. To access all existing enum-members you can use values() method or EnumSet.allOf().
It can be achieved using three-args version of collect() or through a custom collector created via Collector.of().
Map<Country, List<Integer>> championsMap = wcList.stream()
.collect(
() -> EnumSet.allOf(Country.class).stream() // supplier
.collect(Collectors.toMap(
Function.identity(),
c -> new ArrayList<>()
)),
(Map<Country, List<Integer>> map, WorldCup next) -> // accumulator
map.get(next.getCountry()).add(next.getYear()),
(left, right) -> // combiner
right.forEach((k, v) -> left.get(k).addAll(v))
);
Another option is to add missing entries to the map after reduction of the stream has been finished.
For that we can use built-in collector collectingAndThen().
Map<Country, List<Integer>> championsMap = wcList.stream()
.collect(Collectors.collectingAndThen(
Collectors.groupingBy(WorldCup::getCountry,
Collectors.mapping(WorldCup::getYear,
Collectors.toList())),
map -> {
EnumSet.allOf(Country.class)
.forEach(country -> map.computeIfAbsent(country, k -> new ArrayList<>())); // if you're not going to mutate these lists - use Collections.emptyList()
return map;
}
));
I have a fun puzzler. Say I have a list of String values:
["A", "B", "C"]
Then I have to query another system for a Map<User, Long> of users with an attribute that corresponds to those values in the list with a count:
{name="Annie", key="A"} -> 23
{name="Paul", key="C"} -> 16
I need to return a new List<UserCount> with a count of each key. So I expect:
{key="A", count=23},
{key="B", count=0},
{key="C", count=16}
But I'm having a hard time computing when one of my User objects has no corresponding count in the map.
I know that map.computeIfAbsent() does what I need, but how can I apply it based on what's on the contents of the original list?
I think I need to stream the over the original list, then apply compute? So I have:
valuesList.stream()
.map(it -> valuesMap.computeIfAbsent(it.getKey(), k-> OL))
...
But here's where I get stuck. Can anyone provide any insight as to how I accomplish what I need?
You can create an auxiliary Map<String, Long> which will associate each string key with the count and then generate a list of UserCount based on it.
Example:
public record User(String name, String key) {}
public record UserCount(String key, long count) {}
public static void main(String[] args) {
List<String> keys = List.of("A", "B", "C");
Map<User, Long> countByUser =
Map.of(new User("Annie", "A"), 23L,
new User("Paul", "C"), 16L));
Map<String, Long> countByKey = countByUser.entrySet().stream()
.collect(Collectors.groupingBy(entry -> entry.getKey().key(),
Collectors.summingLong(Map.Entry::getValue)));
List<UserCount> userCounts = keys.stream()
.map(key -> new UserCount(key, countByKey.getOrDefault(key, 0L)))
.collect(Collectors.toList());
System.out.println(userCounts);
}
Output
[UserCount[key=A, count=23], UserCount[key=B, count=0], UserCount[key=C, count=16]]
Regarding the idea of utilizing computeIfAbsent() with stream - this approach is wrong and discouraged by the documentation of the Stream API.
Sure, you can use computeIfAbsent() to solve this problem, but not in conjunction with streams. It's not a good idea to create a stream that operates via side effects (at least without compelling reason).
And I guess you even don't need Java 8 computeIfAbsent(), plain and simple putIfAbsent() will be sufficient.
The following code will produce the same result:
Map<String, Long> countByKey = new HashMap<>();
countByUser.forEach((k, v) -> countByKey.merge(k.key(), v, Long::sum));
keys.forEach(k -> countByKey.putIfAbsent(k, 0L));
List<UserCount> userCounts = keys.stream()
.map(key -> new UserCount(key, countByKey.getOrDefault(key, 0L)))
.collect(Collectors.toList());
And instead of applying forEach() on a map and list, you can create two enhanced for loops if this options looks convoluted.
Another educational and parallel friendly version would be to gather the logic in one place and build your own custom accumulator and combiner for the Collector
public static void main(String[] args) {
Map<User, Long> countByUser =
Map.of(new User("Alice", "A"), 23L,
new User("Bob", "C"), 16L);
List<String> keys = List.of("A", "B", "C");
UserCountAggregator userCountAggregator =
countByUser.entrySet()
.parallelStream()
.collect(UserCountAggregator::new,
UserCountAggregator::accumulator,
UserCountAggregator::combiner);
List<UserCount> userCounts = userCountAggregator.getUserCounts(keys);
System.out.println(userCounts);
}
Output
[UserCount(key=A, count=23), UserCount(key=B, count=0), UserCount(key=C, count=16)]
User and UserCount classes with Lombok's #Value
#Value
class User {
private String name;
private String key;
}
#Value
class UserCount {
private String key;
private long count;
}
And the UserCountAggregator which contains your custom accumulator and combiner
class UserCountAggregator {
private Map<String, Long> keyCounts = new HashMap<>();
public void accumulator(Map.Entry<User, Long> userLongEntry) {
keyCounts.put(userLongEntry.getKey().getKey(),
keyCounts.getOrDefault(userLongEntry.getKey().getKey(), 0L)
+ userLongEntry.getValue());
}
public void combiner(UserCountAggregator other) {
other.keyCounts
.forEach((key, value) -> keyCounts.merge(key, value, Long::sum));
}
public List<UserCount> getUserCounts(List<String> keys) {
return keys.stream()
.map(key -> new UserCount(key, keyCounts.getOrDefault(key, 0L)))
.collect(Collectors.toList());
}
}
final Map<User,Long> valuesMap = ...
// First, map keys to counts (assuming keys are unique for each user)
final Map<String,Long> keyToCountMap = valuesMap.entrySet().stream()
.collect(Collectors.toMap(e -> e.getKey().key, e -> e.getValue()));
final List<UserCount> list = valuesList.stream()
.map(key -> new UserCount (key, keyToCountMap.getOrDefault(key, 0L)))
.collect(Collectors.toList());
We have the following:
public List<Balance> mapToBalancesWithSumAmounts(List<MonthlyBalancedBooking> entries) {
return entries
.stream()
.collect(
groupingBy(
MonthlyBalancedBooking::getValidFor,
summingDouble(MonthlyBalancedBooking::getAmount)
)
)
.entrySet()
.stream()
.map(localDateDoubleEntry -> new Balance(localDateDoubleEntry.getValue(), localDateDoubleEntry.getKey()))
.collect(toList());
}
Is there a possibility to avoid the second stream() path in the code, so the result of the groupingBy() should be a list in our case. We need a possibility to pass the map()-function to collect or groupingBy is that possible in Java 8?
That wouldn't be possible since the value that you are looking for as you map to the Balance objects could only be evaluated once all the entries of the MonthlyBalancedBooking list are iterated.
new Balance(localDateDoubleEntry.getValue(), localDateDoubleEntry.getKey())
An alternate way though with moving the stream though within a single terminal operation could be by using collectingAndThen as:
public List<Balance> mapToBalancesWithSumAmounts(List<MonthlyBalancedBooking> entries) {
return entries.stream()
.collect(Collectors.collectingAndThen(
Collectors.groupingBy(MonthlyBalancedBooking::getValidFor,
Collectors.summingDouble(MonthlyBalancedBooking::getAmount)),
map -> map.entrySet().stream()
.map(entry -> new Balance(entry.getValue(), entry.getKey()))))
.collect(Collectors.toList());
}
The simple way is just using toMap() collector with merge function like this:
List<Balance> balances = new ArrayList<>(entries.stream()
.collect(toMap(MonthlyBalancedBooking::getValidFor, m -> new Balance(m.getAmount(),
m.getValidFor()),Balance::merge)).values());
I supposed for Balance class these properties:
class Balance {
private Double value;
private Integer key;
public Balance merge(Balance b) {
this.value += b.getValue();
return this;
}
}
I have a simple User class with a String and an int property.
I would like to add two Lists of users this way:
if the String equals then the numbers should be added and that would be its new value.
The new list should include all users with proper values.
Like this:
List1: { [a:2], [b:3] }
List2: { [b:4], [c:5] }
ResultList: {[a:2], [b:7], [c:5]}
User definition:
public class User {
private String name;
private int comments;
}
My method:
public List<User> addTwoList(List<User> first, List<User> sec) {
List<User> result = new ArrayList<>();
for (int i=0; i<first.size(); i++) {
Boolean bsin = false;
Boolean isin = false;
for (int j=0; j<sec.size(); j++) {
isin = false;
if (first.get(i).getName().equals(sec.get(j).getName())) {
int value= first.get(i).getComments() + sec.get(j).getComments();
result.add(new User(first.get(i).getName(), value));
isin = true;
bsin = true;
}
if (!isin) {result.add(sec.get(j));}
}
if (!bsin) {result.add(first.get(i));}
}
return result;
}
But it adds a whole lot of things to the list.
This is better done via the toMap collector:
Collection<User> result = Stream
.concat(first.stream(), second.stream())
.collect(Collectors.toMap(
User::getName,
u -> new User(u.getName(), u.getComments()),
(l, r) -> {
l.setComments(l.getComments() + r.getComments());
return l;
}))
.values();
First, concatenate both the lists into a single Stream<User> via Stream.concat.
Second, we use the toMap collector to merge users that happen to have the same Name and get back a result of Collection<User>.
if you strictly want a List<User> then pass the result into the ArrayList constructor i.e. List<User> resultSet = new ArrayList<>(result);
Kudos to #davidxxx, you could collect to a list directly from the pipeline and avoid an intermediate variable creation with:
List<User> result = Stream
.concat(first.stream(), second.stream())
.collect(Collectors.toMap(
User::getName,
u -> new User(u.getName(), u.getComments()),
(l, r) -> {
l.setComments(l.getComments() + r.getComments());
return l;
}))
.values()
.stream()
.collect(Collectors.toList());
You have to use an intermediate map to merge users from both lists by summing their ages.
One way is with streams, as shown in Aomine's answer. Here's another way, without streams:
Map<String, Integer> map = new LinkedHashMap<>();
list1.forEach(u -> map.merge(u.getName(), u.getComments(), Integer::sum));
list2.forEach(u -> map.merge(u.getName(), u.getComments(), Integer::sum));
Now, you can create a list of users, as follows:
List<User> result = new ArrayList<>();
map.forEach((name, comments) -> result.add(new User(name, comments)));
This assumes User has a constructor that accepts name and comments.
EDIT: As suggested by #davidxxx, we could improve the code by factoring out the first part:
BiConsumer<List<User>, Map<String, Integer>> action = (list, map) ->
list.forEach(u -> map.merge(u.getName(), u.getComments(), Integer::sum));
Map<String, Integer> map = new LinkedHashMap<>();
action.accept(list1, map);
action.accept(list2, map);
This refactor would avoid DRY.
There is a pretty direct way using Collectors.groupingBy and Collectors.reducing which doesnt require setters, which is the biggest advantage since you can keep the User immutable:
Collection<Optional<User>> d = Stream
.of(first, second) // start with Stream<List<User>>
.flatMap(List::stream) // flatting to the Stream<User>
.collect(Collectors.groupingBy( // Collecting to Map<String, List<User>>
User::getName, // by name (the key)
// and reducing the list into a single User
Collectors.reducing((l, r) -> new User(l.getName(), l.getComments() + r.getComments()))))
.values(); // return values from Map<String, List<User>>
Unfortunately, the result is Collection<Optional<User>> since the reducing pipeline returns Optional since the result might not be present after all. You can stream the values and use the map() to get rid of the Optional or use Collectors.collectAndThen*:
Collection<User> d = Stream
.of(first, second) // start with Stream<List<User>>
.flatMap(List::stream) // flatting to the Stream<User>
.collect(Collectors.groupingBy( // Collecting to Map<String, List<User>>
User::getName, // by name (the key)
Collectors.collectingAndThen( // reduce the list into a single User
Collectors.reducing((l, r) -> new User(l.getName(), l.getComments() + r.getComments())),
Optional::get))) // and extract from the Optional
.values();
* Thanks to #Aomine
As alternative fairly straight and efficient :
stream the elements
collect them into a Map<String, Integer> to associate each name to the sum of comments (int)
stream the entries of the collected map to create the List of User.
Alternatively for the third step you could apply a finishing transformation to the Map collector with collectingAndThen(groupingBy()..., m -> ...
but I don't find it always very readable and here we could do without.
It would give :
List<User> users =
Stream.concat(first.stream(), second.stream())
.collect(groupingBy(User::getName, summingInt(User::getComments)))
.entrySet()
.stream()
.map(e -> new User(e.getKey(), e.getValue()))
.collect(toList());
My traditional code would look like this:
List<MyObject> transform(Collection<java.util.Map.Entry<String, List<String>>> input) {
List<MyObject> output = new LinkedList<>();
for (Entry<String, List<String>> pair : input) {
for (String value : pair.getValue()) {
output.add(new MyObject(pair.getKey(), value));
}
}
return output;
}
Can I do the same with lambda expressions? I’ve tried around, but I don’t get it. The outer collection is unsorted, but the List<String> is sorted. The result objects may return in the result list without any order, with the exception that objects created from the same key String should follow each other to preserve the order of the value. Is this at all possible?
input.stream()
.flatMap(e -> e.getValue()
.stream()
.map(v -> new MyObject(e.getKey(), v)))
.collect(Collectors.toCollection(LinkedList::new));
You can use streams and Stream.flatMap as in this answer, however I think that the code is much clearer if you stick to loops, either traditional ones as in your question, or modern ones:
List<MyObject> output = new LinkedList<>();
input.forEach(pair -> pair.getValue()
.forEach(value -> output.add(new MyObject(pair.getKey(), value))));
By the way, I'd use an ArrayList instead of a LinkedList.