What is the Aggregate/Reduce version of this loop using Java streams? - java

I'm trying to come up to speed on the Streams API, but I'm really used to the simplicity of the C# Linq Extension functions and the ability to use the yield keyword to create iterators. Normally I would use:
list.Aggregate(set, (acc, a) => { acc.add(a.Id); return acc});
Or something like that, but I'm not immediately seeing how this maps to the Streams API.
List<SomeObject> objs = ...
Set<String> ids = new HashSet<>();
for (SomeObject a : objs) {
ids.add(a.getId());
}
assertThat(ids.size(), objs.size());
EDIT:
Changed SomeObject.getId() to a.getId() in the for loop.

The following statement should be equivalent to the for-loop in your example.
Set<String> ids = objs.stream()
.map(a -> a.getId())
.collect(Collectors.toSet());
You can also use a method reference instead of a lambda expression:
Set<String> ids = objs.stream()
.map(SomeObject::getId)
.collect(Collectors.toSet());

Related

Java 8 stream optimization for nested level

How can we use optimize this stream to collect nested level and root level for a set.
final Set<String> groupedUsers = new HashSet<>();
groups.stream().forEach(group -> {
groupedUsers.add(group.getTeamLeadId());
groupedUsers.addAll(group.getTeamMemberIds().stream().collect(Collectors.toSet()));
});
Don't use forEach to add elements to a collection.
Set<String> groupedUsers = groups.stream()
.flatMap(g -> Stream.concat(
Stream.of(g.getTeamLeadId()), g.getTeamMemberIds().stream()))
.collect(toSet());
Or just use a plain old (enhanced) for loop. Streams don't give you any clear advantage here.
Well, Stream API and lambdas don't give you any advantage here.
optimize this stream to collect
If I get correctly the meaning of your "optimize" word, then I would go with old-style foreach:
Set<String> groupedUsers = new HashSet<>();
for (Group g : groups) {
groupedUsers.add(g.getTeamLeadId());
groupedUsers.addAll(g.getTeamMemberIds());
}
If all you have is a hammer, everything looks like a nail.
You can skip stream creation in groups.stream().forEach() because of Iterable.forEach()
groups.forEach(group -> {
..
});
Streams are generally more costly than regular iteration due to years of loop optimization put into JIT.
Or may be even:
Set<String> groupedUsers = new HashSet<>();
groups.forEach(x -> {
groupedUsers.add(x.getTeamLeadId());
groupedUsers.addAll(x.getTeamMemberIds());
});

How to convert a Collection<Pair<K, Collection<V>> to a List<MyObject<K,V>> using Java stream API?

My traditional code would look like this:
List<MyObject> transform(Collection<java.util.Map.Entry<String, List<String>>> input) {
List<MyObject> output = new LinkedList<>();
for (Entry<String, List<String>> pair : input) {
for (String value : pair.getValue()) {
output.add(new MyObject(pair.getKey(), value));
}
}
return output;
}
Can I do the same with lambda expressions? I’ve tried around, but I don’t get it. The outer collection is unsorted, but the List<String> is sorted. The result objects may return in the result list without any order, with the exception that objects created from the same key String should follow each other to preserve the order of the value. Is this at all possible?
input.stream()
.flatMap(e -> e.getValue()
.stream()
.map(v -> new MyObject(e.getKey(), v)))
.collect(Collectors.toCollection(LinkedList::new));
You can use streams and Stream.flatMap as in this answer, however I think that the code is much clearer if you stick to loops, either traditional ones as in your question, or modern ones:
List<MyObject> output = new LinkedList<>();
input.forEach(pair -> pair.getValue()
.forEach(value -> output.add(new MyObject(pair.getKey(), value))));
By the way, I'd use an ArrayList instead of a LinkedList.

Nested collections lambda iteration

Suppose I have an object containing a collection, each elements on the said collection contains a collection, and each collection contains a collection.
And I want to iterate on the deepest objects and apply the same code to it.
The imperative way is trivial, but is there a way to lambda-fy this all?
Here is how the code looks today:
My object o;
SecretType computedThingy = 78;
for (FirstLevelOfCollection coll : o.getList()) {
for (SecondLevelOfCollection colColl : coll.getSet()) {
for (MyCoolTinyObjects mcto : colColl.getFoo()) {
mcto.setSecretValue(computedThingy);
}
}
}
I can see how to make a lambda out of the deepest loop:
colColl.getFoo().stream().forEach(x -> x.setSecretValue(computedThingy)
But can I do more?
flatMap is available for such a purpose. What you get here is iteration over all elements of the various deepest collections as if they were a single collection:
o.getList().stream()
.flatMap(c1 -> c1.getSet().stream())
.flatMap(c2 -> c2.getFoo().stream())
.forEach(x -> x.setSecretValue(computedThingy));
flatMap to the rescue, simple example with a nested collection of String
See also:
Java 8 Streams FlatMap method example
Turn a List of Lists into a List Using Lambdas
Set<List<List<String>>> outerMostSet = new HashSet<>();
List<List<String>> middleList = new ArrayList<>();
List<String> innerMostList = new ArrayList<>();
innerMostList.add("foo");
innerMostList.add("bar");
middleList.add(innerMostList);
List<String> anotherInnerMostList = new ArrayList<>();
anotherInnerMostList.add("another foo");
middleList.add(anotherInnerMostList);
outerMostSet.add(middleList);
outerMostSet.stream()
.flatMap(mid -> mid.stream())
.flatMap(inner -> inner.stream())
.forEach(System.out::println);
Produces
foo
bar
another foo

Stream: Filter on children, return the parent

Assume a class MyClass:
public class MyClass {
private final Integer myId;
private final String myCSVListOfThings;
public MyClass(Integer myId, String myCSVListOfThings) {
this.myId = myId;
this.myCSVListOfThings = myCSVListOfThings;
}
// Getters, Setters, etc
}
And this Stream:
final Stream<MyClass> streamOfObjects = Stream.of(
new MyClass(1, "thing1;thing2;thing3"),
new MyClass(2, "thing2;thing3;thing4"),
new MyClass(3, "thingX;thingY;thingZ"));
I want to return every instance of MyClass that contains an entry "thing2" in myCSVListOfThings.
If I wanted a List<String> containing myCSVListOfThings this could be done easily:
List<String> filteredThings = streamOfObjects
.flatMap(o -> Arrays.stream(o.getMyCSVListOfThings().split(";")))
.filter("thing2"::equals)
.collect(Collectors.toList());
But what I really need is a List<MyClass>.
This is what I have right now:
List<MyClass> filteredClasses = streamOfObjects.filter(o -> {
Stream<String> things = Arrays.stream(o.getMyCSVListOfThings().split(";"));
return things.anyMatch(s -> s.equals("thing2"));
}).collect(Collectors.toList());
But somehow it does not feel right. Any cleaner solution than opening a new Stream inside of a Predicate?
Firstly, I recommend you to add extra method to MyClass public boolean containsThing(String str), so you can transform you code like this:
List<MyClass> filteredClasses = streamOfObjects
.filter(o -> o.containsThing("thing2"))
.collect(Collectors.toList());
Now you can implement this method as you want depends on input data: splitting into Stream, splitting into Set, even searching of substring (if it's possible and has sense), caching result if you need.
You know much more about usage of this class so you can make right choice.
One solution is to use a pattern matching that avoids the split-and-stream operation:
Pattern p=Pattern.compile("(^|;)thing2($|;)");
List<MyClass> filteredClasses = streamOfObjects
.filter(o -> p.matcher(o.getMyCSVListOfThings()).find())
.collect(Collectors.toList());
Since the argument to String.split is defined as regex pattern, the pattern above has the same semantic as looking for a match within the result of split; you are looking for the word thing2 between two boundaries, the first is either, the beginning of the line or a semicolon, the second is either, the end of the line or a semicolon.
Besides that, there is nothing wrong with using another Stream operation within a predicate. But there are some ways to improve it. The lambda expression gets more concise if you omit the obsolete local variable holding the Stream. Generally, you should avoid holding Stream instances in local variables as chaining the operations directly will reduce the risk of trying to use a Stream more than one time. Second, you can use the Pattern class to stream over the resulting elements of a split operation without collecting them all into an array first:
Pattern p=Pattern.compile(";");
List<MyClass> filteredClasses = streamOfObjects
.filter(o -> p.splitAsStream(o.getMyCSVListOfThings()).anyMatch("thing2"::equals))
.collect(Collectors.toList());
or
Pattern p=Pattern.compile(";");
List<MyClass> filteredClasses = streamOfObjects
.filter(o -> p.splitAsStream(o.getMyCSVListOfThings()).anyMatch(s->s.equals("thing2")))
.collect(Collectors.toList());
Note that you could also rewrite your original code to
List<MyClass> filteredClasses = listOfObjects.stream()
.filter(o -> Arrays.asList(o.getMyCSVListOfThings().split(";")).contains("thing2"))
.collect(Collectors.toList());
Now, the operation within the predicate is not a Stream but a Collection operation, but this doesn’t change the semantic nor the correctness of the code…
As I see it you have three options.
1) look for particular entry in the String without spliting it - still looks messy
List<MyClass> filteredClasses = streamOfObjects
.filter(o -> o.getMyCSVListOfThings().contains(";thing2;"))
.collect(Collectors.toList());
2) map twice - still messy
List<MyClass> filteredClasses = streamOfObjects
.map(o -> Pair<MyClass, List<String>>.of(o, toList(o.getMyCSVListOfThings()))
.filter(pair -> pair.getRight().contains("thing2"))
.map(pair -> pair.getLeft())
.collect(Collectors.toList());
where toList is a method that will convert String to List
3) create additional field - method I'd suggest
Extend class MyClass - add field to the class
List<String> values;
And initialize it in the constructor:
public MyClass(Integer myId, String myCSVListOfThings) {
this.myId = myId;
this.myCSVListOfThings = myCSVListOfThings;
this.values = toList(myCSVListOfThings);
}
And then in the stream simply:
List<MyClass> filteredClasses = streamOfObjects
.filter(o -> o.getValues().contains("thing2"))
.collect(Collectors.toList());
Of course field values can be initialized in LAZY mode during first getValues method call if you want.
This is similar to the issue, Getting only required objects from a list using Java 8 Streams, posted a year earlier. I think the solution I left there is applicable here.
There's a library called com.coopstools.cachemonads. It extends the java stream (and Optional) classes to allow caching of entities for later use.
The solution can be found with:
List<Parent> goodParents = CacheStream.of(parents)
.cache()
.map(Parent::getChildren)
.flatMap(Collection::stream)
.map(Child::getAttrib1)
.filter(att -> att > 10)
.load()
.distinct()
.collect(Collectors.toList());
where, parents is an array or stream.
For clarity, the cache method is what stores the parents; and the load method is what pulls the parents back out. And If a parent does not have children, a filter will be needed after the first map to remove the null lists.
More specifically, for your issue:
List<Parent> goodParents = CacheStream.of(streamOfObjects)
.cache()
.map(o -> o.getMyCSVListOfThings().split(";"))
.flatMap(Collection::stream)
.filter("thing2"::equals)
.load()
.collect(Collectors.toList())
This library can be used in any situation where operations need to be performed on children, including map/sort/filter/etc, but where an older entity is still needed. There may be more lines than some of the other answers, but each line is very clean and straight forward.
Please let me know if this answer is helpful.
The code can be found at https://github.com/coopstools/cachemonads or can be downloaded from maven:
<dependency>
<groupId>com.coopstools</groupId>
<artifactId>cachemonads</artifactId>
<version>0.2.0</version>
</dependency>
(or, gradle, com.coopstools:cachemonads:0.2.0)

Java lambda expression -- mapping and then modifying a list?

Using a Java 8 lambda expression, I'm trying to do something like this.
List<NewObject> objs = ...;
for (OldObject oldObj : oldObjects) {
NewObject obj = oldObj.toNewObject();
obj.setOrange(true);
objs.add(obj);
}
I wrote this code.
oldObjects.stream()
.map(old -> old.toNewObject())
.forEach({new.setOrange("true")})
.collect(Collectors.toList());
This is invalid code because I'm then trying to do .collect() on what's returned by .forEach(), but forEach is void and does not return a list.
How should this be structured?
You can use Stream's peek method, which returns the Stream because it's an intermediate operation. It normally isn't supposed to have a side effect (it's supposed to be "non-interfering"), but in this case, I think the side effect (setOrange(true)) is intended and is fine.
List<NewObject> newObjects =
oldObjects.stream()
.map(OldObject::toNewObject)
.peek( n -> n.setOrange(true))
.collect(Collectors.toList());
It's about as verbose as your non-streams code, so you can choose which technique to use.
You can use peek.
List<NewObject> list = oldObjects.stream()
.map(OldObject::toNewObject)
.peek(o -> o.setOrange(true))
.collect(Collectors.toList());
Alternatively, you can mutate the elements after forming the list.
List<NewObject> list = oldObjects.stream()
.map(OldObject::toNewObject)
.collect(Collectors.toList());
list.forEach(o -> o.setOrange(true));

Categories