Java Streams; avoid finisher on Collectors.collectingAndThen - java

I've this code:
private Iterable<Practitioner> pickPractitioners(List<String> ids) {
return Optional.ofNullable(ids)
.map(List::stream)
.orElse(Stream.of())
.collect(
Collectors.collectingAndThen(
Collectors.toList(),
this.practitionerRepository::findAllById
)
);
}
Problem is that when ids is empty, this.practitionerRepository::findAllById is also executed.
I'd like to avoid this step if resulting collector is empty.
Any ideas?

In general to skip that part of the finisher you could pass a lambda instead of a method reference and check if the input is empty:
.collect(
Collectors.collectingAndThen(
Collectors.toList(),
r -> r.isEmpty() ? Collections.emptyList() : this.practitionerRepository.findAllById(r)
)
);
If your actual code is a simple as this example then you don't need to use streams or optional at all. Instead you could just check if the input of the method is null or empty in a ternary operator:
return ids == null || ids.isEmpty() ? Collections.emptyList() :
this.practitionerRepository.findAllById(ids);

Whilst the practical part of this question (how to avoid interrogating the repository with an empty list as an argument) is already addressed in other answers I want to point out that there's a cleaner way to build a pipeline in this method.
Firstly it's worthy to remind that the main purpose of Optional.ofNullable() is to create an Optional object that has to be returned from a method.
Attempts to use Optional.ofNullable() in order to utilize method-chaining or to avoid null-checks in the middle of the method according to Stuart Marks are considered to be anti-patterns.
Here is the quote from his talk at Devoxx:
"it's generally a bad idea to create an Optional for the specific
purpose of chaining methods from it to get a value."
A similar idea was expressed in his answer on stackoverflow.
What are the alternatives?
Since Java 9 Stream interface has its own method ofNullable().
Returns a sequential Stream containing a single element, if non-null,
otherwise returns an empty Stream.
Keeping all that in mind method pickPractitioners() could be rewritten like this:
private Function<List<String>, Iterable<Practitioner>> getPractitioners =
idList -> idList.isEmpty() ? Collections.emptyList() :
this.practitionerRepository.findAllById(idList);
private Iterable<Practitioner> pickPractitioners(List<String> ids) {
return Stream.ofNullable(ids)
.flatMap(List::stream)
.collect(Collectors.collectingAndThen(
Collectors.toList(),
getPractitioners
));
}

If you look at the signature of the Finisher. It is just a function, so you can just write it:
public static<T,A,R,RR> Collector<T,A,RR> collectingAndThen(Collector<T,A,R> downstream, Function<R,RR> finisher) {
static interface MyRepository extends JpaRepository<Part, Long> {
}
public static void main(String[] args) {
MyRepository myRepository = null;
List<Long> list = null;
Function<List<Long>, List<Part>> finisher = (ids) -> {
return ids.isEmpty() ? Collections.emptyList() : myRepository.findAllById(ids);
};
Optional.ofNullable(list)
.map(List::stream)
.orElse(Stream.of())
.collect(
Collectors.collectingAndThen(
Collectors.toList(),
finisher
)
);
}

Related

Do I need a custom Spliterator to avoid extra .stream() call?

I have this code which works fine, but I find it ugly.
#EqualsAndHashCode
public abstract class Actions {
#Getter
private List<ActionsBloc> blocs;
public Actions mergeWith(#NotNull Actions other) {
this.blocs = Stream.of(this.blocs, other.blocs)
.flatMap(Collection::stream)
.collect(groupingBy(ActionsBloc::getClass, reducing(ActionsBloc::mergeWith)))
.values()
.stream()
.filter(Optional::isPresent)
.map(Optional::get)
.collect(toList());
return this;
}
}
ActionsBloc is a super type which contains a list of Action.
public interface ActionsBloc {
<T extends Action> List<T> actions();
default ActionsBloc mergeWith(ActionsBloc ab) {
this.actions().addAll(ab.actions());
return this;
}
}
What I want to do is merge blocs of Actions together based on the Class type. So I'm grouping by ActionsBloc::getClass and then merge by calling ActionsBloc::mergeWith.
What I find ugly is calling the values().stream() after the first stream was ended on collect.
Is there a way to operate only on one stream and get rid of values().stream(), or do I have to write a custom Spliterator? In other words have only one collect in my code.
You can work with a reducing identity to sort that out possibly. One way could be to update the implementation of mergeWith as :
default ActionsBloc mergeWith(ActionsBloc ab) {
this.actions().addAll(Optional.ofNullable(ab)
.map(ActionsBloc::actions)
.orElse(Collections.emptyList()));
return this;
}
and then modify the grouping and reduction to:
this.blocs = new ArrayList<>(Stream.of(this.blocs, other.blocs)
.flatMap(Collection::stream)
.collect(groupingBy(ActionsBloc::getClass, reducing(null, ActionsBloc::mergeWith)))
.values());
Edit: As Holger pointed out such use cases of using groupingBy and reducing further could be more appropriately implemented using toMap as :
this.blocs = new ArrayList<>(Stream.concat(this.blocs.stream(), other.blocs.stream())
.collect(Collectors.toMap(ActionsBloc::getClass, Function.identity(), ActionsBloc::mergeWith))
.values());

Java 8 using stream, flatMap and lambda

I have this piece of code and I want to return a list of postCodes:
List<String> postcodes = new ArrayList<>();
List<Entry> entries = x.getEntry(); //getEntry() returns a list of Entry class
for (Entry entry : entries) {
if (entry != null) {
Properties properties = entry.getContent().getProperties();
postcodes.addAll(Arrays.asList(properties.getPostcodes().split(",")));
}
}
return postcodes;
Here's my attempt to use stream() method and the following chained methods:
...some other block of code
List<Entry> entries = x.getEntry.stream()
.filter(entry -> recordEntry != null)
.flatMap(entry -> {
Properties properties = recordEntry.getContent().getProperties();
postCodes.addAll(Arrays.asList(properties.getPostcodes().split(",")));
});
you've got several issues with your code i.e:
postCodes.addAll is a side-effect and therefore you should avoid doing that otherwise when the code is executed in parallel you'll receive non-deterministic results.
flatMap expects a stream, not a boolean; which is what your code currently attempts to pass to flatMap.
flatMap in this case consumes a function that also consumes a value and returns a value back and considering you've decide to use a lambda statement block then you must include a return statement within the lambda statement block specifying the value to return. this is not the case within your code.
stream pipelines are driven by terminal operations which are operations that turn a stream into a non-stream value and your code currently will not execute at all as you've just set up the ingredients but not actually asked for a result from the stream.
the receiver type of your query should be List<String> not List<Entry> as within your current code the call to Arrays.asList(properties.getPostcodes().split(",")) returns a List<String> which you then add to an accumulator with the call addAll.
thanks to Holger for pointing it out, you're constantly failing to decide whether the variable is named entry or recordEntry.
That said here's how I'd rewrite your code:
List<String> entries = x.getEntry.stream()
.filter(Objects::nonNull)
.map(Entry::getContent)
.map(Content::getProperties)
.map(Properties::getPostcodes‌)
.flatMap(Pattern.co‌mpile(",")::splitAsS‌tream)
.collect(Collectors.toList());
and you may want to use Collectors.toCollection to specify a specific implementation of the list returned if deemed appropriate.
edit:
with a couple of good suggestions from shmosel we can actually use method references throughout the stream pipelines and therefore enabling better intent of the code and a lot easier to follow.
or you could proceed with the approach:
List<String> entries = x.getEntry.stream()
.filter(e -> e != null)
.flatMap(e -> Arrays.asList(
e.getContent().getProperties().getPostcodes().split(",")).stream()
)
.collect(Collectors.toList());
if it's more comfortable to you.

Check instanceof in stream

I have the following expression:
scheduleIntervalContainers.stream()
.filter(sic -> ((ScheduleIntervalContainer) sic).getStartTime() != ((ScheduleIntervalContainer)sic).getEndTime())
.collect(Collectors.toList());
...where scheduleIntervalContainers has element type ScheduleContainer:
final List<ScheduleContainer> scheduleIntervalContainers
Is it possible to check the type before the filter?
You can apply another filter in order to keep only the ScheduleIntervalContainer instances, and adding a map will save you the later casts :
scheduleIntervalContainers.stream()
.filter(sc -> sc instanceof ScheduleIntervalContainer)
.map (sc -> (ScheduleIntervalContainer) sc)
.filter(sic -> sic.getStartTime() != sic.getEndTime())
.collect(Collectors.toList());
Or, as Holger commented, you can replace the lambda expressions with method references if you prefer that style:
scheduleIntervalContainers.stream()
.filter(ScheduleIntervalContainer.class::isInstance)
.map (ScheduleIntervalContainer.class::cast)
.filter(sic -> sic.getStartTime() != sic.getEndTime())
.collect(Collectors.toList());
A pretty elegant option is to use method reference of class:
scheduleIntervalContainers
.stream()
.filter( ScheduleIntervalContainer.class::isInstance )
.map( ScheduleIntervalContainer.class::cast )
.filter( sic -> sic.getStartTime() != sic.getEndTime())
.collect(Collectors.toList() );
There is a small problem with #Eran solution - typing class name in both filter and map is error-prone - it is easy to forget to change the name of the class in both places. An improved solution would be something like this:
private static <T, R> Function<T, Stream<R>> select(Class<R> clazz) {
return e -> clazz.isInstance(e) ? Stream.of(clazz.cast(e)) : null;
}
scheduleIntervalContainers
.stream()
.flatMap(select(ScheduleIntervalContainer.class))
.filter( sic -> sic.getStartTime() != sic.getEndTime())
.collect(Collectors.toList());
However there might be a performance penalty in creating a Stream for every matching element. Be careful to use it on huge data sets. I've learned this solution from #Tagir Vailev
Instead of a filter + map like other answers suggest, I would recommend this utility method:
public static <Super, Sub extends Super> Function<Super, Stream<Sub>> filterType(Class<Sub> clz) {
return obj -> clz.isInstance(obj) ? Stream.of(clz.cast(obj)) : Stream.empty();
}
Use it as:
Stream.of(dog, cat fish)
.flatMap(filterType(Dog.class));
Compared to filter + map it has the following advantages:
If the class does not extend your class you will get a compile error
Single place, you can never forget to change a class in either filter or map
Filter by class type with StreamEx
StreamEx.of(myCollection).select(TheThing.class).toList();

Stuck with java8 lambda expression

I have Map<Integer,Doctor> docLib=new HashMap<>(); to save class of Doctor.
Class Doctor has methods:getSpecialization() return a String,
getPatients() to return a collection of class Person.
In the main method, I type:
public Map<String,Set<Person>> getPatientsPerSpecialization(){
Map<String,Set<Person>> res=this.docLib.entrySet().stream().
map(d->d.getValue()).
collect(groupingBy(d->d.getSpecialization(),
d.getPatients()) //error
);
return res;
}
As you can see, I have problem with groupingBy,I try to send the same value d to the method, but it's wrong.
How to solve this?
You need a second Collector for that mapping :
public Map<String,Set<Person>> getPatientsPerSpecialization(){
return this.docLib
.values()
.stream()
.collect(Colectors.groupingBy(Doctor::getSpecialization,
Collectors.mapping(Doctor::getPatients,toSet()))
);
}
EDIT:
I think my original answer may be wrong (it's hard to say without being able to test it). Since Doctor::getPatients returns a Collection, I think my code may return a Map<String,Set<Collection<Person>>> instead of the desired Map<String,Set<Person>>.
The easiest way to overcome that is to iterate over that Map again to produce the desired Map :
public Map<String,Set<Person>> getPatientsPerSpecialization(){
return this.docLib
.values()
.stream()
.collect(Colectors.groupingBy(Doctor::getSpecialization,
Collectors.mapping(Doctor::getPatients,toSet()))
)
.entrySet()
.stream()
.collect (Collectors.toMap (e -> e.getKey(),
e -> e.getValue().stream().flatMap(c -> c.stream()).collect(Collectors.toSet()))
);
}
Perhaps there's a way to get the same result with a single Stream pipeline, but I can't see it right now.
Instead of groupingBy, you could use toMap:
public Map<String, Set<Person>> getPatientsPerSpecialization() {
return docLib.values()
.stream()
.collect(toMap(Doctor::getSpecialization,
d -> new HashSet<>(d.getPatients()),
(p1, p2) -> Stream.concat(p1.stream(), p2.stream()).collect(toSet())));
}
What it does is that it groups the doctors per specialization and map each one to a set of the patients it has (so a Map<String, Set<Person>>).
If, when collecting the data from the pipeline, you encounter a doctor with a specialization that is already stored as a key in the map, you use the merge function to produce a new set of values with both sets (the set that is already stored as a value for the key, and the set that you want to associate with the key).

How to create a List<T> from Map<K,V> and List<K> of keys?

Using Java 8 lambdas, what's the "best" way to effectively create a new List<T> given a List<K> of possible keys and a Map<K,V>? This is the scenario where you are given a List of possible Map keys and are expected to generate a List<T> where T is some type that is constructed based on some aspect of V, the map value types.
I've explored a few and don't feel comfortable claiming one way is better than another (with maybe one exception -- see code). I'll clarify "best" as a combination of code clarity and runtime efficiency. These are what I came up with. I'm sure someone can do better, which is one aspect of this question. I don't like the filter aspect of most as it means needing to create intermediate structures and multiple passes over the names List. Right now, I'm opting for Example 6 -- a plain 'ol loop. (NOTE: Some cryptic thoughts are in the code comments, especially "need to reference externally..." This means external from the lambda.)
public class Java8Mapping {
private final Map<String,Wongo> nameToWongoMap = new HashMap<>();
public Java8Mapping(){
List<String> names = Arrays.asList("abbey","normal","hans","delbrook");
List<String> types = Arrays.asList("crazy","boring","shocking","dead");
for(int i=0; i<names.size(); i++){
nameToWongoMap.put(names.get(i),new Wongo(names.get(i),types.get(i)));
}
}
public static void main(String[] args) {
System.out.println("in main");
Java8Mapping j = new Java8Mapping();
List<String> testNames = Arrays.asList("abbey", "froderick","igor");
System.out.println(j.getBongosExample1(testNames).stream().map(Bongo::toString).collect(Collectors.joining(", ")));
System.out.println(j.getBongosExample2(testNames).stream().map(Bongo::toString).collect(Collectors.joining(", ")));
System.out.println(j.getBongosExample3(testNames).stream().map(Bongo::toString).collect(Collectors.joining(", ")));
System.out.println(j.getBongosExample4(testNames).stream().map(Bongo::toString).collect(Collectors.joining(", ")));
System.out.println(j.getBongosExample5(testNames).stream().map(Bongo::toString).collect(Collectors.joining(", ")));
System.out.println(j.getBongosExample6(testNames).stream().map(Bongo::toString).collect(Collectors.joining(", ")));
}
private static class Wongo{
String name;
String type;
public Wongo(String s, String t){name=s;type=t;}
#Override public String toString(){return "Wongo{name="+name+", type="+type+"}";}
}
private static class Bongo{
Wongo wongo;
public Bongo(Wongo w){wongo = w;}
#Override public String toString(){ return "Bongo{wongo="+wongo+"}";}
}
// 1: Create a list externally and add items inside 'forEach'.
// Needs to externally reference Map and List
public List<Bongo> getBongosExample1(List<String> names){
final List<Bongo> listOne = new ArrayList<>();
names.forEach(s -> {
Wongo w = nameToWongoMap.get(s);
if(w != null) {
listOne.add(new Bongo(nameToWongoMap.get(s)));
}
});
return listOne;
}
// 2: Use stream().map().collect()
// Needs to externally reference Map
public List<Bongo> getBongosExample2(List<String> names){
return names.stream()
.filter(s -> nameToWongoMap.get(s) != null)
.map(s -> new Bongo(nameToWongoMap.get(s)))
.collect(Collectors.toList());
}
// 3: Create custom Collector
// Needs to externally reference Map
public List<Bongo> getBongosExample3(List<String> names){
Function<List<Wongo>,List<Bongo>> finisher = list -> list.stream().map(Bongo::new).collect(Collectors.toList());
Collector<String,List<Wongo>,List<Bongo>> bongoCollector =
Collector.of(ArrayList::new,getAccumulator(),getCombiner(),finisher, Characteristics.UNORDERED);
return names.stream().collect(bongoCollector);
}
// example 3 helper code
private BiConsumer<List<Wongo>,String> getAccumulator(){
return (list,string) -> {
Wongo w = nameToWongoMap.get(string);
if(w != null){
list.add(w);
}
};
}
// example 3 helper code
private BinaryOperator<List<Wongo>> getCombiner(){
return (l1,l2) -> {
l1.addAll(l2);
return l1;
};
}
// 4: Use internal Bongo creation facility
public List<Bongo> getBongosExample4(List<String> names){
return names.stream().filter(s->nameToWongoMap.get(s) != null).map(s-> new Bongo(nameToWongoMap.get(s))).collect(Collectors.toList());
}
// 5: Stream the Map EntrySet. This avoids referring to anything outside of the stream,
// but bypasses the lookup benefit from Map.
public List<Bongo> getBongosExample5(List<String> names){
return nameToWongoMap.entrySet().stream().filter(e->names.contains(e.getKey())).map(e -> new Bongo(e.getValue())).collect(Collectors.toList());
}
// 6: Plain-ol-java loop
public List<Bongo> getBongosExample6(List<String> names){
List<Bongo> bongos = new ArrayList<>();
for(String s : names){
Wongo w = nameToWongoMap.get(s);
if(w != null){
bongos.add(new Bongo(w));
}
}
return bongos;
}
}
If namesToWongoMap is an instance variable, you can't really avoid a capturing lambda.
You can clean up the stream by splitting up the operations a little more:
return names.stream()
.map(n -> namesToWongoMap.get(n))
.filter(w -> w != null)
.map(w -> new Bongo(w))
.collect(toList());
return names.stream()
.map(namesToWongoMap::get)
.filter(Objects::nonNull)
.map(Bongo::new)
.collect(toList());
That way you don't call get twice.
This is very much like the for loop, except, for example, it could theoretically be parallelized if namesToWongoMap can't be mutated concurrently.
I don't like the filter aspect of most as it means needing to create intermediate structures and multiple passes over the names List.
There are no intermediate structures and there is only one pass over the List. A stream pipeline says "for each element...do this sequence of operations". Each element is visited once and the pipeline is applied.
Here are some relevant quotes from the java.util.stream package description:
A stream is not a data structure that stores elements; instead, it conveys elements from a source such as a data structure, an array, a generator function, or an I/O channel, through a pipeline of computational operations.
Processing streams lazily allows for significant efficiencies; in a pipeline such as the filter-map-sum example above, filtering, mapping, and summing can be fused into a single pass on the data, with minimal intermediate state.
Radiodef's answer pretty much nailed it, I think. The solution given there:
return names.stream()
.map(namesToWongoMap::get)
.filter(Objects::nonNull)
.map(Bongo::new)
.collect(toList());
is probably about the best that can be done in Java 8.
I did want to mention a small wrinkle in this, though. The Map.get call returns null if the name isn't present in the map, and this is subsequently filtered out. There's nothing wrong with this per se, though it does bake null-means-not-present semantics into the pipeline structure.
In some sense we'd want a mapper pipeline operation that has a choice of returning zero or one elements. A way to do this with streams is with flatMap. The flatmapper function can return an arbitrary number of elements into the stream, but in this case we want just zero or one. Here's how to do that:
return names.stream()
.flatMap(name -> {
Wongo w = nameToWongoMap.get(name);
return w == null ? Stream.empty() : Stream.of(w);
})
.map(Bongo::new)
.collect(toList());
I admit this is pretty clunky and so I wouldn't recommend doing this. A slightly better but somewhat obscure approach is this:
return names.stream()
.flatMap(name -> Optional.ofNullable(nameToWongoMap.get(name))
.map(Stream::of).orElseGet(Stream::empty))
.map(Bongo::new)
.collect(toList());
but I'm still not sure I'd recommend this as it stands.
The use of flatMap does point to another approach, though. If you have a more complicated policy of how to deal with the not-present case, you could refactor this into a helper function that returns a Stream containing the result or an empty Stream if there's no result.
Finally, JDK 9 -- still under development as of this writing -- has added Stream.ofNullable which is useful in exactly these situations:
return names.stream()
.flatMap(name -> Stream.ofNullable(nameToWongoMap.get(name)))
.map(Bongo::new)
.collect(toList());
As an aside, JDK 9 has also added Optional.stream which creates a zero-or-one stream from an Optional. This is useful in cases where you want to call an Optional-returning function from within flatMap. See this answer and this answer for more discussion.
One approach I didn't see is retainAll:
public List<Bongo> getBongos(List<String> names) {
Map<String, Wongo> copy = new HashMap<>(nameToWongoMap);
copy.keySet().retainAll(names);
return copy.values().stream().map(Bongo::new).collect(
Collectors.toList());
}
The extra Map is a minimal performance hit, since it's just copying pointers to objects, not the objects themselves.

Categories