Java stream operation invocations - java

Can anyone point to a official Java documentation which describes how many times Stream will invoke each "non-interfering and stateless" intermediate operation for each element.
For example:
Arrays.asList("1", "2", "3", "4").stream()
.filter(s -> check(s))
.forEach(s -> System.out.println(s));
public boolean check(Object o) {
return true;
}
The above currently will invoke check method 4 times.
Is it possible that in the current or future versions of JDKs the check method gets executed more or less times than the number of elements in the stream created from List or any other standard Java API?

This does not have to do with the source of the stream, but rather the terminal operation and optimization done in the stream implementation itself. For example:
Stream.of(1,2,3,4)
.map(x -> x + 1)
.count();
Since java-9, map will not get executed a single time.
Or:
someTreeSet.stream()
.sorted()
.findFirst();
sorted might not get executed at all, since the source is a TreeSet and getting the first element is trivial, but if this is implemented inside stream API or not, is a different question.
So the real answer here - it depends, but I can't imagine one operation that would get executed more that the numbers of elements in the source.

From the documentation:
Laziness-seeking. Many stream operations, such as filtering, mapping, or duplicate removal, can be implemented lazily, exposing opportunities for optimization. For example, "find the first String with three consecutive vowels" need not examine all the input strings. Stream operations are divided into intermediate (Stream-producing) operations and terminal (value- or side-effect-producing) operations. Intermediate operations are always lazy.
By that virtue, because filter is an intermediate operation which creates a new Stream as part of its operation, due to its laziness, it will only ever invoke the filter predicate once per element as part of its rebuilding of the stream.
The only way that your method would possibly have a different number of invocations against it in the stream is if the stream were somehow mutated between states, which given the fact that nothing in a stream actually runs until the terminal operation, would only realistically be possible due to a bug upstream.

Related

Why is this IntStream.mapToObj mapper not evaluated with Stream.count? [duplicate]

I am in the progress of learning through Java 8 lambda expressions and would like to ask about the following piece of Java code relating to the peek method in the function interface that I have come across.
On execution of the program on IDE, it gives no output. I was expecting it would give 2, 4, 6.
import java.util.Arrays;
import java.util.List;
public class Test_Q3 {
public Test_Q3() {
}
public static void main(String[] args) {
List<Integer> values = Arrays.asList(1, 2, 3);
values.stream()
.map(n -> n * 2)
.peek(System.out::print)
.count();
}
}
I assume you are running this under Java 9? You are not altering the SIZED property of the stream, so there is no need to execute either map or peek at all.
In other words all you care is about count as the final result, but in the meanwhile you do not alter the initial size of the List in any way (via filter for example or distinct) This is an optimization done in the Streams.
Btw, even if you add a dummy filter this will show what you expect:
values.stream ()
.map(n -> n*2)
.peek(System.out::print)
.filter(x -> true)
.count();
Here's some relevant quotes from the Javadoc of Stream interface:
A stream implementation is permitted significant latitude in optimizing the computation of the result. For example, a stream implementation is free to elide operations (or entire stages) from a stream pipeline -- and therefore elide invocation of behavioral parameters -- if it can prove that it would not affect the result of the computation. This means that side-effects of behavioral parameters may not always be executed and should not be relied upon, unless otherwise specified (such as by the terminal operations forEach and forEachOrdered). (For a specific example of such an optimization, see the API note documented on the count() operation. For more detail, see the side-effects section of the stream package documentation.)
And more specifically from the Javadoc of count() method:
API Note:
An implementation may choose to not execute the stream pipeline (either sequentially or in parallel) if it is capable of computing the count directly from the stream source. In such cases no source elements will be traversed and no intermediate operations will be evaluated. Behavioral parameters with side-effects, which are strongly discouraged except for harmless cases such as debugging, may be affected. For example, consider the following stream:
List<String> l = Arrays.asList("A", "B", "C", "D");
long count = l.stream().peek(System.out::println).count();
The number of elements covered by the stream source, a List, is known and the intermediate operation, peek, does not inject into or remove elements from the stream (as may be the case for flatMap or filter operations). Thus the count is the size of the List and there is no need to execute the pipeline and, as a side-effect, print out the list elements.
These quotes only appear on the Javadoc of Java 9, so it must be a new optimization.

Stream on Map doesn't save .map changes

Can someone explain me why the first code example doesn't save the changes I've made with .map on the Map but the second code example does?
First code example:
stringIntegerMap.entrySet().stream()
.map(element -> element.setValue(100));
Second code example:
stringIntegerMap.entrySet().stream()
.map(element -> element.setValue(100))
.forEach(System.out::println);
Also, why does the second code example only print the values and not the whole element (key + value) ?
Your stream operations are lazy-evaluated.
If you do not invoke a terminal operation such as forEach (or collect, etc.), the streaming never actually occurs, hence your setValue is not executed.
Note that modifying the collection/map you are streaming is generally advised against.
Finally, the API for Map.Entry#setValue is here.
You'll notice the method returns:
old value corresponding to the entry
So, when you perform the map operation, the stream generated contains the values.
Some sources here (search for "stream operations and pipelines", and also the part about "non-interference" might help).
Streams are composed of a source, intermediate operations and terminal operations.
The terminal operations start the pipeline processing by lazily gathering elements from the source, then applying intermediate operations and finally executing the terminal operation.
Stream.map is an intermediate operation, whereas Stream.forEach is terminal. So in your first snippet the pipeline processing never starts (hence intermediate operations are never executed), because there's no terminal operation. When you use forEach in your 2nd snippet, then all the pipeline is processed.
Please take a look at the java.util.stream package docs, where there's extensive information about streams and how to use them properly (i.e. you shouldn't modify the source of the stream from within intermediate or final operations, such as you're doing in Stream.map).
Edit:
As to your final question:
why does the second code example only print the values and not the whole element (key + value) ?
Mena's answer explains it well: Map.Entry.setValue not only sets the given value to the entry, but also returns the old value. As you're using Map.Entry.setValue in a lambda expression within the Stream.map intermediate operation, you're actually transforming each Map.Entry element of the stream into the value it had before setting a new value. So, what arrives to Stream.forEach are the old values of the map, while the map has new values set by means of the side-effect produced by Map.Entry.setValue.

Stream.peek() method in Java 8 vs Java 9

I am in the progress of learning through Java 8 lambda expressions and would like to ask about the following piece of Java code relating to the peek method in the function interface that I have come across.
On execution of the program on IDE, it gives no output. I was expecting it would give 2, 4, 6.
import java.util.Arrays;
import java.util.List;
public class Test_Q3 {
public Test_Q3() {
}
public static void main(String[] args) {
List<Integer> values = Arrays.asList(1, 2, 3);
values.stream()
.map(n -> n * 2)
.peek(System.out::print)
.count();
}
}
I assume you are running this under Java 9? You are not altering the SIZED property of the stream, so there is no need to execute either map or peek at all.
In other words all you care is about count as the final result, but in the meanwhile you do not alter the initial size of the List in any way (via filter for example or distinct) This is an optimization done in the Streams.
Btw, even if you add a dummy filter this will show what you expect:
values.stream ()
.map(n -> n*2)
.peek(System.out::print)
.filter(x -> true)
.count();
Here's some relevant quotes from the Javadoc of Stream interface:
A stream implementation is permitted significant latitude in optimizing the computation of the result. For example, a stream implementation is free to elide operations (or entire stages) from a stream pipeline -- and therefore elide invocation of behavioral parameters -- if it can prove that it would not affect the result of the computation. This means that side-effects of behavioral parameters may not always be executed and should not be relied upon, unless otherwise specified (such as by the terminal operations forEach and forEachOrdered). (For a specific example of such an optimization, see the API note documented on the count() operation. For more detail, see the side-effects section of the stream package documentation.)
And more specifically from the Javadoc of count() method:
API Note:
An implementation may choose to not execute the stream pipeline (either sequentially or in parallel) if it is capable of computing the count directly from the stream source. In such cases no source elements will be traversed and no intermediate operations will be evaluated. Behavioral parameters with side-effects, which are strongly discouraged except for harmless cases such as debugging, may be affected. For example, consider the following stream:
List<String> l = Arrays.asList("A", "B", "C", "D");
long count = l.stream().peek(System.out::println).count();
The number of elements covered by the stream source, a List, is known and the intermediate operation, peek, does not inject into or remove elements from the stream (as may be the case for flatMap or filter operations). Thus the count is the size of the List and there is no need to execute the pipeline and, as a side-effect, print out the list elements.
These quotes only appear on the Javadoc of Java 9, so it must be a new optimization.

Swap operation according to stream's encounter order

As long as the documentation defines the so called encounter order I think it's reasonble to ask if we can reverse that encounter order somehow. Looking at the API streams provide us with, I didn't find anything related to ordering except sorted().
If I have a stream produced say from a List can I swap two elements of that stream and therefore producing another stream with the modified encounter order.
Does it even make sense to talking about "swapping" elements in a stream or the specification say nothing about it.
Java Stream API have no dedicated operations to reverse the encounter order or swap elements in pairs or something like this. Please note that the Stream source can be once-off (like network socket or stream of generated random numbers), so in general case you cannot make it backwards without storing everything in the memory. That's actually how sorting operation works: it dumps the whole stream content into the intermediate array, sorts it, then performs a downstream computation. So were reverse operation implemented it would work in the same way.
For particular sources like random-access list you may create reversed stream using, for example, this construct
List<T> list = ...;
Stream<T> stream = IntStream.rangeClosed(1, list.size())
.mapToObj(i -> list.get(list.size()-i));

what does parallelstream().map().map() do?

I have a Collection with encoded objects (which are quite big when not encoded) and I was wondering what actually happens if I would do something like:
codes.parallelStream().map(code -> decode(code)).map(obj -> do1(obj) * do2(obj));
As I didn't find much more info about this kind of constructions, I suppose this first decodes all elements and only afterwards performs the real task, but on the other hand it would be more logical (and memory-friendly in case of big objects) in case of a parallelStream if it would execute both maps at once for every element, like if there was standing:
codes.parallelStream().map(code -> { obj = decode(code); return do1(obj) * do2(obj); });
Could anybody help me understand how this works?
The map operation is evaluated lazily. Therefore the decode operation in the first map call will only be performed if the encoded object is evaluated by the terminal operation of the Stream. Therefore your I suppose this first decodes all elements and only afterwards performs the real task assumption is false, since the terminal operation may require just few of the elements of the source Collection to be processed, so neither of the 2 map operations will be performed for most of the encoded elements in this case.
An intermediate Stream operation may be processed for all the elements of the Stream only if it requires all the elements (for example sorted() must iterate over all the elements), or if it precedes an intermediate operation that requires all the elements (for example in ...map().sorted()..., executing sorted() requires first executing map() on all the elements of the Stream).
Your two code snippets should behave similarly, though the first one is more readable.

Categories