Streams Optional<Question> find highest scoring Question with - java

/Hi everyone! I am really struggeling with this methode. I have to find out the question with the highest score and have to filter it with minimumviews.
public Stream<Question> stream() {
Stream<Question> questionStream = Arrays.stream(items);
questionStream.forEach(System.out::println);
return questionStream;
}
public Optional<Question> findHighestScoringQuestionWith(int minimumViews){
return stream()
.sorted(Comparator.comparing(Question::getScore))
.filter(x -> x.getViewCount() >= minimumViews)
.findFirst();
}
//I would be very grateful if someone can help me with this issue. I thank you all in advance.
//My exception
Exception in thread "main" java.lang.IllegalStateException: stream has already been operated upon or closed
at java.base/java.util.stream.AbstractPipeline.<init>(AbstractPipeline.java:203)
at java.base/java.util.stream.ReferencePipeline.<init>(ReferencePipeline.java:94)
at java.base/java.util.stream.ReferencePipeline$StatefulOp.<init>(ReferencePipeline.java:725)
at java.base/java.util.stream.SortedOps$OfRef.<init>(SortedOps.java:126)
at java.base/java.util.stream.SortedOps.makeRef(SortedOps.java:63)
at java.base/java.util.stream.ReferencePipeline.sorted(ReferencePipeline.java:463)
at stackoverflow.Data.sortedStream(Data.java:156)
at stackoverflow.Main.main(Main.java:14)

Stream operations are divided into intermediate and terminal operations, and are combined to form stream pipelines. A stream pipeline consists of a source (such as a Collection, an array, a generator function, or an I/O channel); followed by zero or more intermediate operations such as Stream.filter or Stream.map; and a terminal operation such as Stream.forEach or Stream.reduce.
- Package Summary for java.util.stream
Stream.forEach is a terminal operation, meaning that it completes a stream pipeline. The whole stream pipeline is evaluated when a terminal operation is invoked, it has been operated upon, as stated in the exception.
If you want to have multiple terminal operations, you need to set up multiple stream pipelines.
To perform some operation on the data mid stream, you can use Stream.peek:
public Stream<Question> stream() {
Stream<Question> questionStream = Arrays.stream(items);
return questionStream.peek(System.out::println); // <-
}
public Optional<Question> findHighestScoringQuestionWith(int minimumViews){
return stream()
.sorted(Comparator.comparing(Question::getScore))
.filter(x -> x.getViewCount() >= minimumViews)
.findFirst();
}
This will print out all items in the stream, but only once a terminal operation is called and the stream is evaluated. In your case, that terminal operation is Stream.findFirst in the findHighestScoringQuestionWith method.

Streams are one-shot objects, you can't use them more than once.
The problem is that you are calling questionStream.forEach in the stream() method, so it is already used up before you return. If you really want to print out the contents, then you could do Arrays.asList(items).forEach(System.out::println);

Related

Collect both min and max in one stream

I need to print both min and max of a stream of int in one operation. I currently have 2 operations but the second is not allowed. Somehow collectors are not working for me:
Stream<Integer> stringInt = Stream.of(8,50,16,0,72);
System.out.println(stringInt.reduce(Math::min).get());
System.out.println(stringInt.reduce(Math::max).get());
The second is not allowed since stream can not be reused. From Stream javadoc :
A stream should be operated on (invoking an intermediate or terminal stream operation) only once. This rules out, for example, "forked" streams, where the same source feeds two or more pipelines, or multiple traversals of the same stream. A stream implementation may throw IllegalStateException if it detects that the stream is being reused.
You could use collect with Collectors.summarizingInt :
IntSummaryStatistics collect = stringInt.collect(Collectors.summarizingInt(value -> value));
System.out.println(collect.getMax());
System.out.println(collect.getMin());

Why do I have to chain Stream operations in Java? [duplicate]

This question already has answers here:
When is a Java 8 Stream considered to be consumed?
(2 answers)
Closed 4 years ago.
I think all of the resources I have studied one way or another emphasize that a stream can be consumed only once, and the consumption is done by so-called terminal operations (which is very clear to me).
Just out of curiosity I tried this:
import java.util.stream.IntStream;
class App {
public static void main(String[] args) {
IntStream is = IntStream.of(1, 2, 3, 4);
is.map(i -> i + 1);
int sum = is.sum();
}
}
which ends up throwing a Runtime Exception:
Exception in thread "main" java.lang.IllegalStateException: stream has already been operated upon or closed
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:229)
at java.util.stream.IntPipeline.reduce(IntPipeline.java:456)
at java.util.stream.IntPipeline.sum(IntPipeline.java:414)
at App.main(scratch.java:10)
This is usual, I am missing something, but still want to ask: As far as I know map is an intermediate (and lazy) operation and does nothing on the Stream by itself. Only when the terminal operation sum (which is an eager operation) is called, the Stream gets consumed and operated on.
But why do I have to chain them?
What is the difference between
is.map(i -> i + 1);
is.sum();
and
is.map(i -> i + 1).sum();
?
When you do this:
int sum = IntStream.of(1, 2, 3, 4).map(i -> i + 1).sum();
Every chained method is being invoked on the return value of the previous method in the chain.
So map is invoked on what IntStream.of(1, 2, 3, 4) returns and sum on what map(i -> i + 1) returns.
You don't have to chain stream methods, but it's more readable and less error-prone than using this equivalent code:
IntStream is = IntStream.of(1, 2, 3, 4);
is = is.map(i -> i + 1);
int sum = is.sum();
Which is not the same as the code you've shown in your question:
IntStream is = IntStream.of(1, 2, 3, 4);
is.map(i -> i + 1);
int sum = is.sum();
As you see, you're disregarding the reference returned by map. This is the cause of the error.
EDIT (as per the comments, thanks to #IanKemp for pointing this out): Actually, this is the external cause of the error. If you stop to think about it, map must be doing something internally to the stream itself, otherwise, how would then the terminal operation trigger the transformation passed to map on each element? I agree in that intermediate operations are lazy, i.e. when invoked, they do nothing to the elements of the stream. But internally, they must configure some state into the stream pipeline itself, so that they can be applied later.
Despite I'm not aware of the full details, what happens is that, conceptually, map is doing at least 2 things:
It's creating and returning a new stream that holds the function passed as an argument somewhere, so that it can be applied to elements later, when the terminal operation is invoked.
It is also setting a flag to the old stream instance, i.e. the one which it has been called on, indicating that this stream instance no longer represents a valid state for the pipeline. This is because the new, updated state which holds the function passed to map is now encapsulated by the instance it has returned. (I believe that this decision might have been taken by the jdk team to make errors appear as early as possible, i.e. by throwing an early exception instead of letting the pipeline go on with an invalid/old state that doesn't hold the function to be applied, thus letting the terminal operation return unexpected results).
Later on, when a terminal operation is invoked on this instance flagged as invalid, you're getting that IllegalStateException. The two items above configure the deep, internal cause of the error.
Another way to see all this is to make sure that a Stream instance is operated only once, by means of either an intermediate or a terminal operation. Here you are violating this requirement, because you are calling map and sum on the same instance.
In fact, javadocs for Stream state it clearly:
A stream should be operated on (invoking an intermediate or terminal stream operation) only once. This rules out, for example, "forked" streams, where the same source feeds two or more pipelines, or multiple traversals of the same stream. A stream implementation may throw IllegalStateException if it detects that the stream is being reused. However, since some stream operations may return their receiver rather than a new stream object, it may not be possible to detect reuse in all cases.
Imagine the IntStream is a wrapper around your data stream with an
immutable list of operations. These operations are not executed until you need the final result (sum in your case).
Since the list is immutable, you need a new instance of IntStream with a list that contains the previous items plus the new one, which is what '. map' returns.
This means that if you don't chain, you will operate on the old instance, which does not have that operation.
The stream library also keeps some internal tracking of what's going on, that's why it's able to throw the exception in the sum step.
If you don't want to chain, you can use a variable for each step:
IntStream is = IntStream.of(1, 2, 3, 4);
IntStream is2 = is.map(i -> i + 1);
int sum = is2.sum();
Intermediate operations return a new stream. They are always lazy; executing an intermediate operation such as filter() does not actually perform any filtering, but instead creates a new stream that, when traversed, contains the elements of the initial stream that match the given predicate.
Taken from https://docs.oracle.com/javase/8/docs/api/java/util/stream/package-summary.html under "Stream Operations and Pipelines"
At the lowest level, all streams are driven by a spliterator.
Taken from the same link under "Low-level stream construction"
Traversal and splitting exhaust elements; each Spliterator is useful for only a single bulk computation.
Taken from https://docs.oracle.com/javase/8/docs/api/java/util/Spliterator.html

Java: Consumer interface in a stream doesn't work as expected [duplicate]

This question already has answers here:
Java 8 Streams peek api
(4 answers)
How to use Streams api peek() function and make it work?
(2 answers)
Closed 4 years ago.
I've got 2 statements, I expected that they should "print" same result:
Arrays.stream("abc".split("")).forEach(System.out::println);//first
Arrays.stream("abc".split("")).peek(new Consumer<String>() {//second
#Override
public void accept(String s) {
System.out.println(s);//breakpoint
}
});
In fact, the first statement will print
a
b
c
Ok, but the second statement prints nothing. I tried to set a breakpoint in the line of "//breakpoint" inside IntelliJ, but it wasn't hit.
So how should I change the second statement to use "peek" as it create a new stream while processing every element using "Consumer"?
Thanks a lot.
Stream.peek, as stated in the javadocs of the API as well, is meant mainly for debugging purposes and performing any update operations on the stream during the peek operation is not recommended.
For example, you can verify the intermediate stream state with the following code and what it eventually results in:
Arrays.stream("acb".split(""))
.peek(System.out::println) // print a c b
.sorted()
.forEach(System.out::println); // print a b c
In general, this operation is an intermediate operation wouldn't be executed unless and terminal operation is performed on the stream as mentioned in the Stream operations and pipelines section of the docs and that is exactly the reason why your first statement will print.
Note: Though as suggested in a few other answers, the action within peek
is not invoked in the cases when its able to optimize the result for some short-circuiting operations like findFirst etc.
In cases where the stream implementation is able to optimize away the
production of some or all the elements (such as with short-circuiting
operations like findFirst, or in the example described in count()),
the action will not be invoked for those elements.
peek() is not terminal operation, you need to add any terminal operation to make peek work, e.g.
Arrays.stream("abc".split("")).peek(new Consumer<String>() { //second
#Override
public void accept(String s) {
System.out.println(s);//breakpoint
}
}).count();
The peek() is not a terminal operation, it produces an intermediate stream. Your stream would be executed only when it finds a terminal operation.
For eg: if you add the count() terminal operation to your second stream, you will get the expected output.
Note - You got an output for the first stream because forEach() is a terminal operation.
Stream operations are divided into intermediate (Stream-producing) operations and terminal (value- or side-effect-producing) operations. Intermediate operations are always lazy. So, Steam will start executing the operation pipeline once it gets any terminal operation. In your first case forEach is the terminal operation, so the stream executed. But in the second ca,se the last operation in the pipeline is peek() which is not a terminal operation.

Conditionally add an operation to a Java 8 stream

I'm wondering if I can add an operation to a stream, based off of some sort of condition set outside of the stream. For example, I want to add a limit operation to the stream if my limit variable is not equal to -1.
My code currently looks like this, but I have yet to see other examples of streams being used this way, where a Stream object is reassigned to the result of an intermediate operation applied on itself:
// Do some stream stuff
stream = stream.filter(e -> e.getTimestamp() < max);
// Limit the stream
if (limit != -1) {
stream = stream.limit(limit);
}
// Collect stream to list
stream.collect(Collectors.toList());
As stated in this stackoverflow post, the filter isn't actually applied until a terminal operation is called. Since I'm reassigning the value of stream before a terminal operation is called, is the above code still a proper way to use Java 8 streams?
There is no semantic difference between a chained series of invocations and a series of invocations storing the intermediate return values. Thus, the following code fragments are equivalent:
a = object.foo();
b = a.bar();
c = b.baz();
and
c = object.foo().bar().baz();
In either case, each method is invoked on the result of the previous invocation. But in the latter case, the intermediate results are not stored but lost on the next invocation. In the case of the stream API, the intermediate results must not be used after you have called the next method on it, thus chaining is the natural way of using stream as it intrinsically ensures that you don’t invoke more than one method on a returned reference.
Still, it is not wrong to store the reference to a stream as long as you obey the contract of not using a returned reference more than once. By using it they way as in your question, i.e. overwriting the variable with the result of the next invocation, you also ensure that you don’t invoke more than one method on a returned reference, thus, it’s a correct usage. Of course, this only works with intermediate results of the same type, so when you are using map or flatMap, getting a stream of a different reference type, you can’t overwrite the local variable. Then you have to be careful to not use the old local variable again, but, as said, as long as you are not using it after the next invocation, there is nothing wrong with the intermediate storage.
Sometimes, you have to store it, e.g.
try(Stream<String> stream = Files.lines(Paths.get("myFile.txt"))) {
stream.filter(s -> !s.isEmpty()).forEach(System.out::println);
}
Note that the code is equivalent to the following alternatives:
try(Stream<String> stream = Files.lines(Paths.get("myFile.txt")).filter(s->!s.isEmpty())) {
stream.forEach(System.out::println);
}
and
try(Stream<String> srcStream = Files.lines(Paths.get("myFile.txt"))) {
Stream<String> tmp = srcStream.filter(s -> !s.isEmpty());
// must not be use variable srcStream here:
tmp.forEach(System.out::println);
}
They are equivalent because forEach is always invoked on the result of filter which is always invoked on the result of Files.lines and it doesn’t matter on which result the final close() operation is invoked as closing affects the entire stream pipeline.
To put it in one sentence, the way you use it, is correct.
I even prefer to do it that way, as not chaining a limit operation when you don’t want to apply a limit is the cleanest way of expression your intent. It’s also worth noting that the suggested alternatives may work in a lot of cases, but they are not semantically equivalent:
.limit(condition? aLimit: Long.MAX_VALUE)
assumes that the maximum number of elements, you can ever encounter, is Long.MAX_VALUE but streams can have more elements than that, they even might be infinite.
.limit(condition? aLimit: list.size())
when the stream source is list, is breaking the lazy evaluation of a stream. In principle, a mutable stream source might legally get arbitrarily changed up to the point when the terminal action is commenced. The result will reflect all modifications made up to this point. When you add an intermediate operation incorporating list.size(), i.e. the actual size of the list at this point, subsequent modifications applied to the collection between this point and the terminal operation may turn this value to have a different meaning than the intended “actually no limit” semantic.
Compare with “Non Interference” section of the API documentation:
For well-behaved stream sources, the source can be modified before the terminal operation commences and those modifications will be reflected in the covered elements. For example, consider the following code:
List<String> l = new ArrayList(Arrays.asList("one", "two"));
Stream<String> sl = l.stream();
l.add("three");
String s = sl.collect(joining(" "));
First a list is created consisting of two strings: "one"; and "two". Then a stream is created from that list. Next the list is modified by adding a third string: "three". Finally the elements of the stream are collected and joined together. Since the list was modified before the terminal collect operation commenced the result will be a string of "one two three".
Of course, this is a rare corner case as normally, a programmer will formulate an entire stream pipeline without modifying the source collection in between. Still, the different semantic remains and it might turn into a very hard to find bug when you once enter such a corner case.
Further, since they are not equivalent, the stream API will never recognize these values as “actually no limit”. Even specifying Long.MAX_VALUE implies that the stream implementation has to track the number of processed elements to ensure that the limit has been obeyed. Thus, not adding a limit operation can have a significant performance advantage over adding a limit with a number that the programmer expects to never be exceeded.
There is two ways you can do this
// Do some stream stuff
List<E> results = list.stream()
.filter(e -> e.getTimestamp() < max);
.limit(limit > 0 ? limit : list.size())
.collect(Collectors.toList());
OR
// Do some stream stuff
stream = stream.filter(e -> e.getTimestamp() < max);
// Limit the stream
if (limit != -1) {
stream = stream.limit(limit);
}
// Collect stream to list
List<E> results = stream.collect(Collectors.toList());
As this is functional programming you should always work on the result of each function. You should specifically avoid modifying anything in this style of programming and treat everything as if it was immutable if possible.
Since I'm reassigning the value of stream before a terminal operation is called, is the above code still a proper way to use Java 8 streams?
It should work, however it reads as a mix of imperative and functional coding. I suggest writing it as a fixed stream as per my first answer.
I think your first line needs to be:
stream = stream.filter(e -> e.getTimestamp() < max);
so that your using the stream returned by filter in subsequent operations rather than the original stream.
I known it is a bit too late, but I had the same question myself and didn't find the satisfying answer, however, inspired by this question and answers I came to the following solution:
return Stream.of( ///< wrap target stream in other stream ;)
/*do regular stream stuff*/
stream.filter(e -> e.getTimestamp() < max)
).flatMap(s -> limit != -1 ? s.limit(limit) : s) ///< apply limit only if necessary and unwrap stream of stream to "normal" stream
.collect(Collectors.toList()) ///< do final stuff

Java 8 Streams peek api

I tried the following snippet of Java 8 code with peek.
List<String> list = Arrays.asList("Bender", "Fry", "Leela");
list.stream().peek(System.out::println);
However there is nothing printed out on the console. If I do this instead:
list.stream().peek(System.out::println).forEach(System.out::println);
I see the following which outputs both the peek as well as foreach invocation.
Bender
Bender
Fry
Fry
Leela
Leela
Both foreach and peek take in a (Consumer<? super T> action)
So why is the output different?
The Javadoc mentions the following:
Intermediate operations return a new stream. They are always lazy; executing an intermediate operation such as filter() does not actually perform any filtering, but instead creates a new stream that, when traversed, contains the elements of the initial stream that match the given predicate. Traversal of the pipeline source does not begin until the terminal operation of the pipeline is executed.
peek being an intermediate operation does nothing. On applying a terminal operation like foreach, the results do get printed out as seen.
The documentation for peek says
Returns a stream consisting of the elements of this stream, additionally performing the provided action on each element as elements are consumed from the resulting stream.
This is an intermediate operation.
You therefore have to do something with the resulting stream for System.out.println to do anything.
From the docs on Stream for the peek method:
...additionally performing the provided action on each element as elements are consumed from the resulting stream.
Streams in Java-8 are lazy, in addition, say if there are two chained operations in stream one after the other, then the second operation begins as soon as first one finishes processing a unit of data element (given there is a terminal operation in the stream).
This is the reason why you can see repeated name strings getting output.

Categories