Say I have two Monos, one which resolves to Void/empty and the other producing an Integer, how can I execute both in parallel, and continue on as a Mono<Integer>.
Specifically both of these Monos are results of WebClient requests. Only one of these produces a useful value, but both need to be successful to continue.
eg.
Mono<Void> a = sendSomeData();
Mono<Integer> b = getSomeNumber();
Mono<Integer> resultingStream = runConcurrentAndGetValue(a, b);
How would I write runConcurrentAndGetValue(a,b) ?
Initially I didn't need the value and was using Mono.when(a,b) and building off of the Mono<Void>. But now I need the value. I tried using Mono.zip(a,b).map(Tuple2::getT2) but then learned that zip will cancel b because a has a lower cardinality (0), and will end up with no item as a result.
I could use Mono.when(a).then(b) but I would really prefer to be able to execute these concurrently. What is the right operator/composition to use in this case?
Edit:
One option I can think of is just a hack to emit an unused value like:
Mono.zip(a.then(Mono.just("placeholder")), b).map(Tuple2::getT2)
You could use reactor.core.publisher.Flux#merge(Publisher<? extends I>...) method and take last element.
Mono<Integer> a = sendSomeData().then(Mono.empty);
Mono<Integer> b = getSomeNumber();
Mono<Integer> result = Flux.merge(a, b).last();
result.map(...);
Related
I have a Flux of strings. For each string, I have to make a remote call. But the problem is that the method that makes the remote call actually returns a Mono of the response (obviously since corresponding to a single request, there'll be a single response).
What should be the correct pattern to handle such cases? One solution I can think of is to make serial (or parallel) calls for the stream elements and reduce the responses to a single one and return.
Here's the code:
fluxObj.flatmap(a -> makeRemoteCall(a)//converts the Mono of the response to a Flux).reduce(...)
I am being unable to wrap my head around inside the flatmap.The makeRemoteCall method returns a Mono. But the flatmap returns a Flux of the response. First, why is this happening? Second, does it mean that the returned Flux contains a single response object (that was returned in the Mono)?
If the mapper Function returns a Mono, then it means that there will be (at most) one derived value for each source element in the Flux.
Having the Function return:
an empty Mono (eg. Mono.empty()) for a given value means that this source value is "ignored"
a valued Mono (like in your example) means that this source value is asynchronously mapped to another specific value
a Flux with several derived values for a given value means that this source value is asynchronously mapped to several values
For instance, given the following flatMap:
Flux.just("A", "B")
.flatMap(v -> Mono.just("value" + v))
Subscribing to the above Flux<String> and printing the emitted elements would yield:
valueA
valueB
Another fun example: with delays, one can get out of order results. Like this:
Flux.just(300, 100)
.flatMap(delay -> Mono.delay(Duration.ofMillis(delay))
.thenReturn(delay + "ms")
)
would result in a Flux<String> that yields:
100ms
300ms
If you see documentation flatMap, you can found answers to your questions:
Transform the elements emitted by this Flux asynchronously into Publishers, then flatten these inner publishers into a single Flux, sequentially and preserving order using concatenation.
Long story in short,
#Test
public void testFlux() {
Flux<String> oneString = Flux.just("1");
oneString
.flatMap(s -> testMono(s))
.collectList()
.subscribe(integers -> System.out.println("elements:" + integers));
}
private Mono<Integer> testMono(String s) {
return Mono.just(Integer.valueOf(s + "0"));
}
mapper - s -> testMono(s) where testMono(s) is Publisher (in your case makeRemoteCall(a)), it transforms a type of my oneString to Integer.
I collected Flux result to List, and printed it. Console output:
elements:[10]
It means result Flux after flatMap operator contains just one element.
I want change my code for single subscriber. Now i have
auctionFlux.window(Duration.ofSeconds(120), Duration.ofSeconds(120)).subscribe(
s -> s.groupBy(Auction::getItem).subscribe( longAuctionGroupedFlux -> longAuctionGroupedFlux.reduce(new ItemDumpStats(), this::calculateStats )
));
This code is working correctly reduce method is very simple. I tried change my code for single subscriber
auctionFlux.window(Duration.ofSeconds(120), Duration.ofSeconds(120))
.flatMap(window -> window.groupBy(Auction::getItem))
.flatMap(longAuctionGroupedFlux -> longAuctionGroupedFlux.reduce(new ItemDumpStats(), this::calculateStats))
.subscribe(itemDumpStatsMono -> log.info(itemDumpStatsMono.toString()));
This is my code, and this code is not working. No errors and no results. After debugging i found code is stuck on second flatMap when i reducing stream. I think problem is on flatMap merging, stucking on Mono resolve. Some one now how to fix this problem and use only single subscriber?
How to replicate, you can use another class or create one. In small size is working but on bigger is dying
List<Auction> auctionList = new ArrayList<>();
for (int i = 0;i<100000;i++){
Auction a = new Auction((long) i, "test");
a.setItem((long) (i%50));
auctionList.add(a);
}
Flux.fromIterable(auctionList).groupBy(Auction::getId).flatMap(longAuctionGroupedFlux ->
longAuctionGroupedFlux.reduce(new ItemDumpStats(), (itemDumpStats, auction) -> itemDumpStats)).collectList().subscribe(itemDumpStats -> System.out.println(itemDumpStats.toString()));
On this approach is instant result but I using 3 subscribers
Flux.fromIterable(auctionList)
.groupBy(Auction::getId)
.subscribe(
auctionIdAuctionGroupedFlux -> auctionIdAuctionGroupedFlux.reduce(new ItemDumpStats(), (itemDumpStats, auction) -> itemDumpStats).subscribe(itemDumpStats -> System.out.println(itemDumpStats.toString()
)
));
I think the behavior you described is related to the interaction between groupBy chained with flatMap.
Check groupBy documentation. It states that:
The groups need to be drained and consumed downstream for groupBy to work correctly. Notably when the criteria produces a large amount of groups, it can lead to hanging if the groups are not suitably consumed downstream (eg. due to a flatMap with a maxConcurrency parameter that is set too low).
By default, maxConcurrency (flatMap) is set to 256 (i checked the source code of 3.2.2). So,
selecting more than 256 groups may cause the execution to hang (particularly when all execution happens on the same thread).
The following code helps in understanding what happens when you chain the operators groupBy and flatMap:
#Test
public void groupAndFlatmapTest() {
val groupCount = 257;
val groupSize = 513;
val list = rangeClosed(1, groupSize * groupCount).boxed().collect(Collectors.toList());
val source = Flux.fromIterable(list)
.groupBy(i -> i % groupCount)
.flatMap(Flux::collectList);
StepVerifier.create(source).expectNextCount(groupCount).expectComplete().verify();
}
The execution of this code hangs. Changing groupCount to 256 or less makes the test pass (for every value of groupSize).
So, regarding your original problem, it is very possible that you are creating a large amount of groups with your key-selector Auction::getItem.
Adding parallel fixed problem, but i looking answer why reduce dramatically slow flatMap.
I have the following code:
final Observable<String> a = Observable.just("a1", "a2");
final Observable<String> b = Observable.just("b1");
final Observable<String> c = Observable.combineLatest(a, b, (first, second) -> first + second);
c.subscribe(res -> System.out.println(res));
What is expected output? I would have expected
a1b1
a2b1
But the actual output is
a2b1
Does that make sense? What is the correct operator the generate the expected sequence?
As the name of the operator should imply, it combines the latest value of each source. If the sources are synchronous or really fast, this could mean that one or more sources will run to their completion and the operator will remember only the very last values of each. You have to interleave the source values by some means, such as having asynchronous sources with ample amount of time between items and avoid close overlapping of items of multiple sources.
The expected sequence can be generated a couple of ways, depending on what your original intention was. For example, if you wanted all cross combination, use flatMap:
a.flatMap(aValue -> b, (aValue, bValue) -> first + second)
.subscribe(System.out::println);
If b is something expensive to recreate, cache it:
Observable<String> cachedB = b.cache();
a.flatMap(aValue -> cachedB, (aValue, bValue) -> first + second)
.subscribe(System.out::println);
Good question! Seems like perhaps a race condition. combineLatest won't output anything until both sources have emitted, and it appears that by the time b generates its output, a has already moved on to its second item. In a "real world" application with asynchronous events that are spaced out in time, you would probably get the behavior you want.
If you can stand the wait, a solution would be to delay a's outputs a bit. With a bit more work you could delay just the first output (see the various overloads of the delay operator). Also I just noticed there's an operator delaySubscription that would probably do the trick (delay your subscription to a until b emits something). I'm sure there are other, perhaps better, solutions (I'm still learning myself).
I'm wondering if I can add an operation to a stream, based off of some sort of condition set outside of the stream. For example, I want to add a limit operation to the stream if my limit variable is not equal to -1.
My code currently looks like this, but I have yet to see other examples of streams being used this way, where a Stream object is reassigned to the result of an intermediate operation applied on itself:
// Do some stream stuff
stream = stream.filter(e -> e.getTimestamp() < max);
// Limit the stream
if (limit != -1) {
stream = stream.limit(limit);
}
// Collect stream to list
stream.collect(Collectors.toList());
As stated in this stackoverflow post, the filter isn't actually applied until a terminal operation is called. Since I'm reassigning the value of stream before a terminal operation is called, is the above code still a proper way to use Java 8 streams?
There is no semantic difference between a chained series of invocations and a series of invocations storing the intermediate return values. Thus, the following code fragments are equivalent:
a = object.foo();
b = a.bar();
c = b.baz();
and
c = object.foo().bar().baz();
In either case, each method is invoked on the result of the previous invocation. But in the latter case, the intermediate results are not stored but lost on the next invocation. In the case of the stream API, the intermediate results must not be used after you have called the next method on it, thus chaining is the natural way of using stream as it intrinsically ensures that you don’t invoke more than one method on a returned reference.
Still, it is not wrong to store the reference to a stream as long as you obey the contract of not using a returned reference more than once. By using it they way as in your question, i.e. overwriting the variable with the result of the next invocation, you also ensure that you don’t invoke more than one method on a returned reference, thus, it’s a correct usage. Of course, this only works with intermediate results of the same type, so when you are using map or flatMap, getting a stream of a different reference type, you can’t overwrite the local variable. Then you have to be careful to not use the old local variable again, but, as said, as long as you are not using it after the next invocation, there is nothing wrong with the intermediate storage.
Sometimes, you have to store it, e.g.
try(Stream<String> stream = Files.lines(Paths.get("myFile.txt"))) {
stream.filter(s -> !s.isEmpty()).forEach(System.out::println);
}
Note that the code is equivalent to the following alternatives:
try(Stream<String> stream = Files.lines(Paths.get("myFile.txt")).filter(s->!s.isEmpty())) {
stream.forEach(System.out::println);
}
and
try(Stream<String> srcStream = Files.lines(Paths.get("myFile.txt"))) {
Stream<String> tmp = srcStream.filter(s -> !s.isEmpty());
// must not be use variable srcStream here:
tmp.forEach(System.out::println);
}
They are equivalent because forEach is always invoked on the result of filter which is always invoked on the result of Files.lines and it doesn’t matter on which result the final close() operation is invoked as closing affects the entire stream pipeline.
To put it in one sentence, the way you use it, is correct.
I even prefer to do it that way, as not chaining a limit operation when you don’t want to apply a limit is the cleanest way of expression your intent. It’s also worth noting that the suggested alternatives may work in a lot of cases, but they are not semantically equivalent:
.limit(condition? aLimit: Long.MAX_VALUE)
assumes that the maximum number of elements, you can ever encounter, is Long.MAX_VALUE but streams can have more elements than that, they even might be infinite.
.limit(condition? aLimit: list.size())
when the stream source is list, is breaking the lazy evaluation of a stream. In principle, a mutable stream source might legally get arbitrarily changed up to the point when the terminal action is commenced. The result will reflect all modifications made up to this point. When you add an intermediate operation incorporating list.size(), i.e. the actual size of the list at this point, subsequent modifications applied to the collection between this point and the terminal operation may turn this value to have a different meaning than the intended “actually no limit” semantic.
Compare with “Non Interference” section of the API documentation:
For well-behaved stream sources, the source can be modified before the terminal operation commences and those modifications will be reflected in the covered elements. For example, consider the following code:
List<String> l = new ArrayList(Arrays.asList("one", "two"));
Stream<String> sl = l.stream();
l.add("three");
String s = sl.collect(joining(" "));
First a list is created consisting of two strings: "one"; and "two". Then a stream is created from that list. Next the list is modified by adding a third string: "three". Finally the elements of the stream are collected and joined together. Since the list was modified before the terminal collect operation commenced the result will be a string of "one two three".
Of course, this is a rare corner case as normally, a programmer will formulate an entire stream pipeline without modifying the source collection in between. Still, the different semantic remains and it might turn into a very hard to find bug when you once enter such a corner case.
Further, since they are not equivalent, the stream API will never recognize these values as “actually no limit”. Even specifying Long.MAX_VALUE implies that the stream implementation has to track the number of processed elements to ensure that the limit has been obeyed. Thus, not adding a limit operation can have a significant performance advantage over adding a limit with a number that the programmer expects to never be exceeded.
There is two ways you can do this
// Do some stream stuff
List<E> results = list.stream()
.filter(e -> e.getTimestamp() < max);
.limit(limit > 0 ? limit : list.size())
.collect(Collectors.toList());
OR
// Do some stream stuff
stream = stream.filter(e -> e.getTimestamp() < max);
// Limit the stream
if (limit != -1) {
stream = stream.limit(limit);
}
// Collect stream to list
List<E> results = stream.collect(Collectors.toList());
As this is functional programming you should always work on the result of each function. You should specifically avoid modifying anything in this style of programming and treat everything as if it was immutable if possible.
Since I'm reassigning the value of stream before a terminal operation is called, is the above code still a proper way to use Java 8 streams?
It should work, however it reads as a mix of imperative and functional coding. I suggest writing it as a fixed stream as per my first answer.
I think your first line needs to be:
stream = stream.filter(e -> e.getTimestamp() < max);
so that your using the stream returned by filter in subsequent operations rather than the original stream.
I known it is a bit too late, but I had the same question myself and didn't find the satisfying answer, however, inspired by this question and answers I came to the following solution:
return Stream.of( ///< wrap target stream in other stream ;)
/*do regular stream stuff*/
stream.filter(e -> e.getTimestamp() < max)
).flatMap(s -> limit != -1 ? s.limit(limit) : s) ///< apply limit only if necessary and unwrap stream of stream to "normal" stream
.collect(Collectors.toList()) ///< do final stuff
In our project we are migrating to java 8 and we are testing the new features of it.
On my project I'm using Guava predicates and functions to filter and transform some collections using Collections2.transform and Collections2.filter.
On this migration I need to change for example guava code to java 8 changes. So, the changes I'm doing are the kind of:
List<Integer> naturals = Lists.newArrayList(1,2,3,4,5,6,7,8,9,10,11,12,13);
Function <Integer, Integer> duplicate = new Function<Integer, Integer>(){
#Override
public Integer apply(Integer n)
{
return n * 2;
}
};
Collection result = Collections2.transform(naturals, duplicate);
To...
List<Integer> result2 = naturals.stream()
.map(n -> n * 2)
.collect(Collectors.toList());
Using guava I was very confortable debugging the code since I could debug each transformation process but my concern is how to debug for example .map(n -> n*2).
Using the debugger I can see some code like:
#Hidden
#DontInline
/** Interpretively invoke this form on the given arguments. */
Object interpretWithArguments(Object... argumentValues) throws Throwable {
if (TRACE_INTERPRETER)
return interpretWithArgumentsTracing(argumentValues);
checkInvocationCounter();
assert(arityCheck(argumentValues));
Object[] values = Arrays.copyOf(argumentValues, names.length);
for (int i = argumentValues.length; i < values.length; i++) {
values[i] = interpretName(names[i], values);
}
return (result < 0) ? null : values[result];
}
But it isn't as straighforward as Guava to debug the code, actually I couldn't find the n * 2 transformation.
Is there a way to see this transformation or a way to easy debug this code?
EDIT: I've added answer from different comments and posted answers
Thanks to Holger comment that answered my question, the approach of having lambda block allowed me to see the transformation process and debug what happened inside lambda body:
.map(
n -> {
Integer nr = n * 2;
return nr;
}
)
Thanks to Stuart Marks the approach of having method references also allowed me to debug the transformation process:
static int timesTwo(int n) {
Integer result = n * 2;
return result;
}
...
List<Integer> result2 = naturals.stream()
.map(Java8Test::timesTwo)
.collect(Collectors.toList());
...
Thanks to Marlon Bernardes answer I noticed that my Eclipse doesn't show what it should and the usage of peek() helped to display results.
I usually have no problem debugging lambda expressions while using Eclipse or IntelliJ IDEA. Just set a breakpoint and be sure not to inspect the whole lambda expression (inspect only the lambda body).
Another approach is to use peek to inspect the elements of the stream:
List<Integer> naturals = Arrays.asList(1,2,3,4,5,6,7,8,9,10,11,12,13);
naturals.stream()
.map(n -> n * 2)
.peek(System.out::println)
.collect(Collectors.toList());
UPDATE:
I think you're getting confused because map is an intermediate operation - in other words: it is a lazy operation which will be executed only after a terminal operation was executed. So when you call stream.map(n -> n * 2) the lambda body isn't being executed at the moment. You need to set a breakpoint and inspect it after a terminal operation was called (collect, in this case).
Check Stream Operations for further explanations.
UPDATE 2:
Quoting Holger's comment:
What makes it tricky here is that the call to map and the lambda
expression are in one line so a line breakpoint will stop on two
completely unrelated actions.
Inserting a line break right after map(
would allow you to set a break point for the lambda expression only.
And it’s not unusual that debuggers don’t show intermediate values of
a return statement. Changing the lambda to n -> { int result=n * 2; return result; }
would allow you to inspect result. Again, insert line
breaks appropriately when stepping line by line…
IntelliJ has such a nice plugin for this case as a Java Stream Debugger plugin. You should check it out: https://plugins.jetbrains.com/plugin/9696-java-stream-debugger?platform=hootsuite
It extends the IDEA Debugger tool window by adding the Trace Current Stream Chain button, which becomes active when debugger stops inside of a chain of Stream API calls.
It has nice interface for working with separate streams operations and gives you opportunity to follow some values that u should debug.
You can launch it manually from the Debug window by clicking here:
Debugging lambdas also works well with NetBeans. I'm using NetBeans 8 and JDK 8u5.
If you set a breakpoint on a line where there's a lambda, you actually will hit once when the pipeline is set up, and then once for each stream element. Using your example, the first time you hit the breakpoint will be the map() call that's setting up the stream pipeline:
You can see the call stack and the local variables and parameter values for main as you'd expect. If you continue stepping, the "same" breakpoint is hit again, except this time it's within the call to the lambda:
Note that this time the call stack is deep within the streams machinery, and the local variables are the locals of the lambda itself, not the enclosing main method. (I've changed the values in the naturals list to make this clear.)
As Marlon Bernardes pointed out (+1), you can use peek to inspect values as they go by in the pipeline. Be careful though if you're using this from a parallel stream. The values can be printed in an unpredictable order across different threads. If you're storing values in a debugging data structure from peek, that data structure will of course have to be thread-safe.
Finally, if you're doing a lot of debugging of lambdas (especially multi-line statement lambdas), it might be preferable to extract the lambda into a named method and then refer to it using a method reference. For example,
static int timesTwo(int n) {
return n * 2;
}
public static void main(String[] args) {
List<Integer> naturals = Arrays.asList(3247,92837,123);
List<Integer> result =
naturals.stream()
.map(DebugLambda::timesTwo)
.collect(toList());
}
This might make it easier to see what's going on while you're debugging. In addition, extracting methods this way makes it easier to unit test. If your lambda is so complicated that you need to be single-stepping through it, you probably want to have a bunch of unit tests for it anyway.
Just to provide more updated details (Oct 2019), IntelliJ has added a pretty nice integration to debug this type of code that is extremely useful.
When we stop at a line that contains a lambda if we press F7 (step into) then IntelliJ will highlight what will be the snippet to debug. We can switch what chunk to debug with Tab and once we decided it then we click F7 again.
Here some screenshots to illustrate:
1- Press F7 (step into) key, will display the highlights (or selection mode)
2- Use Tab multiple times to select the snippet to debug
3- Press F7 (step into) key to step into
Intellij IDEA 15 seems to make it even easier, it allows to stop in a part of the line where lambda is, see the first feature: http://blog.jetbrains.com/idea/2015/06/intellij-idea-15-eap-is-open/
Debugging using IDE's are always-helpful, but the ideal way of debugging through each elements in a stream is to use peek() before a terminal method operation since Java Steams are lazily evaluated, so unless a terminal method is invoked, the respective stream will not be evaluated.
List<Integer> numFromZeroToTen = Arrays.asList(1,2,3,4,5,6,7,8,9,10);
numFromZeroToTen.stream()
.map(n -> n * 2)
.peek(n -> System.out.println(n))
.collect(Collectors.toList());