An answer here quotes a table of all CompletableFuture methods, but it's not quite what I'm looking for, or perhaps I'm going about it wrong.
I'm looking for the CompletableFuture equivalent of Streams' peek(), so basically a thenApply that returns the input argument or a thenRun that doesn't return Void. There are two ways I can think of that both don't semantically accurately express my intention, but do the job:
(..)
.thenApply((a) -> {
doSomething();
return a;
}
(..)
and
(..)
.whenComplete((result, exception) -> {
if (exception == null) {
doSomething();
}
}
(..)
Both take the input from the previous stage, allow me to perform an action and return with the same type to the next stage. Of these two the latter second approach limits my timing to when everything else is done, rather than async as soon as the previous necessary stage is complete.
I've also been searching for an identity function that takes a consumer function as argument, but it seems I would have to write that myself:
public static <T> Function<T, T> identityConsumer(Consumer<T> c) {
return a -> { c.accept(a); return a; };
}
(..)
.thenApply(identityConsumer(a -> doSomething()));
(..)
So short of writing my own util functions, is there an elegant way of performing an action in an intermediate stage, where I don't have to return something, while keeping the stage's current type?
Unlike with Stream, you can make multiple function chains from a single future, so it is not necessary to do everything in a single chain of calls. I would approach your problem like so:
var f1 = CompletableFuture.supplyAsync(...);
f1.thenRun(() -> doSomething()); // this is your "peek"
var f2 = f1.thenApply(...); // continue your chain of operations here
I wanted to do this too, so I just made my own wrapper:
/** Converts a Consumer into an identity Function that passes through the input */
public static <T> Function<T, T> peek(Consumer<T> fn) {
return (t) -> {
fn.accept(t);
return t;
};
}
used like:
.thenApply(Functions.peek(SomeUtil::yourConsumerMethod));
Related
I have following scenario.
CompletableFuture<T> result = CompletableFuture.supplyAsync(task, executor);
result.thenRun(() -> {
...
});
// ....
// after some more code, based on some condition I attach the thenApply() to result.
if ( x == 1) {
result.thenApplyAsync(t -> {
return null;
});
}
The question is what if the CompletableFuture thread finishes the execution before the main thread reaches the thenApplyAsync ? does the CompletableFuture result shall attach itself to thenApply. i.e should callback be declared at the time of defining CompletableFuture.supplyAsync() itself ?
Also what is the order of execution ? thenRun() is always executed at last (after thenApply()) ?
Is there any drawback to use this strategy?
You seem to be missing an important point. When you chain a dependent function, you are not altering the future you’re invoking the chaining method on.
Instead, each of these methods returns a new completion stage representing the dependent action.
Since you are attaching two dependent actions to result, which represent the task passed to supplyAsync, there is no relationship between these two actions. They may run in an arbitrary order and even at the same time in different threads.
Since you are not storing the future returned by thenApplyAsync anywhere, the result of its evaluation would be lost anyway. Assuming that your function returns a result of the same type as T, you could use
if(x == 1) {
result = result.thenApplyAsync(t -> {
return null;
});
}
to replace the potentially completed future with the new future that only gets completed when the result of the specified function has been evaluated. The runnable registered at the original future via thenRun still does not depend on this new future. Note that thenApplyAsync without an executor will always use the default executor, regardless of which executor was used to complete the other future.
If you want to ensure that the Runnable has been successfully executed before any other stage, you can use
CompletableFuture<T> result = CompletableFuture.supplyAsync(task, executor);
CompletableFuture<Void> thenRun = result.thenRun(() -> {
//...
});
result = result.thenCombine(thenRun, (t,v) -> t);
An alternative would be
result = result.whenComplete((value, throwable) -> {
//...
});
but here, the code will be always executed even in the exceptional case (which includes cancellation). You would have to check whether throwable is null, if you want to execute the code only in the successful case.
If you want to ensure that the runnable runs after both actions, the simplest strategy would be to chain it after the if statement, when the final completion stage is defined:
if(x == 1) {
result = result.thenApplyAsync(t -> {
return null;
});
}
result.thenRun(() -> {
//...
});
If that is not an option, you would need an incomplete future which you can complete on either result:
CompletableFuture<T> result = CompletableFuture.supplyAsync(task, executor);
//...
CompletableFuture<T> finalStage = new CompletableFuture<>();
finalStage.thenRun(() -> {
//...
});
// ...
if(x == 1) {
result = result.thenApplyAsync(t -> {
return null;
});
}
result.whenComplete((v,t) -> {
if(t != null) finalStage.completeExceptionally(t); else finalStage.complete(v);
});
The finalStage initially has no defined way of completion, but we can still chain dependent actions. Once we know the actual future, we can chain a handler which will complete our finalStage with whatever result we have.
As a final note, the methods without …Async, like thenRun, provide the least control over the evaluation thread. They may get executed in whatever thread completed the future, like one of executor’s threads in your example, but also directly in the thread calling thenRun, and even less intuitive, in your original example, the runnable may get executed during the unrelated thenApplyAsync invocation.
Is it possible to implement a Collector that stops processing of the stream as soon as an answer is available?
For example, if the Collector is computing an average, and one of the values is NaN, I know the answer is going to be NaN without seeing any more values, so further computation is pointless.
Thanks for the responses. The comments pointed the way to a solution, which I will describe here. It's very much inspired by StreamEx, but adapted to my particular situation.
Firstly, I define an implementation of Stream called XdmStream which in general delegates all methods to an underlying Stream which it wraps.
This immediately gives me the opportunity to define new methods, so for example my users can do stream.last() instead of stream.reduce((first,second)->second), which is a useful convenience.
As an example of a short-circuiting method I have implemented XdmStream.untilFirst(Predicate) as follows (base is the wrapped Stream). The idea of this method is to return a stream that delivers the same results as the original stream, except that when a predicate is satisfied, no more results are delivered.
public XdmStream<T> untilFirst(Predicate<? super XdmItem> predicate) {
Stream<T> stoppable = base.peek(item -> {
if (predicate.test(item)) {
base.close();
}
});
return new XdmStream<T>(stoppable);
}
When I first create the base Stream I call its onClose() method so that a call on close() triggers the supplier of data to stop supplying data.
The close() mechanism doesn't seem particularly well documented (it relies on the concept of a "stream pipeline" and it's not entirely clear when a new stream returned by some method is part of the same pipeline as the original stream) - but it's working for me. I guess I should probably ensure that this is only an optimization, so that the results will still be correct even if the flow of data isn't immediately turned off (e.g. if there is any buffering in the stream).
In addition to Federico's comment, it is possible to emulate a short-circuiting Collector by ceasing accumulation once a certain condition has been met. Though, this method will only be beneficial if accumulation is expensive. Here's an example, but keep in mind that there are flaws with this implementation:
public class AveragingCollector implements Collector<Double, double[], Double> {
private final AtomicBoolean hasFoundNaN = new AtomicBoolean();
#Override
public Supplier<double[]> supplier() {
return () -> new double[2];
}
#Override
public BiConsumer<double[], Double> accumulator() {
return (a, b) -> {
if (hasFoundNaN.get()) {
return;
}
if (b.equals(Double.NaN)) {
hasFoundNaN.set(true);
return;
}
a[0] += b;
a[1]++;
};
}
#Override
public BinaryOperator<double[]> combiner() {
return (a, b) -> {
a[0] += b[0];
a[1] += b[1];
return a;
};
}
#Override
public Function<double[], Double> finisher() {
return average -> average[0] / average[1];
}
#Override
public Set<Characteristics> characteristics() {
return new HashSet<>();
}
}
The following use-case returns Double.NaN, as expected:
public static void main(String args[]) throws IOException {
DoubleStream.of(1, 2, 3, 4, 5, 6, 7, Double.NaN)
.boxed()
.collect(new AveragingCollector()));
}
Instead of using a Collector, you could use Stream.allMatch(..) to terminate the Stream early and use the util classes like LongSummaryStatistics directly. If all values (and at least one) were present, you return them, e.g.:
Optional<LongSummaryStatistics> toLongStats(Stream<OptionalLong> stream) {
LongSummaryStatistics stat = new LongSummaryStatistics();
boolean allPresent = stream.allMatch(opt -> {
if (opt.isEmpty()) return false;
stat.accept(opt.getAsLong());
return true;
});
return allPresent && stat.getCount() > 0 ? Optional.of(stat) : Optional.empty();
}
Instead of a Stream<OptionalLong> you might use a DoubleStream and check for your NaN case.
For the case of NaN, it might be acceptable to consider this an Exceptional outcome, and so throw a custom NaNAverageException, short circuiting the collection operation. Normally using exceptions for normal control flow is a bad practice, however, it may be justified in this case.
Stream<String> s = Stream.of("1","2","ABC", "3");
try
{
double result = s.collect(Collectors.averagingInt(n -> Integer.parseInt(n)));
System.err.println("Average :"+ result);
}
catch (NumberFormatException e)
{
// exception will be thrown it encounters ABC and collector won't go for "3"
e.printStackTrace();
}
In my app I have 3 future calls, that are done in parallel and when a response for one of them is received, I have another 3 requests that all should finish before proceeding with code execution, precisely the DeferredResult from spring.
After a while, I realized that page is sometimes rendered before the latter 3 requests are done. Original source code (ommited logic for simplicity):
public DeferredResult<String> someControllerMethod() {
DeferredResult<String> result = new DeferredResult();
CompletableFuture.allOf(
future1(),
future2(),
future3()
)
.whenComplete((aVoid, throwable) -> result.setResult("something"));
return result;
}
public CompletableFuture<?> future3() {
return someService.asyncCall()
.thenApplyAsync(response -> {
....
return CompletableFuture.allOf(
future4(),
future5(),
future6()
);
}
);
}
With thenApplyAsync sometimes DeferredResult is completed before actual future, while changing to thenComposeAsync seems solve the issue. Could someone explain me why? Or it's a bug in my code somewhere and it should not behave this way?
thenApply[Async] accepts a function that evaluates to an arbitrary value. Once the value has been returned, the future will be completed with that value. When the function, like in your code, returns another future, this doesn’t add an additional meaning to it, the future will be the result value, whether completed or not, just like any other object.
In fact, your
public CompletableFuture<Void> future3() {
return someService.asyncCall()
.thenApplyAsync(response -> {
....
return CompletableFuture.allOf(
future4(),
future5(),
future6()
);
}
);
}
method does not even compile, as the result is CompletableFuture<CompletableFuture<Void>>, a future whose result value is another future. The only way not to spot the error, is to use a broader type, e.g. CompletableFuture<Object> or CompletableFuture<?>, as the return type of future3().
In contrast, thenCompose[Async] expects a function that evaluates to another future, to exactly the outcome you expect. That’s the fundamental different between “apply” and “compose”. If you keep the specific CompletableFuture<Void> return type for future3(), the compiler already guides you to use “compose”, as only that will be accepted.
public CompletableFuture<Void> future3() {
return someService.asyncCall()
.thenComposeAsync(response -> {
....
return CompletableFuture.allOf(
future4(),
future5(),
future6()
);
}
);
}
1) How can I use a Supplier (supplier) to create a sized stream of N values in parallel, while ensuring that no more than N calls are made to the supplier? I need this because I have a supplier with a costly supplier.get() operation.
2) The 'obvious' answer to my question, Streams.generate(supplier).limit(N), does not work and often results in more than N calls being made to the supplier. Why is this?
As 'proof' of the fact that Streams.generate(supplier).limit(N) results in more than N calls to supplier.get(), consider the following code:
public class MWE {
static final int N_ELEMENTS=100000;
static Supplier<IntSupplier> mySupplier = () -> new IntSupplier() {
AtomicInteger ai = new AtomicInteger(-1);
#Override
public int getAsInt() {
return ai.incrementAndGet();
}
};
public static void main(String[] args) {
int[] a = IntStream.generate(mySupplier.get()).limit(N_ELEMENTS).toArray();
int[] b = IntStream.generate(mySupplier.get()).parallel().limit(N_ELEMENTS).toArray();
}
}
a is equal to [0, 1, ..., N_ELEMENTS-1] as expected, but contrary to what you might expect b does not contain the same elements as a. Instead, b often contains elements that are greater than or equal to N_ELEMENTS, which indicates more than N_ELEMENTS number of calls to the supplier.
Another illustration would be that Streams.generate(new Random(0)::nextDouble()).limit(5) does not always generate the same set of numbers.
The stream API does not guarantee that IntStream.generate() will call the generator specified number of times. Also this call does not respect ordering.
If you actually need a parallel stream of increasing numbers, it's much better to use IntStream.range(0, N_ELEMENTS).parallel(). This not only ensures that you will actually have all the numbers from 0 to N_ELEMENTS-1, but greatly reduces the contention and guarantees order. If you need to generate something more complex, consider using custom source defining your own Spliterator class.
Note that the proposed IntStream.iterate solution may not parallelize greatly as it's sequential-by-nature source.
Calling .limit() is not guaranteed to result in a stream of the first N elements generated by the supplier because Stream.generate() creates an unordered stream, which leaves limit() free to decide on what 'part' of the stream to keep. Actually, it is not even semantically sound to refer to "the first N elements" or "(the first) part of the stream", because the stream is unordered. This behavior is clearly laid out in the API documentation; many thanks to everyone who pointed this out to me!
Since asking this question, I have come up with two solutions to my own question. My thanks go to Tagir who set me off in the right direction.
Solution 1: Misusing IntStream.range()
A simple and fairly efficient way of creating an unordered, sized, parallel stream backed by a supplier that makes no more calls to the supplier than is absolutely necessary is to (mis)use IntStream.range() like this:
IntStream.range(0,N_ELEMENTS).parallel().mapToObj($ -> generator.get())
Basically, we are using IntStream.range() only to create a sized stream that can be processed in parallel.
Solution 2: Custom spliterator
Because we never actually use the integers inside of the stream created by IntStream.range(), it seems like we can do slightly better by creating a custom Spliterator:
final class SizedSuppliedSpliterator<T> implements Spliterator<T> {
private int remaining;
private final Supplier<T> supplier;
private SizedSuppliedSpliterator(Supplier<T> supplier, int remaining) {
this.remaining = remaining;
this.supplier = supplier;
}
static <T> SizedSuppliedSpliterator of(Supplier<T> supplier, int limit) {
return new SizedSuppliedSpliterator(supplier, limit);
}
#Override
public boolean tryAdvance(final Consumer<? super T> consumer) {
Objects.requireNonNull(consumer);
if (remaining > 0) {
remaining--;
final T supplied = supplier.get();
consumer.accept(supplied);
return true;
}
return false;
}
#Override
public void forEachRemaining(final Consumer<? super T> consumer) {
while (remaining > 0) {
consumer.accept(supplier.get());
remaining--;
}
}
#Override
public SizedSuppliedSpliterator<T> trySplit() {
int split = (int)remaining/2;
remaining -= split;
return new SizedSuppliedSpliterator<>(supplier, split);
}
#Override
public long estimateSize() {
return remaining;
}
#Override
public int characteristics() {
return SIZED | SUBSIZED | IMMUTABLE;
}
}
We can use this spliterator to create the stream as follows:
StreamSupport.stream(SizedSuppliedSpliterator.of(supplier, N_ELEMENTS), true)
Of course, computing a couple of integers is hardly expensive, and I have not been able to notice or even measure any improvement in performance over solution 1.
I don't see an obvious way to handle an exception with an asynchronous result.
For example, if I want to retry an async operation, I would expect something like this:
CompletionStage<String> cf = askPong("cause error").handleAsync((x, t) -> {
if (t != null) {
return askPong("Ping");
} else {
return x;
}
});
Where askPong asks an actor:
public CompletionStage<String> askPong(String message){
Future sFuture = ask(actorRef, message, 1000);
final CompletionStage<String> cs = toJava(sFuture);
return cs;
}
However handleAsync doesn't do what you think it does - it runs the callbacks on another thread asynchronously. Returning a CompletionStage here is not correct.
Jeopardy question of the day: thenApply is to thenCompose as exceptionally is to what?
Is this what you are looking for?
askPong("cause error")
.handle( (pong, ex) -> ex == null
? CompletableFuture.completedFuture(pong)
: askPong("Ping")
).thenCompose(x -> x);
Also, do not use the ...Async methods unless you intend for the body of the supplied function to be executed asynchronously. So when you do something like
.handleAsync((x, t) -> {
if (t != null) {
return askPong("Ping");
} else {
return x;
})
You are asking for the if-then-else to be run in a separate thread. Since askPong returns a CompletableFuture, there's probably no reason to run it asynchronously.
Jeopardy question of the day: thenApply is to thenCompose as exceptionally is to what?
I know this was initially java-8, but, since java-12, the answer would be exceptionallyCompose:
exceptionallyCompose[Async](Function<Throwable,? extends CompletionStage<T>> fn [, Executor executor])
Returns a new CompletionStage that, when this stage completes exceptionally, is composed using the results of the supplied function applied to this stage's exception.
As the JavaDoc indicates, the default implementation is:
return handle((r, ex) -> (ex == null)
? this
: fn.apply(ex))
.thenCompose(Function.identity());
That is, using handle() to call the fallback, and thenCompose() to unwrap the resulting nested CompletableFuture<CompletableFuture<T>> – i.e., what you would have done in previous versions of Java (like in Misha’s answer), except you would have to replace this with completedFuture(r).
After a lot of frustration in trying to figure out the proper way of doing Scala's recoverWith in Java 8, I ended up just writing my own. I still don't know if this is the best approach, but I created something like:
public RecoveryChainAsync<T> recoverWith(Function<Throwable,
CompletableFuture<T>> fn);
With repeated calls to recoverWith, I queue up the functions inside the recovery chain and implement the recovery flow myself with "handle". RecoveryChainAsync.getCompletableFuture() then returns a representative CompletableFuture for the entire chain. Hope this helps.