CompletableFuture showing inconsistent threading behavior - java

CompletableFuture's documentation specifies the following behavior for asynchronous execution:
All async methods without an explicit Executor argument are performed using the ForkJoinPool#commonPool() (unless it does not support a parallelism level of at least two, in which case, a new Thread is created to run each task). To simplify monitoring, debugging, and tracking, all generated asynchronous tasks are instances of the marker interface AsynchronousCompletionTask.
However, the behavior of synchronous (or at least the non async) methods remain unclear. In most situations, the code executes with the original thread, like so:
Thread thread = Thread.currentThread();
CompletableFuture.runAsync(() -> {
// Should run on the common pool - working as expected
assert thread != Thread.currentThread();
}).thenRun(() -> {
// Returns to running on the thread that launched.. sometimes?
assert thread == Thread.currentThread();
}).join();
However, repetition of this test yields inconsistent results, as the second block of code sometimes uses the common pool instead. What is the intended behavior in this situation?

After looking at the OpenJDK implementation of CompletableFuture here, it smells like you're encountering a sort of race condition. Suppose we're doing something along the lines of runAsync(a).thenRun(b). If a finishes execution on the common pool thread before thenRun(b) gets called on the current thread, then b is run immediately on the current thread once thenRun(b) finally gets called. On the other hand, if thenRun(b) is called before a finishes on the other thread, then b is run on the same thread as a once a finally finishes.
It's sort of like this:
class CompletableFuture:
function runAsync(a):
function c():
run a
for each b in the handler stack:
run b
run c on a different thread
return future representing c
function thenRun(b):
if c is done:
run b
else:
put b in c's handler stack
Clearly, b is run in c's thread if thenRun is called before a finishes. Otherwise, b is run in the current thread.
If you want more consistent behaviour in terms of which thread runs what, you should try using CompletableFuture#thenRunAsync, which guarantees the handler is executed in some particular executor pool.

Related

Thread vs Runnable vs CompletableFuture in Java multi threading

I am trying to implement multi threading in my Spring Boot app. I am just beginner on multi threading in Java and after making some search and reading articles on various pages, I need to be clarified about the following points. So;
As far as I see, I can use Thread, Runnable or CompletableFuture in order to implement multi threading in a Java app. CompletableFuture seems a newer and cleaner way, but Thread may have more advantages. So, should I stick to CompletableFuture or use all of them based on the scenario?
Basically I want to send 2 concurrent requests to the same service method by using CompletableFuture:
CompletableFuture<Integer> future1 = fetchAsync(1);
CompletableFuture<Integer> future2 = fetchAsync(2);
Integer result1 = future1.get();
Integer result2 = future2.get();
How can I send these request concurrently and then return result based on the following condition:
if the first result is not null, return result and stop process
if the first result is null, return the second result and stop process
How can I do this? Should I use CompletableFuture.anyOf() for that?
CompletableFuture is a tool which settles atop the Executor/ExecutorService abstraction, which has implementations dealing with Runnable and Thread. You usually have no reason to deal with Thread creation manually. If you find CompletableFuture unsuitable for a particular task you may try the other tools/abstractions first.
If you want to proceed with the first (in the sense of faster) non‑null result, you can use something like
CompletableFuture<Integer> future1 = fetchAsync(1);
CompletableFuture<Integer> future2 = fetchAsync(2);
Integer result = CompletableFuture.anyOf(future1, future2)
.thenCompose(i -> i != null?
CompletableFuture.completedFuture((Integer)i):
future1.thenCombine(future2, (a, b) -> a != null? a: b))
.join();
anyOf allows you to proceed with the first result, but regardless of its actual value. So to use the first non‑null result we need to chain another operation which will resort to thenCombine if the first result is null. This will only complete when both futures have been completed but at this point we already know that the faster result was null and the second is needed. The overall code will still result in null when both results were null.
Note that anyOf accepts arbitrarily typed futures and results in a CompletableFuture<Object>. Hence, i is of type Object and a type cast needed. An alternative with full type safety would be
CompletableFuture<Integer> future1 = fetchAsync(1);
CompletableFuture<Integer> future2 = fetchAsync(2);
Integer result = future1.applyToEither(future2, Function.identity())
.thenCompose(i -> i != null?
CompletableFuture.completedFuture(i):
future1.thenCombine(future2, (a, b) -> a != null? a: b))
.join();
which requires us to specify a function which we do not need here, so this code resorts to Function.identity(). You could also just use i -> i to denote an identity function; that’s mostly a stylistic choice.
Note that most complications stem from the design that tries to avoid blocking threads by always chaining a dependent operation to be executed when the previous stage has been completed. The examples above follow this principle as the final join() call is only for demonstration purposes; you can easily remove it and return the future, if the caller expects a future rather than being blocked.
If you are going to perform the final blocking join() anyway, because you need the result value immediately, you can also use
Integer result = future1.applyToEither(future2, Function.identity()).join();
if(result == null) {
Integer a = future1.join(), b = future2.join();
result = a != null? a: b;
}
which might be easier to read and debug. This ease of use is the motivation behind the upcoming Virtual Threads feature. When an action is running on a virtual thread, you don’t need to avoid blocking calls. So with this feature, if you still need to return a CompletableFuture without blocking the your caller thread, you can use
CompletableFuture<Integer> resultFuture = future1.applyToEitherAsync(future2, r-> {
if(r != null) return r;
Integer a = future1.join(), b = future2.join();
return a != null? a: b;
}, Executors.newVirtualThreadPerTaskExecutor());
By requesting a virtual thread for the dependent action, we can use blocking join() calls within the function without hesitation which makes the code simpler, in fact, similar to the previous non-asynchronous variant.
In all cases, the code will provide the faster result if it is non‑null, without waiting for the completion of the second future. But it does not stop the evaluation of the unnecessary future. Stopping an already ongoing evaluation is not supported by CompletableFuture at all. You can call cancel(…) on it, but this will will only set the completion state (result) of the future to “exceptionally completed with a CancellationException”
So whether you call cancel or not, the already ongoing evaluation will continue in the background and only its final result will be ignored.
This might be acceptable for some operations. If not, you would have to change the implementation of fetchAsync significantly. You could use an ExecutorService directly and submit an operation to get a Future which support cancellation with interruption.
But it also requires the operation’s code to be sensitive to interruption, to have an actual effect:
When calling blocking operations, use those methods that may abort and throw an InterruptedException and do not catch-and-continue.
When performing a long running computational intense task, poll Thread.interrupted() occasionally and bail out when true.
So, should I stick to CompletableFuture or use all of them based on the scenario?
Use the one that is most appropriate to the scenario. Obviously, we can't be more specific unless you explain the scenario.
There are various factors to take into account. For example:
Thread + Runnable doesn't have a natural way to wait for / return a result. (But it is not hard to implement.)
Repeatedly creating bare Thread objects is inefficient because thread creation is expensive. Thread pooling is better but you shouldn't implement a thread pool yourself.
Solutions that use an ExecutorService take care of thread pooling and allow you to use Callable and return a Future. But for a once-off async computation this might be over-kill.
Solutions that involve ComputableFuture allow you to compose and combine asynchronous tasks. But if you don't need to do that, using ComputableFuture may be overkill.
As you can see ... there is no single correct answer for all scenarios.
Should I use CompletableFuture.anyOf() for that?
No. The logic of your example requires that you must have the result for future1 to determine whether or not you need the result for future2. So the solution is something like this:
Integer i1 = future1.get();
if (i1 == null) {
return future2.get();
} else {
future2.cancel(true);
return i1;
}
Note that the above works with plain Future as well as CompletableFuture. If you were using CompletableFuture because you thought that anyOf was the solution, then you didn't need to do that. Calling ExecutorService.submit(Callable) will give you a Future ...
It will be more complicated if you need to deal with exceptions thrown by the tasks and/or timeouts. In the former case, you need to catch ExecutionException and the extract its cause exception to get the exception thrown by the task.
There is also the caveat that the second computation may ignore the interrupt and continue on regardless.
So, should I stick to CompletableFuture or use all of them based on the scenario?
Well, they all have different purposes and you'll probably use them all either directly or indirectly:
Thread represents a thread and while it can be subclassed in most cases you shouldn't do so. Many frameworks maintain thread pools, i.e. they spin up several threads that then can take tasks from a task pool. This is done to reduce the overhead that thread creation brings as well as to reduce the amount of contention (many threads and few cpu cores mean a lot of context switches so you'd normally try to have fewer threads that just work on one task after another).
Runnable was one of the first interfaces to represent tasks that a thread can work on. Another is Callable which has 2 major differences to Runnable: 1) it can return a value while Runnable has void and 2) it can throw checked exceptions. Depending on your case you can use either but since you want to get a result, you'll more likely use Callable.
CompletableFuture and Future are basically a way for cross-thread communication, i.e. you can use those to check whether the task is done already (non-blocking) or to wait for completion (blocking).
So in many cases it's like this:
you submit a Runnable or Callable to some executor
the executor maintains a pool of Threads to execute the tasks you submitted
the executor returns a Future (one implementation being CompletableFuture) for you to check on the status and results of the task without having to synchronize yourself.
However, there may be other cases where you directly provide a Runnable to a Thread or even subclass Thread but nowadays those are far less common.
How can I do this? Should I use CompletableFuture.anyOf() for that?
CompletableFuture.anyOf() wouldn't work since you'd not be able to determine which of the 2 you'd pass in was successful first.
Since you're interested in result1 first (which btw can't be null if the type is int) you basically want to do the following:
Integer result1 = future1.get(); //block until result 1 is ready
if( result1 != null ) {
return result1;
} else {
return future2.get(); //result1 was null so wait for result2 and return it
}
You'd not want to call future2.get() right away since that would block until both are done but instead you're first interested in future1 only so if that produces a result you wouldn't have for future2 to ever finish.
Note that the code above doesn't handle exceptional completions and there's also probably a more elegant way of composing the futures like you want but I don't remember it atm (if I do I'll add it as an edit).
Another note: you could call future2.cancel() if result1 isn't null but I'd suggest you first check whether cancelling would even work (e.g. you'd have a hard time really cancelling a webservice request) and what the results of interrupting the service would be. If it's ok to just let it complete and ignore the result that's probably the easier way to go.

CompletableFuture Chain uncompleted -> Garbage Collector?

if i have one (or more) CompletableFuture not started yet, and on that method(s) a few thenApplyAsync(), anyOf()-methods.
Will the Garbage Collector remove all of that?
If there is a join()/get() at the end of that chain -> same question: Will the Garbage Collector remove all of that?
Maybe we need more information about that context of the join().
That join is in a Thread the last command, and there are no side-effects.
So is in that case the Thread still active? - Java Thread Garbage collected or not
Anyway is that a good idea, to push a poisen-pill down the chain, if im sure (maybe in a try-catch-finally), that i will not start that Completable-chain, or is that not necessary?
The question is because of something like that? (https://bugs.openjdk.java.net/browse/JDK-8160402)
Some related question to it: When is the Thread-Executor signaled to shedule a new task? I think, when the CompletableFuture goes to the next chained CompletableFuture?. So i must only carry on memory-leaks and not thread-leaks?
Edit: What i mean with a not started CompletableFuture?
i mean a var notStartedCompletableFuture = new CompletableFuture<Object>(); instead of a CompletableFuture.supplyAsync(....);
I can start the notStartedCompletableFuture in that way:
notStartedCompletableFuture.complete(new Object); later in the program-flow or from another thread.
Edit 2: A more detailed Example:
AtomicReference<CompletableFuture<Object>> outsideReference=new AtomicReference<>();
final var myOuterThread = new Thread(() ->
{
final var A = new CompletableFuture<Object>();
final var B = new CompletableFuture<Object>();
final var C = A.thenApplyAsync((element) -> new Object());
final var D = CompletableFuture.anyOf(A, C);
A.complete(new Object());
// throw new RuntimeException();
//outsideReference.set(B);
----->B.complete(new Object());<------ Edit: this shouldn't be here, i remove it in my next iteration
D.join();
});
myOuterThread.start();
//myOutherThread variable is nowhere else referenced, it's sayed so a local variable, to point on my text on it^^
So in the normal case here in my example i don't have a outside
reference. The CompletableFutures in the thread have never a chance
getting completed. Normally the GC can safely erase both the thread
and and the content in there, the CompetableFutures. But i don't
think so, that this would happen?
If I abbord this by throwing an exception -> the join() is never
reached, then i think all would be erased by the GC?
If I give one of the CompletableFutures to the outside by that AtomicReference, there then could be an chance to unblock the join(), There should be no GC here, until the unblock happens. BUT! the waiting myOuterThread on that join() doesen't have to to there anything more after the join(). So it could be an optimization erasing that Thread, before someone from outside completes B. But I think this would be also not happen?!
One more question here, how I can proof that behavior, if threads are blocked by waiting on a join() or are returned to a Thread-Pool?, where the Thread also "blocks"?
You seem to be struggling with different ways that CompletableFuture might leak, depending on how you created it. But it doesn't matter how, where, when or why it was created. The only thing that matters is whether or not it is still reachable.
Will the Garbage Collector remove all of that?
There are two places where we would expect there to be references to a CompletableFuture:
In the Runnable (or whatever) that would complete the future.
In any other code that would (at some point) attempt to get the eventual value from the future.
If you have a call thenApplyAsync() or anyOf() then the reference Runnable is in the arguments to that call. If the call can still happen, then the reference to the Runnable must still be reachable.
In your example:
var notStartedCompletableFuture = new CompletableFuture<Object>();
if the variable notStartedCompletableFuture is still accessible by some code that is still executing, then that CompletableFuture is reachable and won't be garbage collected.
On the other hand, if notStartedCompletableFuture is no longer accessible, and if the future is no longer reachable by some other path, then it won't be reachable at all ... and will be a candidate for garbage collection.
If there is a join() / get() at the end of that chain -> same question: Will the Garbage Collector remove all of that?
That makes no difference. It is all based on reachability. (The only wrinkle is that a thread that is currently alive1 is always reachable, irrespective of any other references to its Thread object. The same applies to its Runnable, and other objects reachable from the Runnable.)
But it is worth noting that if you call join() or get() on a thread / future that never terminates / completes, you will block the current thread, potentially for ever. And that is as bad as a thread leak.
1 - A thread is "alive" from when it is started to when it terminates.
When is the Thread-Executor signaled to schedule a new task?
It depends what you mean by "schedule". If you mean, when is the task submitted, the answer is when submit is called. If you mean, when is it actually run ... well it goes into the queue, and it runs when it gets to the head of the queue and a worker thread is free to execute it.
In the case of thenApplyAsync() and all_of(), the tasks are submitted (i.e. the submit(...) call occurs) when the respective method call occurs. So for example if thenApplyAsync is being called on the result of a previous call, then that call must return first.
This is all a consequence of the basic properties of Java expression evaluation ... applied to the expression that you are using to construct the chain of stages.
In general you don't need try / finally or try with resources to clean up potential memory leaks.
All you need to do is to make sure that you don't keep references to the various futures, stages, etc in variables, data structures, etc that will remain accessible / reachable beyond the lifetime of your computation. If you do that ... those references are liable to be the source of the leaks.
Thread leaks should not be your concern. If your code is not creating threads explicitly, they are being managed by the executor service / pool.
If a thread calls join() or get() on a CompletableFuture that will never be completed, it will remain blocked forever (except if it gets interrupted), holding a reference to that future.
If that future is the root of a chain of descendant futures (+ tasks and executors), it will also keep a reference to those, which will also remain in memory (as well as all transitively referenced objects).
A future does not normally hold references to its “parent(s)” when created through the then*() methods, so they should normally be garbage collected if there are no other references – but pay attention to those, e.g. local variables in the calling thread, reference to a List<CompletableFuture<?>> used in a lambda after allOf() etc.
This Answer only addresses with the 3 followup questions in your "Edit 2".
So in the normal case here in my example i don't have a outside
reference.
I assume that you are referring to the version with the commented out statements.
The CompletableFutures in the thread have never a chance getting completed.
Incorrect. First, A is completed here:
A.complete(new Object());
Next B is completed here:
B.complete(new Object());
Then you call D.join(). Since D is an anyOf stage, this completes when either of A and C completes. A has already completed, so D.join() may not need to wait for C to complete. But since C applies the function asynchronously, it could complete immediately too.
Normally the GC can safely erase both the thread and and the content in there, the CompletableFutures. But I don't think so, that this would happen?
When D.join() returns, the thread terminates. At that point, its local local variables (A, B, C, and D) will be unreachable.
If I abort this by throwing an exception -> the join() is never reached, then i think all would be erased by the GC?
A completes as before, but B, C and D don't.
However, the exception terminates the thread, so the local variables A, B, C, and D then become unreachable.
If I give one of the CompletableFutures to the outside by that AtomicReference, there then could be an chance to unblock the join().
Three points:
The AtomicReference is assigned B so the join() on D is not affected.
As we saw above, it doesn't matter if that a hypothetical join() on outsideReference.value() happens or not for the variables A, B, C, and D. Those variables become unreachable, whichever way the thread terminates.
However, you have now assigned a reference to one of the CompletableFuture objects to a variable which has a different lifetime to the thread. That may mean that that CompletableFuture object stays reachable after the thread has terminated.

FutureTask get() method may distable LockSupport.park

i found FutureTask get() method may distable LockSupport.park in oracle jdk8
my code is :
ExecutorService service = Executors.newFixedThreadPool(1, (r) -> {
Thread thread = new Thread(r);
thread.setDaemon(true);
return thread;
});
Future<String> submit = service.submit(() -> {
TimeUnit.SECONDS.sleep(5);
System.out.println("exec future task..");
return "a";
});
System.out.println(submit.get());
LockSupport.park(new Object());
System.out.println("error,unPark");
}
i thoughtSystem.out.println("error,unPark");would not execute;but it did
exec future task..
a
error,unPark
for simulating thread schedule,I break point on FutureTask line 418
queued = UNSAFE.compareAndSwapObject(this, waitersOffset, q.next = waiters, q);
step over fastly before print exec future task..
after print exec future task.. for a while , continue to execute ..
then skip LockSupport.park(new Object()); and print error,unPark
i think
1.FutureTask add getting thread(main) in waiters;
2.execution thread(the thread pool) finish the task,and unpark all waiters;
3.getting thread(main) read state and found task hash finished, then return result and skip executing FutureTask locksupport.park()
4.because unpark method was executed in futuretask,then can skip LockSupport.park(new Object());and print error,unPark
is it a bug?
The documentation does says:
Disables the current thread for thread scheduling purposes unless the permit is available.
If the permit is available then it is consumed and the call returns immediately; otherwise the current thread becomes disabled for thread scheduling purposes and lies dormant until one of three things happens:
Some other thread invokes unpark with the current thread as the target; or
Some other thread interrupts the current thread; or
The call spuriously (that is, for no reason) returns.
This method does not report which of these caused the method to return. Callers should re-check the conditions which caused the thread to park in the first place. Callers may also determine, for example, the interrupt status of the thread upon return.
The third bullet alone would be enough to tell you that you can’t assume that returning from park implies that the condition you’re waiting for has been fulfilled.
Generally, this tool is meant for code that can not assume atomicity for the operation that checks/induces the condition and the park/unpark call.
As the documentation concludes you have to re-check the condition after returning from park. Special emphasis on “re-”; since this implies that you might find out that the return from park was not due to your condition being fulfilled, you might have consumed an unpark that was not for you. This in turn implies, that you also must test the condition before calling park and not call it when the condition has been fulfilled already. But if you do this, you might skip a park when unpark has been called, so some subsequent park call will return immediately.
In short, even without “spurious returns”, you always have to test your specific condition before calling park and re-check the condition after returning from park, in a loop if you want to wait for the condition.
Note that most of it also applies to using wait/notify in a synchronized block or await/signal when owning a Lock. Both should be used with a pre-test loop for reliable results.

Is the writer's reason correct for using thenCompose and not thenComposeAsync

This question is different from this one Difference between Java8 thenCompose and thenComposeAsync because I want to know what is the writer's reason for using thenCompose and not thenComposeAsync.
I was reading Modern Java in action and I came across this part of code on page 405:
public static List<String> findPrices(String product) {
ExecutorService executor = Executors.newFixedThreadPool(10);
List<Shop> shops = Arrays.asList(new Shop(), new Shop());
List<CompletableFuture<String>> priceFutures = shops.stream()
.map(shop -> CompletableFuture.supplyAsync(() -> shop.getPrice(product), executor))
.map(future -> future.thenApply(Quote::parse))
.map(future -> future.thenCompose(quote ->
CompletableFuture.supplyAsync(() -> Discount.applyDiscount(quote), executor)))
.collect(toList());
return priceFutures.stream()
.map(CompletableFuture::join).collect(toList());
}
Everything is Ok and I can understand this code but here is the writer's reason for why he didn't use thenComposeAsync on page 408 which I can't understand:
In general, a method without the Async suffix in its name executes
its task in the same threads the previous task, whereas a method
terminating with Async always submits the succeeding task to the
thread pool, so each of the tasks can be handled by a
different thread. In this case, the result of the second
CompletableFuture depends on the first,so it makes no difference to
the final result or to its broad-brush timing whether you compose the
two CompletableFutures with one or the other variant of this method
In my understanding with the thenCompose( and thenComposeAsync) signatures as below:
public <U> CompletableFuture<U> thenCompose(
Function<? super T, ? extends CompletionStage<U>> fn) {
return uniComposeStage(null, fn);
}
public <U> CompletableFuture<U> thenComposeAsync(
Function<? super T, ? extends CompletionStage<U>> fn) {
return uniComposeStage(asyncPool, fn);
}
The result of the second CompletableFuture can depends on the previous CompletableFuture in many situations(or rather I can say almost always), should we use thenCompose and not thenComposeAsync in those cases?
What if we have blocking code in the second CompletableFuture?
This is a similar example which was given by person who answered similar question here: Difference between Java8 thenCompose and thenComposeAsync
public CompletableFuture<String> requestData(Quote quote) {
Request request = blockingRequestForQuote(quote);
return CompletableFuture.supplyAsync(() -> sendRequest(request));
}
To my mind in this situation using thenComposeAsync can make our program faster because here blockingRequestForQuote can be run on different thread. But based on the writer's opinion we should not use thenComposeAsync because it depends on the first CompletableFuture result(that is Quote).
My question is:
Is the writer's idea correct when he said :
In this case, the result of the second
CompletableFuture depends on the first,so it makes no difference to
the final result or to its broad-brush timing whether you compose the
two CompletableFutures with one or the other variant of this method
TL;DR It is correct to use thenCompose instead of thenComposeAsync here, but not for the cited reasons. Generally, the code example should not be used as a template for your own code.
This chapter is a recurring topic on Stackoverflow for reasons we can best describe as “insufficient quality”, to stay polite.
In general, a method without the Async suffix in its name executes its task in the same threads the previous task, …
There is no such guaranty about the executing thread in the specification. The documentation says:
Actions supplied for dependent completions of non-async methods may be performed by the thread that completes the current CompletableFuture, or by any other caller of a completion method.
So there’s also the possibility that the task is performed “by any other caller of a completion method”. An intuitive example is
CompletableFuture<X> f = CompletableFuture.supplyAsync(() -> foo())
.thenApply(f -> f.bar());
There are two threads involved. One that invokes supplyAsync and thenApply and the other which will invoke foo(). If the second completes the invocation of foo() before the first thread enters the execution of thenApply, it is possible that the future is already completed.
A future does not remember which thread completed it. Neither does it have some magic ability to tell that thread to perform an action despite it might be busy with something else or even have terminated since then. So it should be obvious that calling thenApply on an already completed future can’t promise to use the thread that completed it. In most cases, it will perform the action immediately in the thread that calls thenApply. This is covered by the specification’s wording “any other caller of a completion method”.
But that’s not the end of the story. As this answer explains, when there are more than two threads involved, the action can also get performed by another thread calling an unrelated completion method on the future at the same time. This may happen rarely, but it’s possible in the reference implementation and permitted by the specification.
We can summarize it as: Methods without Async provides the least control over the thread that will perform the action and may even perform it right in the calling thread, leading to synchronous behavior.
So they are best when the executing thread doesn’t matter and you’re not hoping for background thread execution, i.e. for short, non-blocking operations.
whereas a method terminating with Async always submits the succeeding task to the thread pool, so each of the tasks can be handled by a different thread. In this case, the result of the second CompletableFuture depends on the first, …
When you do
future.thenCompose(quote ->
CompletableFuture.supplyAsync(() -> Discount.applyDiscount(quote), executor))
there are three futures involved, so it’s not exactly clear, which future is meant by “second”. supplyAsync is submitting an action and returning a future. The submission is contained in a function passed to thenCompose, which will return another future.
If you used thenComposeAsync here, you only mandated that the execution of supplyAsync has to be submitted to the thread pool, instead of performing it directly in the completing thread or “any other caller of a completion method”, e.g. directly in the thread calling thenCompose.
The reasoning about dependencies makes no sense here. “then” always implies a dependency. If you use thenComposeAsync here, you enforced the submission of the action to the thread pool, but this submission still won’t happen before the completion of future. And if future completed exceptionally, the submission won’t happen at all.
So, is using thenCompose reasonable here? Yes it is, but not for the reasons given is the quote. As said, using the non-async method implies giving up control over the executing thread and should only be used when the thread doesn’t matter, most notably for short, non-blocking actions. Calling supplyAsync is a cheap action that will submit the actual action to the thread pool on its own, so it’s ok to perform it in whatever thread is free to do it.
However, it’s an unnecessary complication. You can achieve the same using
future.thenApplyAsync(quote -> Discount.applyDiscount(quote), executor)
which will do exactly the same, submit applyDiscount to executor when future has been completed and produce a new future representing the result. Using a combination of thenCompose and supplyAsync is unnecessary here.
Note that this very example has been discussed in this Q&A already, which also addresses the unnecessary segregation of the future operations over multiple Stream operations as well as the wrong sequence diagram.
What a polite answer from Holger! I am really impressed he could provide such a great explanation and at the same time staying in bounds of not calling the author plain wrong. I want to provide my 0.02$ here too, a little, after reading the same book and having to scratch my head twice.
First of all, there is no "remembering" of which thread executed which stage, neither does the specification make such a statement (as already answered above). The interesting part is even in the cited above documentation:
Actions supplied for dependent completions of non-async methods may be performed by the thread that completes the current CompletableFuture, or by any other caller of a completion method.
Even that ...completes the current CompletableFuture part is tricky. What if there are two threads that try to call complete on a CompletableFuture, which thread will run all the dependent actions? The one that has actually completed it? Or any other? I wrote a jcstress test that is very non-intuitive when looking at the results:
#JCStressTest
#State
#Outcome(id = "1, 0", expect = Expect.ACCEPTABLE, desc = "executed in completion thread")
#Outcome(id = "0, 1", expect = Expect.ACCEPTABLE, desc = "executed in the other thread")
#Outcome(id = "0, 0", expect = Expect.FORBIDDEN)
#Outcome(id = "1, 1", expect = Expect.FORBIDDEN)
public class CompletableFutureWhichThread1 {
private final CompletableFuture<String> future = new CompletableFuture<>();
public CompletableFutureWhichThread1() {
future.thenApply(x -> action(Thread.currentThread().getName()));
}
volatile int x = -1; // different default to not mess with the expected result
volatile int y = -1; // different default to not mess with the expected result
volatile int actor1 = 0;
volatile int actor2 = 0;
private String action(String threadName) {
System.out.println(Thread.currentThread().getName());
// same thread that completed future, executed action
if ("actor1".equals(threadName) && actor1 == 1) {
x = 1;
return "action";
}
// same thread that completed future, executed action
if ("actor2".equals(threadName) && actor2 == 1) {
x = 1;
return "action";
}
y = 1;
return "action";
}
#Actor
public void actor1() {
Thread.currentThread().setName("actor1");
boolean completed = future.complete("done-actor1");
if (completed) {
actor1 = 1;
} else {
actor2 = 1;
}
}
#Actor
public void actor2() {
Thread.currentThread().setName("actor2");
boolean completed = future.complete("done-actor2");
if (completed) {
actor2 = 1;
}
}
#Arbiter
public void arbiter(II_Result result) {
if (x == 1) {
result.r1 = 1;
}
if (y == 1) {
result.r2 = 1;
}
}
}
After running this, both 0, 1 and 1, 0 are seen. You do not need to understand very much about the test itself, but it proves a rather interesting point.
You have a CompletableFuture future that has a future.thenApply(x -> action(...)); attached to it. There are two threads (actor1 and actor2) that both, at the same time, compete with each other into completing it (the specification says that only one will be successful). The results show that if actor1 called complete, but does not actually complete the CompletableFuture (actor2 did), it can still do the actual work in action. In other words, a thread that completed a CompletableFuture is not necessarily the thread that executes the dependent actions (those thenApply for example). This was rather interesting for me to find out, though it makes sense.
Your reasonings about speed are a bit off. When you dispatch your work to a different thread, you usually pay a penalty for that. thenCompose vs thenComposeAsync is about being able to predict where exactly is your work going to happen. As you have seen above you can not do that, unless you use the ...Async methods that take a thread pool. Your natural question should be : "Why do I care where it is executed?".
There is an internal class in jdk's HttpClient called SelectorManager. It has (from a high level) a rather simple task: it reads from a socket and gives "responses" back to the threads that wait for a http result. In essence, this is a thread that wakes up all interested parties that wait for some http packets. Now imagine that this particular thread does internally thenCompose. Now also imagine that your chain of calls looks like this:
httpClient.sendAsync(() -> ...)
.thenApply(x -> foo())
where foo is a method that never finishes (or takes a lot of time to finish). Since you have no idea in which thread the actual execution is going to happen, it can, very well, happen in SelectorManager thread. Which would be a disaster. Everyone other http calls would stale, because this thread is busy now. Thus thenComposeAsync: let the configured pool do the work/waiting if needed, while the SelectorManager thread is free to do its work.
So the reasons that the author gives are plain wrong.

In which thread do CompletableFuture's completion handlers execute?

I have a question about CompletableFuture method:
public <U> CompletableFuture<U> thenApply(Function<? super T, ? extends U> fn)
The thing is the JavaDoc says just this:
Returns a new CompletionStage that, when this stage completes
normally, is executed with this stage's result as the argument to the
supplied function. See the CompletionStage documentation for rules
covering exceptional completion.
What about threading? In which thread is this going to be executed? What if the future is completed by a thread pool?
As #nullpointer points out, the documentation tells you what you need to know. However, the relevant text is surprisingly vague, and some of the comments (and answers) posted here seem to rely on assumptions that aren't supported by the documentation. Thus, I think it's worthwhile to pick it apart. Specifically, we should read this paragraph very carefully:
Actions supplied for dependent completions of non-async methods may be performed by the thread that completes the current CompletableFuture, or by any other caller of a completion method.
Sounds straightforward enough, but it's light on details. It seemingly deliberately avoids describing when a dependent completion may be invoked on the completing thread versus during a call to a completion method like thenApply. As written, the paragraph above is practically begging us to fill in the gaps with assumptions. That's dangerous, especially when the topic concerns concurrent and asynchronous programming, where many of the expectations we've developed as programmers get turned on their head. Let's take a careful look at what the documentation doesn't say.
The documentation does not claim that dependent completions registered before a call to complete() will run on the completing thread. Moreover, while it states that a dependent completion might be invoked when calling a completion method like thenApply, it does not state that a completion will be invoked on the thread that registers it (note the words "any other").
These are potentially important points for anyone using CompletableFuture to schedule and compose tasks. Consider this sequence of events:
Thread A registers a dependent completion via f.thenApply(c1).
Some time later, Thread B calls f.complete().
Around the same time, Thread C registers another dependent completion via f.thenApply(c2).
Conceptually, complete() does two things: it publishes the result of the future, and then it attempts to invoke dependent completions. Now, what happens if Thread C runs after the result value is posted, but before Thread B gets around to invoking c1? Depending on the implementation, Thread C may see that f has completed, and it may then invoke c1 and c2. Alternatively, Thread C may invoke c2 while leaving Thread B to invoke c1. The documentation does not rule out either possibility. With that in mind, here are assumptions that are not supported by the documentation:
That a dependent completion c registered on f prior to completion will be invoked during the call to f.complete();
That c will have run to completion by the time f.complete() returns;
That dependent completions will be invoked in any particular order (e.g., order of registration);
That dependent completions registered before f completes will be invoked before completions registered after f completes.
Consider another example:
Thread A calls f.complete();
Some time later, Thread B registers a completion via f.thenApply(c1);
Around the same time, Thread C registers a separate completion via f.thenApply(c2).
If it is known that f has already run to completion, one might be tempted to assume that c1 will be invoked during f.thenApply(c1) and that c2 will be invoked during f.thenApply(c2). One might further assume that c1 will have run to completion by the time f.thenApply(c1) returns. However, the documentation does not support these assumptions. It may be possible that one of the threads calling thenApply ends up invoking both c1 and c2, while the other thread invokes neither.
A careful analysis of the JDK code could determine how the hypothetical scenarios above might play out. But even that is risky, because you may end up relying on an implementation detail that is (1) not portable, or (2) subject to change. Your best bet is not to assume anything that's not spelled out in the javadocs or the original JSR spec.
tldr: Be careful what you assume, and when you write documentation, be as clear and deliberate as possible. While brevity is a wonderful thing, be wary of the human tendency to fill in the gaps.
The policies as specified in the CompletableFuture docs could help you understand better:
Actions supplied for dependent completions of non-async methods may be
performed by the thread that completes the current CompletableFuture,
or by any other caller of a completion method.
All async methods without an explicit Executor argument are performed
using the ForkJoinPool.commonPool() (unless it does not support a
parallelism level of at least two, in which case, a new Thread is
created to run each task). To simplify monitoring, debugging, and
tracking, all generated asynchronous tasks are instances of the marker
interface CompletableFuture.AsynchronousCompletionTask.
Update: I would also advice on reading this answer by #Mike as an interesting analysis further into the details of the documentation.
From the Javadoc:
Actions supplied for dependent completions of non-async methods may be performed by the thread that completes the current CompletableFuture, or by any other caller of a completion method.
More concretely:
fn will run during the call to complete() in the context of whichever thread has called complete().
If complete() has already finished by the time thenApply() is called, fn will be run in the context of the thread calling thenApply().
When it comes to threading the API documentation is lacking. It takes a bit of inference to understand how threading and futures work. Start with one assumption: the non-Async methods of CompletableFuture do not spawn new threads on their own. Work will proceed under existing threads.
thenApply will run in the original CompletableFuture's thread. That's either the thread that calls complete(), or the one that calls thenApply() if the future is already completed. If you want control over the thread—a good idea if fn is a slow operation—then you should use thenApplyAsync.
I know this question is old, but I want to use source code to explain this question.
public CompletableFuture<Void> thenAccept(Consumer<? super T> action) {
return uniAcceptStage(null, action);
}
private CompletableFuture<Void> uniAcceptStage(Executor e,
Consumer<? super T> f) {
if (f == null) throw new NullPointerException();
Object r;
if ((r = result) != null)
return uniAcceptNow(r, e, f);
CompletableFuture<Void> d = newIncompleteFuture();
unipush(new UniAccept<T>(e, d, this, f));
return d;
}
This is the source code from java 16, and we can see, if we trigger thenAccept, we will pass a null executor service reference into our function.
From the 2nd function uniAcceptStage() 2nd if condition. If result is not null, it will trigger uniAcceptNow()
if (e != null) {
e.execute(new UniAccept<T>(null, d, this, f));
} else {
#SuppressWarnings("unchecked") T t = (T) r;
f.accept(t);
d.result = NIL;
}
if executor service is null, we will use lambda function f.accept(t) to execute it. If we are triggering this thenApply/thenAccept from main thread, it will use main thread as executing thread.
But if we cannot get previous result from last completablefuture, we will push our current UniAccept/Apply into stack by using uniPush function. And UniAccept class has tryFire() which will be triggered from our postComplete() function
final void postComplete() {
/*
* On each step, variable f holds current dependents to pop
* and run. It is extended along only one path at a time,
* pushing others to avoid unbounded recursion.
*/
CompletableFuture<?> f = this; Completion h;
while ((h = f.stack) != null ||
(f != this && (h = (f = this).stack) != null)) {
CompletableFuture<?> d; Completion t;
if (STACK.compareAndSet(f, h, t = h.next)) {
if (t != null) {
if (f != this) {
pushStack(h);
continue;
}
NEXT.compareAndSet(h, t, null); // try to detach
}
f = (d = h.tryFire(NESTED)) == null ? this : d;
}
}
}

Categories