When to use CompletableFuture#thenApply(..) over thenApplyAsync(..)? - java

In context of CompletableFuture I understand that thenApply(..) may use the current thread and may use the a pre-defined executor (e.g. ForkJoinPool) while thenApplyAsync(..) ensures that the pre-defined executor will be always used.
Far as I see the thenApplyAsync(..) seems be more "reliable" as it never blocks the current thread while thenApply(..) might be a surprise.
My question: Which example/scenario would be valid to use thenApply(..) rather than thenApplyAsync(..)?
Thanks, Christoph

Yes, thenApplyAsync would use some excecutor. This means that some Runnable object must be created and put in the executor's queue. If the function you want to execute after the complethion of this CompletableFuture is very simple, then invoking this method directly may be more efficient than creating envelope Runnable.

Related

How to globally set thread pool for all CompletableFuture

I am trying to mimic what single threaded async programming in Javascript in Java with the use of async / await library by EA (ea-async). This is mainly because I do not have long-lasting CPU bound computations in my program and I want to code single thread lock free code in Java.
ea-async library heavily relies on the CompletableFuture in Java and underneath Java seems to use ForkJoinPool to run the async callbacks. This puts me into multi threaded environment as my CPU is multi-core. It seems for every CompletableFuture task, I can supply async with my custom thread pool executor. I can supply Executors.newSingleThreadExecutor() for this but I need a way to set this globally so that all CompletableFuture will be using this executor within the single JVM process. How do I do this?
ea-async library heavily relies on the CompletableFuture in Java and
underneath Java seems to use ForkJoinPool to run the async callbacks.
That is the default behavior of CompleteableFuture:
All async methods without an explicit Executor argument are performed
using the ForkJoinPool.commonPool() (unless it does not support a
parallelism level of at least two, in which case, a new Thread is
created to run each task). This may be overridden for non-static
methods in subclasses by defining method defaultExecutor().
That's a defined characteristic of the class, so if you're using class CompleteableFuture, not a subclass, and generating instances without specifying an Executor explicitly, then a ForkJoinPool is what you're going to get.
Of course, if you are in control of the CompletableFutures provided to ea-async then you have the option to provide instances of a subclass that defines defaultExecutor() however you like. Alternatively, you can create your CompleteableFuture objects via the static factory methods that allow you to explicitly specify the Executor to use, such as runAsync​(Runnable, Executor).
But that's probably not what you really want to do.
If you use an executor with only one thread, then your tasks can be executed asynchronously with respect to the thread that submits them, yes, but they will be serialized with respect to each other. You do get only one thread working on them, but it will at any time be working on a specific one, sticking with that one only until it finishes, regardless of the order in which the responses actually arrive. If that's satisfactory, then it's unclear why you want async operations at all.
This puts me into multi threaded environment as my CPU is multi-core.
It puts you in multiple threads regardless of how many cores your CPU has. That's what Executors do, even Executors.newSingleThreadExecutor(). That's the sense of "asynchronous" they provide.
If I understand correctly, you are instead looking to use one thread to multiplex I/O to multiple remote web applications. That is what java.nio.channels.Selector is for, but using that generally requires either managing the I/O operations yourself or using interfaces designed to interoperate with selectors. If you are locked in to third-party interfaces that do not afford use of a Selector, then multithreading and multiprocessing are your only viable alternatives.
In comments you wrote:
I'm starting to think maybe BlockingQueue might do the job in
consolidating all API responses into one queue as tasks where a single
thread will work on them.
Again, I don't think that you want everything that comes with that, and if in fact you do, then I don't see why it wouldn't be even better and easier to work synchronously instead of asynchronously.

Behavior of ThreadPoolExecutor with keepAliveTime = 0 and corePoolSize = 0

Does setting ThreadPoolExecutor's keepAliveTime and corePoolSize to 0 make it create a new Thread for every task? Is it guaranteed no Thread will ever be reused for any task?
BTW I want to set the maximumPoolSize to 100 or so. I cannot afford unlimited amount of threads. In case I reached the limit of the threads (e.g. 100), I want the server to fallback to 'sychronous' mode (no parallelism). See the ThreadPoolExecutor.CallerRunsPolicy.
Background (read only in case you are interested in my motivation):
We have a project which relies on usage of ThreadLocals (e.g. we use Spring and its SecurityContextHolder). We would like to make 10 calls to backend systems in parallel. We like the ThreadPoolExecutor.CallerRunsPolicy, which runs the callable in the caller thread in case thread pool and its task queue is full. That's why we would like to use ThreadPoolExecutor. I am not able to change the project not to use ThreadLocals, please do not suggest doing so.
I was thinking how to do it with the least amount of work. SecurityContextHolder can be switched to use InheritableThreadLocal instead of ThreadLocal. The thread-local variables are then passed to child threads when the child threads are created. The only problem is how to make ThreadPoolExecutor create new Thread for every task. Will setting its keepAliveTime and corePoolSize to 0 work? Am I sure none of the threads will be reused for a next task? I can live with performance hit of creating new Threads, because the parallel tasks take much more time each.
Other possible solutions considered:
Extend ThreadPoolExecutor's execute method and wrap the Runnable command parameter into a different Runnable which remembers thread-locals into its final fields and then initializes them in its run method before calling the target command. I think this might work, but it is sligtly more code to maintain than the solution my original question is about.
Pass thread-locals in a parameter of asynchronous methods. This is more verbose to use. Also people can forget to do it, and the security context would be shared between asynchronous tasks :-(
Extend ThreadPoolExecutor's beforeExecute and afterExecute methods and copy thread-locals using reflection. This requires ~50 lines of ugly reflection code and I am not sure how safe it is.
Nope, this does not work! ThreadPoolExecutor wraps your Callable/Runnable into an internal Worker object and executes it in its runWorker() method. The Javadoc of this method says:
Main worker run loop. Repeatedly gets tasks from queue and executes them, while coping with a number of issues: ...
You can also take a look at the code of this method and see that it does not exit until the task queue is empty (or something bad happens which causes the thread to exit).
So setting keepAliveTime to 0 will not necessarily cause a new thread on each submitted task.
You should probably go with your solution 3 as the beforeExecute() and afterExecute() methods are exactly meant for dealing with ThreadLocals.
Alternatively, if you insist of having new threads for each task, you may take a look at Spring's SimpleAsyncTaskExecutor. It guarantees you a new thread for each task and allows you to set a concurrency limit, i.e. the equivalent of ThreadPoolExecutor#maxPoolSize.

ExecutorService-like class where user controls when Callables are called

I was using an ExecutorService to schedule tasks to be executed in future. After seeing some "odd" behavior where my Callable was getting executed before I called get() on the Future object returned by submitting my Callable to the ExecutorService pool, I read some documentation and found that the submitted task will get executed between the time it gets submitted or at the latest when get() is called on the Future object.
My question - is there any class that would allow Callables to be submitted to it and ONLY executed when get() is called on it? At this point, it seems like just managing the Callables myself and calling call() on them myself when I am ready for them to be executed seems like it'd accomplish what I want, but I wanted to make sure there was no service already implemented that accomplished this.
In short, is there any alternative to ExecutorService that lets me control when Callables submitted to it are called? Note - the time in the future that I want them called is variable and not determined as I may decide not to call them so a ScheduledExecutorService pool won't work here.
Thanks much!
Sounds like you really want to use a Queue<Callable> instead and just poll the queue for tasks.
That way you can submit as many tasks as you like and execute them at your will - one by one.

Future<V> and Exception

How to ensure that the exception thrown by #Asynchronous method from EJB 3.1 methods are not silently eaten up by Future?
I know one can use Future.get method to retrieve exception but it will wait till the computation is done, a problem in case no exception occur and you have to wait till the computation is over.
(Update)
The scenario is fairly simple. A stateless EJB exposes its method with #Asynchronous annotation, primarily intended for #Local. The AS is JBoss. During computation, its possible that a RuntimeException occurs. Clients may or may not want to poll if the job is finished, but in all cases they should know if exception has occurred.
A workaround is possible to use some sort of callback, but I am interested if there is any out of box solution available.
Did you consider invoking Future#get(timeout, timeUnit) to return the control after the given time if no results are available (the computation is not finished)?
You can also invoke Future#isDone() prior to Future#get() to know if the processing is complete.
Either way, you still need to invoke Future#get(-) to get known what has happened and to be sure that the exception is not swallowed.
HTH.
There is a solution to your problem. The way countdownlatch is implemented is to notify the calling thread how many future tasks are done. here is an example hot to use countdownlatch. So implement a small synchronous class and add an instance to all callable objects while submitting. that shall work as callback.
If you have access to the configuration of your EJB container and you can set the executor, then you could Guava's addCallback method. This method requires a com.google.common.util.concurrent.ListenableFuture instead of an ordinary one. You will get this kind of future by setting the executor of your instance to a ListeningExecutorService. Guava provides a factory method for decorating each ExecutorService as ListeningExecutorService, so you are free to use whatever ExecutorService you had beforehand.

Get Runnable objects I scheduled using ScheduledThreadPoolExecutor when using shutdownNow() method

I'm using ScheduledThreadPoolExecutor.schedule(Runnable,int,TimeUnit) to schedule some objects of a class that implements Runnable.
At some point in time, my application is then shutting down, and I use ScheduledThreadPoolExecutor.shutdownNow(). According to the documentation it returns a list of ScheduledFuture's.
What I really want to do is get a hold of the object that I originally scheduled, and get a bit of data from it which I will then output saying that it was unable to execute. This would then, later, be used by the application to attempt to execute it when the application then starts back up.
The usual way to get info about objects submitted to an executor is to maintain a pointer to the original objects (be they extensions to Callable, Runnable, etc). After you call shutdownNow(), and take into account the Runnable's returned by that which were awaiting execution, you can use that to prune your original list of objects and just interrogate the ones that were actually run.
If you just want to present the information to the user, the simplest approach might be to implement a meaningful toString() method for the Runnables you'r scheduling. Then you can simply iterate the list the Executor gives you and log what you get.
But the sad truth is that your original objects get wrapped by the Executor, though. Then you would need to keep a list of what you pass to the Executor manually and let the Runnables remove themselves from this list when they get executed. Obviously, you would need to use a thread-safe list for this purpose.

Categories