I'm working on a Java server application with the general following architecture:
Clients make RPC requests to the server
The RPC server (gRPC) I believe has its own thread pool for handling requests
Requests are immediately inserted into Thread Pool 1 for more processing
A specific request type, we'll call Request R, needs to run a few asynchronous tasks in parallel, judging the results to form a consensus that it will return to the client. These tasks are a bit more long running, so I use a separate Thread Pool 2 to handle these requests. Importantly, each Request R will need to run the same 2-3 asynchronous tasks. Thread Pool 2 therefore services ALL currently executing Request R's. However, a Request R should only be able to see and retrieve the asynchronus tasks that belong to it.
To achieve this, upon every incoming Request R, while its in Thread Pool 1, it will create a new CompletionService for the request, backed by Thread Pool 2. It will submit 2-3 async tasks, and retrieve the results. These should be strictly isolated from anything else that might be running in Thread Pool 2 belonging to other requests.
My questions:
Firstly, is Java's CompletionService isolated? I couldn't find good documentation on this after checking the JavaDocs. In other words, if two or more CompletionService's are backed by the same thread pool, are any of them at risk of pulling a future belonging to another CompletionService?
Secondly, is this bad practice to be creating this many CompletionService's for each request? Is there a better way to handle this? Of course it would be a bad idea to create a new thread pool for each request, so is there a more canonical/correct way to isolate futures within a CompletionService or is what I'm doing okay?
Thanks in advance for the help. Any pointers to helpful documentation or examles would be greatly appreciated.
Code, for reference, although trivial:
public static final ExecutorService THREAD_POOL_2 =
new ThreadPoolExecutor(16, 64, 60, TimeUnit.SECONDS, new LinkedBlockingQueue<>());
// Gets created to handle a RequestR, RequestRHandler is run in Thread Pool 1
public class RequestRHandler {
CompletionService<String> cs;
RequestRHandler() {
cs = new ExecutorCompletionService<>(THREAD_POOL_2);
}
String execute() {
cs.submit(asyncTask1);
cs.submit(asyncTask2);
cs.submit(asyncTask3);
// Lets say asyncTask3 completes first
Future<String> asyncTask3Result = cs.take();
// asyncTask3 result indicates asyncTask1 & asyncTask2 results don't matter, cancel them
// without checking result
// Cancels all futures, I track all futures submitted within this request and cancel them,
// so it shouldn't affect any other requests in the TP 2 pool
cancelAllFutures(cs);
return asyncTask3Result.get();
}
}
Firstly, is Java's CompletionService isolated?
That's not garanteed as it's an interface, so the implementation decides that. But as the only implementation is ExecutorCompletionService I'd just say the answer is: yes. Every instance of ExecutorCompletionService has internally a BlockingQueue where the finished tasks are queued. Actually, when you call take on the service, it just passes the call to the queue by calling take on it. Every submitted task is wrapped by another object, which puts the task in the queue when it's finished. So each instance manages it's submitted tasks isolated from other instances.
Secondly, is this bad practice to be creating this many CompletionServices for each request?
I'd say it's okay. A CompletionService is nothing but a rather thin wrapper around an executor. You have to live with the "overhead" (internal BlockingQueue and wrapper instances for the tasks) but it's small and you are probably gaining way more from it than it costs. One could ask if you need one for just 2 to 3 tasks but it kinda depends on the tasks. At this point it's a question about if a CompletionService is worth it in general, so that's up to you to decide as it's out of scope of your question.
Related
Am building a spring boot rest api application deployed on weblogic 12c.
One of my requirement is to run some long running tasks on every incoming request.
An incoming rest request could result into multiple asynchronous task executions.
Since I dont care for the response and nor any exceptions that will result from these tasks I chose to use the ExecutorService and not Callable or CompletableFuture.
ExecutorService executorService =
Executors.newFixedThreadPool(2, new CustomizableThreadFactory("-abc-"));
Then for the incoming request that I receive in controller run two for loops and assign those tasks to the ExecutorService:
for (final String orderId : orderIds) {
for (final String itemId : itemIds) {
exec.execute(new Runnable() {
public void run() {
try {
//call database operation
}catch(Throwable t) {
logger.error("EXCEPTION with {} , {}" ,orderId,itemId
)
}
});
}//for
}//for
My question is regarding shutting down of the ExecutorService.
I am aware about graceful shutdown ( shutdown ) a hybrid shutdown ( awaitTermination ) or an abrupt shutdown ( shutdownNow )
what would be the preferred approach between the three for a rest api application ?
also is there any limit on how many thread pools can get created viz a viz as the number of ExecutorService thread pools getting created will be driven by the number of incoming requests
We currently have similar requirements, this is a difficult problem to solve as you want to use the right hammer if you will. There are very heavy weight solutions to orchestrating long running processes, for example SpringBatch.
Firstly though don't bother stop and starting the ExecutorService. The whole point of that class is to take the burden of Thread management off your hands, so you don't need to create and stop Threads yourself. So you don't need to manage the manager.
But be careful with your approach. Without using queues or another load balancing technique to smartly balance the long running processes across instances in your app. Or managing what happens when a Thread dies, you may get into a world of trouble. In general I would say nowadays it doesn't make much sense to interact directly with Threads or ThreadPools, and to use higher level solutions for this type of problem.
awaitTermination is usually a bit safer, while shutdownNow is more forceful. It's usually a good idea to use awaitTermination in a functional method, or even a runnable, if you would like the executor to shut down as soon as possible, but only after it has completed doing everything that it was created to do. In other words, when there are no active tasks that the executor is executing.
Ex.)
ExecutorService executor = Executors.newFixedThreadPool(Runtime.getRuntime.availableProcessors);
Observable.of(items).schedule(Schedulers.from(executor)).flatMap(item -> {
... // this block represents a task that the executor will execute in a worker thread
}).onSubscribe(onNext ->
logItem(onNext), throwable ->
throwable.printStackTrace(), /* onComplete */ () ->
executor.awaitTermination(60, TimeUnit.Seconds)
);
... // you need to shutdown asap because these other methods below are also doing some computation/io-intensive stuff
Now, when this method is finished, it will call awaitTermination, which will either close the pool immediately if it is not executing any tasks, or wait up to 60 seconds if tasks are still being executed.
Threads, or workers, will cease to be active for 60 seconds of inactivity in most cases, since that is usually the default.
On the other hand, if you want tasks to stop executing as soon as (to give some examples) an exception is thrown, there was a breach in security, or another module/service has failed, you might want to use shutdownNow() to stop all tasks immediately without the option of waiting.
My advice for choosing between the two would be to use shutdownNow in you catch block if you do not want tasks to continue to be executed if there is an exception - i.e., there is no longer a reason to return the list of items to the client given that one of the items did not get added to the list.
Otherwise, I'd recommend using awaitTermination after your try-catch, set to one minute, to safely shut down the thread pool as soon as it has executed all the tasks you have given it. But only do that if you know that the executor will not responsible for executing any more tasks down the line.
The simple shutdown, if that is an option for you, is also a good method. shutdown will reject all incoming tasks but wait until current tasks are finished executing, according to the Oracle docs.
If your not sure when you need to close the executor, it might be a good idea to use an #PreDestroy method so that the executor will just before the destroy method has been called on your bean:
#PreDestroy
private void cleanup(){
executor.shutdown();
}
I am building a long running application, which is modeled as a service based on service oriented architecture. Call this as 'serviceA'. It has an activity to perform, call 'activityA', whenever an API call is made to it.
activityA has an activity handler that has to perform 'n' tasks in parallel after which it consolidates and returns result to the client who called the serviceA API.
I am planning to use the ExecutorService to achieve this parallelism.
There are 2 ways to go ahead with this:
Create ExecutorService in a singleton scope, and have it as an attribute of the activity handler. Thus this same ExecutorService object is available throughout the lifetime of the service. When a new request comes, handler uses this ExecutorService object to submit parallel tasks. Then wait on the Future objects for certain timeout time. After all the parallel tasks complete, consolidate and return the activityA response.
Create new ExecutorService object everytime a request to activityA is received, in the activity handler. Submit the parallel tasks to this object, wait for the Future results for certain timeout time, consolidate the results, call shutdown on the ExecutorService object, and return the activityA API response.
Thus,
Which of the 2 above approaches should be followed? Major difference b/w the 2 is the lifetime of the ExecutorService object.
The service is supposed to be called with a volume of ~15k transactions per second, if this data helps with the decision making b/w the 2 approaches?
Advantage of 1st approach is that we will not have the overhead of creating and shutting down new ExecutorService objects, and threads. But, what happens when there is no Future result till the timeout time? Does the thread automatically shuts down? Is it available for any new request that will be coming to the ExecutorService thread pool? Or it will be in some waiting state, and eat up memory - in which case we manually need to do something (and what)?
Also, Timeout time while we call future.get() is from the time we make this get call or from the time we submitted the task to the executor service?
Please also let me know if any of the 2 way is the obvious approach to this problem.
Thanks.
The first way looks like the obvious and correct way to solve this problem, especially with the given amount of transactions. You certainly don't want to restart threads.
Future.get timeout doesn't affect the executing thread. It will continue to run the task until it is either completed or throws an exception. Until then, it won't be accepting new tasks (but other threads in the same executor will). In this case you may want to cancel it explicitly by invoking Future.cancel to free the thread for new tasks. This requires the task itself to respond properly to interrupt (instead of looping forever, for example, or waiting blocked on I/O). However, this would be the same for any threading approach since interruption is the only safe way to terminate a thread anyway. To mitigate this issue you could use a dynamic pool of threads with maximum number of running threads more than n. This will allow to process new tasks while the stuck tasks are in process of termination.
It's from the time you call it.
As we create a Thread pool using Java's Executor service and submit threads to this thread pool, what is the order in which those threads get executed?
I want to ensure that threads submitted first, execute first.
For example, in the code below, I want first 5 threads to get executed first, followed by the next 5 threads and so on...
// Create a thread pool of 5 threads.
ScheduledExecutorService exService = Executors.newScheduledThreadPool(5, new ModifiedThreadFactory("ReadThreadPool"));
// Create 100 threads.
MyThread[] threads = createMyThreads(100);
// Submit these 100 threads to thread pool for execution.
for(MyThread thread : threads) {
exService.submit(thread);
}
Does Java's Thread Pool provide any API for this purpose, or do we need to implement a FIFO queue at our end to achieve this.
If Java's thread pool does not provide any such functionality, I am really interested to understand the reason behind the non-existence of this functionality as it appears like a very common use-case to me.
Is it technically not possible (which I think is quite unlikely), or is it just a miss?
That's the default behavior. ScheduledThreadExecutor (that you're using although you're not scheduling anything) extends from ThreadPoolExecutor. Tasks submitted to a ThreadPoolExecutor are stored in a BlockingQueue until one thread is available to take them and execute them. And queues are FIFO.
This is decscribed in details in the javadoc.
Threads do not get executed. Threads are the entities running taska like Runnable and Callable . Submiting such a task to a executor service will put it in it's inner BlockingQueue until it gets picked up by a thread from it's thread pool. This will still tell you nothing about the order of execution as different classes can do different things while implementing Runnable
I'm working on a project where execution time is critical. In one of the algorithms I have, I need to save some data into a database.
What I did is call a method that does that. It fires a new thread every time it's called. I faced a runoutofmemory problem since the loaded threads are more than 20,000 ...
My question now is, I want to start only one thread, when the method is called, it adds the job into a queue and notifies the thread, it sleeps when no jobs are available and so on. Any design patterns available or examples available online ?
Run, do not walk to your friendly Javadocs and look up ExecutorService, especially Executors.newSingleThreadExecutor().
ExecutorService myXS = Executors.newSingleThreadExecutor();
// then, as needed...
myXS.submit(myRunnable);
And it will handle the rest.
Yes, you want a worker thread or thread pool pattern.
http://en.wikipedia.org/wiki/Thread_pool_pattern
See http://www.ibm.com/developerworks/library/j-jtp0730/index.html for Java examples
I believe the pattern you're looking for is called producer-consumer. In Java, you can use the blocking methods on a BlockingQueue to pass tasks from the producers (that create the jobs) to the consumer (the single worker thread). This will make the worker thread automatically sleep when no jobs are available in the queue, and wake up when one is added. The concurrent collections should also handle using multiple worker threads.
Are you looking for java.util.concurrent.Executor?
That said, if you have 20000 concurrent inserts into the database, using a thread pool will probably not save you: If the database can't keep up, the queue will get longer and longer, until you run out of memory again. Also, note that an executors queue is volatile, i.e. if the server crashes, the data in it will be gone.
I created multiple ExecutorService instances in my code, usually each UI page has one ExecutorService instance. Each ExecutorService instance will execute some http get request threads.
private ExecutorService m_threadPool = Executors.newCachedThreadPool();
Is it OK to do that?
The problem I met is that sometimes the http get requests got response code -1 from HttpURLConnection getResponseCode() call. I don't know whether it is caused by multiple threadpool instances.
Thanks.
ExecutorService per se is just another object so there's no big overhead. But each thread pool comes with a number of idle threads by default and those are a cause of a major resource waste. I would suggest setting the default number of pre-generated threads in each pool small (1 or 0 if you are not sure whether any requests are sent) in order to reduce the cost of creating extra objects. Threads would be created on demand and you'll be able to keep your code clean.
Another solution is to use a single thread pool but to maintain a separate list of tasks for each UI window. In this case when window gets closed you'll have to iterate over all tasks and cancell the running ones manually (this can also be done in a separate thread). A task may be represented by a Future<?> (it has handy isDone() and cancel() methods).
It shouldn't be caused by your thread pool instances. However, I'd say that having more than one thread pool is questionable. Why would you need it? It could lead to a lot of unnecessary threads, and thereby unnecessary memory use.