How to make asynchronous tasks have lower priority in Java 8?

How to make asynchronous tasks have lower priority in Java 8? - java

I have a service that handles the main entity, retrieves the first sub-entity associated with the main entity, then returns both. It also sets off a set of completable Future chains to go out & retrieve any additional entities. Currently, I just take a prebuilt set of retrieval tasks, wrap a Future async around them, then set it off with a CachedThreadPool. This is fine, but when 50+ users hit the server the primary task (of retrieving the main entity & the first sub-entity) is dramatically slowed by all of the async threads running.
I want to know if there is a way to make the asynchronous calls to run on a lower priority in order to make sure the primary call is handled quickly.
public CompletableFuture buildFutureTasks(P primaryEntity, List<List<S>> entityGroups)
{
ExecutorService pool = Executors.newCachedThreadPool();
CompletableFuture<Void> future = null;
for (List<S> entityGroup : entityGroups)
{
if (future == null || future.isDone())
{
future = CompletableFuture.runAsync(() ->
retrieveSubEntitiesForEntity(primaryEntity, entityGroup), pool);
}
else
{
future.thenRunAsync(() ->
retrieveSubEntitiesForEntities(primaryEntity, entityGroup), pool);
}
}
return future;
}
This is the fastest I've been able to make this run with 50+ users but it still dramatically slows down the more users I add.

As you are most likely to know already, there is a method Thread::setPriority. But as to the JavaDoc
Every thread has a priority. Threads with higher priority are executed
in preference to threads with lower priority.
So you can just provide a ThreadFactory when creating your cached ExecutorService.
The actual thread scheduling details is VM-implementation specific so you cannot really rely on this. I would consider using fixedThreadPool instead
But I'm not sure that the actual problem is about thread scheduling and priority. First of all, cached thread pool can (as to the documentation)
Creates a thread pool that creates new threads as needed, but will
reuse previously constructed threads when they are available.
In case of 50+ users which can call buildFutureTasks you cannot really control the number of threads created.
I would consider using fixedThreadPool so you can control if don't really need SynchronousQueue which is the underlying of cached thread pools.
Consider using the same ThreadPool for all the task and not create it inside the method buildFutureTasks every time it is called.

Related

Multiple CompletionService for one thread pool Java

I'm working on a Java server application with the general following architecture:
Clients make RPC requests to the server
The RPC server (gRPC) I believe has its own thread pool for handling requests
Requests are immediately inserted into Thread Pool 1 for more processing
A specific request type, we'll call Request R, needs to run a few asynchronous tasks in parallel, judging the results to form a consensus that it will return to the client. These tasks are a bit more long running, so I use a separate Thread Pool 2 to handle these requests. Importantly, each Request R will need to run the same 2-3 asynchronous tasks. Thread Pool 2 therefore services ALL currently executing Request R's. However, a Request R should only be able to see and retrieve the asynchronus tasks that belong to it.
To achieve this, upon every incoming Request R, while its in Thread Pool 1, it will create a new CompletionService for the request, backed by Thread Pool 2. It will submit 2-3 async tasks, and retrieve the results. These should be strictly isolated from anything else that might be running in Thread Pool 2 belonging to other requests.
My questions:
Firstly, is Java's CompletionService isolated? I couldn't find good documentation on this after checking the JavaDocs. In other words, if two or more CompletionService's are backed by the same thread pool, are any of them at risk of pulling a future belonging to another CompletionService?
Secondly, is this bad practice to be creating this many CompletionService's for each request? Is there a better way to handle this? Of course it would be a bad idea to create a new thread pool for each request, so is there a more canonical/correct way to isolate futures within a CompletionService or is what I'm doing okay?
Thanks in advance for the help. Any pointers to helpful documentation or examles would be greatly appreciated.
Code, for reference, although trivial:
public static final ExecutorService THREAD_POOL_2 =
new ThreadPoolExecutor(16, 64, 60, TimeUnit.SECONDS, new LinkedBlockingQueue<>());
// Gets created to handle a RequestR, RequestRHandler is run in Thread Pool 1
public class RequestRHandler {
CompletionService<String> cs;
RequestRHandler() {
cs = new ExecutorCompletionService<>(THREAD_POOL_2);
}
String execute() {
cs.submit(asyncTask1);
cs.submit(asyncTask2);
cs.submit(asyncTask3);
// Lets say asyncTask3 completes first
Future<String> asyncTask3Result = cs.take();
// asyncTask3 result indicates asyncTask1 & asyncTask2 results don't matter, cancel them
// without checking result
// Cancels all futures, I track all futures submitted within this request and cancel them,
// so it shouldn't affect any other requests in the TP 2 pool
cancelAllFutures(cs);
return asyncTask3Result.get();
}
}

Firstly, is Java's CompletionService isolated?
That's not garanteed as it's an interface, so the implementation decides that. But as the only implementation is ExecutorCompletionService I'd just say the answer is: yes. Every instance of ExecutorCompletionService has internally a BlockingQueue where the finished tasks are queued. Actually, when you call take on the service, it just passes the call to the queue by calling take on it. Every submitted task is wrapped by another object, which puts the task in the queue when it's finished. So each instance manages it's submitted tasks isolated from other instances.
Secondly, is this bad practice to be creating this many CompletionServices for each request?
I'd say it's okay. A CompletionService is nothing but a rather thin wrapper around an executor. You have to live with the "overhead" (internal BlockingQueue and wrapper instances for the tasks) but it's small and you are probably gaining way more from it than it costs. One could ask if you need one for just 2 to 3 tasks but it kinda depends on the tasks. At this point it's a question about if a CompletionService is worth it in general, so that's up to you to decide as it's out of scope of your question.

shutting down ExecutorService in a spring boot Rest API

Am building a spring boot rest api application deployed on weblogic 12c.
One of my requirement is to run some long running tasks on every incoming request.
An incoming rest request could result into multiple asynchronous task executions.
Since I dont care for the response and nor any exceptions that will result from these tasks I chose to use the ExecutorService and not Callable or CompletableFuture.
ExecutorService executorService =
Executors.newFixedThreadPool(2, new CustomizableThreadFactory("-abc-"));
Then for the incoming request that I receive in controller run two for loops and assign those tasks to the ExecutorService:
for (final String orderId : orderIds) {
for (final String itemId : itemIds) {
exec.execute(new Runnable() {
public void run() {
try {
//call database operation
}catch(Throwable t) {
logger.error("EXCEPTION with {} , {}" ,orderId,itemId
)
}
});
}//for
}//for
My question is regarding shutting down of the ExecutorService.
I am aware about graceful shutdown ( shutdown ) a hybrid shutdown ( awaitTermination ) or an abrupt shutdown ( shutdownNow )
what would be the preferred approach between the three for a rest api application ?
also is there any limit on how many thread pools can get created viz a viz as the number of ExecutorService thread pools getting created will be driven by the number of incoming requests

We currently have similar requirements, this is a difficult problem to solve as you want to use the right hammer if you will. There are very heavy weight solutions to orchestrating long running processes, for example SpringBatch.
Firstly though don't bother stop and starting the ExecutorService. The whole point of that class is to take the burden of Thread management off your hands, so you don't need to create and stop Threads yourself. So you don't need to manage the manager.
But be careful with your approach. Without using queues or another load balancing technique to smartly balance the long running processes across instances in your app. Or managing what happens when a Thread dies, you may get into a world of trouble. In general I would say nowadays it doesn't make much sense to interact directly with Threads or ThreadPools, and to use higher level solutions for this type of problem.

awaitTermination is usually a bit safer, while shutdownNow is more forceful. It's usually a good idea to use awaitTermination in a functional method, or even a runnable, if you would like the executor to shut down as soon as possible, but only after it has completed doing everything that it was created to do. In other words, when there are no active tasks that the executor is executing.
Ex.)
ExecutorService executor = Executors.newFixedThreadPool(Runtime.getRuntime.availableProcessors);
Observable.of(items).schedule(Schedulers.from(executor)).flatMap(item -> {
... // this block represents a task that the executor will execute in a worker thread
}).onSubscribe(onNext ->
logItem(onNext), throwable ->
throwable.printStackTrace(), /* onComplete */ () ->
executor.awaitTermination(60, TimeUnit.Seconds)
);
... // you need to shutdown asap because these other methods below are also doing some computation/io-intensive stuff
Now, when this method is finished, it will call awaitTermination, which will either close the pool immediately if it is not executing any tasks, or wait up to 60 seconds if tasks are still being executed.
Threads, or workers, will cease to be active for 60 seconds of inactivity in most cases, since that is usually the default.
On the other hand, if you want tasks to stop executing as soon as (to give some examples) an exception is thrown, there was a breach in security, or another module/service has failed, you might want to use shutdownNow() to stop all tasks immediately without the option of waiting.
My advice for choosing between the two would be to use shutdownNow in you catch block if you do not want tasks to continue to be executed if there is an exception - i.e., there is no longer a reason to return the list of items to the client given that one of the items did not get added to the list.
Otherwise, I'd recommend using awaitTermination after your try-catch, set to one minute, to safely shut down the thread pool as soon as it has executed all the tasks you have given it. But only do that if you know that the executor will not responsible for executing any more tasks down the line.
The simple shutdown, if that is an option for you, is also a good method. shutdown will reject all incoming tasks but wait until current tasks are finished executing, according to the Oracle docs.
If your not sure when you need to close the executor, it might be a good idea to use an #PreDestroy method so that the executor will just before the destroy method has been called on your bean:
#PreDestroy
private void cleanup(){
executor.shutdown();
}

How can I ensure an ExecutorService pool has completed, without shutting it down?

Currently, I'm making sure my tasks have finished before moving on like so:
ExecutorService pool = Executors.newFixedThreadPool(5);
public Set<Future> EnqueueWork(StreamWrapper stream) {
Set<Future> futureObjs = new HashSet<>();
util.setData(stream);
Callable callable = util;
Future future = pool.submit(callable);
futureObjs.add(future);
pool.shutdown();
try {
pool.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
} catch (InterruptedException e) {
e.printStackTrace();
}
Node.sendTCP(Node.getNodeByHostname(StorageTopology.getNextPeer()), Coordinator.prepareForTransport(stream));
return futureObjs;
}
However, because of some other threading on my socket, it's possible that multiple calls are made to EnqueueWork - I'd like to make sure the calls to .submit have completed in the current thread, without shutting down the pool for subsequent threads coming in.
Is this possible?

You can check by invoking isDone() method on all the Future objects in futureObjs. You need to make sure isDone is called in a loop. calling get() method on Future object is another option, since get() is a blocking call, it will return only after task is completed and result is ready. But do you really want to keep the pool open after all the tasks are done?

I agree with one of the comments, it seems odd that your executor can be used by different threads. Usually and executor is private to an instance of some class, but anyhow.
What you can do, from the docs, is to check:
getActiveCount() - Returns the approximate number of threads that are >actively executing tasks.
NOTE: This is a blocking method, it will take out a lock on the workers of your threadpool and block until it has counted everything
And also check:
getQueue() - Returns the task queue used by this executor. Access to the
task queue is intended primarily for debugging and monitoring.
This queue may be in active use. Retrieving the task queue
does not prevent queued tasks from executing.
If your queue is empty and the activeCount is 0, all your tasks should have finished. I say should because getActiveCount says "approximate". Looking at the impl, this is most likely because the worker internally has a flag indicating that it is locked (in use). There is in theory a slight race between executing and the worker being done and marking itself so.
A better approach would in fact be to track the features. You would have to check the Queue and that all futures are done.
However I think what you really need is to reverse your logic. Instead of the current thread trying to work out if another thread has submitted work in the meantime, you should have the other thread call isShutdown() and simply not submit a new task in that case.

You are approaching this issue from the wrong direction. If you need to know whether or not your tasks are finished, that means you have a dependency of A->B. The executor is the wrong place to ensure that dependency, as much as you don't ask the engine of your car "are we there yet?".
Java offers several features to ensure that a certain state has been reached before starting a new execution path. One of them is the invokeAll method of the ExecutorService, that returns only when all tasks that have been submitted are completed.
pool.invokeAll(listOfAllMyCallables);
// if you reach this point all callables are completed

You have already added Future to the set. Just add below code block to get the status of each Future task by calling get() with time out period.
In my example, time out is 60 seconds. You can change it as per your requirement.
Sample code:
try{
for(Future future : futureObjs){
System.out.println("future.status = " + future.get(60000, TimeUnit.MILLISECONDS));
}
}catch(Exception err){
err.printStackTrace();
}
Other useful posts:
How to forcefully shutdown java ExecutorService
How to wait for completion of multiple tasks in Java?

Thread pool that binds tasks for a given ID to the same thread

Are there any implementations of a thread pool (in Java) that ensures all tasks for the same logical ID are executed on the same thread?
The logic I'm after is if there is already a task being executed on a specific thread for a given logical ID, then new tasks with the same ID are scheduled on the same thread. If there are no threads executing a task for the same ID then any thread can be used.
This would allow tasks for unrelated IDs to be executed in parallel, but tasks for the same ID to be executed in serial and in the order submitted.
If not, are there any suggestions on how I might extend ThreadPoolExecutor to get this behaviour (if that's even possible)?
UPDATE
Having spent longer thinking about this, I don't actually require that tasks for the same logical ID get executed on the same thread, just that they don't get executed at the same time.
An example for this would be a system that processed orders for customers, where it was OK to process multiple orders at the same time, but not for the same customer (and all orders for the same customer had to be processed in order).
The approach I'm taking at the moment is to use a standard ThreadPoolExecutor, with a customised BlockingQueue and also wrapping the Runnable with a custom wrapper. The Runnable wrapper logic is:
Atomically attempt to add ID to concurrent 'running' set (ConcurrentHashMap) to see if a task for the same ID is currently running
if add fails, push the task back on to the front of the queue and return immediately
if succeeeds, carry on
Run the task
Remove the task's associated ID from the 'running' set
The queue's poll() methods then only return tasks that have an ID that is not currently in the 'running' set.
The trouble with this is that I'm sure there are going to be a lot of corner cases that I haven't thought about, so it's going to require a lot of testing.

Create an array of executor services running one thread each and assign your queue entries to them by the hash code of your item id. The array can be of any size, depending on how many threads at most do you want to use.
This will restrict that we can use from the executor service but still allows to use its capability to shut down the only thread when no longer needed (with allowCoreThreadTimeOut(true)) and restart it as required. Also, all queuing stuff will work without rewriting it.

The simplest idea could be this:
Have a fixed map of BlockingQueues. Use hash mechanism to pick a queue based on task id. The hash algorithm should pick the same queue for the same ids. Start one single thread for every queue. every thread will pick one task from it's own dedicated queue and execute it.
p.s. the appropriate solution is strongly depends on the type of work you assign to threads
UPDATE
Ok, how about this crazy idea, please bear with me :)
Say, we have a ConcurrentHashMap which holds references id -> OrderQueue
ID1->Q1, ID2->Q2, ID3->Q3, ...
Meaning that now every id is associated with it's own queue. OrderQueue is a custom blocking-queue with an additional boolean flag - isAssociatedWithWorkingThread.
There is also a regular BlockingQueue which we will call amortizationQueue for now, you'll see it's use later.
Next, we have N working threads. Every working thread has it's own working queue which is a BlockingQueue containing ids associated with this thread.
When a new id comes, we do the following:
create a new OrderQueue(isAssociatedWithWorkingThread=false)
put the task to the queue
put id->OrderQueue to the map
put this OrderQueue to amortizationQueue
When an update for existing id comes we do the following:
pick OrderQueue from the map
put the task to the queue
if isAssociatedWithWorkingThread == false
put this OrderQueue to amortizationQueue
Every working thread does the following:
take next id from the working queue
take the OrderQueue associated with this id from the map
take all tasks from this queue
execute them
mark isAssociatedWithWorkingThread=false for this OrderQueue
put this OrderQueue to amortizationQueue
Pretty straightforward. Now to the fun part - work stealing :)
If at some point of time some working thread finds itself with empty working queue, then it does the following:
go to the pool of all working threads
pick one (say, one with the longest working queue)
steal id from *the tail* of that thread's working queue
put this id to it's own working queue
continue with regular execution
And there also +1 additional thread which provides amortization work:
while (true)
take next OrderQueue from amortizationQueue
if queue is not empty and isAssociatedWithWorkingThread == false
set isAssociatedWithWorkingThread=true
pick any working thread and add the id to it's working queue
Will have to spend more time thinking if you can get away with AtomicBoolean for isAssociatedWithWorkingThread flag or there is a need to make it blocking operation to check/change this flag.

I had to deal with a similar situation recently.
I ended up with a design similar to yours. The only difference was that the "current" was a map rather than a set: a map from ID to a queue of Runnables. When the wrapper around task's runnable sees that its ID is present in the map it adds the task's runnable to the ID's queue and returns immediately. Otherwise the ID is added to the map with empty queue and the task is executed.
When the task is done, the wrapper checks the ID's queue again. If the queue is not empty, the runnable is picked. Otherwise it's removed from the map and we're done.
I'll leave shutdown and cancelation as an exercise to the reader :)

Our approach is similar to what is in the update of the original question. We have a wrapper class that is a runnable that contains a queue (LinkedTransferQueue) which we call a RunnableQueue. The runnable queue has the basic API of:
public class RunnableQueue implements Runnable
{
public RunnableQueue(String name, Executor executor);
public void run();
public void execute(Runnable runnable);
}
When the user submits the first Runnable via the execute call the RunnableQueue enqueues itself on the executor. Subsequent calls to execute get queued up on the queue inside the RunnableQueue. When the runnable queue get executed by the ThreadPool (via its run method) it starts to "drain" the internal queue by serially executing the runnables one by one. If execute is called on the RunnableQueue while it is executing, the new runnables simply get appended to the internal queue. Once the queue is drained, the run method of the runnable queue completes and it "leaves" the executor pool. Rinse repeat.
We have other optimizations that do things like only let some number of runnables run (e.g. four) before the RunnableQueue re-posts itself to the executor pool.
The only really tricky bit inside and it isn't that hard) is to synchronize around when it is posted to the executor or not so that it doesn't repost, or miss when it should post.
Overall we find this to work pretty well. The "ID" (semantic context) for us is the runnable queue. The need we have (i.e. a plugin) has a reference to the RunnableQueue and not the executor pool so it is forced to work exclusively through the RunnableQueue. This not only guarantees all accesses are serially sequence (thread confinement) but lets the RunnableQueue "moderate" the plugin's job loading. Additionally, it requires no centralized management structure or other points of contention.

I have to implement a similar solution and the suggestion of creating an array of executor services by h22 seems the best approach to me with one caveat that I will be taking the modulus % of the ID (either the raw ID assuming it is long/int or the hash code) relative to some desired max size and using that result as the new ID so that way I can have a balance between ending up with way too many executor service objects while still getting a good amount of concurrency in the processing.
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
public class ExecutorServiceRouter {
private List<ExecutorService> services;
private int size;
public ExecutorServiceRouter(int size) {
services = new ArrayList<ExecutorService>(size);
this.size = size;
for (int i = 0; i < size; i++) {
services.add(Executors.newSingleThreadExecutor());
}
}
public void route(long id, Runnable r) {
services.get((int) (id % size)).execute(r);
}
public void shutdown() {
for (ExecutorService service : services) {
service.shutdown();
}
}
}

Extending ThreadPoolExecutor would be quite difficult. I would suggest you to go for a producer-consumer system. Here is what I am suggesting.
You can create typical producer consumer systems . Check out the code mentioned in this question.
Now each of these system will have a queue and a Single Consumer thread,which will process the tasks in the queue serially
Now, create a pool of such individual systems.
When you submit a task for a related ID , see if there is already a system marked for that related ID which is currently processing the tasks, if yes then submit the tasks,
If its not processing any tasks then mark that system with this new related ID and submit the task.
This way a single system will cater only for one logical related IDs .
Here I am assuming that a related ID is logical bunch of individual IDs and the producer consumer systems will be created for related IDs and NOT individual IDs.

a "simple" thread pool in java

I'm looking for a simple object that will hold my work threads and I need it to not limit the number of threads, and not keep them alive longer than needed.
But I do need it to have a method similar to an ExecutorService.shutdown();
(Waiting for all the active threads to finish but not accepting any new ones)
so maybe a threadpool isn't what I need, so I would love a push in the right direction.
(as they are meant to keep the threads alive)
Further clarification of intent:
each thread is an upload of a file, and I have another process that modifies files, but it waits for the file to not have any uploads. by joining each of the threads. So when they are kept alive it locks that process. (each thread adds himself to a list for a specific file on creation, so I only join() threads that upload a specific file)

One way to do what you awant is to use a Callable with a Future that returns the File object of a completed upload. Then pass the Future into another Callable that checks Future.isDone() and spins until it returns true and then do whatever you need to do to the file. Your use case is not unique and fits very neatly into the java.util.concurrent package capabilities.
One interesting class is ExecutorCompletionService class which does exactly what you want with waiting for results then proceeding with an additional calculation.
A CompletionService that uses a
supplied Executor to execute tasks.
This class arranges that submitted
tasks are, upon completion, placed on
a queue accessible using take. The
class is lightweight enough to be
suitable for transient use when
processing groups of tasks.
Usage Examples: Suppose you have a set of solvers for a certain problem,
each returning a value of some type
Result, and would like to run them
concurrently, processing the results
of each of them that return a non-null
value, in some method use(Result r).
You could write this as:
void solve(Executor e, Collection<Callable<Result>> solvers)
throws InterruptedException, ExecutionException
{
CompletionService<Result> ecs = new ExecutorCompletionService<Result>(e);
for (Callable<Result> s : solvers) { ecs.submit(s); }
int n = solvers.size();
for (int i = 0; i < n; ++i)
{
Result r = ecs.take().get();
if (r != null) { use(r); }
}
}
You don't want an unbounded ExecutorService
You almost never want to allow unbounded thread pools, as they actually can limit the performance of your application if the number of threads gets out of hand.
You domain is limited by disk or network I/O or both, so a small thread pool would be sufficient. You are not going to want to try and read from hundreds or thousands of incoming connections with a thread per connection.
Part of your solution, if you are receiving more than a handful of concurrent uploads is to investigate the java.nio package and read about non-blocking I/O as well.

Is there a reason that you don't want to reuse threads? Seems to me that the simplest thing would be to use ExecutorService anyway and let it reuse threads.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.