While setting up thread pool configuration, how do you choose the correct RejectedExecutionHandler?
I have a legacy application which publishes events (those events could be consumed locally or could be consumed by the remote process). At the moment, the policy is to abort which causes lots of exceptions and missed events. We pass synchronous queue to thread pool executor.
I was thinking of changing the RejectedExecutionHandler to caller runs policy. This could mean that caller spending time running that task when the thread bound and queue capacity is reached. I don't see any problem with that.
What has been your experience so far? Also, Is using unbounded queue means no utility for RejectedExecutionHandler?
I think you are already familiar with different RejectedExecutionHandlers of ThreadPoolExecutor
In ThreadPoolExecutor.CallerRunsPolicy, the thread that invokes execute itself runs the task. This provides a simple feedback control mechanism that will slow down the rate that new tasks are submitted.
It will impact overall performance of your application. If your application can afford this delay (Not Real time and Batch Processing, Non-Interactive and Offline), you can use this policy. If you can't afford delay and fine with discarding that task, you can go for ThreadPoolExecutor.DiscardPolicy
Is using unbounded queue means no utility for RejectedExecutionHandler?
Yes. Unbounded queue means no utility for RejectedExecutionHandler. When you are using unbounded queue, make sure that your application throughput is under control with respect to Memory and CPU utilization. If you are submitting short duration tasks with less memory footprint of data in that task, you can use unbounded queue.
Related
I am working on an application which needs to test 1000's of proxy servers continuously. The application is based around Spring Boot.
The current approach I am using is #Async decorated method which takes a proxy server and returns the result.
I am often getting OutOfMemory error and the processing is very slow. I assume that is because each async method is executed in a separate thread which blocks on I/O?
Everywhere I read about async in Java, people mix parallel execution in threads with non-blocking IO. In the Python world, there is the async library which executes I/O requests in a single thread. While a method is waiting for a response from server, it starts executing other method.
I think in my case, I need something like this because Spring's #Async is not suitable for me. Can someone please help remove my confusion and suggest me how should I go about this challenge?
I want to check 100's of proxies simultaneously without putting excessive load.
I have read about Apache Async HTTP Client but I don't know if it is suitable?
This is the thread pool configuration I am using:
public Executor proxyTaskExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(Runtime.getRuntime().availableProcessors() * 2 - 1);
executor.setMaxPoolSize(100);
executor.setDaemon(true);
return executor;
}
I am often getting OutOfMemory error and the processing is very slow.
I assume that is because each async method is executed in a separate
thread which blocks on I/O?
For the OOME, I explain it in the second point.
About the slowness, it is indeed related to I/O performed in the request/response processings.
The problem comes from the number of thread running effectively in parallel.
With your actual configuration, the number of pool max is never reached (I explain why below).
Supposing that corePoolSize==10 in your case. It means that 10 threads run in parallel. Suppose each thread runs about 3 seconds to test the site.
It means that you test a site in about 0.3 second. To test 1000 sites it makes 300 seconds.
It is slow enough and an important part of the time is waiting time : I/O to send/receive request/response from the site currently tested.
To increase the overall speed, you should probably run in parallel initially much more threads than your core capacity. In this way, I/O waiting time will be less a problem since the scheduling between the threads will be frequent and so you would have some I/O processings without value for the threads while these are paused.
It should handle the OOME issue and probably improve strongly the execution time, but well no guarantee that you get a very short time.
To achieve it you should probably work the multi-threading logic more finely and rely on API/libraries with non blocking IO.
Some information of the official documentation that should be helpful.
This part explains the overall logical when a task is submitted (emphasis is mine):
The configuration of the thread pool should also be considered in
light of the executor’s queue capacity. For the full description of
the relationship between pool size and queue capacity, see the
documentation for ThreadPoolExecutor. The main idea is that, when a
task is submitted, the executor first tries to use a free thread if
the number of active threads is currently less than the core size. If
the core size has been reached, the task is added to the queue, as
long as its capacity has not yet been reached. Only then, if the
queue’s capacity has been reached, does the executor create a new
thread beyond the core size. If the max size has also been reached,
then the executor rejects the task.
And this explains the consequences on the queue size (emphasis is still mine):
By default, the queue is unbounded, but this is rarely the desired
configuration, because it can lead to OutOfMemoryErrors if enough
tasks are added to that queue while all pool threads are busy.
Furthermore, if the queue is unbounded, the max size has no effect at
all. Since the executor always tries the queue before creating a new
thread beyond the core size, a queue must have a finite capacity for
the thread pool to grow beyond the core size (this is why a fixed-size
pool is the only sensible case when using an unbounded queue).
Long story short : you didn't set the queue size that by default is unbounded (Integer.MAX_VALUE). So you fill the queue with several hundreds of tasks that will be pop only much later. These tasks use much memory, whereas the OOME risen.
Besides, as explained in the documentation, this setting is helpless with an unbounded queue because only when the queue is full a new thread would be created :
executor.setMaxPoolSize(100);
Setting both information with relevant values make more sense :
public Executor proxyTaskExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(Runtime.getRuntime().availableProcessors() * 2 - 1);
executor.setMaxPoolSize(100);
executor.setQueueCapacity(100);
executor.setDaemon(true);
return executor;
}
Or as alternative use a fixed-size pool with the same value for initial and max pool size :
Rather than only a single size, an executor’s thread pool can have
different values for the core and the max size. If you provide a
single value, the executor has a fixed-size thread pool (the core and
max sizes are the same).
public Executor proxyTaskExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(100);
executor.setMaxPoolSize(100);
executor.setDaemon(true);
return executor;
}
Note also that invoking 1000 times the asynch service without pause seems harmful in terms of memory since it cannot handle them straightly. You should probably split these invocations into smaller parts (2, 3 or more) by performing thread.sleep() between them.
In Oracle documentation for ThreadPoolExecutor class it is written:
There are three general strategies for queuing:
Direct handoffs. A good default choice for a work queue is a SynchronousQueue that hands
off tasks to threads without otherwise holding them. Here, an attempt
to queue a task will fail if no threads are immediately available to
run it, so a new thread will be constructed. This policy avoids
lockups when handling sets of requests that might have internal
dependencies. Direct handoffs generally require unbounded
maximumPoolSizes to avoid rejection of new submitted tasks. This in
turn admits the possibility of unbounded thread growth when commands
continue to arrive on average faster than they can be processed.
Unbounded queues. Using an unbounded queue (for example a LinkedBlockingQueue without a predefined capacity) will cause new
tasks to wait in the queue when all corePoolSize threads are busy.
Thus, no more than corePoolSize threads will ever be created. (And the
value of the maximumPoolSize therefore doesn't have any effect.) This
may be appropriate when each task is completely independent of others,
so tasks cannot affect each others execution; for example, in a web
page server. While this style of queuing can be useful in smoothing
out transient bursts of requests, it admits the possibility of
unbounded work queue growth when commands continue to arrive on
average faster than they can be processed.
...
Why direct handoff strategy is better at avoiding lockups in comparison to unbounded queues strategy? Or do I understand it incorrectly?
Let's say you have a corePoolSize = 1. If the first task submits another task to the same pool and wait for the results it will lock up indefinitely.
However if a task is completley independent there would be no reason to use direct handoff in regards to preventing lockups.
This is just an example, internal dependency can mean a lot of different things.
I want to use an ExecutorService that uses a single thread. And now I am inserting requests (via submit) on a higher rate than that thread can deal with them. What happens?
I am specifically wondering:
Are there any guarantees on ordering - will tasks be executed in the exact same order?
Is there a (theoretical) limit on which the ExecutorService will start throwing away incoming requests?
Out of curiosity: what changes when the service is using a pool of threads?
(sure, I can assume that some queue might be used; and that the Oracle implementation just "does the right thing"; but I am actually wondering if there is a real "spec" somewhere that nails down the expected behavior)
If you created a fixed thread-pool ExecutorService with Executors.newFixedThreadPool(1); (or newSingleThreadExecutor()) then the Javadoc clearly specifies what happens.
Are there any guarantees on ordering - will tasks be executed in the exact same order?
A fixed-thread pool uses a LinkedBlockingQueue to hold pending tasks. Such queue implements a FIFO strategy (first-in-first-out) so the order of execution is guaranteed.
Is there a (theoretical) limit on which the ExecutorService will start throwing away incoming requests?
Quoting the Javadoc:
If additional tasks are submitted when all threads are active, they will wait in the queue until a thread is available.
Every incoming request will be added to an unbounded queue so there is no limit and no requests will be rejected (so the theoretical limit is Integer.MAX_VALUE).
Out of curiosity: what changes when the service is using a pool of threads?
If you mean, "what changes if there are more than 1 thread in the fixed thread pool", then nothing. The queue will still have a FIFO nature and there will be no limit on this queue. Otherwise, it depends on how you create the thread-pool.
I take it you are getting your ExecutorService via Executors.newSingleThreadExecutor()?
Creates an Executor that uses a single worker thread operating off an unbounded queue. (Note however that if this single thread terminates due to a failure during execution prior to shutdown, a new one will take its place if needed to execute subsequent tasks.) Tasks are guaranteed to execute sequentially, and no more than one task will be active at any given time. Unlike the otherwise equivalent newFixedThreadPool(1) the returned executor is guaranteed not to be reconfigurable to use additional threads.
So:
Are there any guarantees on ordering - will tasks be executed in the exact same order?
Tasks are guaranteed to execute sequentially.
Is there a (theoretical) limit on which the ExecutorService will start throwing away incoming requests?
Operating off an unbounded queue. So as large as memory/the backing store of the queue will allow. Commonly Integer.MAX_VALUE.
Out of curiosity: what changes when the service is using a pool of threads?
Depends on how you create the ExecutorService. You can create with bounded queues if you wished, or with a queue that did not use FIFO (such as PriorityBlockingQueue. The documentation for ThreadPoolExecutor gives a good overview of your different options.
I'm trying to set up a job that will run every x minutes/seconds/milliseconds/whatever and poll an Amazon SQS queue for messages to process. My question is what the best approach would be for this. Should I create a ScheduledThreadPoolExecutor with x number of threads and schedule a single task with scheduleAtFixedRate method and just run it very often (like 10 ms) so that multiple threads will be used when needed, or, as I am proposing to colleagues, create a ScheduledThreadPoolExecutor with x number of threads and then create multiple scheduled tasks at slightly offset intervals but running less often. This to me sounds like how the STPE was meant to be used.
Typically I use Spring/Quartz for this type of thing but that's out of at this point.
So what are your thoughts?
I recommend that you use long polling on SQS, which makes your ReceiveMessage calls behave more like calls to take on a BlockingQueue (which means that you won't need to use a scheduled task to poll from the queue - you just need a single thread that polls in an infinite loop, retrying if the connection times out)
Well it depends on the frequency of tasks. If you just have to poll on timely interval and the interval is not very small, then ScheduledThreadPoolExecutor with scheduleAtFixedRate is a good alternative.
Else I will recommend using netty's HashedWheelTimer. Under heavy tasks it gives the best performance. Akka and play uses this for scheduling. This is because STPE for every task adding takes O(log(n)) where as HWT takes O(1).
If you have to use STPE, I will recommend one task at a rate else it results in excess resource.
Long Polling is like a blocking queue only for a max of 20 seconds after which the call returns. Long polling is sufficient if that is the max delay required between poll cycles. Beyond that you will need a scheduledExector.
The number of threads really depends on how fast you can process the received messages. If you can process the message really fast you need only a single thread. I have a setup as follows
SingleThreadScheduledExecutor with scheduleWithFixedDelay executes 5 mins after the previous completion
In each execution messages are retrieved in batch from SQS till there are no more messages to process (remember each batch receive a max of 10 messages).
The messages are processed and then deleted from queue.
For my scenario single thread is sufficient. If the backlog is increasing (for example, a network operation is required for each message which may involve waits), you might want to use multiple threads. If one processing node become resource constrained you could always start another instance (EC2 perhaps) to add more capacity.
We need to do some asynchronous task processing where in around 30-40 requests will be coming at the same moment and each request will intiate a asynch task which will approximately take around 7-8 seconds to complete.
If java executorservice has been identified to do such task, what would be the idle type of executor for such purpose?
I thought of using CachedThreadPool but my worry is if too many threads are created would it have any performance impact on the application?
Another option would be to use FixedThreadPool but I am struggling to think of a idle no threads which it should be instantiated with...
What is the recommended Executor for such a scenario or how we go about finding the right one?
I think you are limiting your research to just the Executors.* factory methods. You should review what the range of constructors of ThreadPoolExecutor, you'll find a maximum thread pool size limit, among other things.
I thought of using CachedThreadPool but my worry is if too many
threads are created would it have any performance impact on the
application?
You need to test for the application for performance impact.
If none of them fits into the application or having some issues then you can use customized thread pool executor java.util.concurrent.ThreadPoolExecutor
You can customize according your needs with configuiring core pool size, configuring the blocking queue. Blocking queue will be used and task will be queued when pool size is reached.