In Oracle documentation for ThreadPoolExecutor class it is written:
There are three general strategies for queuing:
Direct handoffs. A good default choice for a work queue is a SynchronousQueue that hands
off tasks to threads without otherwise holding them. Here, an attempt
to queue a task will fail if no threads are immediately available to
run it, so a new thread will be constructed. This policy avoids
lockups when handling sets of requests that might have internal
dependencies. Direct handoffs generally require unbounded
maximumPoolSizes to avoid rejection of new submitted tasks. This in
turn admits the possibility of unbounded thread growth when commands
continue to arrive on average faster than they can be processed.
Unbounded queues. Using an unbounded queue (for example a LinkedBlockingQueue without a predefined capacity) will cause new
tasks to wait in the queue when all corePoolSize threads are busy.
Thus, no more than corePoolSize threads will ever be created. (And the
value of the maximumPoolSize therefore doesn't have any effect.) This
may be appropriate when each task is completely independent of others,
so tasks cannot affect each others execution; for example, in a web
page server. While this style of queuing can be useful in smoothing
out transient bursts of requests, it admits the possibility of
unbounded work queue growth when commands continue to arrive on
average faster than they can be processed.
...
Why direct handoff strategy is better at avoiding lockups in comparison to unbounded queues strategy? Or do I understand it incorrectly?
Let's say you have a corePoolSize = 1. If the first task submits another task to the same pool and wait for the results it will lock up indefinitely.
However if a task is completley independent there would be no reason to use direct handoff in regards to preventing lockups.
This is just an example, internal dependency can mean a lot of different things.
Related
I see from the java docs -
ThreadPoolExecutor(int corePoolSize,
int maximumPoolSize,
long keepAliveTime,
TimeUnit unit,
BlockingQueue<Runnable> workQueue,
RejectedExecutionHandler handler)
Where -
workQueue – the queue to use for holding tasks before they are executed. This queue will hold only the Runnable tasks submitted by the execute method.
Now java provides various type of blocking queues and the java doc clearly say when to use what type of queue with ThreadPoolExecutor-
Queuing
Any BlockingQueue may be used to transfer and hold submitted tasks. The use of this queue interacts with pool sizing:
If fewer than corePoolSize threads are running, the Executor always prefers adding a new thread rather than queuing.
If corePoolSize or more threads are running, the Executor always prefers queuing a request rather than adding a new thread.
If a request cannot be queued, a new thread is created unless this would exceed maximumPoolSize, in which case, the task will be rejected.
There are three general strategies for queuing:
Direct handoffs. A good default choice for a work queue is a SynchronousQueue that hands off tasks to threads without otherwise holding them. Here, an attempt to queue a task will fail if no threads are immediately available to run it, so a new thread will be constructed. This policy avoids lockups when handling sets of requests that might have internal dependencies. Direct handoffs generally require unbounded maximumPoolSizes to avoid rejection of new submitted tasks. This in turn admits the possibility of unbounded thread growth when commands continue to arrive on average faster than they can be processed.
Unbounded queues. Using an unbounded queue (for example a LinkedBlockingQueue without a predefined capacity) will cause new tasks to wait in the queue when all corePoolSize threads are busy. Thus, no more than corePoolSize threads will ever be created. (And the value of the maximumPoolSize therefore doesn't have any effect.) This may be appropriate when each task is completely independent of others, so tasks cannot affect each others execution; for example, in a web page server. While this style of queuing can be useful in smoothing out transient bursts of requests, it admits the possibility of unbounded work queue growth when commands continue to arrive on average faster than they can be processed.
Bounded queues. A bounded queue (for example, an ArrayBlockingQueue) helps prevent resource exhaustion when used with finite maximumPoolSizes, but can be more difficult to tune and control. Queue sizes and maximum pool sizes may be traded off for each other: Using large queues and small pools minimizes CPU usage, OS resources, and context-switching overhead, but can lead to artificially low throughput. If tasks frequently block (for example if they are I/O bound), a system may be able to schedule time for more threads than you otherwise allow. Use of small queues generally requires larger pool sizes, which keeps CPUs busier but may encounter unacceptable scheduling overhead, which also decreases throughput.
Below is my Question -
I have seen code usages as below -
BlockingQueue<Runnable> workQueue = new LinkedBlockingDeque<>(90);
ExecutorService executorService = new ThreadPoolExecutor(1, 10, 30,
TimeUnit.SECONDS, workQueue,
new ThreadPoolExecutor.CallerRunsPolicy());
So, as the Deque (in the above code) is anyway of fixed capacity. What advantage am I getting with LinkedBlockingDeque<>(90) when compared to below -
LinkedBlockingQueue<>(90) ; - just want to know about deque advantage over queue in this case not in general. How the Executor will benefit for a deque over a queue.
ArrayBlockingQueue<>(90) ; - (i see one can also mention fairness etc but this not of my current interest) So why not just use an Array over Deque (i.e when using a deque of fixed capacity).
LinkedBlockingQueue is an optionally-bounded blocking queue based on linked nodes. Its capacity is not limited.
ArrayBlockingQueue is bounded blocking queue in which a fixed-sized array holds elements.
In your case, there's no benefit anywhere. ArrayBlockingQueue may prove to be more efficient, as it uses fixed-size array in a single memory span.
Difference between Queue and Deque is just it's mechanism. Queue is LIFO while Deque is FIFO.
In LIFO the last task inserted is the last to be executed
In FIFO the last task inserted is the first one to be executed
Consider the following: You want your tasks to be executed as they come in? Use LIFO. You want your tasks to be executed the other way around? use FIFO.
The main benefit is when you're using the thread pool to execute some kind of a pipeline. As a rule of thumb, at each stage in a pipeline, the queue either is almost always empty (producer(s) tend(s) to be slower than the consumer(s)), or else the queue almost always is full (producer(s) tend(s) to be faster.)
If the producer(s) is/are faster, and if the application is meant to continue running indefinitely, then you need a fixed-size, blocking queue to put "back pressure" on the producers. If there was no back pressure, then the queue would continue to grow until eventually, some bad thing happened. (e.g., process runs out of memory, or system breaks down because "tasks" spend too much time delayed in the queues.)
While setting up thread pool configuration, how do you choose the correct RejectedExecutionHandler?
I have a legacy application which publishes events (those events could be consumed locally or could be consumed by the remote process). At the moment, the policy is to abort which causes lots of exceptions and missed events. We pass synchronous queue to thread pool executor.
I was thinking of changing the RejectedExecutionHandler to caller runs policy. This could mean that caller spending time running that task when the thread bound and queue capacity is reached. I don't see any problem with that.
What has been your experience so far? Also, Is using unbounded queue means no utility for RejectedExecutionHandler?
I think you are already familiar with different RejectedExecutionHandlers of ThreadPoolExecutor
In ThreadPoolExecutor.CallerRunsPolicy, the thread that invokes execute itself runs the task. This provides a simple feedback control mechanism that will slow down the rate that new tasks are submitted.
It will impact overall performance of your application. If your application can afford this delay (Not Real time and Batch Processing, Non-Interactive and Offline), you can use this policy. If you can't afford delay and fine with discarding that task, you can go for ThreadPoolExecutor.DiscardPolicy
Is using unbounded queue means no utility for RejectedExecutionHandler?
Yes. Unbounded queue means no utility for RejectedExecutionHandler. When you are using unbounded queue, make sure that your application throughput is under control with respect to Memory and CPU utilization. If you are submitting short duration tasks with less memory footprint of data in that task, you can use unbounded queue.
I want to use an ExecutorService that uses a single thread. And now I am inserting requests (via submit) on a higher rate than that thread can deal with them. What happens?
I am specifically wondering:
Are there any guarantees on ordering - will tasks be executed in the exact same order?
Is there a (theoretical) limit on which the ExecutorService will start throwing away incoming requests?
Out of curiosity: what changes when the service is using a pool of threads?
(sure, I can assume that some queue might be used; and that the Oracle implementation just "does the right thing"; but I am actually wondering if there is a real "spec" somewhere that nails down the expected behavior)
If you created a fixed thread-pool ExecutorService with Executors.newFixedThreadPool(1); (or newSingleThreadExecutor()) then the Javadoc clearly specifies what happens.
Are there any guarantees on ordering - will tasks be executed in the exact same order?
A fixed-thread pool uses a LinkedBlockingQueue to hold pending tasks. Such queue implements a FIFO strategy (first-in-first-out) so the order of execution is guaranteed.
Is there a (theoretical) limit on which the ExecutorService will start throwing away incoming requests?
Quoting the Javadoc:
If additional tasks are submitted when all threads are active, they will wait in the queue until a thread is available.
Every incoming request will be added to an unbounded queue so there is no limit and no requests will be rejected (so the theoretical limit is Integer.MAX_VALUE).
Out of curiosity: what changes when the service is using a pool of threads?
If you mean, "what changes if there are more than 1 thread in the fixed thread pool", then nothing. The queue will still have a FIFO nature and there will be no limit on this queue. Otherwise, it depends on how you create the thread-pool.
I take it you are getting your ExecutorService via Executors.newSingleThreadExecutor()?
Creates an Executor that uses a single worker thread operating off an unbounded queue. (Note however that if this single thread terminates due to a failure during execution prior to shutdown, a new one will take its place if needed to execute subsequent tasks.) Tasks are guaranteed to execute sequentially, and no more than one task will be active at any given time. Unlike the otherwise equivalent newFixedThreadPool(1) the returned executor is guaranteed not to be reconfigurable to use additional threads.
So:
Are there any guarantees on ordering - will tasks be executed in the exact same order?
Tasks are guaranteed to execute sequentially.
Is there a (theoretical) limit on which the ExecutorService will start throwing away incoming requests?
Operating off an unbounded queue. So as large as memory/the backing store of the queue will allow. Commonly Integer.MAX_VALUE.
Out of curiosity: what changes when the service is using a pool of threads?
Depends on how you create the ExecutorService. You can create with bounded queues if you wished, or with a queue that did not use FIFO (such as PriorityBlockingQueue. The documentation for ThreadPoolExecutor gives a good overview of your different options.
When we talk about the processing of asynchronous events using an Executors service, why does creating a new fixed thread pool, involve the use of LinkedBlockingQueue ? The events which are arriving are not dependent at all, so why use a queue because the consumer thread would still involve the contention for take lock? Why doens't the Executors class have some hybrid data structure(such as a concurrent Map implementation) where there is no need for a take lock in most of the cases ?
There is very good reason what thread pool executor works with BlockingQueue (btw, you are not obliged to use LinkedBlockingQueue implementation, you can use different implementations of the BlockingQueue). The queue should be blocking in order to suspend worker threads when there are no tasks to execute. This blocking is done using wait on condition variables, so waiting worker threads do not consume any CPU resources when queue is empty.
If you use non-blocking queue in the thread pool, then how would worker threads poll for tasks to execute? They would have to implement some kind of polling, which is unnecessary wasting of CPU resources (it will be "busy waiting").
UPDATE:
Ok, now I fully understood the use case. Still you need blocking collection anyway. The reason is basically the same - since you implement Producer-Consumer you should have means for worker threads to wait for messages to arrive - and this you simply can't do without mutex + condition variable (or simply BlockingQueue).
Regarding map - yes, I understand how you want to use it, but unfortunately there is no such implementation provided. Recently I solved the similar problem: I needed to group incoming tasks by some criteria and execute tasks from each group serially. As a result I implemented my own GroupThreadPoolExecutor that does this grouping. The idea is simple: group incoming tasks into map and then add them to the executor queue when previous task from the group completes.
There is big discussion here - I think it's relevant to your question.
at http://download.oracle.com/javase/6/docs/api/java/util/concurrent/ThreadPoolExecutor.html
you can read the description of parameters to constructor.
Specifically in the "Core and maximum pool sizes" paragraph, it's written:
If there are more than corePoolSize but less than maximumPoolSize threads running, a new thread will be created only if the queue is full.
...
By setting maximumPoolSize to an essentially unbounded value such as Integer.MAX_VALUE, you allow the pool to accommodate an arbitrary number of concurrent tasks.
Now I can't understand what "only if the queue is full" in the first part stands for...
Will ThreadPoolExecutor wait until queue is full or it will simply create a new worker?
An suppose now that we have more tasks that aren't asynchronous between them: using a ThreadPoolExecutor could cause a deadlock? Supposing that my first 10 tasks are producer and that CorePoolSize is 10, then succeeding consumer tasks will go to queue and won't run until the queue is full? If so this behavior may cause deadlock because first 10 producers could go on wait, suspending all 10 threads of the Core.
When the queue is full?
I'm not sure I understood well the documentation, because Executors.newCachedThreadPool() seems to create a new Worker until maxPoolSize is reached and THEN it sends task to queue.
I'm a little confused.
Thank you
When you construct the ThreadPoolExecutor, you pass in an instance of BlockingQueue<Runnable> called workQueue, to hold the tasks, and it is this queue that is being referred to.
In fact, the section of the docs called "Queuing" goes into more detail about the phrase you're confused about:
Any BlockingQueue may be used to transfer and hold submitted tasks. The use of this queue interacts with pool sizing:
If fewer than corePoolSize threads are running, the Executor always prefers adding a new thread rather than queuing.
If corePoolSize or more threads are running, the Executor always prefers queuing a request rather than adding a new thread.
If a request cannot be queued, a new thread is created unless this would exceed maximumPoolSize, in which case, the task will be rejected.
As for your second part, about inter-task dependencies - in this case I don't think it's a good idea to put them into an ExecutorService at all. The ExecutorService is good for running a self-contained bit of code at some point in the future, but by design it's not meant to be strongly deterministic about when this happens, other than "at some convenient point in the (hopefully near) future, after tasks that were previously queued have started."
Combine this lack of precision of timing, with the hard ordering requirements that concurrent operation imposes, and you can see that having a producer and a consumer that need to talk to each other, put into a general purpose ExecutorService, is a recipe for very annoying and confusing bugs.
Yes, I'm sure you could get it to work with sufficient tweaking of parameters. However, it wouldn't be clear why it worked, it wouldn't be clear what the dependencies were, and when (not if) it broke, it would be very hard to diagnose. (Harder than normal concurrency problems, I suspect). The bottom line is that an ExecutorService isn't designed to run Runnables with hard timing restrictions, so this could even be broken by a new release of Java, because it doesn't have to work like this.
I think you're asking the wrong question, by looking at the details when perhaps your concepts are a little shaky. Perhaps if you explained what you wanted to achieve there would be a better way to go about it.
To clarify your first quote - if the executor already has corePoolSize threads running, and all of those threads are busy when a new task is submitted, it will not create any more threads, but will enqueue the task until one of the threads becomes free. It will only create a new thread (up to the maxPoolSize limit) when the queue becomes full. If the queue is huge (i.e. bounded only by memory constraints) then no more than corePoolSize threads will ever be created.
Executors.newCachedThreadPool() will create an executor with a zero-sized queue, so the queue is always full. This means that it will create a new thread (or re-use an idle one) as soon as you submit the task. As such, it's not a good demonstration of what the core/max/queue parameters are for.