I need to make a queue with n semaphores so processes that did not enter due to size, stand in the waiting pool until the queue is free. When process have semaphore, ThreadPool runs its function in another thread. I also need a concurrent list of semaphore-carrying processes' IDs so that it is updated along with the semaphore queue. How can I do this using modern Java 8 patterns?
It strikes me that there is a much simpler solution that doesn't involve explicit semaphores and custom code (and bugs).
Just use a bounded BlockingQueue (javadoc) and have the threads use put(...) to add items to the queue. When queue is full, put will block the thread that is calling it ... until queue space is available. If you don't want the thread to block indefinitely, use offer with a suitable timeout.
Related
I see from the java docs -
ThreadPoolExecutor(int corePoolSize,
int maximumPoolSize,
long keepAliveTime,
TimeUnit unit,
BlockingQueue<Runnable> workQueue,
RejectedExecutionHandler handler)
Where -
workQueue – the queue to use for holding tasks before they are executed. This queue will hold only the Runnable tasks submitted by the execute method.
Now java provides various type of blocking queues and the java doc clearly say when to use what type of queue with ThreadPoolExecutor-
Queuing
Any BlockingQueue may be used to transfer and hold submitted tasks. The use of this queue interacts with pool sizing:
If fewer than corePoolSize threads are running, the Executor always prefers adding a new thread rather than queuing.
If corePoolSize or more threads are running, the Executor always prefers queuing a request rather than adding a new thread.
If a request cannot be queued, a new thread is created unless this would exceed maximumPoolSize, in which case, the task will be rejected.
There are three general strategies for queuing:
Direct handoffs. A good default choice for a work queue is a SynchronousQueue that hands off tasks to threads without otherwise holding them. Here, an attempt to queue a task will fail if no threads are immediately available to run it, so a new thread will be constructed. This policy avoids lockups when handling sets of requests that might have internal dependencies. Direct handoffs generally require unbounded maximumPoolSizes to avoid rejection of new submitted tasks. This in turn admits the possibility of unbounded thread growth when commands continue to arrive on average faster than they can be processed.
Unbounded queues. Using an unbounded queue (for example a LinkedBlockingQueue without a predefined capacity) will cause new tasks to wait in the queue when all corePoolSize threads are busy. Thus, no more than corePoolSize threads will ever be created. (And the value of the maximumPoolSize therefore doesn't have any effect.) This may be appropriate when each task is completely independent of others, so tasks cannot affect each others execution; for example, in a web page server. While this style of queuing can be useful in smoothing out transient bursts of requests, it admits the possibility of unbounded work queue growth when commands continue to arrive on average faster than they can be processed.
Bounded queues. A bounded queue (for example, an ArrayBlockingQueue) helps prevent resource exhaustion when used with finite maximumPoolSizes, but can be more difficult to tune and control. Queue sizes and maximum pool sizes may be traded off for each other: Using large queues and small pools minimizes CPU usage, OS resources, and context-switching overhead, but can lead to artificially low throughput. If tasks frequently block (for example if they are I/O bound), a system may be able to schedule time for more threads than you otherwise allow. Use of small queues generally requires larger pool sizes, which keeps CPUs busier but may encounter unacceptable scheduling overhead, which also decreases throughput.
Below is my Question -
I have seen code usages as below -
BlockingQueue<Runnable> workQueue = new LinkedBlockingDeque<>(90);
ExecutorService executorService = new ThreadPoolExecutor(1, 10, 30,
TimeUnit.SECONDS, workQueue,
new ThreadPoolExecutor.CallerRunsPolicy());
So, as the Deque (in the above code) is anyway of fixed capacity. What advantage am I getting with LinkedBlockingDeque<>(90) when compared to below -
LinkedBlockingQueue<>(90) ; - just want to know about deque advantage over queue in this case not in general. How the Executor will benefit for a deque over a queue.
ArrayBlockingQueue<>(90) ; - (i see one can also mention fairness etc but this not of my current interest) So why not just use an Array over Deque (i.e when using a deque of fixed capacity).
LinkedBlockingQueue is an optionally-bounded blocking queue based on linked nodes. Its capacity is not limited.
ArrayBlockingQueue is bounded blocking queue in which a fixed-sized array holds elements.
In your case, there's no benefit anywhere. ArrayBlockingQueue may prove to be more efficient, as it uses fixed-size array in a single memory span.
Difference between Queue and Deque is just it's mechanism. Queue is LIFO while Deque is FIFO.
In LIFO the last task inserted is the last to be executed
In FIFO the last task inserted is the first one to be executed
Consider the following: You want your tasks to be executed as they come in? Use LIFO. You want your tasks to be executed the other way around? use FIFO.
The main benefit is when you're using the thread pool to execute some kind of a pipeline. As a rule of thumb, at each stage in a pipeline, the queue either is almost always empty (producer(s) tend(s) to be slower than the consumer(s)), or else the queue almost always is full (producer(s) tend(s) to be faster.)
If the producer(s) is/are faster, and if the application is meant to continue running indefinitely, then you need a fixed-size, blocking queue to put "back pressure" on the producers. If there was no back pressure, then the queue would continue to grow until eventually, some bad thing happened. (e.g., process runs out of memory, or system breaks down because "tasks" spend too much time delayed in the queues.)
In Oracle documentation for ThreadPoolExecutor class it is written:
There are three general strategies for queuing:
Direct handoffs. A good default choice for a work queue is a SynchronousQueue that hands
off tasks to threads without otherwise holding them. Here, an attempt
to queue a task will fail if no threads are immediately available to
run it, so a new thread will be constructed. This policy avoids
lockups when handling sets of requests that might have internal
dependencies. Direct handoffs generally require unbounded
maximumPoolSizes to avoid rejection of new submitted tasks. This in
turn admits the possibility of unbounded thread growth when commands
continue to arrive on average faster than they can be processed.
Unbounded queues. Using an unbounded queue (for example a LinkedBlockingQueue without a predefined capacity) will cause new
tasks to wait in the queue when all corePoolSize threads are busy.
Thus, no more than corePoolSize threads will ever be created. (And the
value of the maximumPoolSize therefore doesn't have any effect.) This
may be appropriate when each task is completely independent of others,
so tasks cannot affect each others execution; for example, in a web
page server. While this style of queuing can be useful in smoothing
out transient bursts of requests, it admits the possibility of
unbounded work queue growth when commands continue to arrive on
average faster than they can be processed.
...
Why direct handoff strategy is better at avoiding lockups in comparison to unbounded queues strategy? Or do I understand it incorrectly?
Let's say you have a corePoolSize = 1. If the first task submits another task to the same pool and wait for the results it will lock up indefinitely.
However if a task is completley independent there would be no reason to use direct handoff in regards to preventing lockups.
This is just an example, internal dependency can mean a lot of different things.
I want to use an ExecutorService that uses a single thread. And now I am inserting requests (via submit) on a higher rate than that thread can deal with them. What happens?
I am specifically wondering:
Are there any guarantees on ordering - will tasks be executed in the exact same order?
Is there a (theoretical) limit on which the ExecutorService will start throwing away incoming requests?
Out of curiosity: what changes when the service is using a pool of threads?
(sure, I can assume that some queue might be used; and that the Oracle implementation just "does the right thing"; but I am actually wondering if there is a real "spec" somewhere that nails down the expected behavior)
If you created a fixed thread-pool ExecutorService with Executors.newFixedThreadPool(1); (or newSingleThreadExecutor()) then the Javadoc clearly specifies what happens.
Are there any guarantees on ordering - will tasks be executed in the exact same order?
A fixed-thread pool uses a LinkedBlockingQueue to hold pending tasks. Such queue implements a FIFO strategy (first-in-first-out) so the order of execution is guaranteed.
Is there a (theoretical) limit on which the ExecutorService will start throwing away incoming requests?
Quoting the Javadoc:
If additional tasks are submitted when all threads are active, they will wait in the queue until a thread is available.
Every incoming request will be added to an unbounded queue so there is no limit and no requests will be rejected (so the theoretical limit is Integer.MAX_VALUE).
Out of curiosity: what changes when the service is using a pool of threads?
If you mean, "what changes if there are more than 1 thread in the fixed thread pool", then nothing. The queue will still have a FIFO nature and there will be no limit on this queue. Otherwise, it depends on how you create the thread-pool.
I take it you are getting your ExecutorService via Executors.newSingleThreadExecutor()?
Creates an Executor that uses a single worker thread operating off an unbounded queue. (Note however that if this single thread terminates due to a failure during execution prior to shutdown, a new one will take its place if needed to execute subsequent tasks.) Tasks are guaranteed to execute sequentially, and no more than one task will be active at any given time. Unlike the otherwise equivalent newFixedThreadPool(1) the returned executor is guaranteed not to be reconfigurable to use additional threads.
So:
Are there any guarantees on ordering - will tasks be executed in the exact same order?
Tasks are guaranteed to execute sequentially.
Is there a (theoretical) limit on which the ExecutorService will start throwing away incoming requests?
Operating off an unbounded queue. So as large as memory/the backing store of the queue will allow. Commonly Integer.MAX_VALUE.
Out of curiosity: what changes when the service is using a pool of threads?
Depends on how you create the ExecutorService. You can create with bounded queues if you wished, or with a queue that did not use FIFO (such as PriorityBlockingQueue. The documentation for ThreadPoolExecutor gives a good overview of your different options.
When we talk about the processing of asynchronous events using an Executors service, why does creating a new fixed thread pool, involve the use of LinkedBlockingQueue ? The events which are arriving are not dependent at all, so why use a queue because the consumer thread would still involve the contention for take lock? Why doens't the Executors class have some hybrid data structure(such as a concurrent Map implementation) where there is no need for a take lock in most of the cases ?
There is very good reason what thread pool executor works with BlockingQueue (btw, you are not obliged to use LinkedBlockingQueue implementation, you can use different implementations of the BlockingQueue). The queue should be blocking in order to suspend worker threads when there are no tasks to execute. This blocking is done using wait on condition variables, so waiting worker threads do not consume any CPU resources when queue is empty.
If you use non-blocking queue in the thread pool, then how would worker threads poll for tasks to execute? They would have to implement some kind of polling, which is unnecessary wasting of CPU resources (it will be "busy waiting").
UPDATE:
Ok, now I fully understood the use case. Still you need blocking collection anyway. The reason is basically the same - since you implement Producer-Consumer you should have means for worker threads to wait for messages to arrive - and this you simply can't do without mutex + condition variable (or simply BlockingQueue).
Regarding map - yes, I understand how you want to use it, but unfortunately there is no such implementation provided. Recently I solved the similar problem: I needed to group incoming tasks by some criteria and execute tasks from each group serially. As a result I implemented my own GroupThreadPoolExecutor that does this grouping. The idea is simple: group incoming tasks into map and then add them to the executor queue when previous task from the group completes.
There is big discussion here - I think it's relevant to your question.
I want to use a ConcurrentLinkedQueue in an atomic lock-free manner:
Several concurrent threads push events into the queue and some other thread will process them. The queue is not bound and I don't want any thread to wait or get locked. The reading part however may notice that the queue got empty. In a lock free implementation the reading thread must not block but will just end its task and proceeds executing other tasks (i.e. as an ExecutorService). Thus the writer pushing the first new event into an empty queue must become aware about it and should restart the reader (i.e. by submitting a new Runnable to the ExecutorService) to process the queue. Any further threads submitting a second or third event won't care about, as they may assume some reader was already prepared/submitted.
Unfortunately the add() method of ConcurrentLinkedQueue always returns true. Asking the queue if isEmpty() before or after adding the event won't help, as it is not atomic.
Should I use some additional AtomicInteger for monitoring the queue size() or is there some smarter solution for that?
Dieter.
I don't quite understand why you wouldn't just use an ExecutorService directly for this. It uses a BlockingQueue internally and takes care of all of the signaling itself.
// open ended thread pool
ExecutorService threadPool = Executors.newFixedThreadPool(1);
for (Job job : jobsToDo) {
threadPool.submit(new MyJobProcessor(job));
}
Unless you have good reasons, I would not rewrite the same logic yourself.
If you are trying to make use of dormant threads somehow, I would strongly recommend not bothering. Threads are relatively cheap so assigning a thread to process your queued tasks is fine. Re-using threads is unnecessary and seems like premature optimization to me.
Using of AtomicInteger for resolving submit contention is more efficient than locks or synchronized block.
Here is an example how it can be implemented with Java.
Also there is more efficient structure for multi-producer / single-writer queue than ConcurrentLinkedQueue.
Example of using it for actor implementations.
Another example.