Is adding tasks to BlockingQueue of ThreadPoolExecutor advisable?

Is adding tasks to BlockingQueue of ThreadPoolExecutor advisable? - java

The JavaDoc for ThreadPoolExecutor is unclear on whether it is acceptable to add tasks directly to the BlockingQueue backing the executor. The docs say calling executor.getQueue() is "intended primarily for debugging and monitoring".
I'm constructing a ThreadPoolExecutor with my own BlockingQueue. I retain a reference to the queue so I can add tasks to it directly. The same queue is returned by getQueue() so I assume the admonition in getQueue() applies to a reference to the backing queue acquired through my means.
Example
General pattern of the code is:
int n = ...; // number of threads
queue = new ArrayBlockingQueue<Runnable>(queueSize);
executor = new ThreadPoolExecutor(n, n, 1, TimeUnit.HOURS, queue);
executor.prestartAllCoreThreads();
// ...
while (...) {
Runnable job = ...;
queue.offer(job, 1, TimeUnit.HOURS);
}
while (jobsOutstanding.get() != 0) {
try {
Thread.sleep(...);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
executor.shutdownNow();
queue.offer() vs executor.execute()
As I understand it, the typical use is to add tasks via executor.execute(). The approach in my example above has the benefit of blocking on the queue whereas execute() fails immediately if the queue is full and rejects my task. I also like that submitting jobs interacts with a blocking queue; this feels more "pure" producer-consumer to me.
An implication of adding tasks to the queue directly: I must call prestartAllCoreThreads() otherwise no worker threads are running. Assuming no other interactions with the executor, nothing will be monitoring the queue (examination of ThreadPoolExecutor source confirms this). This also implies for direct enqueuing that the ThreadPoolExecutor must additionally be configured for > 0 core threads and mustn't be configured to allow core threads to timeout.
tl;dr
Given a ThreadPoolExecutor configured as follows:
core threads > 0
core threads aren't allowed to timeout
core threads are prestarted
hold a reference to the BlockingQueue backing the executor
Is it acceptable to add tasks directly to the queue instead of calling executor.execute()?
Related
This question ( producer/consumer work queues ) is similar, but doesn't specifically cover adding to the queue directly.

One trick is to implement a custom subclass of ArrayBlockingQueue and to override the offer() method to call your blocking version, then you can still use the normal code path.
queue = new ArrayBlockingQueue<Runnable>(queueSize) {
#Override public boolean offer(Runnable runnable) {
try {
return offer(runnable, 1, TimeUnit.HOURS);
} catch(InterruptedException e) {
// return interrupt status to caller
Thread.currentThread().interrupt();
}
return false;
}
};
(as you can probably guess, i think calling offer directly on the queue as your normal code path is probably a bad idea).

If it were me, I would prefer using Executor#execute() over Queue#offer(), simply because I'm using everything else from java.util.concurrent already.
Your question is a good one, and it piqued my interest, so I took a look at the source for ThreadPoolExecutor#execute():
public void execute(Runnable command) {
if (command == null)
throw new NullPointerException();
if (poolSize >= corePoolSize || !addIfUnderCorePoolSize(command)) {
if (runState == RUNNING && workQueue.offer(command)) {
if (runState != RUNNING || poolSize == 0)
ensureQueuedTaskHandled(command);
}
else if (!addIfUnderMaximumPoolSize(command))
reject(command); // is shutdown or saturated
}
}
We can see that execute itself calls offer() on the work queue, but not before doing some nice, tasty pool manipulations if necessary. For that reason, I'd think that it'd be advisable to use execute(); not using it may (although I don't know for certain) cause the pool to operate in a non-optimal way. However, I don't think that using offer() will break the executor - it looks like tasks are pulled off the queue using the following (also from ThreadPoolExecutor):
Runnable getTask() {
for (;;) {
try {
int state = runState;
if (state > SHUTDOWN)
return null;
Runnable r;
if (state == SHUTDOWN) // Help drain queue
r = workQueue.poll();
else if (poolSize > corePoolSize || allowCoreThreadTimeOut)
r = workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS);
else
r = workQueue.take();
if (r != null)
return r;
if (workerCanExit()) {
if (runState >= SHUTDOWN) // Wake up others
interruptIdleWorkers();
return null;
}
// Else retry
} catch (InterruptedException ie) {
// On interruption, re-check runState
}
}
}
This getTask() method is just called from within a loop, so if the executor's not shutting down, it'd block until a new task was given to the queue (regardless of from where it came from).
Note: Even though I've posted code snippets from source here, we can't rely on them for a definitive answer - we should only be coding to the API. We don't know how the implementation of execute() will change over time.

One can actually configure behavior of the pool when the queue is full, by specifying a RejectedExecutionHandler at instantiation. ThreadPoolExecutor defines four policies as inner classes, including AbortPolicy, DiscardOldestPolicy, DiscardPolicy, as well as my personal favorite, CallerRunsPolicy, which runs the new job in the controlling thread.
For example:
ThreadPoolExecutor threadPool = new ThreadPoolExecutor(
nproc, // core size
nproc, // max size
60, // idle timeout
TimeUnit.SECONDS,
new ArrayBlockingQueue<Runnable>(4096, true), // Fairness = true guarantees FIFO
new ThreadPoolExecutor.CallerRunsPolicy() ); // If we have to reject a task, run it in the calling thread.
The behavior desired in the question can be obtained using something like:
public class BlockingPolicy implements RejectedExecutionHandler {
void rejectedExecution(Runnable r, ThreadPoolExecutor executor) {
executor.getQueue.put(r); // Self contained, no queue reference needed.
}
At some point the queue must be accessed. The best place to do so is in a self-contained RejectedExecutionHandler, which saves any code duplication or potenial bugs arising from direct manipulation of the queue at the scope of the pool object. Note that the handlers included in ThreadPoolExecutor themselves use getQueue().

It's a very important question if the queue you're using is a completely different implementation from the standard in-memory LinkedBlockingQueue or ArrayBlockingQueue.
For instance if you're implementing the producer-consumer pattern using several producers on different machines, and use a queuing mechanism based on a separate persistence subsystem (like Redis), then the question becomes relevant on its own, even if you don't want a blocking offer() like the OP.
So the given answer, that prestartAllCoreThreads() has to be called (or enough times prestartCoreThread()) for the worker threads to be available and running, is important enough to be stressed.

If required, we can also use a parking lot which separates main processing from rejected tasks -
final CountDownLatch taskCounter = new CountDownLatch(TASKCOUNT);
final List<Runnable> taskParking = new LinkedList<Runnable>();
BlockingQueue<Runnable> taskPool = new ArrayBlockingQueue<Runnable>(1);
RejectedExecutionHandler rejectionHandler = new RejectedExecutionHandler() {
#Override
public void rejectedExecution(Runnable r, ThreadPoolExecutor executor) {
System.err.println(Thread.currentThread().getName() + " -->rejection reported - adding to parking lot " + r);
taskCounter.countDown();
taskParking.add(r);
}
};
ThreadPoolExecutor threadPoolExecutor = new ThreadPoolExecutor(5, 10, 1000, TimeUnit.SECONDS, taskPool, rejectionHandler);
for(int i=0 ; i<TASKCOUNT; i++){
//main
threadPoolExecutor.submit(getRandomTask());
}
taskCounter.await(TASKCOUNT * 5 , TimeUnit.SECONDS);
System.out.println("Checking the parking lot..." + taskParking);
while(taskParking.size() > 0){
Runnable r = taskParking.remove(0);
System.out.println("Running from parking lot..." + r);
if(taskParking.size() > LIMIT){
waitForSometime(...);
}
threadPoolExecutor.submit(r);
}
threadPoolExecutor.shutdown();

Related

ThreadPool getActiveCount() vs getPoolSize()

Although this topic has been discussed broadly in other posts I want to present my use case and clarify .So apologies if I am wasting anyone's time. I have the following runnable implementation. Basically infinitely running thread unless java.lang.Error gets thrown by the business logic.
public void run (){
while(true){
try{
//business logic
}catch(Exception ex){
}
}
}
I have about 30 of the above threads started from ExecutorService.
private final ExecutorService normalPriorityExecutorService = Executors.newFixedThreadPool(30);
for(int i=0;i<30;i++) {
normalPriorityExecutorService.submit(//Above Runnable);
}
I want to check and kill the JVM process if the thread count becomes zero on this Executor Service.
if (normalPriorityExecutorService instanceof ThreadPoolExecutor && ((ThreadPoolExecutor) normalPriorityExecutorService).getActiveCount() ==0) {
log.error("No Normal Priority response listeners available. Shutting down App!");
System.exit(1);
}
From my reading since these runnable threads are infinitely running under normal circumstances I will have 30 of them active unless they get killed by runtime Errors.
Question is using getActiveCount() the right approach for my use case ? By the way, when I tried using getPoolSize() instead of getActiveCount(), I did not get the right behavior while testing (I forcefully threw an error to kill a specific thread) and the poolSize still remained thirty.

Since you never use the thread pool as a pool, using a thread pool is overkill. Just create a thread group and start your threads.
private final ThreadGroup normalPriorityThreadGroup = new ThreadGroup("NormalPriority");
for (int i = 0; i < 30; i++) {
new Thread(this.normalPriorityThreadGroup, runnable, "NormalPriority-" + 1).start();
}
if (this.normalPriorityThreadGroup.activeCount() == 0) {
log.error("No Normal Priority response listeners available. Shutting down App!");
System.exit(1);
}

Customized ThreadPoolExecutor

I am writing a customized ThreadPoolExecutor with extra features as:-
If number of threads are more than core pool size but less than max pool size and queue is not full and there are no ideal threads then create a new thread for a task.
If there are ideal threads and as task comes assign that task to queue rather than adding it to the queue.
If all threads (upto max pool size are busy) then as new task come add them to the queue using reject method of RejectionHandler
I have overridden execute method of ThreadPoolExecutor version java 1.5.
The new code is as follows:-
public void execute(Runnable command) {
System.out.println(" Active Count: "+getActiveCount()+" PoolSize: "+getPoolSize()+" Idle Count: "+(getPoolSize()-getActiveCount())+" Queue Size: "+getQueue().size());
if (command == null)
throw new NullPointerException();
for (;;) {
if (runState != RUNNING) {
reject(command);
return;
}
if (poolSize < corePoolSize && addIfUnderCorePoolSize(command)) {
return;
}
if (runState == RUNNING && (getPoolSize()-getActiveCount() != 0) && workQueue.offer(command)) {
return;
}
int status = addIfUnderMaximumPoolSize(command);
if (status > 0) // created new thread
return;
if (status == 0) { // failed to create thread
reject(command);
return;
}
if (workQueue.offer(command))
return;
// Retry if created a new thread but it is busy with another task
}
}
The legacy code is as below:-
public void execute(Runnable command) {
if (command == null)
throw new NullPointerException();
for (;;) {
if (runState != RUNNING) {
reject(command);
return;
}
if (poolSize < corePoolSize && addIfUnderCorePoolSize(command))
return;
if (workQueue.offer(command))
return;
int status = addIfUnderMaximumPoolSize(command);
if (status > 0) // created new thread
return;
if (status == 0) { // failed to create thread
reject(command);
return;
}
// Retry if created a new thread but it is busy with another task
}
}
The problem which is now getting generated is that its not creating new threads when threads are idle but its not even allocating tasks to those threads else it is adding them to the queue which is not desired as we don't want the task to wait but process it asap even if it requires new thread creation but waiting is not allowed for a task.
PLEASE HELP ME IN RESOLVING THIS ISSUE. Thanks.

If I understand the question, I believe that I've found a solution to the default behavior of the ThreadPoolExecutor that I show in my answer here:
How to get the ThreadPoolExecutor to increase threads to max before queueing?
Basically you LinkedBlockingQueue to have it always return false for queue.offer(...) which will add an additional threads to the pool, if necessary. If the pool is already at max threads and they all are busy, the RejectedExecutionHandler will be called. It is the handler which then does the put(...) into the queue.
See my code there.

As much as i have understand of the three functionality you described, I think using ExectuorService would do much more than what you are currently trying to do: An Executor that provides methods to manage termination and methods that can produce a Future for tracking progress of one or more asynchronous tasks, specially with:
1.catched thread pool: allows createing as many threads it needs to execute the task in parrallel. The old available threads will be reused for the new tasks and Fixed thread pool.
2.Fixed Thread Pool : provides a pool with fixed number of threads. If a thread is not available for the task, the task is put in queue waiting for an other task to ends.
Check out this article for detail explanation and nice example.

Rejection handler in Executors.newScheduledThreadPool

I have a ArrayBlocking queue, , upon which a single thread fixed rate Scheduled works.
I may have failed task. I want re-run that or re-insert in queue at high priority or top level

Some thoughts here -
Why are you using ArrayBlockingQueue and not PriorityBlockingQueue ? Sounds like what you need to me . At first set all your elements to be with equal priority.
In case you receive an exception - re-insert to the queue with a higher priority

Simplest thing might be a priority queue. Attach a retry number to the task. It starts as zero. After an unsuccessful run, throw away all the ones and increment the zeroes and put them back in the queue at a high priority. With this method, you can easily decide to run everything three times, or more, if you want to later. The down side is you have to modify the task class.
The other idea would be to set up another, non-blocking, thread-safe, high-priority queue. When looking for a new task, you check the non-blocking queue first and run what's there. Otherwise, go to the blocking queue. This might work for you as is, and so far it's the simplest solution. The problem is the high priority queue might fill up while the scheduler is blocked on the blocking queue.
To get around this, you'd have to do your own blocking. Both queues should be non-blocking. (Suggestion: java.util.concurrent.ConcurrentLinkedQueue.) After polling both queues with no results, wait() on a monitor. When anything puts something in a queue, it should call notifyAll() and the scheduler can start up again. Great care is needed lest the notification occur after the scheduler has checked both queues but before it calls wait().
Addition:
Prototype code for third solution with manual blocking. Some threading is suggested, but the reader will know his/her own situation best. Which bits of code are apt to block waiting for a lock, which are apt to tie up their thread (and core) for minutes while doing extensive work, and which cannot afford to sit around waiting for the other code to finish all needs to be considered. For instance, if a failed run can immediately be rerun on the same thread with no time-consuming cleanup, most of this code can be junked.
private final ConcurrentLinkedQueue mainQueue = new ConcurrentLinkedQueue();
private final ConcurrentLinkedQueue prioQueue = new ConcurrentLinkedQueue();
private final Object entryWatch = new Object();
/** Adds a new job to the queue. */
public void addjob( Runnable runjob ) {
synchronized (entryWatch) { entryWatch.notifyAll(); }
}
/** The endless loop that does the work. */
public void schedule() {
for (;;) {
Runnable run = getOne(); // Avoids lock if successful.
if (run == null) {
// Both queues are empty.
synchronized (entryWatch) {
// Need to check again. Someone might have added and notifiedAll
// since last check. From this point until, wait, we can be sure
// entryWatch is not notified.
run = getOne();
if (run == null) {
// Both queues are REALLY empty.
try { entryWatch.wait(); }
catch (InterruptedException ie) {}
}
}
}
runit( run );
}
}
/** Helper method for the endless loop. */
private Runnable getOne() {
Runnable run = (Runnable) prioQueue.poll();
if (run != null) return run;
return (Runnable) mainQueue.poll();
}
/** Runs a new job. */
public void runit( final Runnable runjob ) {
// Do everthing in another thread. (Optional)
new Thread() {
#Override public void run() {
// Run run. (Possibly in own thread?)
// (Perhaps best in thread from a thread pool.)
runjob.run();
// Handle failure (runit only, NOT in runitLast).
// Defining "failure" left as exercise for reader.
if (failure) {
// Put code here to handle failure.
// Put back in queue.
prioQueue.add( runjob );
synchronized (entryWatch) { entryWatch.notifyAll(); }
}
}
}.start();
}
/** Reruns a job. */
public void runitLast( final Runnable runjob ) {
// Same code as "runit", but don't put "runjob" in "prioQueue" on failure.
}

Java Concurrency in Practice: race condition in BoundedExecutor?

There's something odd about the implementation of the BoundedExecutor in the book Java Concurrency in Practice.
It's supposed to throttle task submission to the Executor by blocking the submitting thread when there are enough threads either queued or running in the Executor.
This is the implementation (after adding the missing rethrow in the catch clause):
public class BoundedExecutor {
private final Executor exec;
private final Semaphore semaphore;
public BoundedExecutor(Executor exec, int bound) {
this.exec = exec;
this.semaphore = new Semaphore(bound);
}
public void submitTask(final Runnable command) throws InterruptedException, RejectedExecutionException {
semaphore.acquire();
try {
exec.execute(new Runnable() {
#Override public void run() {
try {
command.run();
} finally {
semaphore.release();
}
}
});
} catch (RejectedExecutionException e) {
semaphore.release();
throw e;
}
}
When I instantiate the BoundedExecutor with an Executors.newCachedThreadPool() and a bound of 4, I would expect the number of threads instantiated by the cached thread pool to never exceed 4. In practice, however, it does. I've gotten this little test program to create as much as 11 threads:
public static void main(String[] args) throws Exception {
class CountingThreadFactory implements ThreadFactory {
int count;
#Override public Thread newThread(Runnable r) {
++count;
return new Thread(r);
}
}
List<Integer> counts = new ArrayList<Integer>();
for (int n = 0; n < 100; ++n) {
CountingThreadFactory countingThreadFactory = new CountingThreadFactory();
ExecutorService exec = Executors.newCachedThreadPool(countingThreadFactory);
try {
BoundedExecutor be = new BoundedExecutor(exec, 4);
for (int i = 0; i < 20000; ++i) {
be.submitTask(new Runnable() {
#Override public void run() {}
});
}
} finally {
exec.shutdown();
}
counts.add(countingThreadFactory.count);
}
System.out.println(Collections.max(counts));
}
I think there's a tiny little time frame between the release of the semaphore and the task ending, where another thread can aquire a permit and submit a task while the releasing thread hasn't finished yet. In other words, it has a race condition.
Can someone confirm this?

BoundedExecutor was indeed intended as an illustration of how to throttle task submission, not as a way to place a bound on thread pool size. There are more direct ways to achieve the latter, as at least one comment pointed out.
But the other answers don't mention the text in the book that says to use an unbounded queue and to
set the bound on the semaphore to be equal to the pool size plus the
number of queued tasks you want to allow, since the semaphore is
bounding the number of tasks both currently executing and awaiting
execution. [JCiP, end of section 8.3.3]
By mentioning unbounded queues and pool size, we were implying (apparently not very clearly) the use of a thread pool of bounded size.
What has always bothered me about BoundedExecutor, however, is that it doesn't implement the ExecutorService interface. A modern way to achieve similar functionality and still implement the standard interfaces would be to use Guava's listeningDecorator method and ForwardingListeningExecutorService class.

You are correct in your analysis of the race condition. There is no synchronization guarantees between the ExecutorService & the Semaphore.
However, I do not know if throttling the number of threads is what the BoundedExecutor is used for. I think it is more for throttling the number of tasks submitted to the service. Imagine if you have 5 million tasks that need to submit, and if you submit more then 10,000 of them you run out of memory.
Well you only will ever have 4 threads running at any given time, why would you want to try and queue up all 5 millions tasks? You can use a construct similar to this to throttle the number of tasks queued up at any given time. What you should get out of this is that at any given time there are only 4 tasks running.
Obviously the resolution to this is to use a Executors.newFixedThreadPool(4).

I see as much as 9 threads created at once. I suspect there is a race condition which causes there to be more thread than required.
This could be because there is before and after running the task work to be done. This means that even though there is only 4 thread inside your block of code, there is a number of thread stopping a previous task or getting ready to start a new task.
i.e. the thread does a release() while it is still running. Even though its the last thing you do its not the last thing it does before acquiring a new task.

how to deal with multiple worker threads that may create new work items

I have a queue that contains work items and I want to have multiple threads work in parallel on those items. When a work item is processed it may result in new work items. The problem I have is that I can't find a solution on how to determine if I'm done. The worker looks like that:
public class Worker implements Runnable {
public void run() {
while (true) {
WorkItem item = queue.nextItem();
if (item != null) {
processItem(item);
}
else {
// the queue is empty, but there may still be other workers
// processing items which may result in new work items
// how to determine if the work is completely done?
}
}
}
}
This seems like a pretty simple problem actually but I'm at a loss. What would be the best way to implement that?
thanks
clarification:
The worker threads have to terminate once none of them is processing an item, but as long as at least one of them is still working they have to wait because it may result in new work items.

What about using an ExecutorService which will allow you to wait for all tasks to finish: ExecutorService, how to wait for all tasks to finish

I'd suggest wait/notify calls. In the else case, your worker threads would wait on an object until notified by the queue that there is more work to do. When a worker creates a new item, it adds it to the queue, and the queue calls notify on the object the workers are waiting on. One of them will wake up to consume the new item.
The methods wait, notify, and notifyAll of class Object support an efficient transfer of control from one thread to another. Rather than simply "spinning" (repeatedly locking and unlocking an object to see whether some internal state has changed), which consumes computational effort, a thread can suspend itself using wait until such time as another thread awakens it using notify. This is especially appropriate in situations where threads have a producer-consumer relationship (actively cooperating on a common goal) rather than a mutual exclusion relationship (trying to avoid conflicts while sharing a common resource).
Source: Threads and Locks

I'd look at something higher level than wait/notify. It's very difficult to get right and avoid deadlocks. Have you looked at java.util.concurrent.CompletionService<V>? You could have a simpler manager thread that polls the service and take()s the results, which may or may not contain a new work item.

Using a BlockingQueue containing items to process along with a synchronized set that keeps track of all elements being processed currently:
BlockingQueue<WorkItem> bQueue;
Set<WorkItem> beingProcessed = new Collections.synchronizedSet(new HashMap<WorkItem>());
bQueue.put(workItem);
...
// the following runs over many threads in parallel
while (!(bQueue.isEmpty() && beingProcessed.isEmpty())) {
WorkItem currentItem = bQueue.poll(50L, TimeUnit.MILLISECONDS); // null for empty queue
if (currentItem != null) {
beingProcessed.add(currentItem);
processItem(currentItem); // possibly bQueue.add(newItem) is called from processItem
beingProcessed.remove(currentItem);
}
}
EDIT: as #Hovercraft Full Of Eels suggested, an ExecutorService is probably what you should really use. You can add new tasks as you go along. You can semi-busy wait for termination of all tasks at regular interval with executorService.awaitTermination(time, timeUnits) and kill all your threads after that.

Here's the beginnings of a queue to solve your problem. bascially, you need to track new work and in process work.
public class WorkQueue<T> {
private final List<T> _newWork = new LinkedList<T>();
private int _inProcessWork;
public synchronized void addWork(T work) {
_newWork.add(work);
notifyAll();
}
public synchronized T startWork() throws InterruptedException {
while(_newWork.isEmpty() && (_inProcessWork > 0)) {
wait();
if(!_newWork.isEmpty()) {
_inProcessWork++;
return _newWork.remove(0);
}
}
// everything is done
return null;
}
public synchronized void finishWork() {
_inProcessWork--;
if((_inProcessWork == 0) && _newWork.isEmpty()) {
notifyAll();
}
}
}
your workers will look roughly like:
public class Worker {
private final WorkQueue<T> _queue;
public void run() {
T work = null;
while((work = _queue.startWork()) != null) {
try {
// do work here...
} finally {
_queue.finishWork();
}
}
}
}
the one trick is that you need to add the first work item _before you start any workers (otherwise they will all immediately exit).

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Is adding tasks to BlockingQueue of ThreadPoolExecutor advisable? - java

Related

ThreadPool getActiveCount() vs getPoolSize()

Customized ThreadPoolExecutor

Rejection handler in Executors.newScheduledThreadPool

Java Concurrency in Practice: race condition in BoundedExecutor?

how to deal with multiple worker threads that may create new work items

Categories

Resources