Java Concurrency in Practice: race condition in BoundedExecutor?

Java Concurrency in Practice: race condition in BoundedExecutor? - java

There's something odd about the implementation of the BoundedExecutor in the book Java Concurrency in Practice.
It's supposed to throttle task submission to the Executor by blocking the submitting thread when there are enough threads either queued or running in the Executor.
This is the implementation (after adding the missing rethrow in the catch clause):
public class BoundedExecutor {
private final Executor exec;
private final Semaphore semaphore;
public BoundedExecutor(Executor exec, int bound) {
this.exec = exec;
this.semaphore = new Semaphore(bound);
}
public void submitTask(final Runnable command) throws InterruptedException, RejectedExecutionException {
semaphore.acquire();
try {
exec.execute(new Runnable() {
#Override public void run() {
try {
command.run();
} finally {
semaphore.release();
}
}
});
} catch (RejectedExecutionException e) {
semaphore.release();
throw e;
}
}
When I instantiate the BoundedExecutor with an Executors.newCachedThreadPool() and a bound of 4, I would expect the number of threads instantiated by the cached thread pool to never exceed 4. In practice, however, it does. I've gotten this little test program to create as much as 11 threads:
public static void main(String[] args) throws Exception {
class CountingThreadFactory implements ThreadFactory {
int count;
#Override public Thread newThread(Runnable r) {
++count;
return new Thread(r);
}
}
List<Integer> counts = new ArrayList<Integer>();
for (int n = 0; n < 100; ++n) {
CountingThreadFactory countingThreadFactory = new CountingThreadFactory();
ExecutorService exec = Executors.newCachedThreadPool(countingThreadFactory);
try {
BoundedExecutor be = new BoundedExecutor(exec, 4);
for (int i = 0; i < 20000; ++i) {
be.submitTask(new Runnable() {
#Override public void run() {}
});
}
} finally {
exec.shutdown();
}
counts.add(countingThreadFactory.count);
}
System.out.println(Collections.max(counts));
}
I think there's a tiny little time frame between the release of the semaphore and the task ending, where another thread can aquire a permit and submit a task while the releasing thread hasn't finished yet. In other words, it has a race condition.
Can someone confirm this?

BoundedExecutor was indeed intended as an illustration of how to throttle task submission, not as a way to place a bound on thread pool size. There are more direct ways to achieve the latter, as at least one comment pointed out.
But the other answers don't mention the text in the book that says to use an unbounded queue and to
set the bound on the semaphore to be equal to the pool size plus the
number of queued tasks you want to allow, since the semaphore is
bounding the number of tasks both currently executing and awaiting
execution. [JCiP, end of section 8.3.3]
By mentioning unbounded queues and pool size, we were implying (apparently not very clearly) the use of a thread pool of bounded size.
What has always bothered me about BoundedExecutor, however, is that it doesn't implement the ExecutorService interface. A modern way to achieve similar functionality and still implement the standard interfaces would be to use Guava's listeningDecorator method and ForwardingListeningExecutorService class.

You are correct in your analysis of the race condition. There is no synchronization guarantees between the ExecutorService & the Semaphore.
However, I do not know if throttling the number of threads is what the BoundedExecutor is used for. I think it is more for throttling the number of tasks submitted to the service. Imagine if you have 5 million tasks that need to submit, and if you submit more then 10,000 of them you run out of memory.
Well you only will ever have 4 threads running at any given time, why would you want to try and queue up all 5 millions tasks? You can use a construct similar to this to throttle the number of tasks queued up at any given time. What you should get out of this is that at any given time there are only 4 tasks running.
Obviously the resolution to this is to use a Executors.newFixedThreadPool(4).

I see as much as 9 threads created at once. I suspect there is a race condition which causes there to be more thread than required.
This could be because there is before and after running the task work to be done. This means that even though there is only 4 thread inside your block of code, there is a number of thread stopping a previous task or getting ready to start a new task.
i.e. the thread does a release() while it is still running. Even though its the last thing you do its not the last thing it does before acquiring a new task.

Related

Executors distribution of task per threads

I am fairly new with java executors, so this maybe an easy question.
ExecutorService executorService = Executors.newFixedThreadPool(NumberOfThreads - 1);
do_work();
for(int i = 1; i < NumberOfThreads; i++)
{
executorService.execute(new Runnable()
{
public void run()
{
do_work();
}
});
}
My question is:
If I create a fixed thread pool with 'N' threads, and if I want to execute 'N' tasks, like the code above. Do I have guarantees that each thread will only execute one task (do_work())?

No. It's a pool, and the assignment of threads to tasks doesn't make such guarantees.
e.g. imagine your do_work() method completes immediately. By the time you submit your 2nd Runnable, all the threads in the pool will be available, and any one of them will be a candidate for your job.

how to synchronize a set of multiple threads with respect to a single thread in Java

Suppose that I have an arraylist called myList of threads all of which are created with an instance of the class myRunnable implementing the Runnable interface, that is, all the threads share the same code to execute in the run() method of myRunnable. Now suppose that I have another single thread called singleThread that is created with an instance of the class otherRunnable implementing the Runnable interface.
The synchornization challenge I have to resolve for these threads is the following: I need all of the threads in myList to execute their code until certain point. Once reached this point, they shoud sleep. Once all and only all of the threads in myList are sleeping, then singleThread should be awakened (singleThread was already asleep). Then singleThread execute its own stuff, and when it is done, it should sleep and all the threads in myList should be awakened. Imagine that the codes are wrapped in while(true)'s, so this process must happen again and again.
Here is an example of the situation I've just described including an attempt of solving the synchronization problem:
class myRunnable extends Runnable
{
public static final Object lock = new Object();
static int count = 0;
#override
run()
{
while(true)
{
//do stuff
barrier();
//do stuff
}
}
void barrier()
{
try {
synchronized(lock) {
count++;
if (count == Program.myList.size()) {
count = 0;
synchronized(otherRunnable.lock) {
otherRunnable.lock.notify();
}
}
lock.wait();
}
} catch (InterruptedException ex) {}
}
}
class otherRunnable extend Runnable
{
public static final Object lock = new Object();
#override
run()
{
while(true)
{
try {
synchronized(lock) {
lock.wait();
} catch (InterruptedException ex) {}
// do stuff
try {
synchronized(myRunnable.lock) {
myRunnable.notifyAll();
}
}
}
}
class Program
{
public static ArrayList<Thread> myList;
public static void main (string[] args)
{
myList = new ArrayList<Thread>();
for(int i = 0; i < 10; i++)
{
myList.add(new Thread(new myRunnable()));
myList.get(i).start();
}
new Thread(new OtherRunnable()).start();
}
}
Basically my idea is to use a counter to make sure that threads in myList just wait except the last thread incrementing the counter, which resets the counter to 0, wakes up singleThread by notifying to its lock, and then this last thread goes to sleep as well by waiting to myRunnable.lock. In a more abstract level, my approach is to use some sort of barrier for threads in myList to stop their execution in a critical point, then the last thread hitting the barrier wakes up singleThread and goes to sleep as well, then singleThread makes its stuff and when finished, it wakes up all the threads in the barrier so they can continue again.
My problem is that there is a flaw in my logic (probably there are more). When the last thread hitting the barrier notifies otherRunnable.lock, there is a chance that an immediate context switch could occur, giving the cpu to singleThread, before the last thread could execute its wait on myRunnable.lock (and going to sleep). Then singleThread would execute all its stuff, would execute notifyAll on myRunnable.lock, and all the threads in myList would be awakened except the last thread hitting the barrier because it has not yet executed its wait command. Then, all those threads would do their stuff again and would hit the barrier again, but the count would never be equal to myList.size() because the last thread mentioned earlier would be eventually scheduled again and would execute wait. singleThread in turn would also execute wait in its first line, and as a result we have a deadlock, with everybody sleeping.
So my question is: what would be a good way to synchronize these threads in order to achieve the desired behaviour described before but at the same time in a way safe of deadlocks??

Based on your comment, sounds like a CyclicBarrier would fit your need exactly. From the docs (emphasis mine):
A synchronization aid that allows a set of threads to all wait for each other to reach a common barrier point. CyclicBarriers are useful in programs involving a fixed sized party of threads that must occasionally wait for each other. The barrier is called cyclic because it can be re-used after the waiting threads are released.
Unfortunately, I haven't used them myself, so I can't give you specific pointers on them. I think the basic idea is you construct your barrier using the two-argument constructor with the barrierAction. Have your n threads await() on this barrier after this task is done, after which barrierAction is executed, after which the n threads will continue.
From the javadoc for CyclicBarrier#await():
If the current thread is the last thread to arrive, and a non-null barrier action was supplied in the constructor, then the current thread runs the action before allowing the other threads to continue. If an exception occurs during the barrier action then that exception will be propagated in the current thread and the barrier is placed in the broken state.

How to shutdown many instances of the ExecutorService Runnables?

I am attempting to understand how to handle many instances of the ExecutorService executing Runnable commands. With regards to the code provided, how many shutdowns are required if I execute a hundred Runnables with the fixed thread pool set to one? I think the code should execute a hundred futures sequentially in the for loop execution order with a single thread (never spawns more than a single thread), and requires a single ExecutorService shutdown. Is this correct? Also, it's ok to call shutdown right after the for loop completes because all hundred of the futures are in queue so that the executorService shutdown will occur automatically after all hundred futures complete. Just looking for some clarification, thanks.
public static void main(String[] args)
{
private static ExecutorService executorService = Executors.newFixedThreadPool(1);
for (int i = 0; i < 100; i++)
{
executorService.execute(new Runnable() {
#Override
public void run()
{
// do stuff
}
});
}
executorService.shutdown();
}

Looks like you've got the right idea. It doesn't matter how many Runnables you've handed over to the ExecutorService to run or how big a thread pool you've allocated, you only need to call shutdown() once. That will allow all tasks to complete but will not allow you to add any new ones. You may want to call
try {
executorService.awaitTermination(5, TimeUnit.MINUTES);
} catch (InterruptedException e) {
// do stuff
}
to block while all tasks are completed depending on your usage scenario.
If you want to shutdown and attempt to kill all running tasks, instead call the shutdownNow() method. Note that there is no guarantee that it will be able to interrupt running tasks.

Java threadpools and runnables creating runnables

Bear with me as I'm not terribly savvy in multithreaded programming...
I'm currently building out a system that uses a ThreadPool ExecutorService for various runnables. That much is straightforward. However, I'm looking at the possibility of having the runnables themselves spawn an additional runnable based on what happens in the original runnable (ie, if success, do this, if fail, do this, etc as some tasks must be complete before others execute). It should be noted that the main thread does not need to be notified of the results of these tasks, although it might be handy for handling exceptions, ie, if an external service cannot be contacted and all threads are throwing exceptions as a result, then stop submitting tasks and periodically check on the external service until it comes back up. This isn't completely necessary, but it would be nice.
Ie, submit Task A. Task A does some things. If everything goes well, Task A will execute Task B. If something doesn't work out properly or an exception is thrown, execute Task C. Each child task may also have additional tasks, but only a few levels deep. I'd much rather do something like this than large, snarled conditionals in a single task as this approach allows for much greater flexibility.
However, I'm not certain how this would affect the thread pool. I would assume that any additional thread(s) created from within a thread in the pool would exist outside of the pool as they themselves were not submitted directly to the pool. Is this a correct assumption? If so, it's likely a bad idea (well, if not, it may not be a very good idea anyway) as it could result in a lot more threads as the original thread completes and a new task is submitted while the thread spawned from the earlier task is still going (and may last considerably longer than others).
I've also considered implementing these as Callables instead and placing a response object in the Future that is returned, then add the appropriate Callable to the thread pool based on the response. However, this would tie all actions back to the main thread, which seems an unnecessary bottleneck. I suppose I could place a Runnable into the pool that itself handles the execution of the Callable and subsequent actions, but then I get twice as many threads.
Am I on the right track here or am I completely off the rails?

I have never used this, but it can be useful for you: http://docs.oracle.com/javase/tutorial/essential/concurrency/forkjoin.html

There are many ways to do what you want. You need to be careful you don't end up creating too many threads.
The following is an example, you could make this more efficient with an ExecutorCompletionService and alternatively you could use Runnable's.
import java.util.ArrayList;
import java.util.List;
import java.util.Random;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
public class ThreadsMakeThreads {
public static void main(String[] args) {
new ThreadsMakeThreads().start();
}
public void start() {
//Create resources
ExecutorService threadPool = Executors.newCachedThreadPool();
Random random = new Random(System.currentTimeMillis());
int numberOfThreads = 5;
//Prepare threads
ArrayList<Leader> leaders = new ArrayList<Leader>();
for(int i=0; i < numberOfThreads; i++) {
leaders.add(new Leader(threadPool, random));
}
//Get the results
try {
List<Future<Integer>> results = threadPool.invokeAll(leaders);
for(Future<Integer> result : results) {
System.out.println("Result is " + result.get());
}
} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
}
threadPool.shutdown();
}
class Leader implements Callable<Integer> {
private ExecutorService threadPool;
private Random random;
public Leader(ExecutorService threadPool, Random random) {
this.threadPool = threadPool;
this.random = random;
}
#Override
public Integer call() throws Exception {
int numberOfWorkers = random.nextInt(10);
ArrayList<Worker> workers = new ArrayList<Worker>();
for(int i=0; i < numberOfWorkers; i++) {
workers.add(new Worker(random));
}
List<Future<Integer>> tasks = threadPool.invokeAll(workers);
int result = 0;
for(Future<Integer> task : tasks) {
result += task.get();
}
return result;
}
}
class Worker implements Callable<Integer> {
private Random random;
public Worker(Random random) {
this.random = random;
}
#Override
public Integer call() throws Exception {
return random.nextInt(100);
}
}
}

Submitting tasks to the thread pool from other tasks is quite meaningful idea. But I am afraid you think of running new tasks on separate threads, that really can eat all the memory. Just set a limit to the number of threads when the pool is created, and submit new tasks to that thread pool.
This approach can be further elaborated in different directions. First, treat tasks as ordinary objects, with interface methods, and let that methods decide if they want to submit this object to the thread pool. This requires that each task knows its thread pool - pass it as a parameter at the time of creation. Even more convenient, keep reference to the thread pool as a thread local variable.
You can easily emulate functional programming: an object represents a function call, and for each parameter it has corresponding set method. When all parameters are set, the object is submitted to the thread pool.
Another direction is actor programming: task class has single set method, but it can be called multiple times, and if previous argument is not yet processed, the set method does not submit the task to the thread pool, but simply stores its argument in a queue. The run() method processes all available arguments from the queue and then returns.
All this features are implemented in the dataflow library https://github.com/rfqu/df4j. I wrote it intentionally to support task-based parallelism.

Java Thread Sleep

I have a main for-loop that sends out requests to an external system. The external system might take a few seconds or even minutes to respond back.
Also, if the number of requests reaches the MAX_REQUESTS, the current for-loop should SLEEP for a few seconds.
This is my scenario. Lets say the main for-loop goes to sleep say for 5 seconds because it has reached the MAX_REQUESTS. Then say a previous external requests comes back returns from callExternalSystem(). What will happen to the main for-loop Thread that is currently on the SLEEP state? Will it be interrupted and continue processing or continue to SLEEP?
for(...){
...
while(numRequestsProcessing > MAX_REQUESTS){
Thread.sleep(SLEEP_TIME);
}
...
callExternalSystem();
}
Thanks in advance.

Unless you've got some code to interrupt the sleeping thread, it will continue sleeping until the required time has elapsed. If you don't want that to happen, you could possibly use wait()/notify() instead of sleep() so that another thread can notify the object that the main thread is sleeping on, in order to wake it up. That relies on there being another thread to notice that the external system has responded, of course - it's not really clear how you're getting responses back.
EDIT: It sounds like really you should use a Semaphore. Each time the main thread wants to issue a request, it acquires a permit. Each time there's a response, that releases a permit. Then you just need to set it up with as many permits as you want concurrent requests. Use tryAcquire if you want to be able to specify a timeout in the main thread - but think about what you want to do if you already have as many requests outstanding as you're really happy with.

I would use java.util.concurrent.Executors to create a thread pool with MAX_REQUESTS threads. Create a java.util.concurrent.CountDownLatch for however many requests you're sending out at once. Pass the latch to the Runnables that make the request, they call countDown() on the latch when complete. The main thread then calls await(timeout) on the latch. I would also suggest the book "Java Concurrency in Practice".

One approach, is to use a ThreadPoolExecutor which blocks whenever there is no free thread.
ThreadPoolExecutor executor = new ThreadPoolExecutor(MAX_REQUESTS, MAX_REQUESTS, 60, TimeUnit.SECONDS, new SynchronousQueue<Runnable>(), new RejectedExecutionHandler() {
#Override
public void rejectedExecution(Runnable r, ThreadPoolExecutor executor) {
try {
executor.getQueue().offer(r, Long.MAX_VALUE, TimeUnit.NANOSECONDS);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
});
for(int i=0;i<LOTS_OF_REQUESTS;i++) {
final int finalI = i;
executor.submit(new Runnable() {
#Override
public void run() {
request(finalI);
}
});
}
Another approach is to have the tasks generate their own requests. This way a new request is generated each time a thread is free concurrently.
ExecutorService executor = Executors.newFixedThreadPool(MAX_REQUESTS);
final AtomicInteger counter = new AtomicInteger();
for (int i = 0; i < MAX_REQUESTS; i++) {
executor.submit(new Runnable() {
#Override
public void run() {
int i;
while ((i = counter.getAndIncrement()) < LOTS_OF_REQUESTS)
request(i);
}
});
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.