ExecutorService awaitTermination method not performing as expected - java

I'm trying to write a process in Java that executes a series of tasks concurrently, waits for the tasks to be done, then tags the overall process as complete. Each task has its own information, including when the individual task is complete. I'm using an ExecutorService for the process, and have boiled down the essence of the process as follows:
List<Foo> foos = getFoos();
ExecutorService executorService = Executors.newFixedThreadPool(foos.size());
for (Foo foo : foos) {
executorService.execute(new MyRunnable(foo));
}
executorService.shutdown();
try {
executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.SECONDS);
} catch (InterruptedException e) {
// log the error.
}
completeThisProcess();
Each of the MyRunnable objects has a run method that makes a webservice call, then writes the results of the call to the database, including the time the call completed. The completeThisProcess method simply writes the status of the whole process as complete along with the time the process completed.
The problem I'm having is that when I look in the database after the process has completed, the completeThisProcess method has apparently been able to execute before all of the MyRunnables have completed. I'm noticing that the times that are written from the completeThisProcess method are even occasionally upwards of 20-30 seconds before the last MyRunnable task has completed.
Is there anything there that is obviously wrong with the process I've written? Perhaps I'm not understanding the ExecutorService correctly, but I thought that the awaitTermination method should be ensuring that all of the MyRunnable instances have completed their run methods (assuming they complete without exception, of course), which would result in all the sub-tasks having completion times before the overall process's completion time.

If you want to wait for all the threads to return is then following method can be trusted.
Make your thread class implement Callable interface instead of Runnable (In case of Callable run method will return some value. Make it return threadName.
Create a list of Callable Objects and use invokeAll method which will wait for all threads to return. For the below code assume the thread class name to be MyCallable.
ExecutorService executorService = Executors.newFixedThreadPool(foos.size());
List<Callable> tasks = new ArrayList<>();
for (Foo foo : foos) {
tasks.add(new MyCallable(foo));
}
executorService.invokeAll(tasks);
invokeAll returns List of future objects if you want to make use of it.
OR
you can use CountDownLatch.
CountDownLatch cdl = new CountDownLatch(foo.size);
Make it count down in run method using cdl.countDown() method.
Use cdl.await after for loop and then it will wait untill cdl become zero.
Heading

Related

Specify order of execution of task while using single threaded executor

I currently have a bunch of tasks that I want to execute. I am using the single-threaded executor in java. These are mainly of 2 types. Let's call these types TaskA and TaskB. I have 10 tasks of type TaskA and 5 tasks of type TaskB. I have to execute them all but I cannot control the sequence in which they are submitted to the executor. A few tasks of type TaskB might be submitted to the executor before all 10 tasks of type TaskA have been submitted. Is there any way to ensure that all 10 tasks of type TaskA are executed before the 5 tasks of type TaskB? In order to successfully execute all tasks of type TaskB, I need to first execute all tasks of type TaskA. You may think of TaskA tasks to be responsible for data loading and TaskB tasks for data processing. Without loading the data I cannot process it and run into exceptions
Please do let me know if I can phrase the question better if it is unclear
No, the default executor service implementations do not differentiate between submitted tasks.
You need a task manager object.
Define a class that contains the single-threaded executor service as a member field. That class offers methods submit( TaskA ta ) and submit( TaskB tb ).
The second method collects the B tasks, as a queue, holding them for now if we’ve not yet processed ten A tasks.
The first method accepts each A task, submitting to the executor service immediately. And the first method counts those submissions. On the tenth A task, flag is set, and all stored B tasks are submitted to the member field executor service.
The second method always checks for that “A tasks done” flag being set. If set, any further B tasks submissions are sent directly to the executor service.
Your task manager class could itself implement the ExecutorService interface. But I don’t know if I would go that far.
The way I think you could do this is using the semaphore/locking pattern.
first, you need a lock. You can use an object
Object lock = new Object();
Then you need a count of how many A tasks have completed.
int completedAs = 0; // variable name is completed 'A's, not 'as'
Both of these should be static or otherwise available to TaskA and TaskB. Then what you can do is only add the TaskB's to the ExecutorService when the appropriate number of TaskA's have completed, like
for (TaskB : allTaskBs) {
synchronized(lock) {
//add the taskB to the ExecutorService
}
}
And then upon taskA completion:
synchronized(lock) {
completedAs++;
if (...) {
lock.notify(); // you can call this multiple times to release multiple B's
}
}
Here is something of a weird solution. Provided you have a default executor.
ExecutorService service = Executors.newSingleThreadExecutor();
Your need to keep track of how many a tasks have completed and how many need to be completed.
AtomicInteger a = new AtomicInteger(0);
int totalA = 10;
Then for submitting a task.
void submitTask(Runnable t){
Runnable r = ()->{
if( t instance of TaskA ){
try{
t.run();
} finally{
a.incrementAndGet();
}
} else if( t instance of TaskB ){
if( a.get() >= totalA ){
t.run();
} else{
service.submit(this);
}
} else{
throw new RuntimeException("not an acceptable task!");
}
}
service.submit(r);
}
This will filter the TaskA's and the TaskB's, TaskA's will be immediately executed, but TaskB's will be resubmitted.
There are some flaws to this design. I think ThreadPoolExecutor can be setup a little better where you reject a task if it is not ready to be run.
I suspect that you could design your setup a little better. They have tools like an ExecutionCompletionService, or CountDownLatch that are made for creating barriers to execution.

ExecutorService SingleThreadExecutor

I have a list of objects, from which depending on user interaction some objects need to do work asynchronically. Something like this:
for(TheObject o : this.listOfObjects) {
o.doWork();
}
The class TheObject implements an ExecutorService (SingleThread!), which is used to do the work. Every object of type TheObject instantiates an ExecutorService. I don't want to make lasagna code. I don't have enough Objects at the same time, to make an extra extraction layer with thread pooling needed.
I want to cite the Java Documentation about CachedThreadPools:
Threads that have not been used for sixty seconds are terminated and
removed from the cache. Thus, a pool that remains idle for long enough
will not consume any resources.
First question: Is this also true for a SingleThreadExecutor? Does the thread get terminated? JavaDoc doesn't say anything about SingleThreadExecutor. It wouldn't even matter in this application, as I have an amount of objects I can count on one hand. Just curiosity.
Furthermore the doWork() method of TheObject needs to call the ExecutorService#.submit() method to do the work async. Is it possible (I bet it is) to call the doWork() method implicitly? Is this a viable way of designing an async method?
void doWork() {
if(!isRunningAsync) {
myExecutor.submit(doWork());
} else {
// Do Work...
}
}
First question: Is this also true for a SingleThreadExecutor? Does the thread get terminated?
Take a look at the source code of Executors, comparing the implementations of newCachedThreadPool and newSingleThreadExecutor:
public static ExecutorService newCachedThreadPool() {
return new ThreadPoolExecutor(0, Integer.MAX_VALUE,
60L, TimeUnit.SECONDS,
new SynchronousQueue<Runnable>());
}
public static ExecutorService newSingleThreadExecutor() {
return new FinalizableDelegatedExecutorService
(new ThreadPoolExecutor(1, 1,
0L, TimeUnit.MILLISECONDS,
new LinkedBlockingQueue<Runnable>()));
}
The primary difference (of interest here) is the 60L, TimeUnit.SECONDS and 0L, TimeUnit.MILLISECONDS.
Effectively (but not actually), these parameters are passed to ThreadPoolExecutor.setKeepAliveTime. Looking at the Javadoc of that method:
A time value of zero will cause excess threads to terminate immediately after executing tasks.
where "excess threads" actually refers to "threads in excess of the core pool size".
The cached thread pool is created with zero core threads, and an (effectively) unlimited number of non-core threads; as such, any of its threads can be terminated after the keep alive time.
The single thread executor is created with 1 core thread and zero non-core threads; as such, there are no threads which can be terminated after the keep alive time: its one core thread remains active until you shut down the entire ThreadPoolExecutor.
(Thanks to #GPI for pointing out that I was wrong in my interpretation before).
First question:
Threads that have not been used for sixty seconds are terminated and removed from the cache. Thus, a pool that remains idle for long enough will not consume any resources.
Is this also true for a SingleThreadExecutor?
SingleThreadExecutor works differently. It don't have time-out concept due to the values configured during creation.
Termination of SingleThread is possible. But it guarantees that always one Thread exists to handle tasks from task queue.
From newSingleThreadExecutor documentation:
public static ExecutorService newSingleThreadExecutor()
Creates an Executor that uses a single worker thread operating off an unbounded queue. (Note however that if this single thread terminates due to a failure during execution prior to shutdown, a new one will take its place if needed to execute subsequent tasks.)
Tasks are guaranteed to execute sequentially, and no more than one task will be active at any given time. Unlike the otherwise equivalent newFixedThreadPool(1) the returned executor is guaranteed not to be reconfigurable to use additional threads.
Second question:
Furthermore the doWork() method of TheObject needs to call the ExecutorService#.submit() method to do the work async
for(TheObject o : this.listOfObjects) {
o.doWork();
}
can be changed to
ExecutorService executorService = Executors.newSingleThreadExecutor();
executorService.execute(new Runnable() {
public void run() {
System.out.println("Asynchronous task");
}
});
executorService.shutdown();
with Callable or Runnable interface and add your doWork() code in run() method or call() method. The task will be executed concurrently.

invokeAll vs CompletionService

I am having a class that is accessed by multiple threads, each thread request one method of this class. Each method in turn performs number of Callables. This class uses threadPool from ExecutorService to execute these Callables through invokeAll((Collection>) executableTasks) method.
The setup looks like this:
public MyClass {
private final ExecutorService threadPool = Runtime.getRuntime().availableProcessors();
public void method1() {
List<SomeObject> results = new ArrayList<>();
List<Callable<Void>> tasks = new ArrayList<Callable<Void>>();
tasks.add(new Callable<Void>(){ ... results.add(someObject);} );
threadPool.invokeAll(tasks);
}
public void method2() {
List<SomeObject> results = new ArrayList<>();
List<Callable<Void>> tasks = new ArrayList<Callable<Void>>();
tasks.add(new Callable<Void>(){ ... results.add(someObject);} );
threadPool.invokeAll(tasks);
}
}
I am confused if this will execute tasks in class concurrently or invokeAll() will block execution till tasks in one method completes(means execution will happen concurrently inside methods but not at class level)? Or Should I use CompletionService to find out the corresponding results of tasks?
ExecutorService#invokeAll executes all the tasks concurrently, but the call itself blocks until all the tasks complete.
For example, let's say you have three tasks that take 6 sec, 2 sec, and 10 sec to complete. If you were to execute these synchronously, it would take at least 6 + 2 + 10 = 18 seconds. However, using invokeAll (on a sufficiently large thread pool), this could take as little as the longest time, or 10 seconds.
This means that the methods method1() and method2() are both blocking methods because of the use of invokeAll(). When you call method1(), it will block until all of the requests added into the list of callables are complete. Same goes for method2(). If these methods are called from different threads, then the tasks in both methods will execute concurrently.
If you want the methods to be asynchronous, you'll want to call threadPool.submit(callable) individually for each task inside the methods and collect the returned futures in a list. You could either return a List or use a CompletionService to help for this, yes.
PS - this line in your example won't work:
ExecutorService threadPool = Runtime.getRuntime().availableProcessors();
I think you want this instead:
ExecutorService threadPool = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors());
Hope this helps.
To me, this will execute tasks in class concurrently. invokeAll() waits until all its tasks are finished but that is for the current thread, while this thread is waiting, other thread can execute its tasks concurrently
According to the Java Specification invokeAll execute all tasks concurrently independently of one another. and repeated calls to invokeAll will do the same, That is, the call to invokeAll not block the execution of tasks.
Visit:http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/AbstractExecutorService.html

ThreadPoolExecutor's getActiveCount()

I have a ThreadPoolExecutor that seems to be lying to me when I call getActiveCount(). I haven't done a lot of multithreaded programming however, so perhaps I'm doing something incorrectly.
Here's my TPE
#Override
public void afterPropertiesSet() throws Exception {
BlockingQueue<Runnable> workQueue;
int maxQueueLength = threadPoolConfiguration.getMaximumQueueLength();
if (maxQueueLength == 0) {
workQueue = new LinkedBlockingQueue<Runnable>();
} else {
workQueue = new LinkedBlockingQueue<Runnable>(maxQueueLength);
}
pool = new ThreadPoolExecutor(
threadPoolConfiguration.getCorePoolSize(),
threadPoolConfiguration.getMaximumPoolSize(),
threadPoolConfiguration.getKeepAliveTime(),
TimeUnit.valueOf(threadPoolConfiguration.getTimeUnit()),
workQueue,
// Default thread factory creates normal-priority,
// non-daemon threads.
Executors.defaultThreadFactory(),
// Run any rejected task directly in the calling thread.
// In this way no records will be lost due to rejection
// however, no records will be added to the workQueue
// while the calling thread is processing a Task, so set
// your queue-size appropriately.
//
// This also means MaxThreadCount+1 tasks may run
// concurrently. If you REALLY want a max of MaxThreadCount
// threads don't use this.
new ThreadPoolExecutor.CallerRunsPolicy());
}
In this class I also have a DAO that I pass into my Runnable (FooWorker), like so:
#Override
public void addTask(FooRecord record) {
if (pool == null) {
throw new FooException(ERROR_THREAD_POOL_CONFIGURATION_NOT_SET);
}
pool.execute(new FooWorker(context, calculator, dao, record));
}
FooWorker runs record (the only non-singleton) through a state machine via calculator then sends the transitions to the database via dao, like so:
public void run() {
calculator.calculate(record);
dao.save(record);
}
Once my main thread is done creating new tasks I try and wait to make sure all threads finished successfully:
while (pool.getActiveCount() > 0) {
recordHandler.awaitTermination(terminationTimeout,
terminationTimeoutUnit);
}
What I'm seeing from output logs (which are presumably unreliable due to the threading) is that getActiveCount() is returning zero too early, and the while() loop is exiting while my last threads are still printing output from calculator.
Note I've also tried calling pool.shutdown() then using awaitTermination but then the next time my job runs the pool is still shut down.
My only guess is that inside a thread, when I send data into the dao (since it's a singleton created by Spring in the main thread...), java is considering the thread inactive since (I assume) it's processing in/waiting on the main thread.
Intuitively, based only on what I'm seeing, that's my guess. But... Is that really what's happening? Is there a way to "do it right" without putting a manual incremented variable at the top of run() and a decremented at the end to track the number of threads?
If the answer is "don't pass in the dao", then wouldn't I have to "new" a DAO for every thread? My process is already a (beautiful, efficient) beast, but that would really suck.
As the JavaDoc of getActiveCount states, it's an approximate value: you should not base any major business logic decisions on this.
If you want to wait for all scheduled tasks to complete, then you should simply use
pool.shutdown();
pool.awaitTermination(terminationTimeout, terminationTimeoutUnit);
If you need to wait for a specific task to finish, you should use submit() instead of execute() and then check the Future object for completion (either using isDone() if you want to do it non-blocking or by simply calling get() which blocks until the task is done).
The documentation suggests that the method getActiveCount() on ThreadPoolExecutor is not an exact number:
getActiveCount
public int getActiveCount()
Returns the approximate number of threads that are actively executing tasks.
Returns: the number of threads
Personally, when I am doing multithreaded work such as this, I use a variable that I increment as I add tasks, and decrement as I grab their output.

Waiting on threads

I have a method that contains the following (Java) code:
doSomeThings();
doSomeOtherThings();
doSomeThings() creates some threads, each of which will run for only a finite amount of time. The problem is that I don't want doSomeOtherThings() to be called until all the threads launched by doSomeThings() are finished. (Also doSomeThings() will call methods that may launch new threads and so on. I don't want to execute doSomeOtherThings() until all these threads have finished.)
This is because doSomeThings(), among other things will set myObject to null, while doSomeOtherThings() calls myObject.myMethod() and I do not want myObject to be null at that time.
Is there some standard way of doing this kind of thing (in Java)?
You may want to have a look at the java.util.concurrent package. In particular, you might consider using the CountDownLatch as in
package de.grimm.game.ui;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
public class Main {
public static void main(String[] args)
throws Exception {
final ExecutorService executor = Executors.newFixedThreadPool(5);
final CountDownLatch latch = new CountDownLatch(3);
for( int k = 0; k < 3; ++k ) {
executor.submit(new Runnable() {
public void run() {
// ... lengthy computation...
latch.countDown();
}
});
}
latch.await();
// ... reached only after all threads spawned have
// finished and acknowledged so by counting down the
// latch.
System.out.println("Done");
}
}
Obviously, this technique will only work, if you know the number of forked threads beforehand, since you need to initialize the latch with that number.
Another way would be to use condition variables, for example:
boolean done = false;
void functionRunInThreadA() {
synchronized( commonLock ) {
while( !done ) commonLock.wait();
}
// Here it is safe to set the variable to null
}
void functionRunInThreadB() {
// Do something...
synchronized( commonLock ) {
done = true;
commonLock.notifyAll();
}
}
You might need to add exception handling (InteruptedException) and some such.
Take a look at Thread.join() method.
I'm not clear on your exact implementation but it seems like doSomeThings() should wait on the child threads before returning.
Inside of doSomeThings() method, wait on the threads by calling Thread.join() method.
When you create a thread and call that thread's join() method, the calling thread waits until that thread object dies.
Example:
// Create an instance of my custom thread class
MyThread myThread = new MyThread();
// Tell the custom thread object to run
myThread.start();
// Wait for the custom thread object to finish
myThread.join();
You are looking is the executorservice and use the futures :)
See http://java.sun.com/docs/books/tutorial/essential/concurrency/exinter.html
So basically collect the futures for all the runnables that you submit to the executor service. Loop all the futures and call the get() methods. These will return when the corresponding runnable is done.
Another useful more robust Synchronization Barrier you can use that would do the similar functionality as a CountdownLatch is a CyclicBarrier. It works similar to a CountdownLatch where you have to know how many parties (threads) are being used, but it allows you to reuse the barrier as apposed to creating a new instance of a CountdownLatch every time.
I do like momania's suggestion of using an ExecutorService, collecting the futures and invoking get on all of them until they complete.
Another option is to sleep your main thread, and have it check every so often if the other threads have finished. However, I like Dirk's and Marcus Adams' answers better - just throwing this out here for completeness sake.
Depends on what exactly you are trying to do here. Is your main concern the ability to dynamically determine the various threads that get spawned by the successive methods that get called from within doSomeThings() and then be able to wait till they finish before calling doSomeOtherThings() ? Or it is possible to know the threads that are spawned at compile time ? In the later case there are number of solutions but all basically involve calling the Thread.join() method on all these threads from wherever they are created.
If it is indeed the former , then you are better off using ThreadGroup and its enumerate()
method. This gives you a array of all threads spawned by doSomeThings() if you have properly added new threads to the ThreadGroup. Then you can loop through all thread references in the returned array and call join() on the main thread just before you call doSomeOtherThings() .

Categories