Executing two tasks consecutively - java

There's a thread pool with a single thread that is used to perform tasks submitted by multiple threads. The task is actually comprised of two parts - perform with meaningful result and cleanup that takes quite some time but returns no meaningful result. At the moment (obviously incorrect) implementation looks something like this. Is there an elegant way to ensure that another perform task will be executed only after previous cleanup task?
public class Main {
private static class Worker {
int perform() {
return 1;
}
void cleanup() {
}
}
private static void perform() throws InterruptedException, ExecutionException {
ExecutorService pool = Executors.newFixedThreadPool(1);
Worker w = new Worker();
Future f = pool.submit(() -> w.perform());
pool.submit(w::cleanup);
int x = (int) f.get();
System.out.println(x);
}
}

Is there an elegant way to ensure that another perform task will be executed only after previous cleanup task?
The most obvious thing to do is to call cleanup() from perform() but I assume there is a reason why you aren't doing that.
You say that your solution is currently "obviously incorrect". Why? Because of race conditions? Then you could add a synchronized block:
synchronized (pool) {
Future f = pool.submit(() -> w.perform());
pool.submit(w::cleanup);
}
That would ensure that the cleanup() would come immediately after a perform(). If you are worried about the performance hit with the synchronized, don't be.
Another solution might be to use the ExecutorCompletionService class although I'm not sure how that would help with one thread. I've used it before when I had cleanup tasks running in another thread pool.

If you are using java8, you can do this with CompletableFuture
CompletableFuture.supplyAsync(() -> w.perform(), pool)
.thenApplyAsync(() -> w.cleanup(), pool)
.join();

Related

How to distribute tasks between threads in ThreadPoolExecutor

I have following problem,
I have a queue of tasks and there are a lot of types of tasks like:
A, B, C, D, ...
I execute these tasks in thread pool.
But I have to restrict same type task execution at same time, hence, this is bad:
Thread-1: [A, D, C, B, ...]
Thread-2: [A, C, D, B, ...]
Tasks of type A and B could be executed at same time.
But this is good:
Thread-1: [A,B,A,B,...]
Thread-2: [C,D,D,C,...]
Hence tasks of same type are always executed sequentially.
What is the easiest way to implement this functionality?
This problem easily can be solved with an actor framework like Akka.
For each type of tasks. create an actor.
For each separate task, create a message and send it to the actor of corresponding type. Messages can be of type Runnable, as they probably are now, and the actor's reaction method can be
#Override
public void onReceive(Object msg) {
((Runnable)msg).run();
}
This way your program will run correctly for any number of threads.
I think you can implement your own DistributedThreadPool to control the thread. It's like some kind of topic subscriber/publisher structure.
I did a example as following:
class DistributeThreadPool {
Map<String, TypeThread> TypeCenter = new HashMap<String, TypeThread>();
public void execute(Worker command) {
TypeCenter.get(command.type).accept(command);
}
class TypeThread implements Runnable{
Thread t = null;
LinkedBlockingDeque<Runnable> lbq = null;
public TypeThread() {
lbq = new LinkedBlockingDeque<Runnable>();
}
public void accept(Runnable inRun) {
lbq.add(inRun);
}
public void start() {
t = new Thread(this);
t.start();
}
#Override
public void run() {
while (!Thread.interrupted()) {
try {
lbq.take().run();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
}
public DistributeThreadPool(String[] Types) {
for (String t : Types) {
TypeThread thread = new TypeThread();
TypeCenter.put(t, thread);
thread.start();
}
}
public static void main(String [] args) {
DistributeThreadPool dtp = new DistributeThreadPool(new String[] {"AB","CD"});
Worker w1 = new Worker("AB",()->System.out.println(Thread.currentThread().getName() +"AB"));
Worker w2 = new Worker("AB",()->System.out.println(Thread.currentThread().getName() +"AB"));
Worker w3 = new Worker("CD",()->System.out.println(Thread.currentThread().getName() +"CD"));
Worker w4 = new Worker("CD",()->System.out.println(Thread.currentThread().getName() +"CD"));
Worker w5 = new Worker("CD",()->System.out.println(Thread.currentThread().getName() +"CD"));
List<Worker> workers = new ArrayList<Worker>();
workers.add(w1);
workers.add(w2);
workers.add(w3);
workers.add(w4);
workers.add(w5);
workers.forEach(e->dtp.execute(e));
}
}
CompletableFuture.supplyAsync(this::doTaskA)
.thenAccept(this::useResultFromTaskAinTaskB);
What's happening above is that Task A and the related Task B are actually run in the same thread (one after the other, no need to "get" a new thread to start running Task B).
Or you can use runAsync for Task A if you don't need any information from it, but do need to wait for it to complete before running Task B.
By default, CompletableFuture's will use the common thread pool, but if you want more control over which ThreadPool gets used, you can pass a 2nd argument to the async methods with your own Executor that uses your own ThreadPool.
Interesting problem. Two questions come to mind:
How many different types of tasks are there?
If there are relatively few, the simplest way may be to create one thread for each type and assign each incoming task to its kind of thread. As long as tasks are balanced between types (and that's a big assumption) utilization will be good enough.
What's the expected timeliness/latency for task completion?
If your problem is flexible on the timeliness, you could batch incoming tasks of each kind by count or time interval, submit each batch you retire to the pool, then await completion of batch to submit another of the same kind.
You can adapt the second alternative to batch sizes as small as one, in which case the mechanics of awaiting completion become important for efficiency. CompletableFuture would fit the bill here; you could chain the "poll next task of type A and submit to pool" action to the task with thenRunAsync, and fire and forget the task.
You would have to maintain one external task queue per task type; the work queues of the FJ pool would be for in-progress tasks only. Still, this design has a good chance of dealing reasonably with imbalance in task count and workload per type.
Hope this helps.
Implement key ordered executor. Each task should have key. Tasks with same keys will be queued and will be executed successively, tasks with different keys will be executed in parallel.
Implementation in netty
You can try to make it yourself, but it is tricky and error prone. I can see few bugs in answer suggested there.

How to shutdown many instances of the ExecutorService Runnables?

I am attempting to understand how to handle many instances of the ExecutorService executing Runnable commands. With regards to the code provided, how many shutdowns are required if I execute a hundred Runnables with the fixed thread pool set to one? I think the code should execute a hundred futures sequentially in the for loop execution order with a single thread (never spawns more than a single thread), and requires a single ExecutorService shutdown. Is this correct? Also, it's ok to call shutdown right after the for loop completes because all hundred of the futures are in queue so that the executorService shutdown will occur automatically after all hundred futures complete. Just looking for some clarification, thanks.
public static void main(String[] args)
{
private static ExecutorService executorService = Executors.newFixedThreadPool(1);
for (int i = 0; i < 100; i++)
{
executorService.execute(new Runnable() {
#Override
public void run()
{
// do stuff
}
});
}
executorService.shutdown();
}
Looks like you've got the right idea. It doesn't matter how many Runnables you've handed over to the ExecutorService to run or how big a thread pool you've allocated, you only need to call shutdown() once. That will allow all tasks to complete but will not allow you to add any new ones. You may want to call
try {
executorService.awaitTermination(5, TimeUnit.MINUTES);
} catch (InterruptedException e) {
// do stuff
}
to block while all tasks are completed depending on your usage scenario.
If you want to shutdown and attempt to kill all running tasks, instead call the shutdownNow() method. Note that there is no guarantee that it will be able to interrupt running tasks.

How to test IO issues in Java?

How can I test behavior of my application code for the case of very bad IO performance without using mock streams that sleep (because they would react to interrupts)?
For instance, I want to test a ConcurrentWrapper utility that has a pool of threads for file IO. It submits each operation to an ExecutorService with invokeAll() with timeout. I want to confirm not only that the call with ConcurrentWrapper exits before timeout, but also that it somehow made the thread of its inner ExecutorService terminate (to avoid leakage).
I need to somehow simulate slow IO in the inner thread, but in a way that will ignore interrupts (like real IO does).
A bit of clarification: No answer like "sleep and swallow InterruptedException" or "sleep, catch InterruptedException and go back to sleep" is acceptable. I want to test how my code handles interrupts and such instrumentation would defeat the purpose by handling them itself.
You can sleep in a way that will insist on sleeping through interrupts:
long start = System.currentTimeMillis();
long end = start + sleepTime;
for (long now = start; now < end; now = System.currentTimeMillis()) {
try {
Thread.sleep(end - now);
} catch (InterruptedException ignored) {
}
}
For testing with timeouts, you can actually put a maximum time to execute the test, in JUnit you can include the annotation timeout:
#Test(timeout=100)
public void method_withTimeout() {
while(true);
}
For the part of testing that the method exits, you could use the Future interface that provides a timeout for getting the results.
If i understand your question correctly, ReentrantLock might help.
final ReentrantLock lock = new ReentrantLock();
Callable<Void> c = new Callable<Void>() {
public void call() {
lock.lock();
try {
if (Thread.currentThread().isInterrupted()) {
...
}
}
finally {
lock.unlock();
}
}
}
// Submit to the pool
Future<Void> future = executorService.submit(c);
// you might want to sleep a bit to give the pool a chance
// to pull off the queue.
// Issue a cancel
future.cancel();
// Now release the lock, which should let your
// callable continue onto to the interrupted check.
lock.unlock();
Note that the "lock" method does not throw any InterruptedException (though there is a method for that called "lockInterruptibly"), and if you look at the code for that class, it's not catching and swallowing (as you've stated would not be what you want).

Java Concurrency in Practice: race condition in BoundedExecutor?

There's something odd about the implementation of the BoundedExecutor in the book Java Concurrency in Practice.
It's supposed to throttle task submission to the Executor by blocking the submitting thread when there are enough threads either queued or running in the Executor.
This is the implementation (after adding the missing rethrow in the catch clause):
public class BoundedExecutor {
private final Executor exec;
private final Semaphore semaphore;
public BoundedExecutor(Executor exec, int bound) {
this.exec = exec;
this.semaphore = new Semaphore(bound);
}
public void submitTask(final Runnable command) throws InterruptedException, RejectedExecutionException {
semaphore.acquire();
try {
exec.execute(new Runnable() {
#Override public void run() {
try {
command.run();
} finally {
semaphore.release();
}
}
});
} catch (RejectedExecutionException e) {
semaphore.release();
throw e;
}
}
When I instantiate the BoundedExecutor with an Executors.newCachedThreadPool() and a bound of 4, I would expect the number of threads instantiated by the cached thread pool to never exceed 4. In practice, however, it does. I've gotten this little test program to create as much as 11 threads:
public static void main(String[] args) throws Exception {
class CountingThreadFactory implements ThreadFactory {
int count;
#Override public Thread newThread(Runnable r) {
++count;
return new Thread(r);
}
}
List<Integer> counts = new ArrayList<Integer>();
for (int n = 0; n < 100; ++n) {
CountingThreadFactory countingThreadFactory = new CountingThreadFactory();
ExecutorService exec = Executors.newCachedThreadPool(countingThreadFactory);
try {
BoundedExecutor be = new BoundedExecutor(exec, 4);
for (int i = 0; i < 20000; ++i) {
be.submitTask(new Runnable() {
#Override public void run() {}
});
}
} finally {
exec.shutdown();
}
counts.add(countingThreadFactory.count);
}
System.out.println(Collections.max(counts));
}
I think there's a tiny little time frame between the release of the semaphore and the task ending, where another thread can aquire a permit and submit a task while the releasing thread hasn't finished yet. In other words, it has a race condition.
Can someone confirm this?
BoundedExecutor was indeed intended as an illustration of how to throttle task submission, not as a way to place a bound on thread pool size. There are more direct ways to achieve the latter, as at least one comment pointed out.
But the other answers don't mention the text in the book that says to use an unbounded queue and to
set the bound on the semaphore to be equal to the pool size plus the
number of queued tasks you want to allow, since the semaphore is
bounding the number of tasks both currently executing and awaiting
execution. [JCiP, end of section 8.3.3]
By mentioning unbounded queues and pool size, we were implying (apparently not very clearly) the use of a thread pool of bounded size.
What has always bothered me about BoundedExecutor, however, is that it doesn't implement the ExecutorService interface. A modern way to achieve similar functionality and still implement the standard interfaces would be to use Guava's listeningDecorator method and ForwardingListeningExecutorService class.
You are correct in your analysis of the race condition. There is no synchronization guarantees between the ExecutorService & the Semaphore.
However, I do not know if throttling the number of threads is what the BoundedExecutor is used for. I think it is more for throttling the number of tasks submitted to the service. Imagine if you have 5 million tasks that need to submit, and if you submit more then 10,000 of them you run out of memory.
Well you only will ever have 4 threads running at any given time, why would you want to try and queue up all 5 millions tasks? You can use a construct similar to this to throttle the number of tasks queued up at any given time. What you should get out of this is that at any given time there are only 4 tasks running.
Obviously the resolution to this is to use a Executors.newFixedThreadPool(4).
I see as much as 9 threads created at once. I suspect there is a race condition which causes there to be more thread than required.
This could be because there is before and after running the task work to be done. This means that even though there is only 4 thread inside your block of code, there is a number of thread stopping a previous task or getting ready to start a new task.
i.e. the thread does a release() while it is still running. Even though its the last thing you do its not the last thing it does before acquiring a new task.

How can I make sure a threadpool is finished?

The setup:
I am in the process of changing the way a program works under the hood. In the current version works like this:
public void threadWork( List<MyCallable> workQueue )
{
ExecutorService pool = Executors.newFixedThreadPool(someConst);
List<Future<myOutput>> returnValues = new ArrayList<Future<myOutput>>();
List<myOutput> finishedStuff = new ArrayList<myOutput>();
for( int i = 0; i < workQueue.size(); i++ )
{
returnValues.add( pool.submit( workQueue.get(i) ) );
}
while( !returnValues.isEmpty() )
{
try
{
// Future.get() waits for a value from the callable
finishedStuff.add( returnValues.remove(0).get(0) );
}
catch(Throwable iknowthisisbaditisjustanexample){}
}
doLotsOfThings(finsihedStuff);
}
But the new system is going to use a private inner Runnable to call a synchronized method that writes the data into a global variable. My basic setup is:
public void threadReports( List<String> workQueue )
{
ExecutorService pool = Executors.newFixedThreadPool(someConst);
List<MyRunnable> runnables = new ArrayList<MyRunnable>()
for ( int i = 0; i < modules.size(); i++ )
{
runnables.add( new MyRunnable( workQueue.get(i) );
pool.submit(threads.get(i));
}
while( !runnables.isEmpty() )
{
try
{
runnables.remove(0).wait(); // I realized that this wouldn't work
}
catch(Throwable iknowthisisbaditisjustanexample){}
}
doLotsOfThings(finsihedStuff); // finishedStuff is the global the Runnables write to
}
If you read my comment in the try of the second piece of code you will notice that I don't know how to use wait(). I had thought it was basically like thread.join() but after reading the documentation I see it is not.
I'm okay with changing some structure as needed, but the basic system of taking work, using runnables, having the runnables write to a global variable, and using a threadpool are requirements.
The Question
How can I wait for the threadpool to be completely finished before I doLotsOfThings()?
You should call ExecutorService.shutdown() and then ExecutorService.awaitTermination.
...
pool.shutdown();
if (pool.awaitTermination(<long>,<TimeUnit>)) {
// finished before timeout
doLotsOfThings(finsihedStuff);
} else {
// Timeout occured.
}
Try this:
pool.shutdown();
pool.awaitTermination(WHATEVER_TIMEOUT, TimeUnit.SECONDS);
Have you considered using the Fork/Join framework that is now included in Java 7. If you do not want to use Java 7 yet you can get the jar for it here.
public void threadReports( List<String> workQueue )
{
ExecutorService pool = Executors.newFixedThreadPool(someConst);
Set<Future<?>> futures = new HashSet<Future<?>>();
for ( int i = 0; i < modules.size(); i++ )
{
futures.add(pool.submit(threads.get(i)));
}
while( !futures.isEmpty() )
{
Set<Future<?>> removed = new Set<Future<?>>();
for(Future<?> f : futures) {
f.get(100, TimeUnit.MILLISECONDS);
if(f.isDone()) removed.add(f);
}
for(Future<?> f : removed) futures.remove(f);
}
doLotsOfThings(finsihedStuff); // finishedStuff is the global the Runnables write to
}
shutdownis a lifecycle method of the ExecutorService and renders the executor unusable after the call. Creating and destroying ThreadPools in a method is as bad as creating/destroying threads: it pretty much defeats the purpose of using threadpool, which is to reduce the overhead of thread creation by enabling transparent reuse.
If possible, you should maintain your ExecutorService lifecycle in sync with your application. - create when first needed, shutdown when your app is closing down.
To achieve your goal of executing a bunch of tasks and waiting for them, the ExecutorService provides the method invokeAll(Collection<? extends Callable<T>> tasks) (and the version with timeout if you want to wait a specific period of time.)
Using this method and some of the points mentioned above, the code in question becomes:
public void threadReports( List<String> workQueue ) {
List<MyRunnable> runnables = new ArrayList<MyRunnable>(workQueue.size());
for (String work:workQueue) {
runnables.add(new MyRunnable(work));
}
// Executor is obtained from some applicationContext that takes care of lifecycle mgnt
// invokeAll(...) will block and return when all callables are executed
List<Future<MyRunnable>> results = applicationContext.getExecutor().invokeAll(runnables);
// I wouldn't use a global variable unless you have a VERY GOOD reason for that.
// b/c all the threads of the pool doing work will be contending for the lock on that variable.
// doLotsOfThings(finishedStuff);
// Note that the List of Futures holds the individual results of each execution.
// That said, the preferred way to harvest your results would be:
doLotsOfThings(results);
}
PS: Not sure why threadReports is void. It could/should return the calculation of doLotsOfThings to achieve a more functional design.

Categories