Interrupt Runnable that takes hours - java

I have a ThreadPoolExecutor:
ThreadPoolExecutor service = new ThreadPoolExecutor(N_THREADS, N_THREADS, 0L, TimeUnit.MILLISECONDS, blockingQueue, rejectedExecutionHandler);
The service executes threads implementing the Runnable interface. Each thread processes a file on disk. I found that after several hours, two threads (or cores depending on what htop shows in Linux) were running and had been running for 13 hours. What's even worse is that the remaining cores showed no activity as if they were waiting for the two threads to complete.
Questions:
1 - I have read a lot on how this problem may be resolved but nothing conclusive. As far as I can work out, you CANNOT stop a Runnable using the ThreadPoolExecutor because it is an independent thread that just runs. Using the Future framework:
Future<?> f = f.get(submittedtask,XX)
allows you to set a timeout and fetch the future result, but get blocks all the threads effectively making the implementation serial. Is it possible to interrupt a Runnable after a given time using the threadpoolexecutor, get the thread back to the pool so it can pickup a new task and carry on.
2 - My big concern is why, using htop, I see two threads/cores running and no other core/thread are running despite many tasks are still waiting to execute (i.e. there are many files left to process). Any insight?

You could create a second scheduled thread pool to which you would submit cancellation tasks for each of the returned Futures. Each of these tasks after a given timeout would check if it's associated Future is done and if not, cancel it. Cancellation would trigger thread interruption, so you might need to support it in your tasks by checking the interrupted flag: Thread.interrupted().
The size of this second thread pool could be minimal, i.e. 1 as this job takes minimum of CPU time.
Code example:
ScheduledExecutorService service = Executors.newScheduledThreadPool(1);
...
while(...){
final Future<?> f = pool.submit(...);
service.schedule(new Runnable() {
#Override
public void run() {
if(!f.isDone()){
f.cancel(true);
}
}
}, 1, TimeUnit.MINUTES);
}
service.awaitTermination(1, TimeUnit.MINUTES);
service.shutdown();

You can tell a thread that you wish to interrupt:
An interrupt is an indication to a thread that it should stop what it is doing and do something else.
You can interrupt your thread with Future.cancel(true). It's in your responsibility to implement the Runnable in a manner that it obeys that wish by checking its Thread.interrupted() state.

In order to see details about process thread run:
ps -eLf | grep <PROCESS_PID>
Since htop shows you the running processes list, where each process has at least one thread.

Related

Skip to next task in a single threaded ExecutorSerivce?

I am considering an implementation of an ExecutorService to run a series of tasks. I plan to use the internal queue to have a few tasks waiting for their turn to run. Is there some way to interrupt the task (the Runnable) that is currently running in an ExecutorService thread, and keep the thread alive to run the next task? Or is only possible to call .shutdown() and then create a new ExecutorService?
I have found this and wanted to know if there are any other solutions.
Instead of interfering with the threads you may want to have a Task class (that extends or wraps the Runnable) which implements an interrupt mechanism (e.g. a boolean flag).
When you execute your task you need to check this flag periodically and if it is set, the task should stop what it is doing. You might want to return a specific result at this point, that tells your code that the task was cancelled succesfully.
If a user now decides that he no longer requires the results from this task,
you will have to set this flag. However the task might have already completed at this point of time so you still need to deal with the cases where the result already exists but the user does no longer care about it.
An interrupt on a thread level does not guarantee that the thread stops working. This will only work if the thread is in a state where it can receive an interrupt.
Also you should not interfere with the Threads of the ExecutorSerivce directly, as you might unintentionally stop a different task or stop the ExecutorSerivce from working properly.
Why would you want to kill that task and continue with the next one? If it is a question of times you can define that the threads that are taking longer than you declared in the method that executes them are automatically canceled. E.g:
ExecutorService executor = Executors.newSingleThreadExecutor();
executor.invokeAll(Arrays.asList(new Task()), 60, TimeUnit.SECONDS); // Timeout of 60 seconds.
executor.shutdown();
If any of the threads takes longer than 60 seconds it will throw a cancellation.Exception() that you must catch

How is an executor terminated in my Java program?

This question is related to my previous question : Why the speed of a Java process inside multiple loops slows down as it goes?
In order to find the problem of that question, I looked closely at my code and found some executors in my app are not terminated, since I'm in the process of learning how to use executors, I copied some online sample codes and used them in my app, and I'm not sure if I'm using them correctly.
What's the difference between the following 2 approaches of using executors ?
[1]
Executor executor=Executors.newFixedThreadPool(30);
CountDownLatch doneSignal=new CountDownLatch(280);
for (int N=0;N<280;N++)
{
...
executor.execute(new SampleCountRunner(doneSignal,...));
}
try { doneSignal.await(); }
catch (Exception e) { e.printStackTrace(); }
[2]
ExecutorService executor=Executors.newFixedThreadPool(30);
for (int i=0;i<60;i++)
{
...
executor.execute(new xyzRunner(...));
}
executor.shutdown();
while (!executor.isTerminated()) { }
It seems to me after the 1st one is done, the executor still has an active pool of threads running and waiting for more tasks, they DO consume cpu time and memory.
The 2nd one will terminate all active threads in the pool after the shutdown() method is run, and all previously active threads won't take any more cpu time or memory after that point.
So my questions are :
[1] Am I correct ?
[2] How to terminate the pool of threads in the 1st case ? There is no "executor.shutdown()" for Executor
Edit :
Problem solved, I changed Executor in [1] to ExecutorService, and added :
executor.shutdown();
while (!executor.isTerminated()) { }
Now when my program ends, it won't have a lot of threads active any more.
It seems to me after the 1st one is done, the executor still has an active pool of threads running and waiting for more tasks, they DO consume cpu time and memory.
Not exactly. In first approach , after the tasks are all done ( as signalled by the latch ) , the executor is definitely NOT shutdown - but the threads in the executor do NOT consume cpu ( they consume minimum memory needed for thier structures yes ).
In this approach - you are explicitly in control of knowing when and how your tasks are completed. You can know if the tasks have succeeded or failed , and can decide to resubmit the tasks if needed.
The 2nd one will terminate all active threads in the pool after the shutdown() method is run, and all previously active threads won't take any more cpu time or memory after that point.
Again ,not exactly .In this approach , the ExecutorService does not shutdown immediately after the call to shutdown(). It waits for the already submitted tasks to complete , but here you do not directly know if these tasks completed successfully or they failed ( by throwing some Exception ).
And until the already submitted tasks are completed - your isShutDown() will do a tight loop ( it will spike the cpu to near 100% ) .
Thread pools (ExecutorService) should generally speaking not be created/destroyed regularly. Rather they should be long lived (perhaps entire life of application) to avoid the (significant) overhead of thread creation/destruction.
If you want to submit a list of tasks and wait for all to complete, use ExecutorService.invokeAll() rather than trying to track completion by a countdown latch.
The ExecutorService interface provides 2 mechanisms to shutdown: shutdown and shutdownNow. The first simply stops taking new jobs and will stop threads as currently executing and already submitted work is done. The second will attempt to interrupt all work in progress and will not even work on already submitted but not yet started jobs.

Thread Completion Notification

I have many threads that runs concurrently. In these few threads execution depends on the completion of other threads.
For eg,
Thread - 1, Thread - 2, Thread - 3
These can run independently.
Thread - 4, depends on the completion of 1 and 2
Thread - 5, depends on the completion of 1 and 3
Thread - 6, depends in the completion of 1, 2, and 3
All the threads are submitted to the executor. Thread 4, 5, and 6 has to implement some waiting mechanism before starting. What are the possible blocking mechanisms available in java for the above situation.
You get a Future<T> obejct when you use
Future<?> ExecutorService.submit(Runnable task)
Just pass the future to the thread that must wait for the completion (e.g it's constructor) and do:
future.get();
which will block until the thread for this future has finished.
Proposals to use blocking facilities like CountDownLatch or Future can work on an unbounded Executor or at least able to start all 6 threads simultaneously, otherwise there is a risk of thread starvation. So using Executor has no advantage compared to starting all 6 threads directly, without an Executor.
Meanwhile, the dependencies dictate that no more than 3 threads will run at each moment of time. In case the actual number of threads matters, you should use event-driven facilities instead of blocking ones. What I mean is an object which collects signals and, when signals from both tasks 1 and 2 has arrived, submits task 4 etc. From theoretical point of view, this resembles Petri nets. Unfortunately, JDK does not provide standard classes for event-driven task coordination. It is not hard to implement such a signal collector/task emitter from scratch, or you can use dataflow library df4j.
You may try to use the ExecutorService implementation of Threads for which on finishing the job (the tasks you want to accomplish with threads) you may try to invoke await() method of CountDownLatch class as follows:
public void finishWork() {
try {
System.out.println("START WAITING for thread");
countDownLatch.await();
System.out.println("DONE WAITING for thread");
} catch (InterruptedException ex) {
Thread.currentThread().interrupt();
}
}
And to monitor each thread you may try to invoke countDown() method available from this class. So, before calling the shutdown() on ExecutorService (or, completing with the job with threads), the above method could be used.

With a Java ExecutorService, how do I complete actively executing tasks but halt the processing of waiting tasks?

I am using an ExecutorService (a ThreadPoolExecutor) to run (and queue) a lot of tasks. I am attempting to write some shut down code that is as graceful as possible.
ExecutorService has two ways of shutting down:
I can call ExecutorService.shutdown() and then ExecutorService.awaitTermination(...).
I can call ExecutorService.shutdownNow().
According to the JavaDoc, the shutdown command:
Initiates an orderly shutdown in which previously submitted
tasks are executed, but no new tasks will be accepted.
And the shutdownNow command:
Attempts to stop all actively executing tasks, halts the
processing of waiting tasks, and returns a list of the tasks that were
awaiting execution.
I want something in between these two options.
I want to call a command that:
a. Completes the currently active task or tasks (like shutdown).
b. Halts the processing of waiting tasks (like shutdownNow).
For example: suppose I have a ThreadPoolExecutor with 3 threads. It currently has 50 tasks in the queue with the first 3 actively running. I want to allow those 3 active tasks to complete but I do not want the remaining 47 tasks to start.
I believe I can shutdown the ExecutorService this way by keeping a list of Future objects around and then calling cancel on all of them. But since tasks are being submitted to this ExecutorService from multiple threads, there would not be a clean way to do this.
I'm really hoping I'm missing something obvious or that there's a way to do it cleanly.
Thanks for any help.
I ran into this issue recently. There may be a more elegant approach, but my solution is to first call shutdown(), then pull out the BlockingQueue being used by the ThreadPoolExecutor and call clear() on it (or else drain it to another Collection for storage). Finally, calling awaitTermination() allows the thread pool to finish what's currently on its plate.
For example:
public static void shutdownPool(boolean awaitTermination) throws InterruptedException {
//call shutdown to prevent new tasks from being submitted
executor.shutdown();
//get a reference to the Queue
final BlockingQueue<Runnable> blockingQueue = executor.getQueue();
//clear the Queue
blockingQueue.clear();
//or else copy its contents here with a while loop and remove()
//wait for active tasks to be completed
if (awaitTermination) {
executor.awaitTermination(SHUTDOWN_TIMEOUT, TimeUnit.SECONDS);
}
}
This method would be implemented in the directing class wrapping the ThreadPoolExecutor with the reference executor.
It's important to note the following from the ThreadPoolExecutor.getQueue() javadoc:
Access to the task queue is intended primarily for debugging and
monitoring. This queue may be in active use. Retrieving the task queue
does not prevent queued tasks from executing.
This highlights the fact that additional tasks may be polled from the BlockingQueue while you drain it. However, all BlockingQueue implementations are thread-safe according to that interface's documentation, so this shouldn't cause problems.
The shutdownNow() is exactly what you need. You've missed the 1st word Attempts and the entire 2nd paragraph of its javadoc:
There are no guarantees beyond best-effort attempts to stop processing actively executing tasks. For example, typical implementations will cancel via Thread.interrupt(), so any task that fails to respond to interrupts may never terminate.
So, only tasks which are checking Thread#isInterrupted() on a regular basis (e.g. in a while (!Thread.currentThread().isInterrupted()) loop or something), will be terminated. But if you aren't checking on that in your task, it will still keep running.
You can wrap each submitted task with a little extra logic
wrapper = new Runnable()
public void run()
if(executorService.isShutdown())
throw new Error("shutdown");
task.run();
executorService.submit(wrapper);
the overhead of extra checking is negligible. After executor is shutdown, the wrappers will still be executed, but the original tasks won't.

Executor in java

I was trying to run ExecutorService object with FixedThreadPool and I ran into problems.
I expected the program to run in nanoseconds but it was hung. I found that I need to use Semaphore along with it so that the items in the queue do not get added up.
Is there any way I can come to know that all the threads of the pool are used.
Basic code ...
static ExecutorService pool = Executors.newFixedThreadPool(4);
static Semaphore permits = new Semaphore(4);
try {
permits.acquire();
pool.execute(p); // Assuming p is runnable on large number of objects
permits.release();
} catch ( InterruptedException ex ) {
}
This code gets hanged and I really don't know why. How to know if pool is currently waiting for all the threads to finish?
By default, if you submit more than 4 tasks to your pool then the extra tasks will be queued until a thread becomes available.
The blog you referenced in your comment uses the semaphore to limit the amount of work that can be queued at once, which won't be a problem for you until you have many thousands of tasks queued up and they start eating into the available memory. There's an easier way to do this, anyway - construct a ThreadPoolExecutor with a bounded queue.* But this isn't your problem.
If you want to know when a task completes, notice that ExecutorService.submit() returns a Future object which can be used to wait for the task's completion:
Future<?> f = pool.execute(p);
f.get();
System.out.println("task complete");
If you have several tasks and want to wait for all of them to complete, either store each Future in a list and then call get() on each in turn, or investigate ExecutorService.invokeAll() (which essentially does the same but in a single method call).
You can also tell whether a task has completed or not:
Future<?> f = pool.execute(p);
while(!f.isDone()) {
// do something else, task not complete
}
f.get();
Finally, note that even if your tasks are complete, your program may not exit (and thus appears to "hang") if you haven't called shutdown() on the thread pool; the reason is that the threads are still running, waiting to be given more work to do.
*Edit: sorry, I just re-read my answer and realised this part is incorrect - ThreadPoolExecutor offers tasks to the queue and rejects them if they aren't accepted, so a bounded queue has different semantics to the semaphore approach.
You do not need the Semaphore.
If you are hanging it is probably because the threads are locking themselves elsewhere.
Run the code in a Debuger and when it hangs pause it and see what the threads are doing.
You could change to using a ThreadPoolExecutor. It contains a getActiveCount() method which returns an approximate count of the active threads. Why it is approximate I'm not sure.

Categories