Executor in java - java

I was trying to run ExecutorService object with FixedThreadPool and I ran into problems.
I expected the program to run in nanoseconds but it was hung. I found that I need to use Semaphore along with it so that the items in the queue do not get added up.
Is there any way I can come to know that all the threads of the pool are used.
Basic code ...
static ExecutorService pool = Executors.newFixedThreadPool(4);
static Semaphore permits = new Semaphore(4);
try {
permits.acquire();
pool.execute(p); // Assuming p is runnable on large number of objects
permits.release();
} catch ( InterruptedException ex ) {
}
This code gets hanged and I really don't know why. How to know if pool is currently waiting for all the threads to finish?

By default, if you submit more than 4 tasks to your pool then the extra tasks will be queued until a thread becomes available.
The blog you referenced in your comment uses the semaphore to limit the amount of work that can be queued at once, which won't be a problem for you until you have many thousands of tasks queued up and they start eating into the available memory. There's an easier way to do this, anyway - construct a ThreadPoolExecutor with a bounded queue.* But this isn't your problem.
If you want to know when a task completes, notice that ExecutorService.submit() returns a Future object which can be used to wait for the task's completion:
Future<?> f = pool.execute(p);
f.get();
System.out.println("task complete");
If you have several tasks and want to wait for all of them to complete, either store each Future in a list and then call get() on each in turn, or investigate ExecutorService.invokeAll() (which essentially does the same but in a single method call).
You can also tell whether a task has completed or not:
Future<?> f = pool.execute(p);
while(!f.isDone()) {
// do something else, task not complete
}
f.get();
Finally, note that even if your tasks are complete, your program may not exit (and thus appears to "hang") if you haven't called shutdown() on the thread pool; the reason is that the threads are still running, waiting to be given more work to do.
*Edit: sorry, I just re-read my answer and realised this part is incorrect - ThreadPoolExecutor offers tasks to the queue and rejects them if they aren't accepted, so a bounded queue has different semantics to the semaphore approach.

You do not need the Semaphore.
If you are hanging it is probably because the threads are locking themselves elsewhere.
Run the code in a Debuger and when it hangs pause it and see what the threads are doing.

You could change to using a ThreadPoolExecutor. It contains a getActiveCount() method which returns an approximate count of the active threads. Why it is approximate I'm not sure.

Related

How can I ensure an ExecutorService pool has completed, without shutting it down?

Currently, I'm making sure my tasks have finished before moving on like so:
ExecutorService pool = Executors.newFixedThreadPool(5);
public Set<Future> EnqueueWork(StreamWrapper stream) {
Set<Future> futureObjs = new HashSet<>();
util.setData(stream);
Callable callable = util;
Future future = pool.submit(callable);
futureObjs.add(future);
pool.shutdown();
try {
pool.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
} catch (InterruptedException e) {
e.printStackTrace();
}
Node.sendTCP(Node.getNodeByHostname(StorageTopology.getNextPeer()), Coordinator.prepareForTransport(stream));
return futureObjs;
}
However, because of some other threading on my socket, it's possible that multiple calls are made to EnqueueWork - I'd like to make sure the calls to .submit have completed in the current thread, without shutting down the pool for subsequent threads coming in.
Is this possible?
You can check by invoking isDone() method on all the Future objects in futureObjs. You need to make sure isDone is called in a loop. calling get() method on Future object is another option, since get() is a blocking call, it will return only after task is completed and result is ready. But do you really want to keep the pool open after all the tasks are done?
I agree with one of the comments, it seems odd that your executor can be used by different threads. Usually and executor is private to an instance of some class, but anyhow.
What you can do, from the docs, is to check:
getActiveCount() - Returns the approximate number of threads that are >actively executing tasks.
NOTE: This is a blocking method, it will take out a lock on the workers of your threadpool and block until it has counted everything
And also check:
getQueue() - Returns the task queue used by this executor. Access to the
task queue is intended primarily for debugging and monitoring.
This queue may be in active use. Retrieving the task queue
does not prevent queued tasks from executing.
If your queue is empty and the activeCount is 0, all your tasks should have finished. I say should because getActiveCount says "approximate". Looking at the impl, this is most likely because the worker internally has a flag indicating that it is locked (in use). There is in theory a slight race between executing and the worker being done and marking itself so.
A better approach would in fact be to track the features. You would have to check the Queue and that all futures are done.
However I think what you really need is to reverse your logic. Instead of the current thread trying to work out if another thread has submitted work in the meantime, you should have the other thread call isShutdown() and simply not submit a new task in that case.
You are approaching this issue from the wrong direction. If you need to know whether or not your tasks are finished, that means you have a dependency of A->B. The executor is the wrong place to ensure that dependency, as much as you don't ask the engine of your car "are we there yet?".
Java offers several features to ensure that a certain state has been reached before starting a new execution path. One of them is the invokeAll method of the ExecutorService, that returns only when all tasks that have been submitted are completed.
pool.invokeAll(listOfAllMyCallables);
// if you reach this point all callables are completed
You have already added Future to the set. Just add below code block to get the status of each Future task by calling get() with time out period.
In my example, time out is 60 seconds. You can change it as per your requirement.
Sample code:
try{
for(Future future : futureObjs){
System.out.println("future.status = " + future.get(60000, TimeUnit.MILLISECONDS));
}
}catch(Exception err){
err.printStackTrace();
}
Other useful posts:
How to forcefully shutdown java ExecutorService
How to wait for completion of multiple tasks in Java?

Java ThreadPool concepts, and issues with controlling the number of actual threads

I am a newbie to Java concurrency and am a bit confused by several concepts and implementation issues here. Hope you guys can help.
Say, I have a list of tasks stored in a thread-safe list wrapper:
ListWrapper jobs = ....
'ListWrapper' has synchronized fetch/push/append functions, and this 'jobs' object will be shared by multiple worker threads.
And I have a worker 'Runnable' to execute the tasks:
public class Worker implements Runnable{
private ListWrapper jobs;
public Worker(ListWrapper l){
this.jobs=l;
}
public void run(){
while(! jobs.isEmpty()){
//fetch an item from jobs and do sth...
}
}
}
Now in the main function I execute the tasks:
int NTHREADS =10;
ExecutorService service= Executors.newFixedThreadPool(NTHREADS);
//run threads..
int x=3;
for(int i=0; i<x; i++){
service.execute(new Worker(jobs) );
}
I tested this code with 'x=3', and I found that only 3 threads are running at the same time; but as I set 'x=20', I found that only 10 (=NTHREADS) are running at the same time. Seems to me the # of actual threads is the min of the two values.
Now my questions are:
1) Which value ('x' or 'NTHREADS') should I set to control the number of concurrent threads? Or it doesn't matter in either I choose?
2) How is this approach different from simply using the Producer-Consumer pattern --creating a fixed number of 'stud' threads to execute the tasks(shown in the code below)?
Thread t1= new Worker(jobs);
Thread t2= new Worker(jobs);
...
t1.join();
t2.join();
...
Thank you very much!!
[[ There are some good answers here but I thought I'd add some more detail. ]]
I tested this code with 'x=3', and I found that only 3 threads are running at the same time; but as I set 'x=20', I found that only 10 (=NTHREADS) are running at the same time. Seems to me the # of actual threads is the min of the two values.
No, not really. I suspect that the reason you weren't seeing 20 threads is that threads had already finished or had yet to be started. If you call new Thread(...).start() 20 times then you will get 20 threads started. However, if you check immediately none of them may have actually begun to run or if you check later they may have finished.
1) Which value ('x' or 'NTHREADS') should I set to control the number of concurrent threads? Or it doesn't matter in either I choose?
Quoting the Javadocs of Executors.newFixedThreadPool(...):
Creates a thread pool that reuses a fixed number of threads operating off a shared unbounded queue. At any point, at most nThreads threads will be active processing tasks.
So changing the NTHREADS constant changes the number of threads running in the pool. Changing x changes the number of jobs that are executed by those threads. You could have 2 threads in the pool and submit 1000 jobs or you could have 1000 threads and only submit 1 job for them to work on.
Btw, after you have submitted all of your jobs, you should then shutdown the pool which stops all of the threads if all of the jobs have been run.
service.shutdown();
2) How is this approach different from simply using the Producer-Consumer pattern --creating a fixed number of 'stud' threads to execute the tasks(shown in the code below)?
It differs in that it does all of the heavy work for you.
You don't have to create a ListWrapper of the jobs since you get one inside of the ExecutorService. You just submit the jobs to the ExecutorService and it keeps track of them until the threads are available to run them.
You don't have to create any threads or worry about them throwing exceptions and dying because the ExecutorService starts/restarts the threads for you.
If you want your tasks to return information you can make use of the submit(Callable) method and use the Future to get the results of the jobs. Etc, etc..
Doing this code yourself is going to be harder to get right, more code to maintain, and most likely will not perform as well as the code in the JDK that is battle tested and optimized.
You shouldn't create threads by yourself when using a threadpool. Instead of WorkerThread class you should use a class that implements Runnable but is not a thread. Passing a Thread object to the threadpool won't make the thread run actually. The object will be passed to a different internal thread, which will simply execute the run method of your WorkerThread class.
The ExecutorService is simply incompatible with the way you want to write your program.
In the code you have right now, these WorkerThreads will stop to work when your ListWrapper is empty. If you then add something to the list, nothing will happen. This is definitely not what you wanted.
You should get rid of ListWrapper and simply put your tasks directly into the threadpool. The threadpool already incorporates an internal list of jobs shared between the threads. You should just submit your jobs to the threadpool and it will handle them accordingly.
To answer your questions:
1) Which value ('x' or 'NTHREADS') should I set to control the number of concurrent threads? Or it doesn't matter in either I choose?
NTHREADS, the threadpool will create the necessary number of threads.
2) How is this approach different from simply using the Producer-Consumer pattern --creating a fixed number of 'stud' threads to execute the tasks(shown in the code below)?
It's just that ExecutorService automates a lot of things for you. You can choose from a lot of different implementations of threadpools and you can substitute them easily. You can use for instance a scheduled executor. You get extra functionality. Why reinvent the wheel?
For 1) NTHREADS is the maximum threads that the pool will ever run concurrently, but that doesn't mean there will always be that many running. It will only use as many as is needed up to that max value... which in your case is 3.
As the docs say:
At any point, at most nThreads threads will be active processing tasks. If additional tasks are submitted when all threads are active, they will wait in the queue until a thread is available
http://docs.oracle.com/javase/8/docs/api/java/util/concurrent/Executors.html#newFixedThreadPool-int-
As for 2) using Java's concurrent executors framework is preferred with new code. You get a lot of stuff for free and removes the need for having to handle all of the fiddly thread work yourself.
The number of threads passed into newFixedThreadPool is at most how many threads could be running executing your tasks. If you only have three tasks ever submitted I'd expect the ExecutorService to only create three threads.
To answer your questions:
You should use the number you pass into the constructor to control how many threads are going to be used to execute your tasks.
This differs because of the extra functionality the ExecutorService gives you, as well as the flexibility it gives you such as in the case you need to change your ExecutorService type or number of tasks you'll run (less lines of code to change).
All that is happening is the executor service is only creating as many threads as it needs. NTHREADS is effectively the maximum number of threads it'll create.
There is no point creating ten threads up front if it only has 3 tasks to complete, the other 7 will just be hanging around consuming resources.
If you submit more than NTHREADS number of tasks then it will process that number concurrently and the rest will wait on a queue until a thread becomes free.
This isn't any different from creating a fixed set of your own threads, except the thread management and scheduling is handled for you. The executor service also restarts threads if they are killed by rogue exceptions in your task which you'd otherwise have to code for.
See: The Javadoc on Executorservice.newFixedThreadPool

Interrupt Runnable that takes hours

I have a ThreadPoolExecutor:
ThreadPoolExecutor service = new ThreadPoolExecutor(N_THREADS, N_THREADS, 0L, TimeUnit.MILLISECONDS, blockingQueue, rejectedExecutionHandler);
The service executes threads implementing the Runnable interface. Each thread processes a file on disk. I found that after several hours, two threads (or cores depending on what htop shows in Linux) were running and had been running for 13 hours. What's even worse is that the remaining cores showed no activity as if they were waiting for the two threads to complete.
Questions:
1 - I have read a lot on how this problem may be resolved but nothing conclusive. As far as I can work out, you CANNOT stop a Runnable using the ThreadPoolExecutor because it is an independent thread that just runs. Using the Future framework:
Future<?> f = f.get(submittedtask,XX)
allows you to set a timeout and fetch the future result, but get blocks all the threads effectively making the implementation serial. Is it possible to interrupt a Runnable after a given time using the threadpoolexecutor, get the thread back to the pool so it can pickup a new task and carry on.
2 - My big concern is why, using htop, I see two threads/cores running and no other core/thread are running despite many tasks are still waiting to execute (i.e. there are many files left to process). Any insight?
You could create a second scheduled thread pool to which you would submit cancellation tasks for each of the returned Futures. Each of these tasks after a given timeout would check if it's associated Future is done and if not, cancel it. Cancellation would trigger thread interruption, so you might need to support it in your tasks by checking the interrupted flag: Thread.interrupted().
The size of this second thread pool could be minimal, i.e. 1 as this job takes minimum of CPU time.
Code example:
ScheduledExecutorService service = Executors.newScheduledThreadPool(1);
...
while(...){
final Future<?> f = pool.submit(...);
service.schedule(new Runnable() {
#Override
public void run() {
if(!f.isDone()){
f.cancel(true);
}
}
}, 1, TimeUnit.MINUTES);
}
service.awaitTermination(1, TimeUnit.MINUTES);
service.shutdown();
You can tell a thread that you wish to interrupt:
An interrupt is an indication to a thread that it should stop what it is doing and do something else.
You can interrupt your thread with Future.cancel(true). It's in your responsibility to implement the Runnable in a manner that it obeys that wish by checking its Thread.interrupted() state.
In order to see details about process thread run:
ps -eLf | grep <PROCESS_PID>
Since htop shows you the running processes list, where each process has at least one thread.

With a Java ExecutorService, how do I complete actively executing tasks but halt the processing of waiting tasks?

I am using an ExecutorService (a ThreadPoolExecutor) to run (and queue) a lot of tasks. I am attempting to write some shut down code that is as graceful as possible.
ExecutorService has two ways of shutting down:
I can call ExecutorService.shutdown() and then ExecutorService.awaitTermination(...).
I can call ExecutorService.shutdownNow().
According to the JavaDoc, the shutdown command:
Initiates an orderly shutdown in which previously submitted
tasks are executed, but no new tasks will be accepted.
And the shutdownNow command:
Attempts to stop all actively executing tasks, halts the
processing of waiting tasks, and returns a list of the tasks that were
awaiting execution.
I want something in between these two options.
I want to call a command that:
a. Completes the currently active task or tasks (like shutdown).
b. Halts the processing of waiting tasks (like shutdownNow).
For example: suppose I have a ThreadPoolExecutor with 3 threads. It currently has 50 tasks in the queue with the first 3 actively running. I want to allow those 3 active tasks to complete but I do not want the remaining 47 tasks to start.
I believe I can shutdown the ExecutorService this way by keeping a list of Future objects around and then calling cancel on all of them. But since tasks are being submitted to this ExecutorService from multiple threads, there would not be a clean way to do this.
I'm really hoping I'm missing something obvious or that there's a way to do it cleanly.
Thanks for any help.
I ran into this issue recently. There may be a more elegant approach, but my solution is to first call shutdown(), then pull out the BlockingQueue being used by the ThreadPoolExecutor and call clear() on it (or else drain it to another Collection for storage). Finally, calling awaitTermination() allows the thread pool to finish what's currently on its plate.
For example:
public static void shutdownPool(boolean awaitTermination) throws InterruptedException {
//call shutdown to prevent new tasks from being submitted
executor.shutdown();
//get a reference to the Queue
final BlockingQueue<Runnable> blockingQueue = executor.getQueue();
//clear the Queue
blockingQueue.clear();
//or else copy its contents here with a while loop and remove()
//wait for active tasks to be completed
if (awaitTermination) {
executor.awaitTermination(SHUTDOWN_TIMEOUT, TimeUnit.SECONDS);
}
}
This method would be implemented in the directing class wrapping the ThreadPoolExecutor with the reference executor.
It's important to note the following from the ThreadPoolExecutor.getQueue() javadoc:
Access to the task queue is intended primarily for debugging and
monitoring. This queue may be in active use. Retrieving the task queue
does not prevent queued tasks from executing.
This highlights the fact that additional tasks may be polled from the BlockingQueue while you drain it. However, all BlockingQueue implementations are thread-safe according to that interface's documentation, so this shouldn't cause problems.
The shutdownNow() is exactly what you need. You've missed the 1st word Attempts and the entire 2nd paragraph of its javadoc:
There are no guarantees beyond best-effort attempts to stop processing actively executing tasks. For example, typical implementations will cancel via Thread.interrupt(), so any task that fails to respond to interrupts may never terminate.
So, only tasks which are checking Thread#isInterrupted() on a regular basis (e.g. in a while (!Thread.currentThread().isInterrupted()) loop or something), will be terminated. But if you aren't checking on that in your task, it will still keep running.
You can wrap each submitted task with a little extra logic
wrapper = new Runnable()
public void run()
if(executorService.isShutdown())
throw new Error("shutdown");
task.run();
executorService.submit(wrapper);
the overhead of extra checking is negligible. After executor is shutdown, the wrappers will still be executed, but the original tasks won't.

Collecting Return Values from Launched Threads? [latest Java]

I'm looking for the simplest, most straightforward way to implement the following:
main starts and launches 3 threads
all 3 tasks process and end in a resulting value (which I need to return somehow?)
main waits (.join?) on each thread to ensure they have all 3 completed their task
main somehow gets the value from each thread (3 values)
Then the rest is fairly simple, processes the 3 results and then terminates...
Now, I've been doing some reading and found multiple ideas, like:
Using Future, but this is for asynch, is this really a good idea when the main thread needs to block waiting for all 3 spawned threads to finsih?
Passing in an object (to a thread) and then simply having the thread "fill it" with the result
Somehow using Runnable (not sure how yet).
Anyways - what would be the best, and simplest recommended approach?
Thanks,
List<Callable<Result>> list = ... create list of callables
ExecutorService es = Executors.newFixedThreadPool(3);
List<Future<Result>> results = es.invokeAll(list);
ExecutorService.invokeAll method will return only after all tasks (instances of Callable) finished, either normally or by throwing exception.
For details see ExecutorService (mainly its invokeAll method), Executors, Callable.
You could also use a Semaphore from java.util.concurrent.
Create a new Semaphore with 1 - #threads permits and have main call acquire() on the Semaphore.
When each of the threads you have created has finished it's work, get it to call the release() method.
As you have created a Semaphore with a negative number of permits the call to acquire() will block until this number becomes positive. This will not happen until all of your threads have released a permit on the Semaphore.

Categories