How to avoid congesting/stalling/deadlocking an executorservice with recursive callable - java

All the threads in an ExecutorService are busy with tasks that wait for tasks that are stuck in the queue of the executor service.
Example code:
ExecutorService es=Executors.newFixedThreadPool(8);
Set<Future<Set<String>>> outerSet=new HashSet<>();
for(int i=0;i<8;i++){
outerSet.add(es.submit(new Callable<Set<String>>() {
#Override
public Set<String> call() throws Exception {
Thread.sleep(10000); //to simulate work
Set<Future<String>> innerSet=new HashSet<>();
for(int j=0;j<8;j++) {
int k=j;
innerSet.add(es.submit(new Callable<String>() {
#Override
public String call() throws Exception {
return "number "+k+" in inner loop";
}
}));
}
Set<String> out=new HashSet<>();
while(!innerSet.isEmpty()) { //we are stuck at this loop because all the
for(Future<String> f:innerSet) { //callable in innerSet are stuckin the queue
if(f.isDone()) { //of es and can't start since all the threads
out.add(f.get()); //in es are busy waiting for them to finish
}
}
}
return out;
}
}));
}
Are there any way to avoid this other than by making more threadpools for each layer or by having a threadpool that is not fixed in size?
A practical example would be if some callables are submitted to ForkJoinPool.commonPool() and then these tasks use objects that also submit to the commonPool in one of their methods.

You should use a ForkJoinPool. It was made for this situation.
Whereas your solution blocks a thread permanently while it's waiting for its subtasks to finish, the work stealing ForkJoinPool can perform work while in join(). This makes it efficient for these kinds of situations where you may have a variable number of small (and often recursive) tasks that are being run. With a regular thread-pool you would need to oversize it, to make sure that you don't run out of threads.
With CompletableFuture you need to handle a lot more of the actual planning/scheduling yourself, and it will be more complex to tune if you decide to change things. With FJP the only thing you need to tune is the amount of threads in the pool, with CF you need to think about then vs. thenAsync as well.

I would recommend trying to decompose the work to use completion stages via CompletableFuture
CompletableFuture.supplyAsync(outerTask)
.thenCompose(CompletableFuture.allOf(innerTasks)
That way your outer task doesn’t hog the execution thread while processing inner tasks, but you still get a Future that resolves when the entire job is done. It can be hard to split those stages up if they’re too tightly coupled though.

The approach that you are suggesting which basically is based on the hypothesis that there is a possible resolution if the number of threads are more than the number of task, will not work here, if you are already allocating a single thread pool. You may try it to see it. It's a simple case of deadlock as you have stated in the comments of your code.
In such a case, use two separate thread pools, one for the outer and another for the inner. And when the task from the inner pool completes, simply return back the value to the outer.
Or you can simply create a thread on the fly, get the work done in it, get the result and return it back to the outer.

Related

Run functions concurrently in java with wait [duplicate]

During the course of my program execution, a number of threads are started. The amount of threads varies depending on user defined settings, but they are all executing the same method with different variables.
In some situations, a clean up is required mid execution, part of this is stopping all the threads, I don't want them to stop immediately though, I just set a variable that they check for that terminates them. The problem is that it can be up to 1/2 second before the thread stops. However, I need to be sure that all threads have stopped before the clean up can continues. The cleanup is executed from another thread so technically I need this thread to wait for the other threads to finish.
I have thought of several ways of doing this, but they all seem to be overly complex. I was hoping there would be some method that can wait for a group of threads to complete. Does anything like this exist?
Just join them one by one:
for (Thread thread : threads) {
thread.join();
}
(You'll need to do something with InterruptedException, and you may well want to provide a time-out in case things go wrong, but that's the basic idea...)
If you are using java 1.5 or higher, you can try CyclicBarrier. You can pass the cleanup operation as its constructor parameter, and just call barrier.await() on all threads when there is a need for cleanup.
Have you seen the Executor classes in java.util.concurrent? You could run your threads through an ExecutorService. It gives you a single object you can use to cancel the threads or wait for them to complete.
Define a utility method (or methods) yourself:
public static waitFor(Collection<? extends Thread) c) throws InterruptedException {
for(Thread t : c) t.join();
}
Or you may have an array
public static waitFor(Thread[] ts) throws InterruptedException {
waitFor(Arrays.asList(ts));
}
Alternatively you could look at using a CyclicBarrier in the java.util.concurrent library to implement an arbitrary rendezvous point between multiple threads.
If you control the creation of the Threads (submission to an ExecutorService) then it appears you can use an ExecutorCompletionService
see ExecutorCompletionService? Why do need one if we have invokeAll? for various answers there.
If you don't control thread creation, here is an approach that allows you to join the threads "one by one as they finish" (and know which one finishes first, etc.), inspired by the ruby ThreadWait class.
Basically by newing up "watching threads" which alert when the other threads terminate, you can know when the "next" thread out of many terminates.
You'd use it something like this:
JoinThreads join = new JoinThreads(threads);
for(int i = 0; i < threads.size(); i++) {
Thread justJoined = join.joinNextThread();
System.out.println("Done with a thread, just joined=" + justJoined);
}
And the source:
public static class JoinThreads {
java.util.concurrent.LinkedBlockingQueue<Thread> doneThreads =
new LinkedBlockingQueue<Thread>();
public JoinThreads(List<Thread> threads) {
for(Thread t : threads) {
final Thread joinThis = t;
new Thread(new Runnable() {
#Override
public void run() {
try {
joinThis.join();
doneThreads.add(joinThis);
}
catch (InterruptedException e) {
// "should" never get here, since we control this thread and don't call interrupt on it
}
}
}).start();
}
}
Thread joinNextThread() throws InterruptedException {
return doneThreads.take();
}
}
The nice part of this is that it works with generic Java threads, without modification, any thread can be joined. The caveat is it requires some extra thread creation. Also this particular implementation "leaves threads behind" if you don't call joinNextThread() the full number of times, and doesn't have an "close" method, etc. Comment here if you'd like a more polished version created. You could also use this same type of pattern with "Futures" instead of Thread objects, etc.

ThreadPoolExecutor's getActiveCount()

I have a ThreadPoolExecutor that seems to be lying to me when I call getActiveCount(). I haven't done a lot of multithreaded programming however, so perhaps I'm doing something incorrectly.
Here's my TPE
#Override
public void afterPropertiesSet() throws Exception {
BlockingQueue<Runnable> workQueue;
int maxQueueLength = threadPoolConfiguration.getMaximumQueueLength();
if (maxQueueLength == 0) {
workQueue = new LinkedBlockingQueue<Runnable>();
} else {
workQueue = new LinkedBlockingQueue<Runnable>(maxQueueLength);
}
pool = new ThreadPoolExecutor(
threadPoolConfiguration.getCorePoolSize(),
threadPoolConfiguration.getMaximumPoolSize(),
threadPoolConfiguration.getKeepAliveTime(),
TimeUnit.valueOf(threadPoolConfiguration.getTimeUnit()),
workQueue,
// Default thread factory creates normal-priority,
// non-daemon threads.
Executors.defaultThreadFactory(),
// Run any rejected task directly in the calling thread.
// In this way no records will be lost due to rejection
// however, no records will be added to the workQueue
// while the calling thread is processing a Task, so set
// your queue-size appropriately.
//
// This also means MaxThreadCount+1 tasks may run
// concurrently. If you REALLY want a max of MaxThreadCount
// threads don't use this.
new ThreadPoolExecutor.CallerRunsPolicy());
}
In this class I also have a DAO that I pass into my Runnable (FooWorker), like so:
#Override
public void addTask(FooRecord record) {
if (pool == null) {
throw new FooException(ERROR_THREAD_POOL_CONFIGURATION_NOT_SET);
}
pool.execute(new FooWorker(context, calculator, dao, record));
}
FooWorker runs record (the only non-singleton) through a state machine via calculator then sends the transitions to the database via dao, like so:
public void run() {
calculator.calculate(record);
dao.save(record);
}
Once my main thread is done creating new tasks I try and wait to make sure all threads finished successfully:
while (pool.getActiveCount() > 0) {
recordHandler.awaitTermination(terminationTimeout,
terminationTimeoutUnit);
}
What I'm seeing from output logs (which are presumably unreliable due to the threading) is that getActiveCount() is returning zero too early, and the while() loop is exiting while my last threads are still printing output from calculator.
Note I've also tried calling pool.shutdown() then using awaitTermination but then the next time my job runs the pool is still shut down.
My only guess is that inside a thread, when I send data into the dao (since it's a singleton created by Spring in the main thread...), java is considering the thread inactive since (I assume) it's processing in/waiting on the main thread.
Intuitively, based only on what I'm seeing, that's my guess. But... Is that really what's happening? Is there a way to "do it right" without putting a manual incremented variable at the top of run() and a decremented at the end to track the number of threads?
If the answer is "don't pass in the dao", then wouldn't I have to "new" a DAO for every thread? My process is already a (beautiful, efficient) beast, but that would really suck.
As the JavaDoc of getActiveCount states, it's an approximate value: you should not base any major business logic decisions on this.
If you want to wait for all scheduled tasks to complete, then you should simply use
pool.shutdown();
pool.awaitTermination(terminationTimeout, terminationTimeoutUnit);
If you need to wait for a specific task to finish, you should use submit() instead of execute() and then check the Future object for completion (either using isDone() if you want to do it non-blocking or by simply calling get() which blocks until the task is done).
The documentation suggests that the method getActiveCount() on ThreadPoolExecutor is not an exact number:
getActiveCount
public int getActiveCount()
Returns the approximate number of threads that are actively executing tasks.
Returns: the number of threads
Personally, when I am doing multithreaded work such as this, I use a variable that I increment as I add tasks, and decrement as I grab their output.

How to have one java thread wait for the result of another thread?

I frequently need to have a thread wait for the result of another thread. Seems like there should be some support for this in java.util.concurrent, but I can't find it.
Exchanger is very close to what I'm talking about, but it's bi-directional. I only want Thread A to wait on Thread B, not have both wait on each other.
Yes, I know I can use a CountDownLatch or a Semaphore or Thread.wait() and then manage the result of the computation myself, but it seems like I must be missing a convenience class somewhere.
What am I missing?
UPDATE
// An Example which works using Exchanger
// but you would think there would be uni-directional solution
protected Exchanger<Integer> exchanger = new Exchanger<Integer>();
public void threadA() {
// perform some computations
int result = ...;
exchanger.exchange(result);
}
public void threadB() {
// retrieve the result of threadA
int resultOfA = exchanger.exchange(null);
}
Are you looking for Future<T>? That's the normal representation of a task which has (usually) been submitted to a work queue, but may not have completed yet. You can find out its completion status, block until it's finished, etc.
Look at ExecutorService for the normal way of obtaining futures. Note that this is focused on getting the result of an individual task, not rather than waiting for a thread to finish. A single thread may complete many tasks in its life time, of course - that's the whole point of a thread pool.
So far, it seems like BlockingQueue may be the best solution I've found.
eg.
BlockingQueue<Integer> queue = new ArrayBlockingQueue<Integer>(1);
The waiting thread will call queue.take() to wait for the result, and the producing queue will call queue.add() to submit the result.
The JDK doesn't provide a convenience class that provides the exact functionality you're looking for. However, it is actually fairly easy to write a small utility class to do just that.
You mentioned the CountDownLatch and your preference regarding it, but I would still suggest looking at it. You can build a small utility class (a "value synchronizer" if you will) pretty easily:
public class OneShotValueSynchronizer<T> {
private volatile T value;
private final CountDownLatch set = new CountDownLatch(1);
public T get() throws InterruptedException {
set.await();
return value;
}
public synchronized void set(T value) {
if (set.getCount() > 0) {
this.value = value;
set.countDown();
}
}
// more methods if needed
}
Since Java 8 you can use CompletableFuture<T>. Thread A can wait for a result using the blocking get() method, while Thread B can pass the result of computation using complete().
If Thread B encounters an exception while calculating the result, it can communicate this to Thread A by calling completeExceptionally().
What's inconvenient in using Thread.join()?
I recently had the same problem, tried using a Future then a CountdownLatch but settled on an Exchanger. They are supposed to allow two threads to swap data but there's no reason why one of those threads can't just pass a null.
In the end I think it was the cleanest solution, but it may depend on what exactly you are trying to achieve.
You might use java.util.concurrent.CountDownLatch for this.
http://download.oracle.com/javase/6/docs/api/java/util/concurrent/CountDownLatch.html
Example:
CountDownLatch latch = new CountDownLatch(1);
// thread one
// do some work
latch.countDown();
// thread two
latch.await();

java-Executor Framework

Please look at my following code....
private static final int NTHREDS = 10;
ExecutorService executor = Executors.newFixedThreadPool(NTHREDS);
while(rs.next()){
webLink=rs.getString(1);
FirstName=rs.getString(2);
MiddleName=rs.getString(3);
Runnable worker = new MyRunnable(webLink,FirstName,MiddleName);// this interface has run method....
executor.execute(worker);
}
//added
public class MyRunnable implements Runnable {
MyRunnable(String webLink,String FirstName,String MiddleName){
** Assigning Values...***
}
#Override
public void run() {
long sum = 0;
**Calling method to crawl by passing those Values**
try {
Thread.sleep(200);
}
catch (InterruptedException e)
{
e.printStackTrace();
}
}
}
In this part if the resultset(rs) having 100 records excutor creating 100 threads..... I need to run this process with in 10 threads. I need your help to know how to get control of threads.. If any thread has completed its task then it should process the immediate available task from the Result Set. Is it possible to achieve using executor framework.
Thanks...
vijay365
The code you've already posted does this. Your code will not immediately spawn 100 threads. It will spawn 10 threads that consume tasks from a queue containing your Runnables.
From the Executors.newFixedThreadPool Javadocs:
Creates a thread pool that reuses a
fixed set of threads operating off a
shared unbounded queue.
Instead of using a static number of threads (10 in this case) you should determine the number dynamically:
final int NTHREADS = Runtime.getRuntime().availableProcessors();
Also, I don't get why you are calling Thread.sleep?
ResultSet is probably a JDBC query result.
This design is almost certain to be doomed to failure.
The JDBC interface implementations are not thread-safe.
ResultSets are scare resources that should be closed in the same scope in which they were created. If you pass them around, you're asking for trouble.
Multi-threaded code is hard to write well and even harder to debug if incorrect.
You are almost certainly headed in the wrong direction with this design. I'd bet a large sum of money that you're guilty of premature optimization. You are hoping that multiple threads will make your code faster, but what will happen is ten threads time slicing on one CPU and taking the same time or longer. (Context switching takes time, too.)
A slightly better idea would be to load the ResultSet into an object or collection, close the ResultSet, and then do some multi-threaded processing on that returned object.
Try executor.submit(worker);

Java: Best way to retrieve timings form multiple threads

We have 1000 threads that hit a web service and time how long the call takes. We wish for each thread to return their own timing result to the main application, so that various statistics can be recorded.
Please note that various tools were considered for this, but for various reasons we need to write our own.
What would be the best way for each thread to return the timing - we have considered two options so far :-
1. once a thread has its timing result it calls a singleton that provides a synchronised method to write to the file. This ensures that all each thread will write to the file in turn (although in an undetermined order - which is fine), and since the call is done after the timing results have been taken by the thread, then being blocked waiting to write is not really an issue. When all threads have completed, the main application can then read the file to generate the statistics.
2. Using the Executor, Callable and Future interfaces
Which would be the best way, or are there any other better ways ?
Thanks very much in advance
Paul
Use the latter method.
Your workers implement Callable. You then submit them to a threadpool, and get a Future instance for each.
Then just call get() on the Futures to get the results of the calculations.
import java.util.*;
import java.util.concurrent.*;
public class WebServiceTester {
public static class Tester
implements Callable {
public Integer call() {
Integer start = now();
//Do your test here
Integer end = now();
return end - start;
}
}
public static void main(String args[]) throws Exception {
ExecutorService pool = Executors.newFixedThreadPool(1000);
Set<Future<Integer>> set = new HashSet<Future<Integer>>();
for (int i =0 ; i < 1000 i++) {
set.add(pool.submit(new Tester()));
}
Set<Integer> results = new Set<Integer>();
for (Future<Integer> future : set) {
results.put(future.get());
}
//Manipulate results however you wish....
}
}
Another possible solution I can think of would be to use a CountDownLatch (from the java concurrency packages), each thread decrementing it (flagging they are finished), then once all complete (and the CountDownLatch reaches 0) your main thread can happily go through them all, asking them what their time was.
The executor framework can be implemented here. The time processing can be done by the Callable object. The Future can help you identify if the thread has completed processing.
You could pass an ArrayBlockingQueue to the threads to report their results to. You could then have a file writing thread that takes from the queue to write to the file.

Categories