Java parallelStream() with custom pool with caller work stealing?

Java parallelStream() with custom pool with caller work stealing? - java

Normally when one uses Java 8's parallelStream(), the result is execution via the default, common fork-join pool (i.e. ForkJoinPool.commonPool()).
That is clearly undesirable, however, if one has work that is far from CPU bound, e.g. may be waiting on IO much of the time. In such cases one will want to use a separate pool, sized according to other criteria (e.g. how much of the time the tasks are likely to be actually using the CPU).
There's no obvious means of getting parallelStream() to use a different pool, but there is a way as detailed here.
Unfortunately, that approach entails invoking the terminal operation on the parallel stream from a fork-join pool thread. The downside of this is that if the target-fork join pool is completely busy with existing work, the whole execution will wait on it while doing absolutely nothing. Thus the pool can become a bottleneck worse than single threaded execution. By contrast, when one uses parallelStream() in the "normal" fashion, ForkJoinPool.common.externalHelpComplete() or ForkJoinPool.common.tryExternalUnpush() are used to let the calling thread from outside the pool help in the processing.
Does anyone know of a way to both get parallelStream() to use a non-default fork-join pool and have a calling thread from outside the fork-join pool help in the processing of this work (but not the rest of the fork-join pool's work)?

You can use awaitQuiescence on the pool to help out. However, you can’t select which task(s) you will help, it will just take the next pending from the pool, thus, if there are more pending tasks, you might ending up executing these before getting to your own.
ForkJoinPool forkJoinPool = new ForkJoinPool(1);
// make all threads busy:
forkJoinPool.submit(() -> LockSupport.parkNanos(Long.MAX_VALUE));
// submit our task (may contain your stream operation)
ForkJoinTask<Thread> task = forkJoinPool.submit(() -> Thread.currentThread());
// help out
while(!task.isDone()) // use zero timeout to execute one task only
forkJoinPool.awaitQuiescence(0, TimeUnit.NANOSECONDS);
System.out.println(Thread.currentThread()==task.get());
will print true.
whereas
ForkJoinPool forkJoinPool = new ForkJoinPool(1);
// make all threads busy:
forkJoinPool.submit(() -> LockSupport.parkNanos(Long.MAX_VALUE));
// overload:
forkJoinPool.submit(() -> LockSupport.parkNanos(Long.MAX_VALUE));
// submit our task (may contain your stream operation)
ForkJoinTask<Thread> task = forkJoinPool.submit(() -> Thread.currentThread());
// help out
while(!task.isDone())
forkJoinPool.awaitQuiescence(0, TimeUnit.NANOSECONDS);
System.out.println(Thread.currentThread()==task.get());
will hang forever as it attempts to execute the second blocking task.
Nevertheless, it will let the initiating thread help processing the pool’s pending tasks which will raise the chance of its own task getting executed as long as there are no infinite tasks (the example above is extreme and only chosen for demonstration).
But note that the entire relationship between the Fork/Join framework and the Stream API is an implementation detail anyway.

Related

run things in parallel with multithreading [duplicate]

In my Java application I have a Runnable such as:
this.runner = new Runnable({
#Override
public void run() {
// do something that takes roughly 5 seconds.
}
});
I need to run this roughly every 30 seconds (although this can vary) in a separate thread. The nature of the code is such that I can run it and forget about it (whether it succeeds or fails). I do this as follows as a single line of code in my application:
(new Thread(this.runner)).start()
Now, this works fine. However, I'm wondering if there is any sort of cleanup I should be doing on each of the thread instances after they finish running? I am doing CPU profiling of this application in VisualVM and I can see that, over the course of 1 hour runtime, a lot of threads are being created. Is this concern valid or is everything OK?
N.B. The reason I start a new Thread instead of simply defining this.runner as a Thread, is that I sometimes need to run this.runner twice simultaneously (before the first run call has finished), and I can't do that if I defined this.runner as a Thread since a single Thread object can only be run again once the initial execution has finished.

Java objects that need to be "cleaned up" or "closed" after use conventionally implement the AutoCloseable interface. This makes it easy to do the clean up using try-with-resources. The Thread class does not implement AutoCloseable, and has no "close" or "dispose" method. So, you do not need to do any explicit clean up.
However
(new Thread(this.runner)).start()
is not guaranteed to immediately start computation of the Runnable. You might not care whether it succeeds or fails, but I guess you do care whether it runs at all. And you might want to limit the number of these tasks running concurrently. You might want only one to run at once, for example. So you might want to join() the thread (or, perhaps, join with a timeout). Joining the thread will ensure that the thread will completes its computation. Joining the thread with a timeout increases the chance that the thread starts its computation (because the current thread will be suspended, freeing a CPU that might run the other thread).
However, creating multiple threads to perform regular or frequent tasks is not recommended. You should instead submit tasks to a thread pool. That will enable you to control the maximum amount of concurrency, and can provide you with other benefits (such as prioritising different tasks), and amortises the expense of creating threads.
You can configure a thread pool to use a fixed length (bounded) task queue and to cause submitting threads to execute submitted tasks itself themselves when the queue is full. By doing that you can guarantee that tasks submitted to the thread pool are (eventually) executed. The documentation of ThreadPool.execute(Runnable) says it
Executes the given task sometime in the future
which suggests that the implementation guarantees that it will eventually run all submitted tasks even if you do not do those specific tasks to ensure submitted tasks are executed.

I recommend you to look at the Concurrency API. There are numerous pre-defined methods for general use. By using ExecutorService you can call the shutdown method after submitting tasks to the executor which stops accepting new tasks, waits for previously submitted tasks to execute, and then terminates the executor.
For a short introduction:
https://www.baeldung.com/java-executor-service-tutorial

Difference between ForkJoinPool and normal ExecutionService?

I read a great article about the fork-join framework in Java 7, and the idea is that, with ForkJoinPool and ForkJoinTask, the threads in the pool can get the sub tasks from other tasks, so it's able to use less threads to handle more tasks.
Then I tried to use a normal ExecutorService to do the same work, and found I can't tell the difference, since when I submit a new task to the pool, the task will be run on another available thread.
The only difference I can tell is if I use ForkJoinPool, I don't need to pass the pool to the tasks, because I can call task.fork() to make it running on another thread. But with normal ExecutorService, I have to pass the pool to the task, or make it a static, so inside the task, I can call pool.submit(newTask)
Do I miss something?
(You can view the living code from https://github.com/freewind/fork-join-test/tree/master/src)

Although ForkJoinPool implements ExecutorService, it is conceptionally different from 'normal' executors.
You can easily see the difference if your tasks spawn more tasks and wait for them to complete, e.g. by calling
executor.invoke(new Task()); // blocks this thread until new task completes
In a normal executor service, waiting for other tasks to complete will block the current thread. There are two possible outcomes: If your executor service has a fixed number of threads, it might deadlock if the last running thread waits for another task to complete. If your executor dynamically creates new threads on demand, the number of threads might explode and you end up having thousands of threads which might cause starvation.
In opposite, the fork/join framework reuses the thread in the meantime to execute other tasks, so it won't deadlock although the number of threads is fixed:
new MyForkJoinTask().invoke();
So if you have a problem that you can solve recursively, think of using a ForkJoinPool as you can easily implement one level of recursion as ForkJoinTask.
Just check the number of running threads in your examples.

How do I know when all threads in a ExecutorService are finished?

I know that shutdown() and awaitTermination() exist. The problem is that the runnables in the pool need to be able to add an unknown number (can't use a countdownlatch) of other runnables to it and if I call shutdown() those tasks will be rejected. How can I know when they're done?

Work with Future rather than with Runnable. There's this Future#isDone method that may help you.
In case you don't have anything meaningful to return from the Callable, use Callable<Void> and Future<Void>.

Instead of submitting Runnable tasks to an Executor, you should rather use ForkJoinTask/ForkJoinPool instead. A ForkJoinTask runs inside a ForkJoinPool and can spawn an arbitrary number of (sub)tasks and wait for them to complete, without actually blocking the current thread. A ForkJoinTask is complete when all of its sub-tasks are done, so the entire computation is done, when the initial (root) ForkJoinTask is complete.
See Oracle - The Java™ Tutorials - Fork/Join for details.
As all of your tasks are resultless (Runnable), you should subclass RecursiveAction (which is itself a subclass of ForkJoinTask). Implement the method compute(), and spawn an arbitrary number of new tasks there by either calling invoke(subtask), invokeAll(subtask1, subtask2, ...) or subtask.fork() followed by subtask.join().
The entire computation is executed as follows:
MyRecursiveAction task = new MyRecursiveAction(params);
ForkJoinPool pool = new ForkJoinPool(numberOfThreads);
pool.invoke(task); // will block until task is done
Unfortunatley the advantages of Fork/Join have some limitations, e.g.:
(...) Computations should ideally avoid synchronized methods or blocks, and
should minimize other blocking synchronization apart from joining
other tasks or using synchronizers such as Phasers that are advertised
to cooperate with fork/join scheduling. Subdividable tasks should also
not perform blocking I/O, and should ideally access variables that are
completely independent of those accessed by other running tasks. These
guidelines are loosely enforced by not permitting checked exceptions
such as IOExceptions to be thrown. (...)
For more detail see API docs of ForkJoinTask.

If you are able to use Guava Futures, you can use Futures.allAsList or Futures.successfulAsList. This allows you to wrap a number of Future instances that you got back from the ExecutorService into a single Future which you can then check to see if it is finished using isDone() (or just get(), for that matter, if you want to block until completion).

ForkJoinTask completion handler

I have a long-running calculation that I have split up with Java's ForkJoinTask.
Java's FutureTask provides a template method done(). Overriding this method allows for "registering a completion handler".
Is it possible to register a completion handler for a ForkJoinTask?
I am asking because I don't want to have blocking threads in my application - but my application will have a blocking thread as soon as I retrieve the calculation result via calls to result = ForkJoinPool.invoke(myForkJoinTask) or result = ForkJoinPool.submit(myForkJoinTask).get().

I think you mean "lock free" programming http://en.wikipedia.org/wiki/Non-blocking_algorithm? While FutureTask.get() possibly blocks the current thread (and thus leaves an idling CPU) ForkJoinTask.get() (or join) tries to keep the CPU busy.
This works well if you are able to split your problem into many small peaces (ForkJoinTask). If one FJTask is internally waiting for the result of an other task, which is not ready, the ForkJoinTask tries to pick up some other work (Task) to do from its ForkJoinPool and executes that task(s) meanwhile.
Until all your Task are CPU bound, it works fine: all your CPU(s) are kept busy.
It does NOT work if any of your Task waits for some external event (i.e. sending a REST call to the Mars rover). Also the problem should form a DAG, else you may get a deadlock. But until you join only tasks you forked before in the same Task it works well. Even better if you join the task you forked at last.
So it is not too worse to call get() or join() within/between your Tasks.
You mentioned a completion handler to solve the problem. If you are implementing the ForkJoinTask yourself you may have a look at RecursiveTask or even RecursiveAction. You will implement compute() and you may easily forward the result of each task to some collector at the end of your compute() function instead of returning it.
But you have to consider that you collector will be called concurrently! For adding values or counting completion counts have a look at java.util.concurrent.atomic. Avoid using synchronized blocks. Else all your Tasks have to wait for this single bottleneck and only one CPU keeps working.
I think propagating the results involves more problems than returning them (since FJPool handles this). In addition it becomes difficult to decide (and to communicate to the outside) at which point your final result is done.

Executor in java

I was trying to run ExecutorService object with FixedThreadPool and I ran into problems.
I expected the program to run in nanoseconds but it was hung. I found that I need to use Semaphore along with it so that the items in the queue do not get added up.
Is there any way I can come to know that all the threads of the pool are used.
Basic code ...
static ExecutorService pool = Executors.newFixedThreadPool(4);
static Semaphore permits = new Semaphore(4);
try {
permits.acquire();
pool.execute(p); // Assuming p is runnable on large number of objects
permits.release();
} catch ( InterruptedException ex ) {
}
This code gets hanged and I really don't know why. How to know if pool is currently waiting for all the threads to finish?

By default, if you submit more than 4 tasks to your pool then the extra tasks will be queued until a thread becomes available.
The blog you referenced in your comment uses the semaphore to limit the amount of work that can be queued at once, which won't be a problem for you until you have many thousands of tasks queued up and they start eating into the available memory. There's an easier way to do this, anyway - construct a ThreadPoolExecutor with a bounded queue.* But this isn't your problem.
If you want to know when a task completes, notice that ExecutorService.submit() returns a Future object which can be used to wait for the task's completion:
Future<?> f = pool.execute(p);
f.get();
System.out.println("task complete");
If you have several tasks and want to wait for all of them to complete, either store each Future in a list and then call get() on each in turn, or investigate ExecutorService.invokeAll() (which essentially does the same but in a single method call).
You can also tell whether a task has completed or not:
Future<?> f = pool.execute(p);
while(!f.isDone()) {
// do something else, task not complete
}
f.get();
Finally, note that even if your tasks are complete, your program may not exit (and thus appears to "hang") if you haven't called shutdown() on the thread pool; the reason is that the threads are still running, waiting to be given more work to do.
*Edit: sorry, I just re-read my answer and realised this part is incorrect - ThreadPoolExecutor offers tasks to the queue and rejects them if they aren't accepted, so a bounded queue has different semantics to the semaphore approach.

You do not need the Semaphore.
If you are hanging it is probably because the threads are locking themselves elsewhere.
Run the code in a Debuger and when it hangs pause it and see what the threads are doing.

You could change to using a ThreadPoolExecutor. It contains a getActiveCount() method which returns an approximate count of the active threads. Why it is approximate I'm not sure.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.