ExecutorService.submit() vs ExecutorSerivce.invokeXyz() - java

ExecutorService contains following methods:
invokeAll(Collection<? extends Callable<T>> tasks)
invokeAny(Collection<? extends Callable<T>> tasks)
submit(Callable<T> task)
I am confused about the use of terms submit vs invoke. Does it mean that invokeXyz() methods invoke those tasks immediately as soon as possible by underlying thread pool and submit() does some kind of scheduling of tasks submitted.
This answer says "if we want to wait for completion of all tasks, which have been submitted to ExecutorService". What does "wait for" here refers to?

Both invoke..() and submit() will execute their tasks immediately (assuming threads are available to run the tasks). The difference is that invoke...() will wait for the tasks running in separate threads to finish before returning a result, whereas submit() will return immediately, meaning the task it executed is still running in another thread.
In other words, the Future objects returned from invokeAll() are guaranteed to be in a state where Future.isDone() == true. The Future object returned from submit() can be in a state where Future.isDone() == false.
We can easily demonstrate the timing difference.
public static void main(String... args) throws InterruptedException {
Callable<String> c1 = () -> { System.out.println("Hello "); return "Hello "; };
Callable<String> c2 = () -> { System.out.println("World!"); return "World!"; };
List<Callable<String>> callables = List.of(c1, c2);
ExecutorService executor = Executors.newSingleThreadExecutor();
System.out.println("Begin invokeAll...");
List<Future<String>> allFutures = executor.invokeAll(callables);
System.out.println("End invokeAll.\n");
System.out.println("Begin submit...");
List<Future<String>> submittedFutures = callables.stream().map(executor::submit).collect(toList());
System.out.println("End submit.");
}
And the result is that the callables print their Hello World message before the invokeAll() method completes; but the callables print Hello World after the submit() method completes.
/*
Begin invokeAll...
Hello
World!
End invokeAll.
Begin submit...
End submit.
Hello
World!
*/
You can play around with this code in an IDE by adding some sleep() time in c1 or c2 and watching as the terminal prints out. This should convince you that invoke...() does indeed wait for something to happen, but submit() does not.

Related

Thread used for Java CompletableFuture composition?

I'm starting to be comfortable with Java CompletableFuture composition, having worked with JavaScript promises. Basically the composition just scheduled the chained commands on the indicated executor. But I'm unsure of which thread is running when the composition is performed.
Let's say I have two executors, executor1 and executor2; for simplicity let's say they are separate thread pools. I schedule a CompletableFuture (to use a very loose description):
CompletableFuture<Foo> futureFoo = CompletableFuture.supplyAsync(this::getFoo, executor1);
Then when that is done I transform the Foo to Bar using the second executor:
CompletableFuture<Bar> futureBar .thenApplyAsync(this::fooToBar, executor2);
I understand that getFoo() will be called from a thread in the executor1 thread pool. I understand that fooToBar() will be called from a thread in the executor2 thread pool.
But what thread is used for the actual composition, i.e. after getFoo() finishes and futureFoo() is complete; but before the fooToBar() command gets scheduled on executor2? In other words, what thread actually runs the code to schedule the second command on the second executor?
Is the scheduling performed as part of the same thread in executor1 that called getFoo()? If so, would this completable future composition be equivalent to my simply scheduling fooToBar() manually myself in the first command in the executor1 task?
This is intentionally unspecified. In practice, it will be handled by the same code that also handles the chained operations when the variants without the Async suffix are invoked and exhibits similar behavior.
So when we use the following test code
CompletableFuture.supplyAsync(() -> {
LockSupport.parkNanos(TimeUnit.SECONDS.toNanos(1));
return "";
}, r -> new Thread(r, "A").start())
.thenAcceptAsync(s -> {}, r -> {
System.out.println("scheduled by " + Thread.currentThread());
new Thread(r, "B").start();
});
it will likely print
scheduled by Thread[A,5,main]
as the thread that completed the previous stage was used to schedule the depending action.
However when we use
CompletableFuture<String> first = CompletableFuture.supplyAsync(() -> "",
r -> new Thread(r, "A").start());
LockSupport.parkNanos(TimeUnit.SECONDS.toNanos(1));
first.thenAcceptAsync(s -> {}, r -> {
System.out.println("scheduled by " + Thread.currentThread());
new Thread(r, "B").start();
});
it will likely print
scheduled by Thread[main,5,main]
as by the time the main thread invokes thenAcceptAsync, the first future is already completed and the main thread will schedule the action itself.
But that is not the end of the story. When we use
CompletableFuture<String> first = CompletableFuture.supplyAsync(() -> {
LockSupport.parkNanos(TimeUnit.MILLISECONDS.toNanos(5));
return "";
}, r -> new Thread(r, "A").start());
Set<String> s = ConcurrentHashMap.newKeySet();
Runnable submitter = () -> {
String n = Thread.currentThread().getName();
do {
for(int i = 0; i < 1000; i++)
first.thenAcceptAsync(x -> s.add(n+" "+Thread.currentThread().getName()),
Runnable::run);
} while(!first.isDone());
};
Thread b = new Thread(submitter, "B");
Thread c = new Thread(submitter, "C");
b.start();
c.start();
b.join();
c.join();
System.out.println(s);
It may not only print the combinations B A and C A from the first scenario and B B and C C from the second. On my machine it reproducibly also prints the combinations B C and C B indicating that an action passed to thenAcceptAsync by one thread got submitted to the executor by the other thread calling thenAcceptAsync with a different action at the same time.
This is matching the scenarios for the thread evaluating the function passed to thenApply (without the Async) described in this answer. As said at the beginning, that was what I expected as both things are likely handled by the same code. But unlike the thread evaluating the function passed to thenApply, the thread invoking the execute method on the Executor is not even mentioned in the documentation. So in theory, another implementation could use an entirely different thread not calling a method on the future nor completing it.
At the end is a simple program that does like your code snippet and allows you to play with it.
The output confirms that the executor you supply is called to complete (unless you explicitly call complete early enough - which would happen in the calling thread of complete) when the condition it is waiting on is ready - the get() on a Future blocks until the Future is finished.
Supply an arg - there's an executor 1 and executor 2, supply no args there's just one executor. The output is either (same executor - things a run as separate tasks in the same executor sequentially) -
In thread Thread[main,5,main] - getFoo
In thread Thread[main,5,main] - getFooToBar
In thread Thread[pool-1-thread-1,5,main] - Supplying Foo
In thread Thread[pool-1-thread-1,5,main] - fooToBar
In thread Thread[main,5,main] - Completed
OR (two executors - things again run sequentially but using different executors) -
In thread Thread[main,5,main] - getFoo
In thread Thread[main,5,main] - getFooToBar
In thread Thread[pool-1-thread-1,5,main] - Supplying Foo
In thread Thread[pool-2-thread-1,5,main] - fooToBar
In thread Thread[main,5,main] - Completed
Remember: the code with the executors (in this example can start immediately in another thread .. the getFoo was called prior to even getting to setting up the FooToBar).
Code follows -
package your.test;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.Executor;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;
import java.util.function.Function;
import java.util.function.Supplier;
public class TestCompletableFuture {
private static void dumpWhichThread(final String msg) {
System.err.println("In thread " + Thread.currentThread().toString() + " - " + msg);
}
private static final class Foo {
final int i;
Foo(int i) {
this.i = i;
}
};
public static Supplier<Foo> getFoo() {
dumpWhichThread("getFoo");
return new Supplier<Foo>() {
#Override
public Foo get() {
dumpWhichThread("Supplying Foo");
return new Foo(10);
}
};
}
private static final class Bar {
final String j;
public Bar(final String j) {
this.j = j;
}
};
public static Function<Foo, Bar> getFooToBar() {
dumpWhichThread("getFooToBar");
return new Function<Foo, Bar>() {
#Override
public Bar apply(Foo t) {
dumpWhichThread("fooToBar");
return new Bar("" + t.i);
}
};
}
public static void main(final String args[]) throws InterruptedException, ExecutionException, TimeoutException {
final TestCompletableFuture obj = new TestCompletableFuture();
obj.running(args.length == 0);
}
private String running(final boolean sameExecutor) throws InterruptedException, ExecutionException, TimeoutException {
final Executor executor1 = Executors.newSingleThreadExecutor();
final Executor executor2 = sameExecutor ? executor1 : Executors.newSingleThreadExecutor();
CompletableFuture<Foo> futureFoo = CompletableFuture.supplyAsync(getFoo(), executor1);
CompletableFuture<Bar> futureBar = futureFoo.thenApplyAsync(getFooToBar(), executor2);
try {
// Try putting a complete here before the get ..
return futureBar.get(50, TimeUnit.SECONDS).j;
}
finally {
dumpWhichThread("Completed");
}
}
}
Which thread triggers the Bar stage to progress - in the above - it's executor1. In general the thread completing the future (i.e. giving it a value) is what releases the thing depending on it. If you completed the FutureFoo immediately on the main thread - it would be the one triggering it.
SO you have to be careful with this. If you have "N" things all waiting on the future results - but use only a single threaded executor - then the first one scheduled will block that executor until it completes. You can extrapolate to M threads, N futures - it can decay into "M" locks preventing the rest of things progressing.

Does it make sense to use a Future<T> with a CountDownLatch?

I am submitting two jobs using an ExecutorService as below.
final Future<String> futureResultA = executor.submit(jobA);
final Future<String> futureResultB = executor.submit(jobB);
Now, when I want to get the result from this future, I was calling the get() method to await and get the results.
futureResultA.get()
Now, will using a CountDownLatch give me any advantage if I initialize a latch and call countdown on the latch within each of my jobs?
Assuming that the results of the jobs are unrelated to each other, the answer is NO (CountDownLatch will not add any value)
Using a CompletionService will give you an advantage though. Currently, if you wait on Job-A first, then that call blocks until Job-A is done. Only then you can move on to checking the results of Job-B. So even if Job-B finishes before job-A, you can't use its result util Job-A has also finished.
Using CompletionService, you can get the results from whichever job finishes first and use them without waiting for the other -
public static void main(String[] args) throws Exception {
Callable<String> jobA = () -> {
Thread.sleep(2000);
return "JobA's result";
};
Callable<String> jobB = () -> {
Thread.sleep(1000);
return "JobB's result";
};
Executor executor = Executors.newFixedThreadPool(2);
CompletionService<String> completionService = new ExecutorCompletionService<>(executor);
completionService.submit(jobA);
completionService.submit(jobB);
Future<String> futureWhichCompletedFirst = completionService.take();
System.out.println(futureWhichCompletedFirst.get());
Future<String> futureWhichCompletedNext = completionService.take();
System.out.println(futureWhichCompletedNext.get());
}
CountDownLatch has a counter field, which you can decrement as we require. We can then use it to block a calling thread until it’s been counted down to zero.
As in your case Future.get() already serving this purpose, you don't need CountDownLatch. I would rather say it will be overkill.

Executing two tasks consecutively

There's a thread pool with a single thread that is used to perform tasks submitted by multiple threads. The task is actually comprised of two parts - perform with meaningful result and cleanup that takes quite some time but returns no meaningful result. At the moment (obviously incorrect) implementation looks something like this. Is there an elegant way to ensure that another perform task will be executed only after previous cleanup task?
public class Main {
private static class Worker {
int perform() {
return 1;
}
void cleanup() {
}
}
private static void perform() throws InterruptedException, ExecutionException {
ExecutorService pool = Executors.newFixedThreadPool(1);
Worker w = new Worker();
Future f = pool.submit(() -> w.perform());
pool.submit(w::cleanup);
int x = (int) f.get();
System.out.println(x);
}
}
Is there an elegant way to ensure that another perform task will be executed only after previous cleanup task?
The most obvious thing to do is to call cleanup() from perform() but I assume there is a reason why you aren't doing that.
You say that your solution is currently "obviously incorrect". Why? Because of race conditions? Then you could add a synchronized block:
synchronized (pool) {
Future f = pool.submit(() -> w.perform());
pool.submit(w::cleanup);
}
That would ensure that the cleanup() would come immediately after a perform(). If you are worried about the performance hit with the synchronized, don't be.
Another solution might be to use the ExecutorCompletionService class although I'm not sure how that would help with one thread. I've used it before when I had cleanup tasks running in another thread pool.
If you are using java8, you can do this with CompletableFuture
CompletableFuture.supplyAsync(() -> w.perform(), pool)
.thenApplyAsync(() -> w.cleanup(), pool)
.join();

Is thread starvation deadlock happening here in the code?

//code taken from java concurrency in practice
package net.jcip.examples;
import java.util.concurrent.*;
public class ThreadDeadlock
{
ExecutorService exec = Executors.newSingleThreadExecutor();
public class LoadFileTask implements Callable<String> {
private final String fileName;
public LoadFileTask(String fileName) {
this.fileName = fileName;
}
public String call() throws Exception {
// Here's where we would actually read the file
return "";
}
}
public class RenderPageTask implements Callable<String>
{
public String call() throws Exception
{
Future<String> header, footer;
header = exec.submit(new LoadFileTask("header.html"));
footer = exec.submit(new LoadFileTask("footer.html"));
String page = renderBody();
// Will deadlock -- task waiting for result of subtask
return header.get() + page + footer.get();
}
}
}
This code is take from Java concurrency in practice and as per the authors "ThreadStarvtionDeadlock" is happening here. Please help me finding how ThreadStarvationDeadlock is happening here and where? Thanks in advance.
Deadlock & Starvation is occurring at following line:
return header.get() + page + footer.get();
HOW?
It will happen if we add some extra code to the program. It might be this one:
public void startThreadDeadlock() throws Exception
{
Future <String> wholePage = exec.submit(new RenderPageTask());
System.out.println("Content of whole page is " + wholePage.get());
}
public static void main(String[] st)throws Exception
{
ThreadDeadLock tdl = new ThreadDeadLock();
tdl.startThreadDeadLock();
}
Steps that leading to deadLock:
Task is submitted to exec for Rendering the page via Callable implemented class RenderPageTask.
exec started the RenderPageTask in separate Thread , the only Thread that would execute other tasks submitted to exec sequentially .
Inside call() method of RenderPageTask two more tasks are submitted to exec . First is LoadFileTask("header.html") and second is LoadFileTask("footer.html"). But since the the ExecutorService exec obtained via code Executors.newSingleThreadExecutor(); as mentioned here uses a single worker thread operating off an unbounded queueThread and the thread is already allocated to RenderPageTask , So LoadFileTask("header.html") and LoadFileTask("footer.html") will be en queued to the unbounded queue waiting for there turn to be executed by that Thread.
RenderPageTask is returning a String containing the concatenation of output of LoadFileTask("header.html") , body of page and output of LoadFileTask("footer.html"). Out of these three parts page is obtained successfully by RenderPageTask . But other two parts can only be obtained after both tasks are executed by the single Thread allocated by ExecutorService . And Thread will be free only after call() method of RenderPageTask returns . But call method of RenderPageTask will return only after LoadFileTask("header.html") and LoadFileTask("footer.html") is returned. So Not letting LoadFileTask to execute is leading to Starvation . And each task waiting for other task for completion is leading to DeadLock
I hope this makes clear of why thread starvation deadlock is occurring in above code.
The executor I see is a single thread executor and it gets two tasks to do. However these two tasks are not dependent on each other and they order of execution seems not important. Hence the return statement will only pause in Future.get calls as much as required to complete one and then another task.
It will be no deadlock in the code you show.
However I see one more task in the code (RenderPageTask), it is not clear which executor is actually running its code. If it is the same single thread executor, then deadlock is possible as the two submitted tasks cannot be processed before the main task returns (and this task can only return after the two tasks have been processed).
The reason is not very obvious from the code itself but from the original book where the code is copied from: RenderPageTask submits two additional tasks to the Executor to fetch the page header and footer...
If the RenderPageTask were a task independent from the newSingleThreadExecutor, there would be no deadlock at all.

Java main class ends up before threads execution

I have a multithreaded execution and I want to track and print out the execution time, but when I execute the code, the child threads takes longer than the main execution, thus the output is not visible nor it prints the right value, since it is terminating earlier.
Here is the code:
public static void main(String[] args) throws CorruptIndexException, IOException, LangDetectException, InterruptedException {
/* Initialization */
long startingTime = System.currentTimeMillis();
Indexer main = new Indexer(); // this class extends Thread
File file = new File(SITES_PATH);
main.addFiles(file);
/* Multithreading through ExecutorService */
ExecutorService es = Executors.newFixedThreadPool(4);
for (File f : main.queue) {
Indexer ind = new Indexer(main.writer, main.identificatore, f);
ind.join();
es.submit(ind);
}
es.shutdown();
/* log creation - code I want to execute when all the threads execution ended */
long executionTime = System.currentTimeMillis()-startingTime;
long minutes = TimeUnit.MILLISECONDS.toMinutes(executionTime);
long seconds = TimeUnit.MILLISECONDS.toSeconds(executionTime)%60;
String fileSize = sizeConversion(FileUtils.sizeOf(file));
Object[] array = {fileSize,minutes,seconds};
logger.info("{} indexed in {} minutes and {} seconds.",array);
}
I tried several solutions such as join(), wait() and notifyAll(), but none of them worked.
I found this Q&A on stackoverflow which treats my problem, but join() is ignored and if I put
es.awaitTermination(timeout, TimeUnit.SECONDS);
actually the executor service never executes threads.
Which can be the solution for executing multithreading only in ExecutorService block and finish with main execution at the end?
Given your user case you might as well utilize the invokeAll method. From the Javadoc:
Executes the given tasks, returning a list of Futures holding their
status and results when all complete. Future.isDone() is true for each
element of the returned list. Note that a completed task could have
terminated either normally or by throwing an exception. The results of
this method are undefined if the given collection is modified while
this operation is in progress.
To use:
final Collection<Indexer> tasks = new ArrayList<Indexer>();
for(final File f: main.queue) {
tasks.add(new Indexer(main.writer, main.identificatore, f));
}
final ExecutorService es = Executors.newFixedThreadPool(4);
final List<Future<Object>> results = es.invokeAll(tasks);
This will execute all supplied tasks and wait for them to finish processing before proceeding on your main thread. You will need to tweak the code to fit your particular needs, but you get the gist. A quick note, there is a variant of the invokeAll method that accepts timeout parameters. Use that variant if you want to wait up to a maximum amount of time before proceeding. And make sure to check the results collected after the invokeAll is done, in order to verify the status of the completed tasks.
Good luck.
The ExecutorService#submit() method returns a Future object, which can be used for waiting until the submitted task has completed.
The idea is that you collect all of these Futures, and then call get() on each of them. This ensures that all of the submitted tasks have completed before your main thread continues.
Something like this:
ExecutorService es = Executors.newFixedThreadPool(4);
List<Future<?>> futures = new ArrayList<Future<?>>();
for (File f : main.queue) {
Indexer ind = new Indexer(main.writer, main.identificatore, f);
ind.join();
Future<?> future = es.submit(ind);
futures.add(future);
}
// wait for all tasks to complete
for (Future<?> f : futures) {
f.get();
}
// shutdown thread pool, carry on working in main thread...

Categories