How can I add futures by CompletableFuture.allOf() when the futures are created within a for-loop? I want to create a bunch of futures which should be executed in parallel. And only when all futures are completed the method should return the result:
// Version 1: execute each task in async and return alls tasks when finished
public Set<Task> getTasks(){
var executor = Executors.newCachedThreadPool();
var tasks = new LinkedHashSet<Task>();
var futures = new ArrayList<CompletableFuture<Set<Task>>>();
for (var task : user.getTasks()) {
// all futures are executed in parallel
futures.add(CompletableFuture.supplyAsync(() -> execute(task), executor));
}
for (var f : futures) {
// this will block as long as each future is finished
tasks.addAll(f.join());
}
return tasks;
}
Or is there another alternative? I have also tried the following, but it also executes the futures one after another (instead of parallel):
// Version 2:
var executor = Executors.newCachedThreadPool();
var tasks = new LinkedHashSet<Task>();
for (var task : user.getTasks()) {
CompletableFuture.supplyAsync(() -> execute(task), executor)
.thenAccept(tasks::addAll).join();
}
EDIT: at the end I have two versions which come close the problem I would like to solve. However, I guess version A is not right because parallel threads will add elements to the LinkedHashSet in async mode (which could cause trouble, because LinkedHashSet is not thread safe):
VERSION A (it seems not thread safe):
var executor = Executors.newCachedThreadPool();
var tasks = new LinkedHashSet<Task>();
var futures = new ArrayList<CompletableFuture<Void>>();
for (var t : user.getTasks()) {
futures.add(CompletableFuture.supplyAsync(() -> execute(t), executor).thenAcceptAsync(tasks::addAll));
}
CompletableFuture.allOf(futures.toArray(new CompletableFuture[0])).join();
and VERSION B (which could be better, but is a little complex):
var executor = Executors.newCachedThreadPool();
var futures = new ArrayList<CompletableFuture<Set<Task>>>();
for (var t : user.getTasks()) {
futures.add(CompletableFuture.supplyAsync(() -> execute(t), executor));
}
Set<Task> o = CompletableFuture
.allOf(futures.toArray(new CompletableFuture[0]))
.thenApplyAsync(v -> futures.stream().flatMap(future -> future.join().stream()))
.join().collect(Collectors.toSet());
I cannot find an easier approach..but for completness, I add the following code which is the shortest - however, it uses ForkJoinPool which should be avoided (?) for long running tasks:
// VERSION C: execute in parallel without suffering from CompletableApi:
return user.getTasks()
.parallelStream()
.flatMap(t -> execute(t).stream())
.collect(Collectors.toSet());
Your code should work as it is. That is, the for loop in your first example waits for the first future to complete before proceeding to the second future, but in the meantime all the other futures are concurrently running. They typically start to execute as soon as you've called supplyAsync. To prove this, here's a self-contained executable:
import java.time.LocalTime;
import java.util.ArrayList;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
public class Demo {
public static void main(String[] args) throws InterruptedException {
var executor = Executors.newCachedThreadPool();
var results = new ArrayList<String>();
var futures = new ArrayList<CompletableFuture<String>>();
futures.add(CompletableFuture.supplyAsync(() -> sleep(2), executor));
TimeUnit.MILLISECONDS.sleep(100);
futures.add(CompletableFuture.supplyAsync(() -> sleep(1), executor));
// All futures are executed in parallel
for (var f : futures) {
results.add(f.join());
}
results.forEach(System.out::println);
}
private static String sleep(int seconds) {
var start = LocalTime.now();
try {
TimeUnit.SECONDS.sleep(seconds);
} catch (InterruptedException ignored) {
Thread.currentThread().interrupt();
}
var end = LocalTime.now();
return String.format("Thread %s started at %s and finished at %s",
Thread.currentThread().getId(), start, end);
}
}
The output proves that the second future finished before the first, as expected:
Thread 14 started at 17:49:35.202673531 and finished at 17:49:37.206196631
Thread 15 started at 17:49:35.262183490 and finished at 17:49:36.262342704
CompletableFuture.allOf() is pretty simple here when using Stream API:
CompletableFuture.allOf(user.getTasks().stream()
.map(task -> CompletableFuture.supplyAsync(() -> execute(task), executor))
.toArray(CompletableFuture[]::new))
.join();
Of course your second variant will execute one after another:
CompletableFuture.supplyAsync(() -> execute(task), executor)
.thenAccept(tasks::addAll)
.join();
You join that, blocking the thread.
The second problem is the use of newCachedThreadPool. I'll explain that based on the jdk's HttpClient. In the early version it had that in the documentation that it will use a cached pool, later they removed it from the documentation. Currently it is left in the implementation, but that will be removed also, in time. The problem is that such a pool, when you use it incorrectly, will eat all your resources and kill your application. No more free threads? Sure, I will create a new one, and so on... Eventually this will hurt you. Use a pool with a limited numbers of threads.
To answer your question, you are looking for some kind of flatMap, that could do CompletableFuture<Set<X>> to Set<CompletableFuture<X>>. Such a non-blocking method does not exist. You need to call join, but you can delay the call to that join via a trick:
user.getTasks().stream()
.map(each -> CompletableFuture.supplyAsync(() -> execute(each), executor))
.flatMap(x -> Stream.of(x).map(CompletableFuture::join))
.flatMap(Set::stream)
.collect(Collectors.toSet());
After trying all those versions above I come to the conclustion that the following solution is the best:
// VERSION X is the best
public Set<Task> getTasks(){
var executor = Executors.newCachedThreadPool();
var futures = new ArrayList<Future<Set<Task>>>();
var tasks = new LinkedHashSet<Task>();
for (var t : user.getTasks()) {
futures.add(executor.submit(() -> executor(t)));
}
for (var f : futures) {
try {
tasks.addAll(f.get());
} catch (Exception e) {
e.printStackTrace();
}
}
return tasks;
}
}
It's the best because:
easy and fast code (no unneeded overhead, lambdas, completableFuture,..)
no exception is surpressed
does not stop the execution of further tasks if one task raises an exception
If anyone can convince me to use other versions, then please add arguments.
Related
In my web application, I need to call around more than 10 methods in one API call. To make that efficient I use ExecutorService to create multiple threads at a same time. Each methods returning different Objects expect fav_a(), fav_b(), fav_c(). (Sample code is given below for easiness)
#GetMapping(RequestUrl.INIT)
public ResponseEntity<Map<String, List<?>>> init() throws ExecutionException, InterruptedException {
ExecutorService service = Executors.newFixedThreadPool(6);
Future<List<Object>> method_a = service.submit(() -> someService.method_a());
Future<List<Object>> method_b = service.submit(() -> someService.method_b());
Future<List<Object>> method_c = service.submit(() -> someService.method_c());
Future<List<FavouriteConverter>> fav_a = service.submit(() -> someService.fav_a());
Future<List<FavouriteConverter>> fav_b = service.submit(() -> someService.fav_b());
Future<List<FavouriteConverter>> fav_c = service.submit(() -> someService.fav_c());
service.shutdown();
List<FavouriteConverter> combinedFavourite = Stream.of(fav_a.get(), fav_b.get(), fav_c.get()).flatMap(f -> f.stream()).collect(Collectors.toList());
combinedFavourite=combinedFavourite.stream()
.sorted(Comparator.comparing(FavouriteConverter::get_id, Comparator.reverseOrder()))
.limit(25)
.collect(Collectors.toList());
Map<String, List<?>> map = new HashMap<>();
map.put("method_a", method_a.get());
map.put("method_b", method_b.get());
map.put("method_c", method_c.get());
map.put("favourite", combinedFavourite);
return new ResponseEntity<>(map, HttpStatus.OK);
}
First I need to get fav_a.get(), fav_b.get(), fav_c.get() to make combinedFavourite. If any of one delays, the logic will be wrong. Creating threads are expensive.
Does Stream automatically handle this kind of situation?
If fav_a(), fav_b(), fav_c() do it jobs earlier than other methods, How can I put combinedFavourite into another thread? This means how to make Future<List<FavouriteConverter>> combinedFavourite in waiting stage until fav_a.get(), fav_b.get(), fav_c.get() finishes. (Assume method_a(),method_b(),method_c() still running.)
No, Streams are not responsible for joining these threads.
Since you wait for the results of these 3 threads and putting them into a map which you return, wrapping such logic in a separate thread doesn't help you as long as you have to wait and return the result.
Use ExecutorService::invokeAll to execute all the tasks and returning a list of Futures when all are complete (when Future::done is true).
List<Future<List<Object>>> list = service.invokeAll(
Arrays.asList(
() -> someService.method_a(),
() -> someService.method_b(),
() -> someService.method_c()
));
Note these are guaranteed:
The result List<Future> is in the same order as the collection of tasks given (according to its Iterator).
All the tasks will run in a separate thread if the pooled number of threads are higher or equal than executed tasks (assuming there are no other tasks using a thread from the same thread pool).
This logics helps you to work with complete results.
I am very new to java and I want to parallelize a nested for loop using executor service or using any other method in java. I want to create some fixed number of threads so that CPU is not completely acquired by threads.
for(SellerNames sellerNames : sellerDataList) {
for(String selleName : sellerNames) {
//getSellerAddress(sellerName)
//parallize this task
}
}
size of sellerDataList = 1000 and size of sellerNames = 5000.
Now I want to create 10 threads and assign equal chunk of task to each thread equally. That is for i'th sellerDataList, first thread should get address for 500 names, second thread should get address for next 500 names and so on.
What is the best way to do this job?
There are two ways to make it run parallelly: Streams and Executors.
Using streams
You can use parallel streams and leave the rest to the jvm. In this case you don't have too much control over what happens when. On the other hand your code will be easy to read and maintain:
sellerDataList.stream().forEach(sellerNames -> {
Stream<String> stream = StreamSupport.stream(sellerNames.spliterator(), true); // true means use parallel stream
stream.forEach(sellerName -> {
getSellerAddress(sellerName);
});
});
Using an ExecutorService
Suppose, you want 5 Threads and you want to be able to wait until task completion. Then you can use a fixed thread pool with 5 threads and use Future-s so you can wait until they are done.
final ExecutorService executor = Executors.newFixedThreadPool(5); // it's just an arbitrary number
final List<Future<?>> futures = new ArrayList<>();
for (SellerNames sellerNames : sellerDataList) {
for (final String sellerName : sellerNames) {
Future<?> future = executor.submit(() -> {
getSellerAddress(sellerName);
});
futures.add(future);
}
}
try {
for (Future<?> future : futures) {
future.get(); // do anything you need, e.g. isDone(), ...
}
} catch (InterruptedException | ExecutionException e) {
e.printStackTrace();
}
If you are using a parallel stream you can still control the thread by creating your own ForkJoinPool.
List<Long> aList = LongStream.rangeClosed(firstNum, lastNum).boxed()
.collect(Collectors.toList());
ForkJoinPool customThreadPool = new ForkJoinPool(4);
long actualTotal = customThreadPool.submit(
() -> aList.parallelStream().reduce(0L, Long::sum)).get();
Here on this site, it is described very well.
https://www.baeldung.com/java-8-parallel-streams-custom-threadpool
I have two questions:
1. What is the simplest canonical form for running a Callable as a task in Java 8, capturing and processing the result?
2. In the example below, what is the best/simplest/clearest way to hold the main process open until all the tasks have completed?
Here's the example I have so far -- is this the best approach in Java 8 or is there something more basic?
import java.util.*;
import java.util.concurrent.*;
import java.util.function.*;
public class SimpleTask implements Supplier<String> {
private SplittableRandom rand = new SplittableRandom();
final int id;
SimpleTask(int id) { this.id = id; }
#Override
public String get() {
try {
TimeUnit.MILLISECONDS.sleep(rand.nextInt(50, 300));
} catch(InterruptedException e) {
System.err.println("Interrupted");
}
return "Completed " + id + " on " +
Thread.currentThread().getName();
}
public static void main(String[] args) throws Exception {
for(int i = 0; i < 10; i++)
CompletableFuture.supplyAsync(new SimpleTask(i))
.thenAccept(System.out::println);
System.in.read(); // Or else program ends too soon
}
}
Is there a simpler and clearer Java-8 way to do this? And how do I eliminate the System.in.read() in favor of a better approach?
The canonical way to wait for the completion of multiple CompletableFuture instance is to create a new one depending on all of them via CompletableFuture.allOf. You can use this new future to wait for its completion or schedule new follow-up actions just like with any other CompletableFuture:
CompletableFuture.allOf(
IntStream.range(0,10).mapToObj(SimpleTask::new)
.map(s -> CompletableFuture.supplyAsync(s).thenAccept(System.out::println))
.toArray(CompletableFuture<?>[]::new)
).join();
Of course, it always gets simpler if you forego assigning a unique id to each task. Since your first question was about Callable, I’ll demonstrate how you can easily submit multiple similar tasks as Callables via an ExecutorService:
ExecutorService pool = Executors.newCachedThreadPool();
pool.invokeAll(Collections.nCopies(10, () -> {
LockSupport.parkNanos(TimeUnit.MILLISECONDS.toNanos(
ThreadLocalRandom.current().nextInt(50, 300)));
final String s = "Completed on "+Thread.currentThread().getName();
System.out.println(s);
return s;
}));
pool.shutdown();
The executor service returned by Executors.newCachedThreadPool() is unshared and won’t stay alive, even if you forget to invoke shutDown(), but it can take up to one minute before all threads are terminated then.
Since your first question literally was: “What is the simplest canonical form for running a Callable as a task in Java 8, capturing and processing the result?”, the answer might be that the simplest form still is invoking it’s call() method directly, e.g.
Callable<String> c = () -> {
LockSupport.parkNanos(TimeUnit.MILLISECONDS.toNanos(
ThreadLocalRandom.current().nextInt(50, 300)));
return "Completed on "+Thread.currentThread().getName();
};
String result = c.call();
System.out.println(result);
There’s no simpler way…
Consider collecting the futures into a list. Then you can use join() on each future to await their completion in the current thread:
List<CompletableFuture<Void>> futures = IntStream.range(0,10)
.mapToObj(id -> supplyAsync(new SimpleTask(id)).thenAccept(System.out::println))
.collect(toList());
futures.forEach(CompletableFuture::join);
I have a list of callables and I want to start them all in parallel, give them 5 seconds to complete, and use the results from any of the tasks that finish within that time.
I tried using executorService.invokeAll with a timeout, but in this case they all need to finish before my timeout.
What is the best way to do this using Java 7?
What I do is submit all the tasks and add the Futures to a list.
You can then wait for the timeout, and get all the Futures where isDone() is true.
Alternatively you can call get on each of the Futures which a decreasing timeout based on the amount of time remaining.
Just check after 5s if the Future is terminated using isDone:
List<Callable<V>> callables = // ...
ExecutorService es = Executors.newFixedThreadPool(callables.size()));
List<Future<V>> futures = es.invokeAll(callables);
// Wait 5s
Thread.sleep(5000);
List<V> terminatedResults = new ArrayList<>();
for(Future<V> f : futures) {
if(f.isDone()) {
terminatedResults.add(f.get());
} else {
// cancel the future?
}
}
// use terminatedResults
Ok, the answers helped me get to the solution. The issue with Logeart's answer is that I want to give them a max time - so if they finish quicker, I get them all (sorry if this wasn't clear in the question).
The other issue is that isDone() does not catch the case when a task is cancelled - you need to use isCancelled(). So, my working solution was:
ExecutorService exectutorService = Executors.newCachedThreadPool();
List<Callable<Object>> callables = Arrays.asList(
(Callable(Object) new Check1Callable(),
(Callable(Object) new Check2Callable(),
(Callable(Object) new Check3Callable());
List<Future<Object>> futures = new ArrayList<>();
try {
futures = executorService.invokeAll(callables,maxWaitTime, TimeUnit.SECONDS);
} catch (Exception e) {
}
for (Future thisFuture : futures) {
try {
if (thisFuture.isDone() && !thisFuture.isCancelled()) {
<accept the future's result>
}
} catch (Exception e) {
}
}
I have a completable future (future1) which create 10 completable futures (futureN). Is there a way to set future1 as complete only when all futureN are completed?
I am not sure what you mean by "future creates other futures" but if you have many futures and you want to do something when they are completed you can do it this way:
CompletableFuture.allOf(future2, future3, ..., futureN).thenRun(() -> future1.complete(value));
A CompletableFuture is not something that acts so I'm unsure what you mean by
which create 10 completable futures
I'm assuming you mean you submitted a task with runAsync or submitAsync. My example won't, but the behavior is the same if you do.
Create your root CompletableFuture. Then run some code asynchronously that creates your futures (through an Executor, runAsync, inside a new Thread, or inline with CompletableFuture return values). Collect the 10 CompletableFuture objects and use CompletableFuture#allOf to get a CompletableFuture that will complete when they are all complete (exceptionally or otherwise). You can then add a continuation to it with thenRun to complete your root future.
For example
public static void main(String args[]) throws Exception {
CompletableFuture<String> root = new CompletableFuture<>();
ExecutorService executor = Executors.newSingleThreadExecutor();
executor.submit(() -> {
CompletableFuture<String> cf1 = CompletableFuture.completedFuture("first");
CompletableFuture<String> cf2 = CompletableFuture.completedFuture("second");
System.out.println("running");
CompletableFuture.allOf(cf1, cf2).thenRun(() -> root.complete("some value"));
});
// once the internal 10 have completed (successfully)
root.thenAccept(r -> {
System.out.println(r); // "some value"
});
Thread.sleep(100);
executor.shutdown();
}