SupplyAsync wait for all CompletableFutures to finish - java

I'm running some async tasks below and need to wait until they all finish. I'm not sure why but the join() isn't forcing a wait for all tasks and the code continues executing without waiting. Is there a reason why the stream of joins isn't working as expected?
The CompletableFutures list is just a stream that maps supplyAsync
List<Integer> items = Arrays.asList(1, 2, 3);
List<CompletableFuture<Integer>> futures = items
.stream()
.map(item -> CompletableFuture.supplyAsync(() -> {
System.out.println("processing");
// do some processing here
return item;
}))
.collect(Collectors.toList());
And I wait for the futures like.
CompletableFuture.allOf(futures.toArray(new CompletableFuture[futures.size()]))
.thenApply(ignored -> futures.stream()
.map(CompletableFuture::join)
.collect(Collectors.toList()));
I'm able to get the wait working with futures.forEach(CompletableFuture::join); but I wanted to know why my stream approach wasn't working.

This code:
CompletableFuture.allOf(futures.toArray(new CompletableFuture[futures.size()]))
.thenApply(ignored -> futures.stream()
.map(CompletableFuture::join)
.collect(Collectors.toList()));
Does not wait for all futures in futures to complete. What it does is create a new future that will wait for all async executions in futures to complete before it itself completes (but will not block until all those futures complete). When this allOf future completes, then your thenApply code runs. But allOf() will return immediately without blocking.
This means futures.stream().map(CompletableFuture::join).collect(Collectors.toList()) in your code only runs after all async executions complete, which defeats your purpose. The join() calls will all return immediately. But this is not the bigger problem. Your challenge is that allOf().thenApply() will not wait for async executions to complete. It will just create another future that won't block.
The easiest solution is to use the first pipeline and map to an Integer list:
List<Integer> results = items.stream()
.map(item -> CompletableFuture.supplyAsync(() -> {
System.out.println("processing " + item);
// do some processing here
return item;
}))
.collect(Collectors.toList()) //force-submit all
.stream()
.map(CompletableFuture::join) //wait for each
.collect(Collectors.toList());
If you want to use something like your original code, then your second snippet would have to be changed to:
List<Integer> reuslts = futures.stream()
.map(CompletableFuture::join)
.collect(Collectors.toList());
That's because CompletableFuture.allOf does not wait, it just combines all futures into a new one that completes when all complete:
Returns a new CompletableFuture that is completed when all of the given CompletableFutures complete.
Alternatively, you could still use allOf() with join(), then run your current thenApply() code:
//wrapper future completes when all futures have completed
CompletableFuture.allOf(futures.toArray(new CompletableFuture[futures.size()]))
.join();
//join() calls below return immediately
List<Integer> result = futures.stream()
.map(CompletableFuture::join)
.collect(Collectors.toList());
The join() calls in the last statement return immediately, becaues the join() call on the wrapper (allOf()) future will have waited for all futures passed to it to complete. This is why I don't see the reason to do this when you can use the first approach.

Related

CompletableFuture is Not waiting for child threads

I am trying to wait for processor.processFiles() to complete, the method returns void and it is an #Async method. The busy wait logic does not cause the process to wait for method to complete.
Can anybody please point out what am i missing?
try{
filesList.forEach(files -> {
List<CompletableFuture<Void>> completableFutures = new ArrayList<>();
files.forEach(file-> {
CompletableFuture<Void> completableFuture = CompletableFuture.runAsync(() ->
processor.processFiles());
completableFutures.add(completableFuture);
});
while(true) {
Thread.sleep(5000);
boolean isComplete = completableFutures.stream().allMatch(result -> result.isDone() == true);
if(isComplete){
break;
}
LOGGER.info("processing the file...");
}
});
}
catch(Exception e){
}
finally{
closeConnections();
}
I think you've overcomplicated things.
fileList.flatMap(List::stream).parallel().forEach(file -> processor.processFiles());
The forEach will run in parallel, and return when all of the files have been processed.
At the very least, don't use side effects to populate a List.
List<CompletableFuture<Void>> completableFutures = files.stream().map(
file -> CompletableFuture.runAsync(() -> processor.processFiles());
).collect( Collectors.toList());
I agree with the comment.
CompletableFuture<Void> all = CompletableFuture.allOf( completableFutures );
Then you can use get which will wait until the tasks are completed.
Another way to do this, that would skip the List + CompletableFuture.allOf and just return a single completable future.
CompletableFuture<Void> all = files.stream().map(
file -> CompletableFuture.runAsync(
() -> processor.processFiles()
)
).collect(
Collectors.reducing(
CompletableFuture.completedFuture(null), CompletableFuture::allOf
)
);
That will map file to a CompletableFuture then merge all of the resulting completable futures into a single completable future. You can call .get on it and it will return when everything is finished.

Intermediate executions on CompletableFutures List

I have a list of completableFutures where each future has varying time of execution(50-300ms).
//CompletableFuture[] futures
CompletableFuture<Void> allFutures = CompletableFuture.allOf(futures);
allFutures.whenComplete(){
//Do Something
}
I want to add an intermediate step where after x ms, where I want to do some partial processing on completed futures, and combine results later
//CompletableFuture[] futures
SCHEDULER.schedule(() -> {
for(Future f: futures){
//Do Something else
}
}, 100, TimeUnit.MILLISECONDS);
CompletableFuture<Void> allFutures = CompletableFuture.allOf(futures);
allFutures.whenComplete(){
//Do Something
}
Above doesn't look pretty to me, is there a better way of doing this, does Completable future have something out of the box for this?
Can be achieved through:
CompletableFutures DelayedExecutor
or using Guava Libraries:
FluentFuture.withTimeout() conversion.
Both should be given an executor.

CompleteableFuture Java 8 unusual behavior

I noticed some unusual behavior with CompleteableFutures in Java 8 with streaming.
String [] arr = new String[]{"abc", "def", "cde", "ghj"};
ExecutorService executorService = Executors.newFixedThreadPool(10);
List<String> lst =
Arrays.stream(arr)
.map(r ->
CompletableFuture.supplyAsync(() -> {
try {
Thread.sleep(5000);
return "e";
} catch (Exception e) {
e.printStackTrace();
}
return null;
}, executorService)
)
.map(CompletableFuture::join)
.collect(Collectors.toList());
This code above takes 4*5000 = 20 seconds to execute, so this means the futures are waiting on one another.
String [] arr = new String[]{"abc", "def", "cde", "ghj"};
ExecutorService executorService = Executors.newFixedThreadPool(10);
List<CompletableFuture<String>> lst =
Arrays.stream(arr)
.map(r ->
CompletableFuture.supplyAsync(() -> {
try {
Thread.sleep(5000);
return "d";
} catch (Exception e) {
e.printStackTrace();
}
return null;
}, executorService)
)
.collect(Collectors.toList());
List<String> s =
lst
.stream()
.map(CompletableFuture::join)
.collect(Collectors.toList());
System.out.println(s);
This code however, runs in 5 seconds meaning futures are running in parallel.
What I don't understand: In the second example I get a list of futures explicitly, then do a join, which takes 5 seconds, the first example I keep it streaming through and it seems to wait.
What's the reasoning behind this?
Streams don't necessarily do one stage, then the next. They can compose operations in any order they choose.
So for example,
Arrays.stream(array).map(e -> f(e)).map(e -> g(e)).collect(toList());
can end up being run the same way as
Arrays.stream(array).map(e -> g(f(e))).collect(toList());
...which would have the results you see: the futures are generated one at a time and immediately joined, instead of all being generated up front and then joined.
In point of fact, if you're not doing something async, it's usually more efficient to do it the second way. That way, the stream framework doesn't have to store all the results of f, then store all the results of g: it can only store the results of g(f(e)). The stream framework can't know you're doing async code, so it does the normal efficient thing.
I think the issue is with the second map function call in the original snippet. The map function is serial and hence calls the CF blocking function join for each one of the elements in the source array one after another.

How to reorder Stream of CompletableFutures?

I deal with Streams of CompletableFutures. These take different times to complete. Those taking longer block stream processing while others might already have completed (and I know about Parallel Streams)
Therefore I would like to reorder items in a Stream (e.g. with a buffer) to move completed Futures ahead.
For example, this code blocks stream processing if one getUser call takes long
public static Boolean isValid(User user) { ... }
emails.stream()
// not using ::
// getUser() returns CompletableFuture<User>
.map( e -> getUser(e))
// this line blocks Stream processing
.filter( userF -> isValid( userF.get()) )
.map( f -> f.thenApply(User::getName))
and I would like to have something like
emails.stream()
.map( e -> getUser(e))
// this moves Futures into a bounded buffer
// and puts those finished first
// like CompletionService [1]
// and returns a Stream again
.collect(FutureReorderer.collector())
// this is not the same Stream but
// the one created by FutureReorderer.collector()
.filter( userF -> isValid( userF.get()) )
.map( f -> f.thenApply(User::getName))
[1] For example CompletionService https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ExecutorCompletionService.html returns completed tasks when calling take() and blocks otherwise. But CompletionService does not take futures, would one need to do cs.sumbit( () -> f.get() ) ?
How would I do that?
[Edit]
Changed example to include filter()
Added comment
Added CompletionService link
Having more context would definitely help in tailoring the answer - I have a feeling that problem is somewhere else and can be solved in an easier way.
But if your question is how to somehow keep completed futures at the beginning, there are few options:
Sorting the Stream using a custom Comparator:
.sorted(Comparator.comparing(f -> !f.isDone()))
Keep in mind that isDone returns true not only when a future completes successfully.
Storing futures in a PriorityQueue
PriorityQueue<CompletableFuture<String>> queue
= new PriorityQueue<>(Comparator.comparing(f -> !f.isDone()));
when polling elements, the queue will be returning elements according to their provided ordering.
Here it is in action:
PriorityQueue<CompletableFuture<String>> queue
= new PriorityQueue<>(Comparator.comparing(f -> !f.isDone()));
queue.add(CompletableFuture.supplyAsync(() -> {
try {
Thread.sleep(Integer.MAX_VALUE);
} catch (InterruptedException e) { }
return "42";
}));
queue.add(CompletableFuture.completedFuture("completed"));
queue.poll(); // "completed"
queue.poll(); // still going on
It's important to remember that if you do want to convert PriorityQueue to Stream, you can't do this simply using stream() - this will not preserve the priority order.
This is the right way to go:
Stream.generate(queue::poll).limit(queue.size())
I assume the requirements in OP is execute getUser concurrently and process the result Futures by completion order. Here is solution by ExecutorCompletionService:
final CompletionService<User> ecs = new ExecutorCompletionService<>(executor);
emails.stream().map(e -> ecs.submit(() -> getUser(e).get()))
.collect(Collectors.collectingAndThen(Collectors.toList(), fs -> fs.stream())) // collect the future list for concurrent execution
.map(f -> {
try {
return ecs.take().get();
} catch (InterruptedException | ExecutionException e) {
throw new RuntimeException(e);
}
})
.filter(u -> isValid(u)).map(User::getName)... //TODO;
Or:
final BlockingQueue<Future<User>> queue = new ArrayBlockingQueue<>(emails.size());
final CompletionService<User> ecs = new ExecutorCompletionService<>(executor, queue);
emails.stream().forEach(e -> ecs.submit(() -> getUser(e).get()));
IntStream.range(0, emails.size())
.mapToObj(i -> {
try {
return queue.poll().get();
} catch (InterruptedException | ExecutionException e) {
throw new RuntimeException(e);
}
})
.filter(u -> isValid(u)).map(User::getName);
It's simple but not straightforward.

convert list of completable futeres to one completable future of list

I have a list of CompletableFuture instances.
List<CompletableFuture<String>> listOfFutures;
How is it to convert them to one future like this:
CompletableFuture<List<String>> futureOfList = convert(listOfFutures);
This is a monadic sequence operation. With the cyclops-monad-api (a library I wrote) you can write
AnyM<Stream<String>> futureStream = AnyMonads.sequence(
AsAnyMList.completableFutureToAnyMList(futures));
CompletableFuture<Stream<String>> futureOfList = futureStream.unwrap();
When you call a terminal operation on the Stream inside futureOfList, e.g. to convert to a List, it will trigger the join() call on all the original futures, so should be used in a similar manner to join() itself.
CompletableFuture<List<String>> completed = futureOfList.thenApply(
s->s.collect(Collectors.toList());
To write your own version specifically for CompletableFuture you could do something like this
CompletableFuture<Stream<String>> futureOfList = CompletableFuture.completedFuture(1)
.thenCompose(one->listOfFutures.stream()
.map(cf->cf.join()));
Then to join
CompletableFuture<List<String>> completed = futureOfList.thenApply(
s->s.collect(Collectors.toList());
See also this question and answer for a solution using allOf (which won't block any additional threads).
You can do it like this:
public static <T> CompletableFuture<List<T>> convert(List<CompletableFuture<T>> futures) {
return futures.stream().
map(f -> f.thenApply(Stream::of)).
reduce((a, b) -> a.thenCompose(xs -> b.thenApply(ys -> concat(xs, ys)))).
map(f -> f.thenApply(s -> s.collect(toList()))).
orElse(completedFuture(emptyList()));
}

Categories