Run IO computations in parallel in Java8 - java

I'm familiar with functional programming languages, usually in Scala and Javascript. I'm working on a Java8 project and not sure how I am supposed to run through a list/stream of item, and perform some side-effect for each of them in parallel, using a custom thread pool, and return an object on which it's possible to listen for completion (wether it's a success or failure).
Currently I have the following code, it seems to work (I'm using Play framework Promise implementation as return) but it seems not ideal because ForkJoinPool is not meant to be used for IO intensive computations in the first place.
public static F.Promise<Void> performAllItemsBackup(Stream<Item> items) {
ForkJoinPool pool = new ForkJoinPool(3);
ForkJoinTask<F.Promise<Void>> result = pool
.submit(() -> {
try {
items.parallel().forEach(performSingleItemBackup);
return F.Promise.<Void>pure(null);
} catch (Exception e) {
return F.Promise.<Void>throwing(e);
}
});
try {
return result.get();
} catch (Exception e) {
throw new RuntimeException("Unable to get result", e);
}
}
Can someone give me a more idiomatic implementation of the above function? Ideally not using the ForkJoinPool, using a more standard return type, and most recent Java8 APIs? Not sure what I'm supposed to use between CompletableFuture, CompletionStage, ForkJoinTask...

A canonical solution would be
public static CompletableFuture<Void> performAllItemsBackup(Stream<Item> items) {
ForkJoinPool pool = new ForkJoinPool(3);
try {
return CompletableFuture.allOf(
items.map(CompletableFuture::completedFuture)
.map(f -> f.thenAcceptAsync(performSingleItemBackup, pool))
.toArray(CompletableFuture<?>[]::new));
} finally {
pool.shutdown();
}
}
Note that the interaction between ForkJoin pool and parallel streams is an unspecified implementation detail you should not rely on. In contrast, CompletableFuture provides a dedicated API for providing an Executor. It doesn’t even have to be a ForkJoinPool:
public static CompletableFuture<Void> performAllItemsBackup(Stream<Item> items) {
ExecutorService pool = Executors.newFixedThreadPool(3);
try {
return CompletableFuture.allOf(
items.map(CompletableFuture::completedFuture)
.map(f -> f.thenAcceptAsync(performSingleItemBackup, pool))
.toArray(CompletableFuture<?>[]::new));
} finally {
pool.shutdown();
}
}
In either case, you should shut down the executor explicitly instead of relying on automatic cleanup.
If you need a F.Promise<Void> result, you can use
public static F.Promise<Void> performAllItemsBackup(Stream<Item> items) {
ExecutorService pool = Executors.newFixedThreadPool(3);
try {
return CompletableFuture.allOf(
items.map(CompletableFuture::completedFuture)
.map(f -> f.thenAcceptAsync(performSingleItemBackup, pool))
.toArray(CompletableFuture<?>[]::new))
.handle((v, e) -> e!=null? F.Promise.<Void>throwing(e): F.Promise.pure(v))
.join();
} finally {
pool.shutdown();
}
}
But note that this, like your original code, only returns when the operation has been completed, while the methods returning a CompletableFuture allow the operations to run asynchronously until the caller invokes join or get.
To return a truly asynchronous Promise, you have to wrap the entire operation, e.g.
public static F.Promise<Void> performAllItemsBackup(Stream<Item> stream) {
return F.Promise.pure(stream).flatMap(items -> {
ExecutorService pool = Executors.newFixedThreadPool(3);
try {
return CompletableFuture.allOf(
items.map(CompletableFuture::completedFuture)
.map(f -> f.thenAcceptAsync(performSingleItemBackup, pool))
.toArray(CompletableFuture<?>[]::new))
.handle((v, e) -> e!=null? F.Promise.<Void>throwing(e): F.Promise.pure(v))
.join();
} finally {
pool.shutdown();
}
});
}
But it’s better to decide for one API instead of jumping back and forth between two different APIs.

Related

Spring Reactor: adding delay but in an NON blocking way

Small question on how to add delay in a method but in a non blocking way please.
A very popular way to simulate long processes is to use Thread.sleep();
However, for project Reactor, this is a blocking operation.
And it is well known, in a reactive project, we should not block.
I would like to experiment and simulate long processes. Some sort of method which will take a lot of time, but in a NON blocking way, WITHOUT swapping thread. This is to simulate a method that is just vey lengthy, but proven NON blocking by BlockHound etc.
This construct is very popular:
#Test
public void simulateLengthyProcessingOperationReactor() {
Flux.range(1,5000)
.map(a -> simulateLengthyProcessingOperation(a))
.subscribe(System.out::println);
}
public String simulateLengthyProcessingOperation(Integer input) {
simulateDelayBLOCKING();
return String.format("[%d] on thread [%s] at time [%s]", input, Thread.currentThread().getName(), new Date());
}
public void simulateDelayBLOCKING() {
try {
Thread.sleep(4000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
But it is blocking.
(I know there is the Mono.fromCallable(() -> but this is not the question)
Is it possible to do the same, simulate delay, but NON blocking please?
Also, .delay will not achieve the expected result (simulating a NON blocking lengthy method on the same reactive pipeline)
#Test
public void simulateLengthyProcessingOperationReactor() {
Flux.range(1,5000)
.map(a -> simulateLengthyProcessingOperation(a))
.subscribe(System.out::println);
}
public String simulateLengthyProcessingOperation(Integer input) {
simulateDelay_NON_blocking();
return String.format("[%d] on thread [%s] at time [%s]", input, Thread.currentThread().getName(), new Date());
}
public void simulateDelay_NON_blocking() {
//simulate lengthy process, but WITHOUT blocking
}
Thank you
Of course you can, there is a family of methods .delay...()
You can for example read about delayElements() method here:
https://projectreactor.io/docs/core/release/api/reactor/core/publisher/Flux.html#delayElements-java.time.Duration-
You should know that it switches the executing thread to another Scheduler.
Signals are delayed and continue on the parallel default Scheduler.
In simplest case it would look like this:
public void simulateLengthyProcessingOperationReactor() {
Flux.range(1,5000)
.delayElements(Duration.ofMillis(1000L)) // delay each element for 1000 millis
.subscribe(System.out::println);
}
According to your case you could write your code like this:
#Test
public void simulateLengthyProcessingOperationReactor() {
Flux.range(1,5000)
.concatMap(this::simulateDelay_NON_blocking)
.subscribe(System.out::println);
}
public Mono<String> simulateDelay_NON_blocking(Integer input) {
//simulate lengthy process, but WITHOUT blocking
return Mono.delay(Duration.ofMillis(1000L))
.map(__ -> String.format("[%d] on thread [%s] at time [%s]",
input, Thread.currentThread().getName(), new Date()));
}

Is CompletableFuture followed immediately by a get efficient?

I just found the following code, it supplies an asynchronous task but immediately get the result (so if I understand correctly, it blocks the current thread until the result is available).
Is it efficient ?
public String myMethod() {
CompletableFuture<String> futur = CompletableFuture.supplyAsync(() -> {
// my long call to an external API
return "theResult";
});
try {
return future.get(FUTURE_TIMEOUT_DURATION, TimeUnit.MINUTES);
} catch (Exception e) {
throw new RuntimeException(e);
}
}
If the timeout is handled correctly in the call to the external API, do I need this completable futur ?
Is it ok to simplify the code like this ?
public String myMethod() {
// my long call to an external API
return "theResult";
}
If you doesn't expect any problem with timeout you most probably can remove code related to feature.
There is possibility that code use some threadlocal variables or otherwise relies on being executed in separate thread.

Mono vs CompletableFuture

CompletableFuture executes a task on a separate thread ( uses a thread-pool ) and provides a callback function. Let's say I have an API call in a CompletableFuture. Is that an API call blocking? Would the thread be blocked till it does not get a response from the API? ( I know main thread/tomcat thread will be non-blocking, but what about the thread on which CompletableFuture task is executing? )
Mono is completely non-blocking, as far as I know.
Please shed some light on this and correct me if I am wrong.
CompletableFuture is Async. But is it non-blocking?
One which is true about CompletableFuture is that it is truly async, it allows you to run your task asynchronously from the caller thread and the API such as thenXXX allows you to process the result when it becomes available. On the other hand, CompletableFuture is not always non-blocking. For example, when you run the following code, it will be executed asynchronously on the default ForkJoinPool:
CompletableFuture.supplyAsync(() -> {
try {
Thread.sleep(1000);
}
catch (InterruptedException e) {
}
return 1;
});
It is clear that the Thread in ForkJoinPool that executes the task will be blocked eventually which means that we can't guarantee that the call will be non-blocking.
On the other hand, CompletableFuture exposes API which allows you to make it truly non-blocking.
For example, you can always do the following:
public CompletableFuture myNonBlockingHttpCall(Object someData) {
var uncompletedFuture = new CompletableFuture(); // creates uncompleted future
myAsyncHttpClient.execute(someData, (result, exception -> {
if(exception != null) {
uncompletedFuture.completeExceptionally(exception);
return;
}
uncompletedFuture.complete(result);
})
return uncompletedFuture;
}
As you can see, the API of CompletableFuture future provides you with the complete and completeExceptionally methods that complete your execution whenever it is needed without blocking any thread.
Mono vs CompletableFuture
In the previous section, we got an overview of CF behavior, but what is the central difference between CompletableFuture and Mono?
It worth to mention that we can do blocking Mono as well. No one prevents us from writing the following:
Mono.fromCallable(() -> {
try {
Thread.sleep(1000);
}
catch (InterruptedException e) {
}
return 1;
})
Of course, once we subscribe to the future, the caller thread will be blocked. But we can always work around that by providing an additional subscribeOn operator. Nevertheless, the broader API of Mono is not the key feature.
In order to understand the main difference between CompletableFuture and Mono, lets back to previously mentioned myNonBlockingHttpCall method implementation.
public CompletableFuture myUpperLevelBusinessLogic() {
var future = myNonBlockingHttpCall();
// ... some code
if (something) {
// oh we don't really need anything, let's just throw an exception
var errorFuture = new CompletableFuture();
errorFuture.completeExceptionally(new RuntimeException());
return errorFuture;
}
return future;
}
In the case of CompletableFuture, once the method is called, it will eagerly execute HTTP call to another service/resource. Even though we will not really need the result of the execution after verifying some pre/post conditions, it starts the execution, and additional CPU/DB-Connections/What-Ever-Machine-Resources will be allocated for this work.
In contrast, the Mono type is lazy by definition:
public Mono myNonBlockingHttpCallWithMono(Object someData) {
return Mono.create(sink -> {
myAsyncHttpClient.execute(someData, (result, exception -> {
if(exception != null) {
sink.error(exception);
return;
}
sink.success(result);
})
});
}
public Mono myUpperLevelBusinessLogic() {
var mono = myNonBlockingHttpCallWithMono();
// ... some code
if (something) {
// oh we don't really need anything, let's just throw an exception
return Mono.error(new RuntimeException());
}
return mono;
}
In this case, nothing will happen until the final mono is subscribed. Thus, only when Mono returned by the myNonBlockingHttpCallWithMono method, will be subscribed, the logic provided to Mono.create(Consumer) will be executed.
And we can go even further. We can make our execution much lazier. As you might know, Mono extends Publisher from the Reactive Streams specification. The screaming feature of Reactive Streams is backpressure support. Thus, using the Mono API we can do execution only when the data is really needed, and our subscriber is ready to consume them:
Mono.create(sink -> {
AtomicBoolean once = new AtomicBoolean();
sink.onRequest(__ -> {
if(!once.get() && once.compareAndSet(false, true) {
myAsyncHttpClient.execute(someData, (result, exception -> {
if(exception != null) {
sink.error(exception);
return;
}
sink.success(result);
});
}
});
});
In this example, we execute data only when subscriber called Subscription#request so by doing that it declared its readiness to receive data.
Summary
CompletableFuture is async and can be non-blocking
CompletableFuture is eager. You can't postpone the execution. But you can cancel them (which is better than nothing)
Mono is async/non-blocking and can easily execute any call on different Thread by composing the main Mono with different operators.
Mono is truly lazy and allows postponing execution startup by the subscriber presence and its readiness to consume data.
Building up on Oleh's answer, a possible lazy solution for CompletableFuture would be
public CompletableFuture myNonBlockingHttpCall(CompletableFuture<ExecutorService> dispatch, Object someData) {
var uncompletedFuture = new CompletableFuture(); // creates uncompleted future
dispatch.thenAccept(x -> x.submit(() -> {
myAsyncHttpClient.execute(someData, (result, exception -> {
if(exception != null) {
uncompletedFuture.completeExceptionally(exception);
return;
}
uncompletedFuture.complete(result);
})
}));
return uncompletedFuture;
}
Then, later on you simply do
dispatch.complete(executor);
That would make CompletableFuture equivalent to Mono, but without backpressure, I guess.

How to fail a java.util.concurrent.Future

There is pretty heavy use of io.vertx.core.Future in the vertx ecosystem:
https://vertx.io/docs/apidocs/io/vertx/core/Future.html
An example of using Vertx Future is here:
private Future<Void> prepareDatabase() {
Future<Void> future = Future.future();
dbClient = JDBCClient.createShared(vertx, new JsonObject(...));
dbClient.getConnection(ar -> {
if (ar.failed()) {
LOGGER.error("Could not open a database connection", ar.cause());
future.fail(ar.cause()); // here
return;
}
SQLConnection connection = ar.result();
connection.execute(SQL_CREATE_PAGES_TABLE, create -> {
connection.close();
if (create.failed()) {
future.fail(create.cause()); // here
} else {
future.complete();
}
});
});
return future;
}
I was under the impression that io.vertx.core.Future had something to do with java.util.concurrent.Future, but it appears that it doesn't. As you can see the way to tell a Vertx future to fail is to call it's fail() method.
On the other hand, we have CompletableFuture which is an implementation of the java.util.concurrent.Future interface:
https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/CompletableFuture.html
I don't see a fail method on the CompletableFuture, I only see "resolve()".
So my guess is that the only way to fail a CompletableFuture is to throw an Exception?
CompletableFuture<String> f = CompletableFuture.supplyAsync(() -> {
throw new RuntimeException("fail this future");
return "This would be the success result";
});
besides throwing an error, is there a way to "fail" a CompletableFuture?
In other words, using a Vertx Future, we just call f.fail(), but what about with a CompletableFuture?
CompletableFuture encourages you to throw exceptions from supplyAsync() method to describe failures.
As mentioned in the comments, there's also completeExceptionally() method, which you can use in case you have a Future at hand, and would like to fail it.
https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/CompletableFuture.html#completeExceptionally-java.lang.Throwable-
Since Java9, there's also CompletableFuture.failedFuture​(Throwable ex) construct, if you want to return an already failed future.
https://docs.oracle.com/javase/9/docs/api/java/util/concurrent/CompletableFuture.html#failedFuture-java.lang.Throwable-

How to DRY exception handling with Java 8 CompletableFuture when code throws exception?

I don't know about you guys but, for me it become very annoying when I see a piece of code that is repeated and I came across the following scenario when using Services that throw exception. As shown below, in each CompletableFuture block I have to do exception handling and that part is basically repeated over and over depending on how many completable futures you are going to have.
CompletableFuture<Void> future1Of15 = CompletableFuture.supplyAsync(() -> {
List<SomePojo> somePojos = null;
try {
somePojos = someServiceThatThrowsException.getAll(SomePojo.class);
} catch (SomeException e) {
//Handle the exception
e.printStackTrace();
}
return somePojos;
}).thenAcceptAsync(result -> //do something with the result);
CompletableFuture<Void> future2Of15 = CompletableFuture.supplyAsync(() -> {
List<OtherPojo> otherPojos = null;
try {
otherPojos = someServiceThatThrowsException.getAll(OtherPojo.class);
} catch (SomeException e) {
//Handle the exception
e.printStackTrace();
}
return otherPojos;
}).thenAcceptAsync(result -> //do something with the result);
Now repeat the above x number of times and you notice that, the try/catch block is repeated. In my case, I have around 15-20 such calls.
Is there a way that I could turn the above into 1 or 2 lines of code? In order words stop repeating myself with regards to exception handling inside the supplyAsync lambda?
Just add a method to your class that does all of the repeated code, and takes a Consumer<List<?>> as an argument to pass to thenAcceptAsync on the last line.
private CompletableFuture<Void> getAndAcceptAsync(Consumer<List<?>> resultProcessor) {
return CompletableFuture.supplyAsync(() -> {
List<SomePojo> somePojos = null;
try {
somePojos = someServiceThatThrowsException.getAll(SomePojo.class);
} catch (SomeException e) {
//Handle the exception
e.printStackTrace();
}
return somePojos;
}).thenAcceptAsync(resultProcessor);
}
You can then call this as many times as you need to.
future1Of15 = getAndAcceptAsync(result-> { do something } );
future2Of15 = getAndAcceptAsync(result-> { do something else } );
There are patterns for handling effects like e.g. failure in functional programming. One such pattern is the Try monad for which e.g. vavr provides an implementation in java.
Those patterns abstract a lot of boilerplate away from you via declarative apis:
CompletableFuture
.supplyAsync(() -> Try.of(() -> methodThatThrows()))
.thenAccept(res -> System.out.println(res));
Or if you aren't bound to using CompletableFuture, you may choose to use the Future monad to further reduce boilerplate code:
Future.of(() -> methodThatThrows())
.onComplete(result -> System.out.println(result));
Either way, what you end up with as the result of that Future<T> is a Try<T>, which can be either a Success<T> or a Failure which you can deal with accordingly.
I do not claim that this is the only approach or the most effective approach, I am sharing the solution that helped me DRY and it might or might not work for you. If you have a better solution, please share it.
I created the following utility method
#Component
public class CompletableFutureUtil {
#Autowired
private SomeGenericServiceThatThrowsException genericService;
public <T> CompletableFuture<Collection<T>> fireupCompletableFutureAsynchronously(Class<T> type) {
CompletableFuture<Collection<T>> future = CompletableFuture.supplyAsync(() -> {
Collection<T> mystery = null;
try {
mystery = genericService.list(type);
} catch (SomeException e) {
e.printStackTrace();
//Handle the exception
}
return mystery;
});
return future;
}
}
And now I can reuse the above utility method in the following way after autowiring it
#Autowired private CompletableFutureUtil futureUtil;
The calls basically become one to two lines max
futureUtil.fireupCompletableFutureAsynchronously(SomePojo.class)
.thenAcceptAsync(result -> //do something with result);
futureUtil.fireupCompletableFutureAsynchronously(OtherPojo.class)
.thenAcceptAsync(result -> //do something with result);
Happy DRY+ing

Categories