Code running on main thread even with subscribeOn specified - java

I'm in the process of migrating an AsyncTaskLoader to RxJava, trying to understand all the details about the RxJava approach to concurrency. Simple things were running ok, however I'm struggling with the following code:
This is the top level method that gets executed:
mCompositeDisposable.add(mDataRepository
.getStuff()
.subscribeOn(mSchedulerProvider.io())
.subscribeWith(...)
mDataRepository.getStuff() looks like this:
public Observable<StuffResult> getStuff() {
return mDataManager
.listStuff()
.flatMap(stuff -> Observable.just(new StuffResult(stuff)))
.onErrorReturn(throwable -> new StuffResult(null));
And the final layer:
public Observable<Stuff> listStuff() {
Log.d(TAG, ".listStuff() - "+Thread.currentThread().getName());
String sql = <...>;
return mBriteDatabase.createQuery(Stuff.TABLE_NAME, sql).mapToList(mStuffMapper);
}
So with the code above, the log will print out .listStuff() - main, which is not exactly what I'm looking for. And I'm not really sure why. I was under impression that by setting subscribeOn, every event pulled from the chain will be processed on the thread specified in the subscribeOn method.
What I think is happening, is that the source-aka-final-layer code, before reaching mBriteDatabase, is not from the RxJava world and therefore is not an event until createQuery is called. So I probably need some sort of a wrapper? I've tried applying .fromCallable, however that's a wrapper for non Rx code, and my database layer returns an observable...

Your Log.d call happens
immediately when listStuff gets called
which is immediately after getStuff gets called
which is the first thing happening in the top level code fragment you show us.
If you need to do it when the subscription happens, you need to be explicit:
public Observable<Stuff> listStuff() {
String sql = <...>;
return mBriteDatabase.createQuery(Stuff.TABLE_NAME, sql)
.mapToList(mStuffMapper)
.doOnsubscribe(() -> Log.d(TAG, ".listStuff() - "+Thread.currentThread().getName()));
}

Related

Thread safety for method that returns Mono based on mutable attribute in Java

In my Spring Boot application I have a component that is supposed to monitor the health status of another, external system. This component also offers a public method that reactive chains can subscribe to in order to wait for the external system to be up.
#Component
public class ExternalHealthChecker {
private static final Logger LOG = LoggerFactory.getLogger(ExternalHealthChecker.class);
private final WebClient externalSystemWebClient = WebClient.builder().build(); // config omitted
private volatile boolean isUp = true;
private volatile CompletableFuture<String> completeWhenUp = new CompletableFuture<>();
#Scheduled(cron = "0/10 * * ? * *")
private void checkExternalSystemHealth() {
webClient.get() //
.uri("/health") //
.retrieve() //
.bodyToMono(Void.class) //
.doOnError(this::handleHealthCheckError) //
.doOnSuccess(nothing -> this.handleHealthCheckSuccess()) //
.subscribe(); //
}
private void handleHealthCheckError(final Throwable error) {
if (this.isUp) {
LOG.error("External System is now DOWN. Health check failed: {}.", error.getMessage());
}
this.isUp = false;
}
private void handleHealthCheckSuccess() {
// the status changed from down -> up, which has to complete the future that might be currently waited on
if (!this.isUp) {
LOG.warn("External System is now UP again.");
this.isUp = true;
this.completeWhenUp.complete("UP");
this.completeWhenUp = new CompletableFuture<>();
}
}
public Mono<String> waitForExternalSystemUPStatus() {
if (this.isUp) {
LOG.info("External System is already UP!");
return Mono.empty();
} else {
LOG.warn("External System is DOWN. Requesting process can now wait for UP status!");
return Mono.fromFuture(completeWhenUp);
}
}
}
The method waitForExternalSystemUPStatus is public and may be called from many, different threads. The idea behind this is to provide some of the reactive flux chains in the application a method of pausing their processing until the external system is up. These chains cannot process their elements when the external system is down.
someFlux
.doOnNext(record -> LOG.info("Next element")
.delayUntil(record -> externalHealthChecker.waitForExternalSystemUPStatus())
... // starting processing
The issue here is that I can't really wrap my head around which part of this code needs to be synchronised. I think there should not be an issue with multiple threads calling waitForExternalSystemUPStatusat the same time, as this method is not writing anything. So I feel like this method does not need to be synchronised. However, the method annotated with #Scheduled will also run on it's own thread and will in-fact write the value of isUp and also potentially change the reference of completeWhenUpto a new, uncompleted future instance. I have marked these two mutable attributes with volatilebecause from reading about this keyword in Java it feels to me like it would help with guaranteeing that the threads reading these two values see the latest value. However, I am unsure if I also need to add synchronized keywords to part of the code. I am also unsure if the synchronized keyword plays well with reactor code, I have a hard time finding information on this. Maybe there is also a way of providing the functionality of the ExternalHealthCheckerin a more complete, reactive way, but I cannot think of any.
I'd strongly advise against this approach. The problem with threaded code like this is it becomes immensely difficult to follow & reason about. I think you'd at least need to synchronise the parts of handleHealthCheckSuccess() and waitForExternalSystemUPStatus() that reference your completeWhenUp field otherwise you could have a race hazard on your hands (only one writes to it, but it might be read out-of-order after that write) - but there could well be something else I'm missing, and if so it may show as one of these annoying "one in a million" type bugs that's almost impossible to pin down.
There should be a much more reliable & simple way of achieving this though. Instead of using the Spring scheduler, I'd create a flux when your ExternalHealthChecker component is created as follows:
healthCheckStream = Flux.interval(Duration.ofMinutes(10))
.flatMap(i ->
webClient.get().uri("/health")
.retrieve()
.bodyToMono(String.class)
.map(s -> true)
.onErrorResume(e -> Mono.just(false)))
.cache(1);
...where healthCheckStream is a field of type Flux<Boolean>. (Note it doesn't need to be volatile, as you'll never replace it so cross-thread worries don't apply - it's the same stream that will be updated with different results every 10 minutes based on the healthcheck status, whatever thread you'll access it from.)
This essentially creates a stream of healthcheck response values every 10 minutes, always caches the latest response, and turns it into a hot source. This means that the "nothing happens until you subscribe" doesn't apply in this case - the flux will start executing immediately, and any new subscribers that come in on any thread will always get the latest result, be that a pass or a fail. handleHealthCheckSuccess() and handleHealthCheckError(), isUp, and completeWhenUp are then all redundant, they can go - and then your waitForExternalSystemUPStatus() can just become a single line:
return healthCheckStream.filter(x -> x).next();
...then job done, you can call that from anywhere and you'll have a Mono that will only complete when the system is up.

is `CompletableFuture.completedFuture ... thenAccept` equivalent to sequential processing?

I'm working on a project with a lot of CompletableFuture.completedFuture ... thenAccept codes, e.g.
public CompletableFuture<Boolean> callee() {
boolean result = ... // Do something and get result - Step A
return CompletableFuture.completedFuture(Boolean.valueOf(result));
}
public void caller() {
callee().thenAccept(result -> {
// Detect if call success or failure - Step B
new Throwable().printStackTrace(); // the debug code: stacktrace shows it is called from caller
});
}
I concluded that Step A and Step B are called sequentially in one thread.
So can I simplify it like this?
public boolean callee() {
boolean result = ... // Do something and get result
return result;
}
public void caller() {
boolean result = callee();
// Detect if call success or failure
}
Yes, you can simplify it like this. The long version:
I think the question should be rather: "Is this usage of CompletableFuture appropriate?". No, it's not. This code is using CompletableFuture like a wrapper, a package, to pass data around and not as a tool to execute code asynchronously. This tool can be used to pass data around between threads, but it's not what this code is doing.
Calling CompletableFuture.completedFuture does nothing but create a new CompletableFuture that is completed with whatever you pass to the method. Then you call thenAccept on it, which has basically the following effect: "Take the result when it's done and let the thread that has calculated the result execute the following code. If the result is already calculated, let the caller execute the following code themself." The "following code" is simply the lambda you pass to thenAccept.
The initial CompletableFuture is completed instantly and the following code gets executed by the thread that calls thenAccept directly. The thread that executes caller and callee does everything itself. So this part is effectively doing nothing asynchronously. Therefore, the code is equivalent to the simpler code in the second example without CompletableFuture.
To actually make use of CompletableFuture, you should run boolean result = ... // Do something and get result - Step A asynchronously by e.g. creating this initial future using CompletableFuture.supplyAsync. The chained code will also be run asynchronously.

Monitor progress and intermediate results in Spark

I have a simple Spark task, something like this:
JavaRDD<Solution> solutions = rdd.map(new Solve());
// Select best solution by some criteria
The solve routine takes some time. For a demo application, I need to get some property of each solution as soon as it is calculated, before the call to rdd.map terminates.
I've tried using accumulators and SparkListener, overriding the onTaskEnd method, but it seems to be called only at the end of the mapping, not per thread, E.g.
sparkContext.sc().addSparkListener(new SparkListener() {
public void onTaskEnd(SparkListenerTaskEnd taskEnd) {
// do something with taskEnd.taskInfo().accumulables()
}
});
How can I get an asynchronous message for each map function end?
Spark runs locally or in a standalone cluster mode.
Answers can be in Java or Scala, both are OK.

Vertx Future does not wait

Since I´m using Vertx 3.1 in my stack, I was thinking to use the Future feature that the tools brings, but after read the API seems pretty limited to me. I cannot even find the way to make the the future wait for an Observable.
Here my code
public Observable<CommitToOrderCommand> validateProductRestrictions(CommitToOrderCommand cmd) {
Future<Observable<CommitToOrderCommand>> future = Future.future();
orderRepository.getOrder(cmd, cmd.orderId)
.flatMap(order -> validateOrderProducts(cmd, order))
.subscribe(map -> checkMapValues(map, future, cmd));
Observable<CommitToOrderCommand> result = future.result();
if(errorFound){
throw MAX_QUANTITY_PRODUCT_EXCEED.create("Fail"/*restrictions.getBulkBuyLimit().getDescription())*/);
}
return result;
}
private void checkMapValues(Multimap<String, BigDecimal> totalUnitByRestrictions, Future<Observable<CommitToOrderCommand>> future,
CommitToOrderCommand cmd) {
for (String restrictionName : totalUnitByRestrictions.keySet()) {
Restrictions restrictions = Restrictions.valueOf(restrictionName);
if (totalUnitByRestrictions.get(restrictionName)
.stream()
.reduce(BigDecimal.ZERO, BigDecimal::add)
.compareTo(restrictions.getBulkBuyLimit()
.getMaxQuantity()) == 1) {
errorFound = true;
}
}
future.complete(Observable.just(cmd));
}
In the onComplete of my first Observable I´m checking the results, and after finish is when I finish the future to unblock the operation.
But I´m looking that future.result is not block until future.complete is invoke as I was expecting. Instead is just returning null.
Any idea what´s wrong here?
Regards.
The vertx future doesn't block but rather work with a handler that is invoked when a result has been injected (see setHandler and isComplete).
If the outer layer of code requires an Observable, you don't need to wrap it in a Future, just return Observable<T>. Future<Observable<T>> doesn't make much sense, you're mixing two ways of doing async results.
Note that there are ways to collapse an Observable into a Future, but the difficulty is that an Observable may emit several items whereas a Future can hold only a single item. You already took care of that by collecting your results into a single emission of map.
Since this Observable only ever emits one item, if you want a Future out of it you should subscribe to it and call future.complete(yourMap) in the onNext method. Also define a onError handler that will call future.fail.

Pause execution of a method until callback is finished

I am fairly new to Java and extremely new to concurrency. However, I have worked with C# for a while. It doesn't really matter, but for the sake of example, I am trying to pull data off a table on server. I want method to wait until data is completely pulled. In C#, we have async-await pattern which can be used like this:
private async Task<List<ToDoItem>> PullItems ()
{
var newRemoteItems = await (from p in remoteTable select p).ToListAsync();
return newRemoteItems;
}
I am trying to have similar effect in Java. Here is the exact code I'm trying to port (Look inside SynchronizeAsync method.)! However, Java Azure SDK works with callbacks. So, I have a few options:
Use wait and notify pattern. Following code doesn't work since I don't understand what I'm doing.
final List<TEntity> newRemoteItems = new ArrayList<TEntity>();
synchronized( this ) {
remoteTable.where().field("lastSynchronized").gt(currentTimeStamp)
.execute(new TableQueryCallback<TEntity>() {
public void onCompleted(List<TEntity> result,
int count,
Exception exception,
ServiceFilterResponse response) {
if (exception == null) {
newRemoteItems.clear();
for (TEntity item: result) {
newRemoteItems.add(item);
}
}
}
});
}
this.wait();
//DO SOME OTHER STUFF
My other option is to move DO SOME OTHER STUFF right inside the callback's if(exception == null) block. However, this would result in my whole method logic chopped off into the pieces, disturbing the continuous flow. I don't really like this approach.
Now, here are questions:
What is recommended way of doing this? I am completing the tutorial on Java concurrency at Oracle. Still, clueless. Almost everywhere I read, it is recommended to use higher level stuff rather than wait and notify.
What is wrong with my wait and notify?
My implementation blocks the main thread and it's considered a bad practice. But what else can I do? I must wait for the server to respond! Also, doesn't C# await block the main thread? How is that not a bad thing?
Either put DO SOME OTHER STUFF into callback, or declare a semaphore, and call semaphore.release in the callback and call semaphore.aquire where you want to wait. Remove synchronized(this) and this.wait.

Categories