How to wait for completion of multiple Observables in RXJava?

How to wait for completion of multiple Observables in RXJava? - java

I have multiple tasks, each being executed on Schedulers.newThread(). The task is a method, which returns Observable<Long>.
The overall structure looks like this:
public void createAndPerformOperations(int mDataStructureSize, int operationsAmount) {
disposables.add(Single.fromCallable(() ->
(new OperationsFactory()).getOperations((new DataStructureFactory()).getMaps(mDataStructureSize)))
.subscribeOn(Schedulers.newThread())
.observeOn(AndroidSchedulers.mainThread())
.subscribe(operations -> {
for (int i = 0; i < operations.size(); i++) {
performOperation(operations.get(i), i, operationsAmount);
}
}));
}
private void performOperation(Operation operation, int id, int operationsAmount) {
disposables.add(Observable.defer(() -> operation
.executeAndReturnUptime(operationsAmount))
.subscribeOn(Schedulers.newThread())
.observeOn(AndroidSchedulers.mainThread())
.subscribe(upTime -> uptimeStream.onNext(new Pair<>(id, upTime))));
}
The uptimeStream is a PublishSubject, which is observed in a Fragment.
I need to somehow call the PublishSubject's observer onComplete method, but only when all tasks are finished and all Pairs are consumed.
Although the amount of tasks is known, I'm trying to avoid implementing any type of hardcoded counter for consumed items, since the code should be reusable if that amount changes. I've tried calling onComplete method directly from different places, but nothing seems to really work. The main problem is that I don't need just a confirmation that all tasks are finished, but I also need to process all the values in the parent Fragment
I'm fine with removing PublishSubject in general, if other solution works.
UPD: I'm aware of this solution, but I can't find Observable.from in RXJava 3, and in this solution .zip() doesn't work for me, since I can't manually list all the observables + need to include ID field besides the actual return value
UPD2: I deleted the HashMap thing because I've got to thinking, and the main point here is to still get the results one by one, and execute only onComplete after the tasks are completed. So the solution with fromIterable does not really fit there.

Related

When is the deal Mono.fromCallback ? (reactive)

I still do not understand when to apply this method. In fact, it is similar to Mono.just, but I heard that callback is used for heavy operations if it needs to be performed separately from other flows. Now I use it like this, but is it correct.
Here is an example of use, I wrap sending a firebase notification in a callback since the operation is long
#Override
public Mono<NotificationDto> sendMessageAllDevice(NotificationDto notification) {
return Mono.fromCallable(() -> fcmProvider.sendPublicMessage(notification))
.thenReturn(notification);
}
maybe I still had to wrap up here in Mono.just ?

It depends which thread you want fcmProvider.sendPublicMessage(...) to be run on.
Either the one currently executing sendMessageAllDevice(...):
T result = fcmProvider.sendPublicMessage(notification);
return Mono.just(result);
Or the one(s) the underlying mono relies on:
Callable<T> callable = () -> fcmProvider.sendPublicMessage(notification);
return Mono.fromCallable(callable);
I would guess you need the latter approach.

If you use Mono.just(computeX()), computeX() is called immediately. No want you want(I guess).
If you use Mono.fromCallable(() -> computeX()), the computation is still not performed. I mean computeX() is only called when you subscribe to it. Maybe using .map, .flatMap, etc.
Important: if computeX() return Mono you doe not need to use Mono.fromCallable. It's only for blocking code

As you explained in the description, Mono.fromCallable is used when you want to compute a result with an async execution (mostly some heavy operation).
Since, you have already generated the Mono with Mono.fromCallable you do not have to wrap it again with Mono.just.

Java Clear CompletionService Working Queue

I am writing a program which uses a CompletionService to run threaded analyses on a bunch of different objects, where each "analysis" consists of taking in a string and doing some computation to give either true or false as an answer. My code looks essentially like this:
// tasks come from a different method and contain the strings + some other needed info
List<Future<Pair<Pieces,Boolean>>> futures = new ArrayList<>(tasks.size());
for (Task task : tasks) {
futures.add(executorCompletionService.submit(task));
}
ArrayList<Pair<Pieces, Boolean>> pairs = new ArrayList<>();
int toComplete = tasks.size();
int received = 0;
int failed = 0;
while (received < toComplete) {
Future<Pair<Pieces, Boolean>> resFuture = executorCompletionService.take();
received++;
Pair<Pieces, Boolean> res = resFuture.get();
if (!res.getValue()) failed++;
if (failed > 300) {
// My problem is here
}
pairs.add(res);
}
// return pairs and go on to do something else
In the marked section, my goal is to have it abandon the computation if over 300 strings have failed, such that I can move on to a new analysis, calling this method again with some different data. The problem is that since the same CompletionService is used again, if I do not somehow clear the queue, then the worker queue will keep growing as I keep adding more to it every time I use it (since after 300 failures there are likely still many unprocessed strings left).
I have tried to loop through the futures list and delete all unfinished tasks using something like futures.foreach(future -> future.cancel(true), however when I next call the method I get a java.util.concurrent.CancellationException error when I try to call resFuture.get().
(Edit: It seems that even though I call foreach(future->future.cancel(true)), this does not guarantee that the workerQueue is actually clear afterwards. I do not understand why this is. It almost seems as if it takes a while to clear the queue, and the code does not wait for this to happen before moving to the next analysis, so occasionally get will be called on a future which has been cancelled.)
I have also tried to do
while (received < toComplete) {
executorCompletionService.take();
received++;
}
To empty the queue, and while this works it is barely faster than just running all of the analyses anyway, and so it does not do very well for the efficiency.
My question is if there is a better way to empty the worker queue such that when I next call this code it is as if the CompletionService is new again.
Edit: Another method I have tried is just setting executorCompletionService = new CompletionService, which is slightly faster than my other solution but is still rather slow and definitely not good practice.
P.S.: Also happy to accept any other ways in which this is possible, I am not attached to using a CompletionService it has just been the easiest thing for what I've done so far

This has since been resolved, but I have seen other similar questions with no good answer so here is my solution:
Previously, I was using an ExecutorService to create my ExecutorCompletionService(ExecutorService). I switched the ExecutorService to be a ThreadPoolExecutor, and since in the backed the ExecutorService already is a ThreadPoolExecutor all method signatures can be fixed with just a cast. Using the ThreadPoolExecutor gives you much more freedom in the backend, and specifically you can called threadPoolExecutor.getQueue().clear() which clears all tasks awaiting completion. Finally, I needed to make sure to "drain" the remaining working tasks, so my final cancelling code looked like this:
if (failed > maxFailures) {
executorService.getQueue().clear();
while (executorService.getActiveCount() > 0) {
executorCompletionService.poll();
}
At the end of this code block, the executor will be ready to run again.

Are `thenRunAsync(...)` and `CompletableFuture.runAsync(() -> { ... });` related at all?

I need to perform some extra tasks but let the original thread finish up, e.g. send back an HTTP response.
I think I can just do this:
return mainTasksFuture.thenApply(response -> {
CompletableFuture.runAsync(() -> {
// extra tasks
});
return response;
});
But I remembered there's a thenRunAsync. Is
return mainTasksFuture.thenApply(response -> {
return response;
}).thenRunAsync(() -> {
// extra tasks
});
basically another way to do the same thing? In other words, are the then*Async methods terminators (completion methods) that return the previous chain's result in the original thread, then spawn a new thread to execute the rest?
I'm almost certain the answer is no. It just seems it might be that purely based on method names, to someone new to CompletableFutures. I wanted a confirmation though, in case what I'm reading about ForkJoinPool.commonPool is actually saying what I'm doubting, just in a different way.

You wrote
It just ∗seems* it might be that purely based on method names, to someone new to CompletableFutures.
Well, the method names correctly reflect what the methods do. Both, runAsync and thenRunAsync initiate the asynchronous execution of a Runnable and return a future, which will be completed when the asynchronous execution has finished. So the similarity in the names is justified.
It’s your code which is fundamentally different.
In this variant
return mainTasksFuture.thenApply(response -> {
CompletableFuture.runAsync(() -> {
// extra tasks
});
return response;
});
you are ignoring the future returned by runAsync entirely, so the future returned by thenApply will be completed as soon as the asynchronous operation has been triggered. The caller can retrieve the result value while the “extra tasks” are still running concurrently.
In contrast, with
return mainTasksFuture.thenApply(response -> {
return response;
}).thenRunAsync(() -> {
// extra tasks
});
the thenApply is entirely obsolete as it doesn’t do anything. But you are returning the future returned by thenRunAsync, which will be completed when the asynchronous execution of the Runnable has finished and has the type CompletableFuture<Void>, as the runnable does not produce a value (the future will be completed with null). In the exceptional case, it would get completed with the exception of mainTasksFuture, but in the successful case, it does not pass through the result value.
If the first variant matches your actual intention (the caller should not depend on the completion of the extra tasks), simply don’t model them as a dependency:
mainTasksFuture.thenRunAsync(() -> {
// extra tasks
});
return mainTasksFuture; // does not depend on the completion of extra task
Otherwise, stay with variant 2 (minus obsolete things)
return mainTasksFuture.thenRunAsync(() -> {
// extra tasks
}); // depends on the completion of extra task but results in (Void)null
if you don’t need the result value. Otherwise, you can use
return mainTasksFuture.thenApplyAsync(response -> {
// extra tasks
return response;
}); // depends on the completion of extra task and returns original result
it would be the same as with
return mainTasksFuture.thenCompose(response ->
CompletableFuture.runAsync(() -> {
// extra tasks
}).thenApply(_void -> response));
which does not ignore the future returned by runAsync, but there’s no advantage in this complication, compared to thenApplyAsync.
Another alternative would be
return mainTasksFuture.whenComplete((response,failure) -> {
if(failure == null) {
// extra tasks
}
});
as the future returned by whenComplete will get completed with the original future’s result when the extra tasks have been completed. But the function is always evaluated, even when the original future completed exceptionally, so it needs another conditional if that’s not desired.

Both runAsync and thenRunAsync execute the Runnable taks asynchronous
executes the given action using this stage's default asynchronous execution facility
Question : In other words, are the then*Async methods terminators (completion methods) that return the previous chain's result in the original thread, then spawn a new thread to execute the rest?
Answer: No, From documentation One stage's execution may be triggered by completion of a single stage, or both of two stages, or either of two stages.So basically the result might be returned based on how programmer coded that part, but now in your case (using thenRunAsync) the result will be returned after first stage completion because in the second stage thenRunAsync you are taking result from first stage as input but not returning anything.
Interface CompletionStage
One stage's execution may be triggered by completion of a single stage, or both of two stages, or either of two stages. Dependencies on a single stage are arranged using methods with prefix then. Those triggered by completion of both of two stages may combine their results or effects, using correspondingly named methods. Those triggered by either of two stages make no guarantees about which of the results or effects are used for the dependent stage's computation.
There is also a slight difference between first example and second example
Example : 1 In this example the Runnable tasks get executed asynchronously before returning the result, both Function from thenApply and Runnable from runAsync will be executed concurrently
return mainTasksFuture.thenApply(response -> {
CompletableFuture.runAsync(() -> {
// extra tasks
});
return response;
});
Example : 2 In this example Runnable task from thenRunAsync will be executed after completion of Function from thenApply
return mainTasksFuture.thenApply(response -> {
return response;
}).thenRunAsync(() -> {
// extra tasks
});

Combine a list of Observables and wait until all completed

TL;DR
How to convert Task.whenAll(List<Task>) into RxJava?
My existing code uses Bolts to build up a list of asynchronous tasks and waits until all of those tasks finish before performing other steps. Essentially, it builds up a List<Task> and returns a single Task which is marked as completed when all tasks in the list complete, as per the example on the Bolts site.
I'm looking to replace Bolts with RxJava and I'm assuming this method of building up a list of async tasks (size not known in advance) and wrapping them all into a single Observable is possible, but I don't know how.
I've tried looking at merge, zip, concat etc... but can't get to work on the List<Observable> that I'd be building up as they all seem geared to working on just two Observables at a time if I understand the docs correctly.
I'm trying to learn RxJava and am still very new to it so forgive me if this is an obvious question or explained in the docs somewhere; I have tried searching. Any help would be much appreciated.

You can use flatMap in case you have dynamic tasks composition. Something like this:
public Observable<Boolean> whenAll(List<Observable<Boolean>> tasks) {
return Observable.from(tasks)
//execute in parallel
.flatMap(task -> task.observeOn(Schedulers.computation()))
//wait, until all task are executed
//be aware, all your observable should emit onComplete event
//otherwise you will wait forever
.toList()
//could implement more intelligent logic. eg. check that everything is successful
.map(results -> true);
}
Another good example of parallel execution
Note: I do not really know your requirements for error handling. For example, what to do if only one task fails. I think you should verify this scenario.

It sounds like you're looking for the Zip operator.
There are a few different ways of using it, so let's look at an example. Say we have a few simple observables of different types:
Observable<Integer> obs1 = Observable.just(1);
Observable<String> obs2 = Observable.just("Blah");
Observable<Boolean> obs3 = Observable.just(true);
The simplest way to wait for them all is something like this:
Observable.zip(obs1, obs2, obs3, (Integer i, String s, Boolean b) -> i + " " + s + " " + b)
.subscribe(str -> System.out.println(str));
Note that in the zip function, the parameters have concrete types that correspond to the types of the observables being zipped.
Zipping a list of observables is also possible, either directly:
List<Observable<?>> obsList = Arrays.asList(obs1, obs2, obs3);
Observable.zip(obsList, (i) -> i[0] + " " + i[1] + " " + i[2])
.subscribe(str -> System.out.println(str));
...or by wrapping the list into an Observable<Observable<?>>:
Observable<Observable<?>> obsObs = Observable.from(obsList);
Observable.zip(obsObs, (i) -> i[0] + " " + i[1] + " " + i[2])
.subscribe(str -> System.out.println(str));
However, in both of these cases, the zip function can only accept a single Object[] parameter since the types of the observables in the list are not known in advance as well as their number. This means that that the zip function would have to check the number of parameters and cast them accordingly.
Regardless, all of the above examples will eventually print 1 Blah true
EDIT: When using Zip, make sure that the Observables being zipped all emit the same number of items. In the above examples all three observables emitted a single item. If we were to change them to something like this:
Observable<Integer> obs1 = Observable.from(new Integer[]{1,2,3}); //Emits three items
Observable<String> obs2 = Observable.from(new String[]{"Blah","Hello"}); //Emits two items
Observable<Boolean> obs3 = Observable.from(new Boolean[]{true,true}); //Emits two items
Then 1, Blah, True and 2, Hello, True would be the only items passed into the zip function(s). The item 3would never be zipped since the other observables have completed.

Of the suggestions proposed, zip() actually combines observable results with each other, which may or may not be what is wanted, but was not asked in the question. In the question, all that was wanted was execution of each of the operations, either one-by-one or in parallel (which was not specified, but linked Bolts example was about parallel execution). Also, zip() will complete immediately when any of the observables complete, so it's in violation of the requirements.
For parallel execution of Observables, flatMap() presented in the other answer is fine, but merge() would be more straight-forward. Note that merge will exit on error of any of the Observables, if you rather postpone the exit until all observables have finished, you should be looking at mergeDelayError().
For one-by-one, I think Observable.concat() static method should be used. Its javadoc states like this:
concat(java.lang.Iterable> sequences)
Flattens an Iterable of Observables into one Observable, one after the other, without interleaving them
which sounds like what you're after if you don't want parallel execution.
Also, if you're only interested in the completion of your task, not return values, you should probably look into Completable instead of Observable.
TLDR: for one-by-one execution of tasks and oncompletion event when they are completed, I think Completable.concat() is best suited. For parallel execution, Completable.merge() or Completable.mergeDelayError() sounds like the solution. The former one will stop immediately on any error on any completable, the latter one will execute them all even if one of them has an error, and only then reports the error.

With Kotlin
Observable.zip(obs1, obs2, BiFunction { t1 : Boolean, t2:Boolean ->
})
It's important to set the type for the function's arguments or you will have compilation errors
The last argument type change with the number of argument :
BiFunction for 2
Function3 for 3
Function4 for 4
...

You probably looked at the zip operator that works with 2 Observables.
There is also the static method Observable.zip. It has one form which should be useful for you:
zip(java.lang.Iterable<? extends Observable<?>> ws, FuncN<? extends R> zipFunction)
You can check out the javadoc for more.

I'm writing some computation heave code in Kotlin with JavaRx Observables and RxKotlin. I want to observe a list of observables to be completed and in the meantime giving me an update with the progress and latest result. At the end it returns the best calculation result. An extra requirement was to run Observables in parallel for using all my cpu cores. I ended up with this solution:
#Volatile var results: MutableList<CalculationResult> = mutableListOf()
fun doALotOfCalculations(listOfCalculations: List<Calculation>): Observable<Pair<String, CalculationResult>> {
return Observable.create { subscriber ->
Observable.concatEager(listOfCalculations.map { calculation: Calculation ->
doCalculation(calculation).subscribeOn(Schedulers.computation()) // function doCalculation returns an Observable with only one result
}).subscribeBy(
onNext = {
results.add(it)
subscriber.onNext(Pair("A calculation is ready", it))
},
onComplete = {
subscriber.onNext(Pair("Finished: ${results.size}", findBestCalculation(results))
subscriber.onComplete()
},
onError = {
subscriber.onError(it)
}
)
}
}

I had similar problem, I needed to fetch search items from rest call while also integrate saved suggestions from a RecentSearchProvider.AUTHORITY and combine them together to one unified list. I was trying to use #MyDogTom solution, unfortunately there is no Observable.from in RxJava. After some research I got a solution that worked for me.
fun getSearchedResultsSuggestions(context : Context, query : String) : Single<ArrayList<ArrayList<SearchItem>>>
{
val fetchedItems = ArrayList<Observable<ArrayList<SearchItem>>>(0)
fetchedItems.add(fetchSearchSuggestions(context,query).toObservable())
fetchedItems.add(getSearchResults(query).toObservable())
return Observable.fromArray(fetchedItems)
.flatMapIterable { data->data }
.flatMap {task -> task.observeOn(Schedulers.io())}
.toList()
.map { ArrayList(it) }
}
I created an observable from the array of observables that contains lists of suggestions and results from the internet depending on the query. After that you just go over those tasks with flatMapIterable and run them using flatmap, place the results in array, which can be later fetched into a recycle view.

If you use Project Reactor, you can use Mono.when.
Mono.when(publisher1, publisher2)
.map(i-> {
System.out.println("everything is done!");
return i;
}).block()

How does RxJava Observable "Iteration" work?

I started to play around with RxJava and ReactFX, and I became pretty fascinated with it. But as I'm experimenting I have dozens of questions and I'm constantly researching for answers.
One thing I'm observing (no pun intended) is of course lazy execution. With my exploratory code below, I noticed nothing gets executed until the merge.subscribe(pet -> System.out.println(pet)) is called. But what fascinated me is when I subscribed a second subscriber merge.subscribe(pet -> System.out.println("Feed " + pet)), it fired the "iteration" again.
What I'm trying to understand is the behavior of the iteration. It does not seem to behave like a Java 8 stream that can only be used once. Is it literally going through each String one at a time and posting it as the value for that moment? And do any new subscribers following any previously fired subscribers receive those items as if they were new?
public class RxTest {
public static void main(String[] args) {
Observable<String> dogs = Observable.from(ImmutableList.of("Dasher", "Rex"))
.filter(dog -> dog.matches("D.*"));
Observable<String> cats = Observable.from(ImmutableList.of("Tabby", "Grumpy Cat", "Meowmers", "Peanut"));
Observable<String> ferrets = Observable.from(CompletableFuture.supplyAsync(() -> "Harvey"));
Observable<String> merge = dogs.mergeWith(cats).mergeWith(ferrets);
merge.subscribe(pet -> System.out.println(pet));
merge.subscribe(pet -> System.out.println("Feed " + pet));
}
}

Observable<T> represents a monad, a chained operation, not the execution of the operation itself. It is descriptive language, rather than the imperative you're used to. To execute an operation, you .subscribe() to it. Every time you subscribe a new execution stream is created from scratch. Do not confuse streams with threads, as subscription are executed synchronously unless you specify a thread change with .subscribeOn() or .observeOn(). You chain new elements to any existing operation/monad/Observable to add new behaviour, like changing threads, filtering, accumulation, transformation, etc. In case your observable is an expensive operation you don't want to repeat on every subscription, you can prevent recreation by using .cache().
To make any asynchronous/synchronous Observable<T> operation into a synchronous inlined one, use .toBlocking() to change its type to BlockingObservable<T>. Instead of .subscribe() it contains new methods to execute operations on each result with .forEach(), or coerce with .first()
Observables are a good tool because they're mostly* deterministic (same inputs always yield same outputs unless you're doing something wrong), reusable (you can send them around as part of a command/policy pattern) and for the most part ignore concurrence because they should not rely on shared state (a.k.a. doing something wrong). BlockingObservables are good if you're trying to bring an observable-based library into imperative language, or just executing an operation on an Observable that you have 100% confidence it's well managed.
Architecting your application around these principles is a change of paradigm that I can't really cover on this answer.
*There are breaches like Subject and Observable.create() that are needed to integrate with imperative frameworks.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to wait for completion of multiple Observables in RXJava? - java

Related

When is the deal Mono.fromCallback ? (reactive)

Java Clear CompletionService Working Queue

Are `thenRunAsync(...)` and `CompletableFuture.runAsync(() -> { ... });` related at all?

Combine a list of Observables and wait until all completed

How does RxJava Observable "Iteration" work?

Categories

Resources