Subscribe to an Observable without triggering it and then passing it on

Subscribe to an Observable without triggering it and then passing it on - java

This could get a little bit complicated and I'm not that experienced with Observables and the RX pattern so bear with me:
Suppose you've got some arbitrary SDK method which returns an Observable. You consume the method from a class which is - among other things - responsible for retrieving data and, while doing so, does some caching, so let's call it DataProvider. Then you've got another class which wants to access the data provided by DataProvider. Let's call it Consumer for now. So there we've got our setup.
Side note for all the pattern friends out there: I'm aware that this is not MVP, it's just an example for an analogous, but much more complex problem I'm facing in my application.
That being said, in Kotlin-like pseudo code the described situation would look like this:
class Consumer(val provider: DataProvider) {
fun logic() {
provider.getData().subscribe(...)
}
}
class DataProvider(val sdk: SDK) {
fun getData(): Consumer {
val observable = sdk.getData()
observable.subscribe(/*cache data as it passes through*/)
return observable
}
}
class SDK {
fun getData(): Observable {
return fetchDataFromNetwork()
}
}
The problem is, that upon calling sdk.subscribe() in the DataProvider I'm already triggering the Observable's subscribe() method which I don't want. I want the DataProvider to just silently listen - in this example the triggering should be done by the Consumer.
So what's the best RX compatible solution for this problem? The one outlined in the pseudo code above definitely isn't for various reasons one of which is the premature triggering of the network request before the Consumer has subscribed to the Observable. I've experimented with publish().autoComplete(2) before calling subscribe() in the DataProvider, but that doesn't seem to be the canonical way to do this kind of things. It just feels hacky.
Edit: Through SO's excellent "related" feature I've just stumbled across another question pointing in a different direction, but having a solution which could also be applicable here namely flatMap(). I knew that one before, but never actually had to use it. Seems like a viable way to me - what's your opinion regarding that?

If the caching step is not supposed to modify events in the chain, the doOnNext() operator can be used:
class DataProvider(val sdk: SDK) {
fun getData(): Observable<*> = sdk.getData().doOnNext(/*cache data as it passes through*/)
}

Yes, flatMap could be a solution. Moreover you could split your stream into chain of small Observables:
public class DataProvider {
private Api api;
private Parser parser;
private Cache cache;
public Observable<List<User>> getUsers() {
return api.getUsersFromNetwork()
.flatMap(parser::parseUsers)
.map(cache::cacheUsers);
}
}
public class Api {
public Observable<Response> getUsersFromNetwork() {
//makes https request or whatever
}
}
public class Parser {
public Observable<List<User>> parseUsers(Response response) {
//parse users
}
}
public class Cache {
public List<User> cacheUsers(List<User> users) {
//cache users
}
}
It's easy to test, maintain and replace implementations(with usage of interfaces). Also you could easily insert additional step into your stream(for instance log/convert/change data which you receive from server).
The other quite convenient operator is map. Basically instead of Observable<Data> it returns just Data. It could make your code even simpler.

Related

Spring Webflux Proper Way To Find and Save

I created the below method to find an Analysis object, update the results field on it and then lastly save the result in the database but not wait for a return.
public void updateAnalysisWithResults(String uuidString, String results) {
findByUUID(uuidString).subscribe(analysis -> {
analysis.setResults(results);
computeSCARepository.save(analysis).subscribe();
});
}
This feels poorly written to subscribe within a subscribe.
Is this a bad practice?
Is there a better way to write this?
UPDATE:
entry point
#PatchMapping("compute/{uuid}/results")
public Mono<Void> patchAnalysisWithResults(#PathVariable String uuid, #RequestBody String results) {
return computeSCAService.updateAnalysisWithResults(uuid,results);
}
public Mono<Void> updateAnalysisWithResults(String uuidString, String results) {
// findByUUID(uuidString).subscribe(analysis -> {
// analysis.setResults(results);
// computeSCARepository.save(analysis).subscribe();
// });
return findByUUID(uuidString)
.doOnNext(analysis -> analysis.setResults(results))
.doOnNext(computeSCARepository::save)
.then();
}

Why it is not working is because you have misunderstood what doOnNext does.
Lets start from the beginning.
A Flux or Mono are producers, they produce items. Your application produces things to the calling client, hence it should always return either a Mono or a Flux. If you don't want to return anything you should return a Mono<Void>.
When the client subscribes to your application what reactor will do is call all operators in the opposite direction until it finds a producer. This is what is called the assembly phase. If all your operators don't chain together you are what i call breaking the reactive chain.
When you break the chain, the things broken from the chain wont be executed.
If we look at your example but in a more exploded version:
#Test
void brokenChainTest() {
updateAnalysisWithResults("12345", "Foo").subscribe();
}
public Mono<Void> updateAnalysisWithResults(String uuidString, String results) {
return findByUUID(uuidString)
.doOnNext(analysis -> analysis.setValue(results))
.doOnNext(this::save)
.then();
}
private Mono<Data> save(Data data) {
return Mono.fromCallable(() -> {
System.out.println("Will not print");
return data;
});
}
private Mono<Data> findByUUID(String uuidString) {
return Mono.just(new Data());
}
private static class Data {
private String value;
public void setValue(String value) {
this.value = value;
}
}
in the above example save is a callable function that will return a producer. But if we run the above function you will notice that the print will never be executed.
This has to do with the usage of doOnNext. If we read the docs for it it says:
Add behavior triggered when the Mono emits a data successfully.
The Consumer is executed first, then the onNext signal is propagated downstream.
doOnNext takes a Consumer that returns void. And if we look at doOnNext we see that the function description looks as follows:
public final Mono<T> doOnNext(Consumer<? super T> onNext)`
THis means that it takes in a consumer that is a T or extends a T and it returns a Mono<T>. So to keep a long explanation short, you can see that it consumes something but also returns the same something.
What this means is that this usually used for what is called side effects basically for something that is done on the side that does not hinder the current flow. One of those things could for instance logging. Logging is one of those things that would consume for instance a string and log it, while we want to keep the string flowing down our program. Or maybe we we want to increment a number on the side. Or modify some state somewhere. You can read all about side effects here.
you can of think of it visually this way:
_____ side effect (for instance logging)
/
___/______ main reactive flow
That's why your first doOnNext setter works, because you are modifying a state on the side, you are setting the value on your class hence modifying the state of your class to have a value.
The second statement on the other hand, the save, does not get executed. You see that function is actually returning something we need to take care of.
This is what it looks like:
save
_____
/ \ < Broken return
___/ ____ no main reactive flow
all we have to do is actually change one single line:
// From
.doOnNext(this::save)
// To
.flatMap(this::save)
flatMap takes whatever is in the Mono, and then we can use that to execute something and then return a "new" something.
So our flow (with flatMap) now looks like this:
setValue() save()
______ _____
/ / \
__/____________/ \______ return to client
So with the use of flatMap we are now saving and returning whatever was returned from that function triggering the rest of the chain.
If you then choose to ignore whatever is returned from the flatMap its completely correct to do as you have done to call then which will
Return a Mono which only replays complete and error signals from this
The general rule is, in a fully reactive application, you should never block.
And you generally don't subscribe unless your application is the final consumer. Which means if your application started the request, then you are the consumerof something else so you subscribe. If a webpage starts off the request, then they are the final consumer and they are subscribing.
If you are subscribing in your application that is producing data its like you are running a bakery and eating your baked breads at the same time.
don't do that, its bad for business :D

Subscribe inside a subscribe is not a good practise. You can use flatMap operator to solve this problem.
public void updateAnalysisWithResults(String uuidString, String results) {
findByUUID(uuidString).flatMap(analysis -> {
analysis.setResults(results);
return computeSCARepository.save(analysis);
}).subscribe();
}

Converting portable, imperative code to reactive without resorting to blocking?

I have some legacy imperative code for saving & loading objects by key in multiple document datastores. Essentially, it's written portably so the DatastoreClient knows nothing about the data it's storing, but is given a key by the repository using it for predictable retrieval. What would be the best way to make that pattern reactive?
Legacy code is of the form
public class CustomerRepository implements CrudRepository<Customer, Long> {
private final DatastoreClient datastoreClient;
private final ObjectMapper mapper = new ObjectMapper(); //jackson.fasterxml
public CustomerRepository(final DatastoreClient datastoreClient) {
this.datastoreClient = datastoreClient;
}
public void createOrUpdate(Customer c) {
datatoreClient.makeAndStoreDocument(mapper.convertValue(c, Map.class),
c.getId());
}
}
I've managed to rewrite datatoreClient.createOrUpdate(...) to use the Project Reactor types Mono<Map<String,Object>> and Mono<Long|String>, but what is the right way to get the object and key to this method reactively? Or is the better answer to start from scratch on the interfaces?
public Mono<Void> createOrUpdateReactive(final Mono<Customer> customerMono) {
return customerMono.flatMap(customer -> datastoreClient
.makeAndStoreDocument(
Mono.just(mapper.convertValue(customer, Map.class)),
Mono.just(customer.getId())
)
);
}
Doesn't this end up blocking to unpack the real data out of the first Mono?
I added underlying DatastoreClient makeAndStoreDocument function
public class GoogleFirestoreClient implements DatastoreClient {
#Override
public Mono<Void> makeAndStoreDocument(final Mono<Map<String, Object>> model, final Mono<String> key) {
//The client library for Firestore is synchronous/blocking, so we offload the actual request to a separate, elastic thread pool.
//When the result comes back, a separate, asynchronously generated result goes back up the chain.
return Mono.zip(model, key)
.publishOn(Schedulers.boundedElastic())
.doOnNext(tuple -> db.collection(collectionName).document(tuple.getT2()).set(tuple.getT1()))
.retryWhen(Retry.max(3).filter(error -> error instanceof InterruptedException))
.doOnSuccess(tuple -> System.out.println("Wrote object: " + tuple.getT1() + " to Firestore collection " + collectionName))
.doOnError(ExecutionException.class, ee -> logger.error("ExecutionException in createOrUpdateReactive. ", ee))
.doOnError(InterruptedException.class, ie -> logger.error("Reactive CreateOrUpdate interrupted more than limit allows.", ie))
.then();
}
}

but what is the right way to "split" my Mono in the caller?
There is no right answer, you design your API the way you want. As long as you don't call block or in this specific case call subscribe then you can solve this however works best for you in accordance to your teams decision in designing the API to the database.
How to design API's are out of scope for this question, and is extremely opinion based. What I can suggest in this case is looking into the single responsibility principal which means one things does one thing and it does it really good.
makeAndStoreDocument does two things (hence the name), which is not inherently wrong, but can for instance be harder to test, since you need to test for two things in one single thing (what if you need to change one thing but not the other, then tests need to be rewritten and can build up complexity).
But now we are in opinion based territory and Stack Overflow is not the site for such discussions, there are better sites for that purpose.
Software Engineering
Code review

Jersey/REST: delegating requests to different sub resources without code duplication?

We created a resource, like:
#Path("whatever")
public class WhateverResource {
#POST
public Response createWhatever(CreateBean bean) { ...
#DELETE
#Path("/{uuid}")
public void deleteWhatever(#PathParam("uuid") UUID uuid) { ...
and so on for GET, PUT, HEAD.
Now we figured that we figured that we need to check whether the underlying feature is actually enabled. A single check, and when it fails, all operations should simply result in a 501.
My first thought was be to duplicate the existing resource, like:
#Path("whatever")
public class WhateverResourceIsntAvailable {
#POST
public Response createWhatever(CreateBean bean) {
throw 501
#DELETE
#Path("/{uuid}")
public void deleteWhatever(#PathParam("uuid") UUID uuid) {
throw 501
So, two resources, both specifying the exact same operations. Leading to the problem that we can't (easily) invoke that check at the point in time when the resource needs to be registered.
Beyond that, this duplication doesn't look very elegant, and I am wondering if there is a "more canonical" way of solving this?
EDIT: another option would be to add the check into the existing resource, into each resource, but that means: doing the check for each operation. Which can easily be forgotten when adding new operations.
I envision something like having:
a "base resource", that gets registered
when any operation is invoked on that resource, the request should be "delegated", depending on that underlying feature
either to a resource that just gives 501 always
or to the "real" resource that does the real work
And ideally, without duplicating checking code, or duplicating operation end point specs.

Following the suggestion given by user Samsotha, I implemented a simple filter, which is then "connected" via name binding, like:
#Path("whatever")
#MyNewFilter
public class WhateverResource {
...
And:
#MyNewFilter
public class MyNewFilterImpl implements ContainerRequestFilter {
#Override
public void filter(ContainerRequestContext context) {
if (... feature is enabled )) {
... nothing to do
} else {
context.abortWith(
Response.status(Response.Status.NOT_IMPLEMENTED).entity("not implemented").build());
}
}
The major advantage of this approach is the fact that one can annotate individual operations, but also a whole resource, such as my WhateverResource. The latter will make sure that any operation within that resource is going through the filter!
( further details can be found in any decent Jersey tutorial, like the one at baeldung )

How can I run a concurrent queue of tasks using Rx?

I've found a lot of examples about it and doesn't know what's the 'right' implementation right there.
Basically I've got a object (let's call it NBAManager) and there's a method public Completable generateGame() for this object. The idea is that generateGame method gets called a lot of times and I want to generate games in a sequential way: I was thinking about concurrent queue. I came up with the following design: I'd create a singleton instance of NBAService: service for NBAManager and the body of generateGame() will look like this:
public Completable generateGame(RequestInfo info)
return service.generateGame(info);
So basically I'll pass up that Completable result. And inside of that NBAService object I'll have a queue (a concurrent one, because I want to have an opportunity to poll() and add(request) if there's a call of generateGame() while NBAManager was processing one of the earlier requests) of requests. I got stuck with this:
What's the right way to write such a job queue in Rx way? There're so many examples of it. Could you send me a link of a good implementation?
How do I handle the logic of queue execution? I believe we've to execute if there's one job only and if there're many then we just have to add it and that's it. How can I control it without runnable? I was thinking about using subjects.
Thanks!

There are multiple ways to implement this, you can choose how much RxJava should be invoked. The least involvement can use a single threaded ExecutorService as the "queue" and CompletableSubject for the delayed completion:
class NBAService {
static ExecutorService exec = Executors.newSingleThreadedExecutor();
public static Completable generateGame(RequestInfo info) {
CompletableSubject result = CompletableSubject.create();
exec.submit(() -> {
// do something with the RequestInfo instance
f(info).subscribe(result);
});
return result;
}
}
A more involved solution would be if you wanted to trigger the execution when the Completable is subscribed to. In this case, you can go with create() and subscribeOn():
class NBAService {
public static Completable generateGame(RequestInfo info) {
return Completable.create(emitter -> {
// do something with the RequestInfo instance
emitter.setDisposable(
f(info).subscribe(emitter::onComplete, emitter::onError)
);
})
.subscribeOn(Schedulers.single());
}
}

Asynchronous multiple query from different datasources or databases

I'm having trouble to find appropriate solution for that:
I have several databases with the same structure but with different data. And when my web app execute a query, it must separate this query for each database and execute it asynchronously and then aggregate results from all databases and return it as single result. Additionaly I want to be able to pass a list of databases where query would be executed and also I want to pass maximum expiration time for query executing. Also result must contains meta information for each databases such as excess execution time.
It would be great if it possible to use another datasource such as remote web service with specific API, rather than relational database.
I use Spring/Grail and need java solution but I will be glad to any advice.
UPD: I want to find prepared solution, maybe framework or something like that.

This is basic OO. You need to abstract what you are trying to achieve - loading data - from the mechanism you are using to achieve - a database query or a web-service call.
Such a design would usually involve an interface that defines the contract of what can be done and then multiple implementing classes that make it happen according to their implementation.
For example, you'd end up with an interface that looked something like:
public interface DataLoader
{
public Collection<Data> loadData() throws DataLoaderException;
}
You would then have implementations like JdbcDataLoader, WebServiceDataLoader, etc. In your case you would need another type of implementation that given one or more instances of DataLoader, runs each sumulatiously aggregating the results. This implementation would look something like:
public class AggregatingDataLoader implements DataLoader
{
private Collection<DataLoader> dataLoaders;
private ExecutorService executorService;
public AggregatingDataLoader(ExecutorService executorService, Collection<DataLoader> dataLoaders)
{
this.executorService = executorService;
this.dataLoaders = dataLoaders;
}
public Collection<Data> loadData() throws DataLoaderException
{
Collection<DataLoaderCallable>> dataLoaderCallables = new ArrayList<DataLoaderCallable>>();
for (DataLoader dataLoader : dataLoaders)
{
dataLoaderCallables.add(new DataLoaderCallable(dataLoader));
}
List<Future<Collection<Data>>> futures = executorService.invokeAll(dataLoaderCallables);
Collection<Data> data = new ArrayList<Data>();
for (Future<Collection<Data>> future : futures)
{
add.addAll(future.get());
}
return data;
}
private class DataLoaderCallable implements Callable<Collection<Data>>
{
private DataLoader dataLoader;
public DataLoaderCallable(DataLoader dataLoader)
{
this.dataLoader = dataLoader;
}
public Collection<Data> call()
{
return dataLoader.load();
}
}
}
You'll need to add some timeout and exception handling logic to this, but you get the gist.
The other important thing is your call code should only ever use the DataLoader interface so that you can swap different implementations in and out or use mocks during testing.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Subscribe to an Observable without triggering it and then passing it on - java

If the caching step is not supposed to modify events in the chain, the doOnNext() operator can be used: class DataProvider(val sdk: SDK) { fun getData(): Observable<> = sdk.getData().doOnNext(/cache data as it passes through*/) }

Related

Spring Webflux Proper Way To Find and Save

Converting portable, imperative code to reactive without resorting to blocking?

Jersey/REST: delegating requests to different sub resources without code duplication?

How can I run a concurrent queue of tasks using Rx?

Asynchronous multiple query from different datasources or databases

Categories

Resources

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Subscribe to an Observable without triggering it and then passing it on - java

If the caching step is not supposed to modify events in the chain, the doOnNext() operator can be used: class DataProvider(val sdk: SDK) { fun getData(): Observable<*> = sdk.getData().doOnNext(/*cache data as it passes through*/) }

Related

Spring Webflux Proper Way To Find and Save

Converting portable, imperative code to reactive without resorting to blocking?

Jersey/REST: delegating requests to different sub resources without code duplication?

How can I run a concurrent queue of tasks using Rx?

Asynchronous multiple query from different datasources or databases

Categories

Resources

If the caching step is not supposed to modify events in the chain, the doOnNext() operator can be used: class DataProvider(val sdk: SDK) { fun getData(): Observable<> = sdk.getData().doOnNext(/cache data as it passes through*/) }