RxJava - Observable chain with concatWith and map - java

I'm having troubles properly implementing the following scenario using RxJava (v1.2.1):
I need to handle a request for some data object. I have a meta-data copy of this object which I can return immediately, while making an API call to a remote server to retrieve the whole object data. When I receive the data from the API call I need to process the data before emitting it.
My solution currently looks like this:
return Observable.just(localDataCall())
.concatWith(externalAPICall().map(new DataProcessFunction()));
The first Observable, localDataCall(), should emit the local data, which is then concatenated with the remote API call, externalAPICall(), mapped to the DataProcessFunction.
This solution works but it has a behavior that is not clear to me. When the local data call returns its value, this value goes through the DataProcessFunction even though it's not connected to the first call.
Any idea why this is happening? Is there a better implementation for my use case?

I believe that the issue lies in some part of your code that has not been provided. The data returned from localDataCall() is independent of the new DataProcessFunction() object, unless somewhere within localDataCall you use another DataProcessFunction.
To prove this to you I will create a small example using io.reactivex:rxjava:1.2.1:
public static void main(String[] args){
Observable.just(foo())
.concatWith(bar().map(new IntMapper()))
.subscribe(System.out::println);
}
static int foo() {
System.out.println("foo");
return 0;
}
static Observable<Integer> bar() {
System.out.println("bar");
return Observable.just(1, 2);
}
static class IntMapper implements Func1<Integer, Integer>
{
#Override
public Integer call(Integer integer)
{
System.out.println("IntMapper " + integer);
return integer + 5;
}
}
This prints to the console:
foo
bar
0
IntMapper 1
6
IntMapper 2
7
As can be seen, the value 0 created in foo never gets processed by IntMapper; IntMapper#call is only called twice for the values created in bar. The same can be said for the value created by localDataCall. It will not be mapped by the DataProcessFunction object passed to your map call. Just like bar and IntMapper, only values returned from externalAPICall will be processed by DataProcessFunction.

.concatWith() concatenates all items emitted by one observable with all items emitted by the other observable, so no wonder that .map() is being called twice.
But I do not understand why do you need localDataCall() at all in this scenario. Perhaps you might want to use .switchIfEmpty() or .switchOnNext() instead.

Related

Java Reactive stream how to map an object when the object being mapped is also needed on the next step of the stream

I am using Java 11 and project Reactor (from Spring). I need to make a http call to a rest api (I can only make it once in the whole flow).
With the response I need to compute two things:
Check if a document exists in the database (mongodb). If it does not exists then create it and return it. Otherwise just return it.
Compute some logic on the response and we are done.
In pseudo code it is something like this:
public void computeData(String id) {
httpClient.getData(id) // Returns a Mono<Data>
.flatMap(data -> getDocument(data.getDocumenId()))
// Issue here is we need access to the data object consumed in the previous flatMap but at the same time we also need the document object we get from the previous flatMap
.flatMap(document -> calculateValue(document, data))
.subscribe();
}
public Mono<Document> getDocument(String id) {
// Check if document exists
// If not create document
return document;
}
public Mono<Value> calculateValue(Document doc, Data data) {
// Do something...
return value;
}
The issue is that calculateValue needs the return value from http.getData but this was already consumed on the first flatMap but we also need the document object we get from the previous flatMap.
I tried to solve this issue using Mono.zip like below:
public void computeData(String id) {
final Mono<Data> dataMono = httpClient.getData(id);
Mono.zip(
new Mono<Mono<Document>>() {
#Override
public void subscribe(CoreSubscriber<? super Mono<Document>> actual) {
final Mono<Document> documentMono = dataMono.flatMap(data -> getDocument(data.getDocumentId()))
actual.onNext(documentMono);
}
},
new Mono<Mono<Value>>() {
#Override
public void subscribe(CoreSubscriber<? super Mono<Value>> actual) {
actual.onNext(dataMono);
}
}
)
.flatMap(objects -> {
final Mono<Document> documentMono = objects.getT1();
final Mono<Data> dataMono = objects.getT2();
return Mono.zip(documentMono, dataMono, (document, data) -> calculateValue(document, data))
})
}
But this is executing the httpClient.getData(id) twice which goes against my constrain of only calling it once. I understand why it is being executed twice (I subscribe to it twice).
Maybe my solution design can be improved somewhere but I do not see where. To me this sounds like a "normal" issue when designing reactive code but I could not find a suitable solution to it so far.
My question is, how can accomplish this flow in a reactive and non blocking way and only making one call to the rest api?
PS; I could add all the logic inside one single map but that would force me to subscribe to one of the Mono inside the map which is not recommended and I want to avoid following this approach.
EDIT regarding #caco3 comment
I need to subscribe inside the map because both getDocument and calculateValue methods return a Mono.
So, if I wanted to put all the logic inside one single map it would be something like:
public void computeData(String id) {
httpClient.getData(id)
.map(data -> getDocument(data).subscribe(s -> calculateValue(s, data)))
.subscribe();
}
You do not have to subscribe inside map, just continue building the reactive chain inside the flatMap:
getData(id) // Mono<Data>
.flatMap(data -> getDocument(data.getDocumentId()) // Mono<Document>
.switchIfEmpty(createDocument(data.getDocumentId())) // Mono<Document>
.flatMap(document -> calculateValue(document, data)) // Mono<Value>
)
.subscribe()
Boiling it down, your problem is analogous to:
Mono.just(1)
.flatMap(original -> process(original))
.flatMap(processed -> I need access to the original value and the processed value!
System.out.println(original); //Won't work
);
private static Mono<String> process(int in) {
return Mono.just(in + " is an integer").delayElement(Duration.ofSeconds(2));
}
(Silly example, I know.)
The problem is that map() (and by extension, flatMap()) are transformations - you get access to the new value, and the old one goes away. So in your second flatMap() call, you've got access to 1 is an integer, but not the original value (1.)
The solution here is to, instead of mapping to the new value, map to some kind of merged result that contains both the original and new values. Reactor provides a built in type for that - a Tuple. So editing our original example, we'd have:
Mono.just(1)
.flatMap(original -> operation(original))
.flatMap(processed -> //Help - I need access to the original value and the processed value!
System.out.println(processed.getT1()); //Original
System.out.println(processed.getT2()); //Processed
///etc.
);
private static Mono<Tuple2<Integer, String>> operation(int in) {
return Mono.just(in + " is an integer").delayElement(Duration.ofSeconds(2))
.map(newValue -> Tuples.of(in, newValue));
}
You can use the same strategy to "hold on" to both document and data - no need for inner subscribes or anything of the sort :-)

Generating a Spring Flux from a sequence of paged network calls

I am calling an API, api.magicthegathering.io/v1/cards, using the Spring reactive WebFlux client. The response is a page of 100 cards, along with headers containing links for the "next" and "last" pages, e.g. "last" is api.magicthegathering.io/v1/cards?page=426 (and "next" is simply n+1). I want to generate a Flux<Card> that feeds out each card individually, with a single entry point, e.g. Flux<Card> getAllCards().
I currently have a CardsClient component that returns a Mono<CardPage>. The CardPage has a cards() method on it that returns all cards therein (this is a 1:1 representation of the API's response model). On top of that, I have a CardCatalog component with that getAllCards() method on it.
I have tried using Flux::expand and Flux::generate, which works somewhat, but these implementations have flaws.
Here is a snippet of my current iteration of CardCatalog::getAllCards(). The problem is that the recursive nature of expand is causing redundant calls to client::getNextPage; clearly I am not using the proper method.
#Override
public Flux<Card> getAllCards() {
return client.getFirstPage().flux().expand(client::getNextPage)
.map(Page::cards)
.flatMap(Flux::fromIterable)
.map(mapper::convert)
.cache();
}
Previously I was using generate, but the issue with this is that it would always grab all pages (pretty slow), even if the subscriber only decides to take(20) cards:
#Override
public Flux<Card> getAllCards() {
final Flux<Page> pageFlux =
generate(client::getFirstPage, (response, sink) -> {
final var page = response.block();
sink.next(page);
if (page.next().isPresent()) {
return client.getNextPage(page);
}
sink.complete();
return null;
});
return pageFlux.flatMapIterable(Page::cards).map(mapper::convert);
}
The full code is here: https://github.com/myersadamk/mtg-api-client
Using expand, I added a print to client::getNextPage(). As you can see, the graph is creates makes redundant calls.
Getting https://api.magicthegathering.io/v1/cards?page=1
Getting https://api.magicthegathering.io/v1/cards?page=7
Getting https://api.magicthegathering.io/v1/cards?page=2
Getting https://api.magicthegathering.io/v1/cards?page=8
Getting https://api.magicthegathering.io/v1/cards?page=3
Getting https://api.magicthegathering.io/v1/cards?page=9
Getting https://api.magicthegathering.io/v1/cards?page=4
Getting https://api.magicthegathering.io/v1/cards?page=10
Getting https://api.magicthegathering.io/v1/cards?page=5
Getting https://api.magicthegathering.io/v1/cards?page=11
Getting https://api.magicthegathering.io/v1/cards?page=6
Getting https://api.magicthegathering.io/v1/cards?page=12
Getting https://api.magicthegathering.io/v1/cards?page=7
I want something more like this:
Getting https://api.magicthegathering.io/v1/cards?page=1
Getting https://api.magicthegathering.io/v1/cards?page=2
Getting https://api.magicthegathering.io/v1/cards?page=3
(Final note: it would certainly be faster to parallelize this and call the URIs directly, but it feels a little silly to bypass the next/last mechanic and hard-code the URI's. I may end up doing that, but still want to crack this nut.)
I think this is the sequential non-blocking way of doing this:
public Flux<Card> getAllCards() {
PaginationParams paginationParams = new PaginationParams();
final Flux<Page> pageFlux = Mono
.defer(() -> client.getPage(paginationParams))
.doOnNext(page -> {
if (page.next().isPresent()) {
paginationParams.setPage(page.next().get());
} else {
paginationParams.setPage(null);
}
})
.repeat(() -> paginationParams.getPage() != null);
return pageFlux.flatMapIterable(Page::cards).map(mapper::convert);
}
Alright, I've come up with something that works. I decided to use the page count approach to try parallelization, although it is admittedly not faster since network IO remains the bottleneck. I'll probably go back to the header link crawling and use caching. Roughly, magic numbers and all, this is what it looks like:
#Override
public Flux<Card> getAllCards() {
return client.getPageCount().flatMapMany(pageCount ->
Flux.concat(
range(1, pageCount)
.parallel(pageCount / 6).runOn(Schedulers.parallel())
.map(client::getPage)
).map(Page::cards).flatMap(Flux::fromIterable).map(mapper::convert)
);
}

Java - sync call inside async thenCompose

Consider below code as I am not able to find better words to ask the question:
CompletionStage<Manager> callingAsyncFunction(int developerId) {
return getManagerIdByDeveloperId(developerId)
.thenCompose(id ->
getManagerById(id, mandatoryComputationToGetManager(id)))
}
mandatoryComputationToGetManager() returns a CompletionStage
Now the doubt which I have is :
I wanted to call mandatoryComputationToGetManager() and after its computation I want getManagerById(...) to be called.
I know there can be one way i.e. calling thenCompose() first to do mandatoryComputationToGetManager() and then do another thenCompose() on previous result for getManagerById(). But I wanted to figure out if there is a way without piping one thenCompose() o/p to another by which I can hold until mandatoryComputationToGetManager() result is ready.
As far as I understand, in the above code getManagerById() will get called even if the result is not yet ready from mandatoryComputationToGetManager(), which I want to wait for so that once mandatoryComputationToGetManager() give the result getManagerById() should get computed asynchronously.
Ideally, we should pipe one thenCompose o/p to another, but there is a way by which we can achieve what you are trying to do.
CompletionStage<String> callingAsyncFunction(int developerId) {
return getManagerIdByDeveloperId(developerId)
.thenCompose(id -> getManagerById(id, mandatoryComputationToGetManager()));
}
private CompletionStage<String> getManagerById(
Integer id, CompletionStage<String> stringCompletionStage) {
return stringCompletionStage.thenApply(__ -> "test");
}
private CompletionStage<String> mandatoryComputationToGetManager() {
return CompletableFuture.completedFuture("test");
}

Method reference and boolean

So I have been having a go with using the method reference in Java 8 (Object::Method). What I am attempting to do, which I have done before but have forgotten (last time I used this method reference was about 4 months ago), is find the amount of players that != online using the Method Reference.
public static Set<Friend> getOnlineFriends(UUID playerUUID)
{
Set<Friend> friends = new HashSet<>(Arrays.asList(ZMFriends.getFriends(playerUUID)));
return friends.stream().filter(Friend::isOnline).collect(Collectors.toSet());
}
public static Set<Friend> getOfflineFriends(UUID playerUUID)
{
Set<Friend> friends = new HashSet<>(Arrays.asList(ZMFriends.getFriends(playerUUID)));
return friends.stream().filter(Friend::isOnline).collect(Collectors.toSet());
As you can see I managed to so it when the player (friend) is online but I cannot figure out how to filter though the Set and collect the offline players. I'm missing something obvious, but what is it?!?!
Thanks,
Duke.
In you code
public static Set<Friend> getOnlineFriends(UUID playerUUID)
{
Set<Friend> friends = new HashSet<>(Arrays.asList(ZMFriends.getFriends(playerUUID)));
return friends.stream().filter(Friend::isOnline).collect(Collectors.toSet());
}
you are creating a List view to the array returned by ZMFriends.getFriends(playerUUID), copy its contents to a HashSet, just to call stream() on it.
That’s a waste of resources, as the source type is irrelevant to the subsequent stream operation. You don’t need to have a Set source to get a Set result. So you can implement your operation simply as
public static Set<Friend> getOnlineFriends(UUID playerUUID)
{
return Arrays.stream(ZMFriends.getFriends(playerUUID))
.filter(Friend::isOnline).collect(Collectors.toSet());
}
Further, you should consider whether you really need both, getOnlineFriends and getOfflineFriends in your actual implementation. Creating utility methods in advance, just because you might need them, rarely pays off. See also “You aren’t gonna need it”.
But if you really need both operations, it’s still an unnecessary code duplication. Just consider:
public static Set<Friend> getFriends(UUID playerUUID, boolean online)
{
return Arrays.stream(ZMFriends.getFriends(playerUUID))
.filter(f -> f.isOnline()==online).collect(Collectors.toSet());
}
solving both tasks. It still wastes resource, if the application really needs both Sets, as the application still has to perform the same operation twice to get both Sets. Consider:
public static Map<Boolean,Set<Friend>> getOnlineFriends(UUID playerUUID)
{
return Arrays.stream(ZMFriends.getFriends(playerUUID))
.collect(Collectors.partitioningBy(Friend::isOnline, Collectors.toSet()));
}
This provides you both Sets at once, the online friends being associated to true, the offline friends being associated to false.
There are 2 ways I can think of:
friends.stream().filter(i -> !i.isOnline()).collect(Collectors.toSet());
But I guess that's not what you want, since it's not using a method reference. So maybe something like this:
public static <T> Predicate<T> negation(Predicate<T> predicate) {
return predicate.negate();
}
...
friends.stream().filter(negation(Friend::isOnline)).collect(Collectors.toSet());

java nested http call api library

I was reading some http webservice api from playframework, link below.
I am not able to understand how this nested work with the flatmap.
Can someone give me some hints how to crack down this big chunk of functions call.
from http://www.playframework.com/documentation/2.2.x/JavaWS
Composing results
If you want to make multiple calls in sequence,
this can be achieved using flatMap:
public static Promise<Result> index() {
final Promise<Result> resultPromise = WS.url(feedUrl).get().flatMap(
new Function<WS.Response, Promise<Result>>() {
public Promise<Result> apply(WS.Response response) {
return WS.url(response.asJson().findPath("commentsUrl").asText()).get().map(
new Function<WS.Response, Result>() {
public Result apply(WS.Response response) {
return ok("Number of comments: " + response.asJson().findPath("count").asInt());
}
}
);
}
}
);
return resultPromise;
}
flatMap and map are common Scala (or more generally, functional programming) functions. They both accept a function as a parameter. In order to translate the Play WS API into Java (and pretty much everything else), Scala's function type needed to be re-implemented in Java, so that you can take full advantage of the WS library. It's being done here in a similar way the Scala compiler does it. Function<A,B> is an abstract type that requires an apply method. The parameter(s) of apply are the parameters of the function, and the return type of apply is that of the function.
If you have this function in Java:
public String int2String(Integer integer) {
return integer.toString();
}
It would be equivalent to this:
new Function<Integer, String>() {
public String apply(Integer integer) {
return integer.toString();
}
}
Let's start simpler with the case of just one WS call. WS.url(...).get() returns a Promise<WS.Response>. Promise is a container class for a promised value. In order to handle the value it contains (or will eventually contain), we need to use the map function. For a Promise<WS.Response>, map will accept a Function<WS.Response, T> as a parameter, where T is the type you want to map the response to.
As an example, let's define a Function that will just return the body of the WS.Response in a Play HTTP Result:
Function<WS.Response, Result> echo = new Function<WS.Response, Result>() {
public Result apply(WS.Response response) {
return ok(response.asText());
}
}
Now let's use this Function in a WS call in a controller:
public static Promise<Result> index() {
final Promise<Result> resultPromise = WS.url("http://google.com").get().map(echo);
return resultPromise;
}
Once the Promise has been fulfilled, the echo function defined earlier will be executed inside map, returning the Promise<Result>. The two previous blocks of code could also be written like this (combined into one with an anonymous function):
public static Promise<Result> index() {
final Promise<Result> resultPromise = WS.url("http://google.com").get().map(
new Function<WS.Response, Result>() {
public Result apply(WS.Response response) {
return ok(response.asText());
}
}
);
return resultPromise;
}
As a crude example, let's say we need to make two WS calls. The second WS call will depend on the first. Perhaps the first call will give us some URL we will use to make the second WS call.
This is where flatMap comes into picture. We will need two functions to accomplish this task. The first function is what is passed to flatMap, which will be executed when the first WS.Response is received. This first function will use the first response to make the second WS call, which returns another Promise<WS.Response> that must be mapped to get our final result. So we map the second result with a second function that translates the WS.Response into our Result.
So what happened? If we used map instead of flatMap in both instances, the chain of events would go something like this:
The first get() returned a Promise<WS.Response>, then we map the contained WS.Response to a Promise<Result>. That, however, would leave us with a Promise<Promise<WS.Response>>, which is not very desirable. Using flatMap in the outer function instead will flatten the Promises into a single Promise<Result>. Simiarly, if you were doing 3 or more nested calls, you would map each result to an inner function, and have just one flatMap at the outer level to flatten everything at the end.
This all of course looks much more beautiful in Scala.

Categories