Flux endpoint from infinite java stream - java

I have an issue while processing a flux that is built from a Stream.generate construct.
The Java stream is fetching some data from a remote source, hence I implemented a custom supplier that has the data fetching logic embedded, and then used it to populate the Stream.
Stream.generate(new SearchSupplier(...))
My idea is to detect an empty list and use the Java9 feature of takeWhile ->
Stream.generate(new SearchSupplier(this, queryBody))
.takeWhile(either -> either.isRight() && either.get().nonEmpty())
(using Vavr's Either construct)
The repositoroy layer flux will then do:
return Flux.fromStream (
this.searchStream(...) //this is where the stream gets generated
)
.map(Either::get)
.flatMap(Flux::fromIterable);
The "service" layer is composed of some transformation steps on the flux, but the method signature is something like Flux<JsonObject> search(...).
Finally, the controller layer has a GetMapping:
#GetMapping(produces = "application/stream+json")
public Flux search(...) {
return searchService.search(...) //this is the Flux<JsonObject> parth
.subscriberContext(...) //stuff I need available during processing
.doOnComplete(() -> log.debug("DONE"));
}
My problem is that the Flux seems to never terminate.
Doing a call from Postman for example just shot the 'Loading...' part in the response section. When I terminate the process from my IDE the results are then flushed to postman and I see what I'm expecting. Also the doOnComplete lambda never gets called
What I noticed is that if I change the source of a Flux:
Flux.fromArray(...) //harcoded array of lists of jsons
the doOnComplete lambda is called and also the http connection closes, and results are displayed in postman.
Any idea of what might be the issue?
Thanks.

You could create the Flux directly using code that looks like this. Note that I'm adding some assumed methods which you would need to implement based on your how your SearchSupplier works:
Flux<SearchResultType> flux = Flux.generate(
() -> new SearchSupplier(this, queryBody),
(supplier, sink) -> {
SearchResultType current = supplier.next();
if (isNotLast(current)) {
sink.next(current);
} else {
sink.complete();
}
return supplier;
},
supplier -> anyCleanupOperations(supplier)
);

Related

How do I convert this code reactive using reactor/Mono?

I am changing a java app to use reactive programming to allow asyncronous and nonblocking flow but I'm having trouble understanding the concepts to achieve this. A stream of siteIds are used to invoke third party APIs and eventually the response is saved into some storage.
The code I have now is blocking and I would like to remove that...
generateReport() returns a Mono< BaseResponse > object
getReportAndSave() retrieves and manipulates the report and saves it, then should return boolean.
listResult = siteIds.parallel()
.map(siteId -> generateReport(authToken, requestParams, siteId))
.map(response -> response.block(Duration.ofMinutes(asyncCallTimeout)))
.map(resp -> getReportAndSave(authToken, resp.getRequestId()))
.collect(Collectors.toList());
So far I have this which should be able to do the same except I dont know how to get a return value for listResult.
siteId.forEach(siteId -> generateReport(authToken, requestParams, siteId)
.subscribe(baseResponse -> getReportAndSave(authToken, baseResponse.getRequestId())));
listResult is a List of Booleans, saying if each siteId has successfully been saved into a blob storage.
final Flux<ResultWrapperBean> resultFlux = Flux.fromIterable(siteIds)
// Since generateReport() returns Mono, here you should use flatMap instead of map.
.flatMap(siteId -> generateReport(authToken, requestParams, siteId))
// Use a wrapper bean to save the request id and request result.
.map(resp -> new ResultWrapperBean(resp.getRequestId(), getReportAndSave(authToken, resp.getRequestId())));
resultFlux.subscribe(resultBean -> log.info("RequestId: {}, and request result is {}", resultBean.getRequestId(), resultBean.getResult()));

Java Reactive. How to wait for all data in flux and then process them

I'm getting data from mongo reactive repository and updating it. Then I have to collect all data in one collection and path it to another service to get more info. Then I should map them in one Flux. My code is:
Flux<Views> views = someRepository.findAllByUsersIn(userId).doOnNext(v -> {
v.setInterlocutor(v.getUsers().stream().filter(u -> !userId.equals(u)).findFirst().orElse(null));
});
return Flux.zip(views.map(view -> conversionService.convert(view, ResponseViewDto.class)), getUserInfo(views).flux())
.flatMap(fZip -> {
ResponseViewDto dto = fZip.getT1();
dto.setInterlocutor(fZip.getT2().get(dto.getInter()));
return Flux.just(dto);
});
getUserInfo does collecting usersId and sends them to another service and returns expanded info.
I found that getting from DB calls 2 times and I can understand why, but is there any solution to do it once and still be not blocking.
Thanks to Adhika Setya Pramudita for help. The way to do what I need is just to use cache() method

Java mono repeat call until collected results compete

I'm picking up Java/Reactor after moving over from C#. I'm well versed in the C# async-await approach to non-blocking calls and am struggling to adapt to Flux/Mono.
I'm implementing a solution where I need to make a call to ElasticSearch via the Java SDK, get results, apply additional filters to strip out ES results, and keep paging through ES until my final collection of results is complete.
The ES SDK doesn't support Reactor but there are examples of Java adapter code that takes the ES callback and converts to a mono (I see a direct correlation to the C# async-await here as this is a non-blocking call to ES). What I then struggle with is the next bit - I need to take the results from the ES mono, filter them.
I do this by calling out to other external services to get additional data based on the results from the ES call, so I need to know the ids of each page of content the ES mono result before I can apply the filtering (effectively a kind of block), then apply the in-memory filters and if I don't have enough content, then go back to ES to get the next page... repeat until I have enough data or there are no more results from ES.
This appears to be very difficult to achieve compared to C# but I probably just don't understand the Java paradigm correctly.
My problem is that I can't use "block()" as this throws an error in Reactor 3.2 so I don't really know how to "wait" until the mono calls to ES and external services are complete until continuing. In C#, this would be as simple as call to an Async method with an await to handle the implicit callbacks
My blocking version (works in IntellJ, fails when published via maven and then run in a webserver) is effectively:
do {
var sr = GetSearchRequest(xxxx);
this.elasticsearch.results(sr)
.map(r -> chunk.addAdd(r))
.block();
if (chunk.size() == 0 {
isComplete = true;
}
else {
var filtered = postFilterResults(chunk);
finalResults.add(filtered);
if (finalResults.size() = MAXIMUM_RESULTS) {
isComplete = true;
}
esPage = esPage + 1;
while (isComplete == false);
If I try to subscribe() or other non-blocking reaktor calls, then (obviously) the code skips over the "get ES" bit and hits the do-while, looping repeatedly until the callback from ES finally happens and the subscribed map is invoked.
I think I need to perform an "async block" for each ES call but I don't know how.
To answer my own question... The underlying issue IMO is that Flux/Mono simply is not like any existing programming style in that it absolutely forces you to work within the fluent style that reactor mandates. This is very similar to C# Linq but it's almost a "false friend" as even things like loops need to be in Reactor.
In this case, the key issue to solve is one of paging and to keep doing this within a loop. it is very unclear how to achieve this as a subscription to a flux "locks in" the original parameters so repeating the subscription call simply gets the same page again. The solution is to use the Flux.defer method which forces lazy building of the subscription on each repeated invoke. You then need Atomic integers to keep track of the page counter across different calls. Again, this is something that C# handles for you, so it can catch a .net developer out.
Something like:
//The response from the elasticsearch adapter is a Flux<T> but we do not want to filter
//results on a row by row basis as this incurs one call for each row to the DB/Network
//(as appropriate). We choose to batch these up
var result = new SearchResult();
var page = new AtomicInteger();
var chunkSize = new AtomicInteger();
//Use a defer so we recalculate the subscription to the search with the new page count
var results = Flux.defer(() -> elasticsearch.results(GetSearchRequest(request, lc, pf, page.get()))
.doOnComplete(() -> {
chunkSize.set(0);
page.getAndAdd(1);
})
.collectList()
.map(chunk -> {
chunkSize.set(chunk.size());
return chunk;
})
.map(chunk -> postFilterResults(request, chunk, pf))
.map(filtered -> result.getDocuments().addAll(filtered)));
//Repeat the deferred flux (recalculating each time) until we have enough content or we don't get anything from the search engine
return results
.repeat()
.takeUntil(r -> chunkSize.get() == 0 || result.getDocuments().size() >= this.elasticsearch.getMaximumSearchResults())
.take(this.elasticsearch.getMaximumSearchResults())
.collectList()
.flatMap(r -> {
result.setTotalHits(result.getDocuments().size());
return Mono.just(result);
});

Applying a Single to an ObservableSource and not over-reading

I'm pretty new to RX in general, and rxjava in particular, pardon mistakes.
This operation depends on a two async operations.
The first uses a filter function to attempt to get a single entity from a list returned by an async Observable.
The second is an async operation that communicates with a device and produces an Observable of status updates.
I want to take the Single that is created from the filter function, apply that to pairReader(...), and subscribe to its Observable for updates. I can get this to work as shown, but only if I include the take(1) commented, otherwise I get an exception because the chain tries to pull another value from the Single.
Observable<DeviceCredential> getCredentials() {
return deviceCredentialService()
.getCredentials()
.flatMapIterable(event -> event.getData());
}
Single<Organization> getOrgFromCreds(String orgid) {
return getCredentials()
// A device is logically constrained to only have a single cred per org
.map(DeviceCredential::getOrganization)
.filter(org -> org.getId().equals(orgid))
.take(1) // Without this I get an exception
.singleOrError();
}
Function<Organization, Observable<Reader.EnrollmentState>> pairReader(String name) {
return org -> readerService().pair(name, org);
}
getOrgFromCreds(orgid)
.flatMapObservable(pairReader(readerid))
.subscribe(state -> {
switch(state) {
case BEGUN:
LOG.d(TAG, "Pairing begun");
break;
case PAIRED:
LOG.d(TAG, "Pairing success");
callback.success();
break;
case NOTIFIED_SERVER:
LOG.d(TAG, "Pairing server notified");
break;
}},
error -> {
Crashlytics.logException(error);
callback.error(error.getLocalizedMessage());
});
If the source stream emits more than one item, singleOrError() is supposed to emit an error. Doc
For your case, use either first() or firstOrError() instead.
Single<Organization> getOrgFromCreds(String orgid) {
return getCredentials()
.map(DeviceCredential::getOrganization)
.filter(org -> org.getId().equals(orgid))
.firstOrError();
}
If I got you right, you need to make some action using previously retrieved async data. So, you could use .zip() operator.
Here is an example:
Observable.zip(
getOrgFromCreds().toObservable(),
getCredentials(),
(first, second) -> /*create output object here*/
)
.subscribe(
(n) -> /*do onNext*/,
(e) -> /*do onError*/
);
Note, that .zip() operator will wait for both emission from two streams, and then it will create outer emission using the function you provided in "create output object here".
If you don't want to wait for both items - you can use .combineLatest().
The problem here turned out to be that the API was designed in an odd way (and unfortunately has extremely poor documentation). I couldn't figure out why I was getting duplicates, and thought I was using flatMapIterable incorrectly.
What the deviceCredentialService.getCredentials() call actually creates is an observable that emits DataEvent objects which are simple wrappers over a list of results, and with a flag of where the results came from.
The API designer wanted to allow the user to use locally cached data to fill the UI immediately while a longer request to a REST API executes. The DataEvent.from property is an enum that flags the source, either from the local device cache or from the remote API call.
The way I solved this was to simply ignore the results coming from local cache and only emit results from the API:
Observable<DeviceCredential> getCredentials() {
return deviceCredentialService()
.getCredentials()
// Only get creds from network
.filter(e -> e.getFrom() == SyncedDataSourceObservableFactory.From.SOURCE)
.flatMapIterable(e -> e.getData());
}
Single<Organization> getOrgFromCreds(String orgid) {
return getCredentials()
// A device is logically constrained to only have a single cred per org
.map(DeviceCredential::getOrganization)
.filter(org -> org.getId().equals(orgid))
.singleOrError();
}
The plan then is to use memoization to cache entities in a way that gives the implementing app access to cache invalidation. Since the provided interface doesn't allow squelching the API call, there is no way to work only with cache if the app feels its is fresh.

mono.zip functionality not working as expected

Mono.zip(
Mono.fromCallable(() -> queueDao.getNumberOfMessageInQueue(cartDaemonUrl)),
Mono.fromCallable(() -> queueDao.getNumberOfMessageInQueue(orderConfirmationDaemonUrl))
//ono.fromCallable(()->queueDao.getNumberOfMessageInQueue(cartDaemonUrl))
)
.map(T -> updateQueueCountToDb(T.getT1().block(), T.getT2().block())).
doOnSuccess(row -> log.info("Queue info inserted into db rows ")).
doOnError(e -> log.error("Error while inserting data stacktrace{}", e.getStackTrace()));
I am not able to figure out why control is not entering updateQueueCountToDb method.I have added logs inside that method too, even those logs are not getting printed in the console.
Continuing on to the comments given by #Jesper that is correct that you have to subscribe to reactive streams to bring them into action, otherwise, we are just building reactive pipelines that we expect our data to flow to.
I am also learning reactive streams so, I have created a quick example below that should help you:
Mono<String> first=Mono.just("first");
Mono<String> second=Mono.just("second");
Mono<Tuple2<String, String>> zipped=Mono.zip(first,second);
zipped.map(tuple->someOtherOperation(tuple.getT1(),tuple.getT2()))
.doOnSuccess(s->System.out.println("Success"))
.doOnError(s->System.out.println("Error"))
.subscribe();
My implementation of someOtherOperation is:
public static Mono<Void> someOtherOperation(String a, String b)
{
System.out.println("Performing Operation "+a+":"+b);
return Mono.empty();
}
So running my code snippet inside a java application will print following on your console:
Performing Operation first:second
Success
Also, subscribe is not the only method to subscribe to reactive streams, have a look at this document http://projectreactor.io/docs/core/release/reference/#_simple_ways_to_create_a_flux_or_mono_and_subscribe_to_it

Categories