Invoking non-blocking operations sequentially while consuming from a Flux including retries - java

So my use-case is to consume messages from Kafka in a Spring Webflux application while programming in the reactive style using Project Reactor, and to perform a non-blocking operation for each message in the same order as the messages were received from Kafka. The system should also be able to recover on its own.
Here is the code snippet that is setup to consume from :
Flux<ReceiverRecord<Integer, DataDocument>> messages = Flux.defer(() -> {
KafkaReceiver<Integer, DataDocument> receiver = KafkaReceiver.create(options);
return receiver.receive();
});
messages.map(this::transformToOutputFormat)
.map(this::performAction)
.flatMapSequential(receiverRecordMono -> receiverRecordMono)
.doOnNext(record -> record.receiverOffset().acknowledge())
.doOnError(error -> logger.error("Error receiving record", error))
.retryBackoff(100, Duration.ofSeconds(5), Duration.ofMinutes(5))
.subscribe();
As you can see, what I do is: take the message from Kafka, transform it into an object intended for a new destination, then send it to the destination, and then acknowledge the offset to mark the message as consumed and processed. It is critical to acknowledge the offset in the same order as the messages being consumed from Kafka so that we don't move the offset beyond messages that were not fully processed (including sending some data to the destination). Hence I'm using a flatMapSequential to ensure this.
For simplicity let's assume the transformToOutputFormat() method is an identity transform.
public ReceiverRecord<Integer, DataDocument> transformToOutputFormat(ReceiverRecord<Integer, DataDocument> record) {
return record;
}
The performAction() method needs to do something over the network, say call an HTTP REST API. So the appropriate APIs return a Mono, which means the chain needs to be subscribed to. Also, I need the ReceiverRecord to be returned by this method so that the offset can be acknowledged in the flatMapSequential() operator above. Because I need the Mono subscribed to, I'm using flatMapSequential above. If not, I could have used a map instead.
public Mono<ReceiverRecord<Integer, DataDocument>> performAction(ReceiverRecord<Integer, DataDocument> record) {
return Mono.just(record)
.flatMap(receiverRecord ->
HttpClient.create()
.port(3000)
.get()
.uri("/makeCall?data=" + receiverRecord.value().getData())
.responseContent()
.aggregate()
.asString()
)
.retryBackoff(100, Duration.ofSeconds(5), Duration.ofMinutes(5))
.then(Mono.just(record));
I have two conflicting needs in this method:
1. Subscribe to the chain that makes the HTTP call
2. Return the ReceiverRecord
Using a flatMap() means my return type changes to a Mono. Using doOnNext() in the same place would retain the ReceiverRecord in the chain, but would not allow the HttpClient response to be subscribed to automatically.
I can't add .subscribe() after asString(), because I want to wait till the HTTP response is completely received before the offset is acknowledged.
I can't use .block() either since it runs on a parallel thread.
As a result, I need to cheat and return the record object from the method scope.
The other thing is that on a retry inside performAction it switches threads. Since flatMapSequential() eagerly subscribes to each Mono in the outer flux, this means that while acknowledgement of offsets can be guaranteed in order, we can't guarantee that the HTTP call in performAction will be performed in the same order.
So I have two questions.
Is it possible to return record in a natural way rather than returning the method scope object?
Is it possible to ensure that both the HTTP call as well as the offset acknowledgement are performed in the same order as the messages for which these operations are occurring?

Here is the solution I have come up with.
Flux<ReceiverRecord<Integer, DataDocument>> messages = Flux.defer(() -> {
KafkaReceiver<Integer, DataDocument> receiver = KafkaReceiver.create(options);
return receiver.receive();
});
messages.map(this::transformToOutputFormat)
.delayUntil(this::performAction)
.doOnNext(record -> record.receiverOffset().acknowledge())
.doOnError(error -> logger.error("Error receiving record", error))
.retryBackoff(100, Duration.ofSeconds(5), Duration.ofMinutes(5))
.subscribe();
Instead of using flatMapSequential to subscribe to the performAction Mono and preserve sequence, what I've done instead is delayed the request for more messages from the Kafka receiver until the action is performed. This enables the one-at-a-time processing that I need.
As a result, performAction doesn't need to return a Mono of ReceiverRecord. I also simplified it to the following:
public Mono<String> performAction(ReceiverRecord<Integer, DataDocument> record) {
HttpClient.create()
.port(3000)
.get()
.uri("/makeCall?data=" + receiverRecord.value().getData())
.responseContent()
.aggregate()
.asString()
.retryBackoff(100, Duration.ofSeconds(5), Duration.ofMinutes(5));
}

Related

Create never ending hot stream in spring flux

I have a consumer that serves data. Data is consumed and processed - not reactive. Then I took this data and send it to:
Sinks.many().multicast().onBackpressureBuffer(Queues.SMALL_BUFFER_SIZE, false);
I am using
sink.emitNext(
message, retryOn(Sinks.EmitFailureHandler.FAIL_FAST, message));
as suggested in other stack posts, where I am taking care of
Sinks.EmitResult.FAIL_NON_SERIALIZED
Still stuck on reactor.core.Exceptions$OverflowException: Backpressure overflow during Sinks.Many#emitNext
On the front end there is subscriber EventSource. My GET endpoint (simplified, tried many things here)
#GetMapping(path = "/stream", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<ServerSentEvent<Message>> streamData() {
return service
.getSink()
.asFlux()
.map(e -> ServerSentEvent.builder(e).event(e.getType().getId()).build());
}
I have tried suggestions in other posts. But I am obviously not capable to do: Never ending hot event stream, in which the subscriber once connected never misses any message.
Code is welcome, but opinions more matters.
Recap: Never ending hot event source, no message is lost to subscriber.
EDIT: Why the suggested posts don work for me: I am missing the part between the Consumer(of the queue) - Sink - Get endpoint: still end with reactor.core.Exceptions$OverflowException: Backpressure overflow during Sinks.Many#emitNext
A consumer receives data from queue. Data is processed non reactive. Result data sent to... Should be received in EventSource in front end.
Used sink to send the processed data:
sink = Sinks.many().multicast().directAllOrNothing();
sink.emitNext(
ServerSentEvent.builder(message).event(message.getType().getId()).build(),
retryOnNonSerialized(Sinks.EmitFailureHandler.FAIL_FAST,message)
);
Controller:
#PostConstruct
private void loadFlux() {
log.info("Constructing Flux....");
flux = service
.getSink()
.asFlux()
.publishOn(Schedulers.boundedElastic())
.onBackpressureBuffer()
.onBackpressureDrop(message -> blog.debug("[STREAM] Backpressure message drop: {}", message))
.share();
}
#GetMapping(path = "/stream", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<ServerSentEvent<Message>> streamData() {
return fluxData;
}
Works for me for never ending hot stream.

What is the proper way to wait till all Mono responses are returned from downstream APIs

I'm quite new to Mono and Flux. I'm trying to join several downstream API responses. It's a traditional blocking application. I don't wish to collect a list of Mono, I want a List of the payloads returned from the downstream APIs, which I fetch from the Mono. However the 'result' being returned to the controller at times only has some or none of the downstream API responses. What is the correct way to do this? I've read several posts How to iterate Flux and mix with Mono states
you should not call subscribe anywhere in a web application. If this is bound to an HTTP request, you're basically triggering the
reactive pipeline with no guarantee about resources or completion.
Calling subscribe triggers the pipeline but does not wait until it's
complete
Should I be using CompletableFuture?
In my Service I attempted
var result = new ArrayList<List<>>();
List<Mono<X>> monoList = apiCall();
Flux.fromIterable(monoList)
.flatMap(m -> m.doOnSuccess(
x -> {
result.add(x.getData());
}
)).subscribe();
I also attempted the following in controller, but the method returns without waiting for subscribe to complete
var result = new ArrayList<List<X>>();
Flux.concat(
this.service.callApis(result, ...)
).subscribe();
return result;
In my service
public Mono<Void> callApis(List<List<x>> result, ..) {
...
return Flux.fromIterable(monoList)
.flatMap(m -> m.doOnSuccess(
x -> {
result.add(x.getData()...);
}
)).then();
The Project Reactor documentation (which is very good) has a section called Which operator do I need?. You need to create a Flux from your API calls, combine the results, and then return to the synchronous world.
In your case, it looks like all your downstream services have the same API, so they all return the same type and it doesn't really matter what order those responses appear in your application. Also, I'm assuming that apiCall() returns a List<Mono<Response>>. You probably want something like
Flux.fromIterable(apiCall()) // Flux<Mono<Response>>
.flatMap(mono -> mono) // Flux<Response>
.map(response -> response.getData()) // Flux<List<X>>
.collectList() // Mono<List<List<X>>>
.block(); // List<List<X>>
The fromIterable(...).flatMap(x->x) construct just converts your List<Mono<R>> into a Flux<R>.
map() is used to extract the data part of your response.
collectList() creates a Mono that waits until the Flux completes, and gives a single result containing all the data lists.
block() subscribes to the Mono returned by the previous operator, and blocks until it is complete, which will (in this case) be when all the Monos returned by apiCall() have completed.
There are many possible alternatives here, and which is most suitable will depend on your exact use case.

How to limit concurrent http requests with Mono & Flux

I want to handle Flux to limit concurrent HTTP requests made by List of Mono.
When some requests are done (received responses), then service requests another until the total count of waiting requests is 15.
A single request returns a list and triggers another request depending on the result.
At this point, I want to send requests with limited concurrency.
Because consumer side, too many HTTP requests make an opposite server in trouble.
I used flatMapMany like below.
public Flux<JsonNode> syncData() {
return service1
.getData(param1)
.flatMapMany(res -> {
List<Mono<JsonNode>> totalTask = new ArrayList<>();
Map<String, Object> originData = service2.getDataFromDB(param2);
res.withArray("data").forEach(row -> {
String id = row.get("id").asText();
if (originData.containsKey(id)) {
totalTask.add(service1.updateRequest(param3));
} else {
totalTask.add(service1.deleteRequest(param4));
}
originData.remove(id);
});
for (left) {
totalTask.add(service1.createRequest(param5));
}
return Flux.merge(totalTask);
});
}
void syncData() {
syncDataService.syncData().????;
}
I tried chaining .window(15), but it doesn't work. All the requests are sent simultaneously.
How can I handle Flux for my goal?
I am afraid Project Reactor doesn't provide any implementation of either rate or time limit.
However, you can find a bunch of 3rd party libraries that provide such functionality and are compatible with Project Reactor. As far as I know, resilience4-reactor supports that and is also compatible with Spring and Spring Boot frameworks.
The RateLimiterOperator checks if a downstream subscriber/observer can acquire a permission to subscribe to an upstream Publisher. If the rate limit would be exceeded, the RateLimiterOperator could either delay requesting data from the upstream or it can emit a RequestNotPermitted error to the downstream subscriber.
RateLimiter rateLimiter = RateLimiter.ofDefaults("name");
Mono.fromCallable(backendService::doSomething)
.transformDeferred(RateLimiterOperator.of(rateLimiter))
More about RateLimiter module itself here: https://resilience4j.readme.io/docs/ratelimiter
You can use limitRate on a Flux. you need to probably reformat your code a bit but see docs here: https://projectreactor.io/docs/core/release/api/reactor/core/publisher/Flux.html#limitRate-int-
flatMap takes a concurrency parameter: https://projectreactor.io/docs/core/release/api/reactor/core/publisher/Flux.html#flatMap-java.util.function.Function-int-
Mono<User> getById(int userId) { ... }
Flux.just(1, 2, 3, 4).flatMap(client::getById, 2)
will limit the number of concurrent requests to 2.

Project Reactor async send email with retry on error

I need to send some data after user registered. I want to do first attempt in main thread, but if there are any errors, I want to retry 5 times with 10 minutes interval.
#Override
public void sendRegisterInfo(MailData data) {
Mono.just(data)
.doOnNext(this::send)
.doOnError(ex -> logger.warn("Main queue {}", ex.getMessage()))
.doOnSuccess(d -> logger.info("Send mail to {}", d.getRecipient()))
.onErrorResume(ex -> retryQueue(data))
.subscribe();
}
private Mono<MailData> retryQueue(MailData data) {
return Mono.just(data)
.delayElement(Duration.of(10, ChronoUnit.MINUTES))
.doOnNext(this::send)
.doOnError(ex -> logger.warn("Retry queue {}", ex.getMessage()))
.doOnSuccess(d -> logger.info("Send mail to {}", d.getRecipient()))
.retry(5)
.subscribe();
}
It works.
But I've got some questions:
Did I correct to make operation in doOnNext function?
Is it correct to use delayElement to make a delay between executions?
Did the thread blocked when waiting for delay?
And what the best practice to make a retries on error and make a delay between it?
doOnXXX for logging is fine. But for the actual element processing, you must prefer using flatMap rather than doOnNext (assuming your processing is asynchronous / can be converted to returning a Flux/Mono).
This is correct. Another way is to turn the code around and start from a Flux.interval, but here delayElement is better IMO.
The delay runs on a separate thread/scheduler (by default, Schedulers.parallel()), so not blocking the main thread.
There's actually a Retry builder dedicated to that kind of use case in the reactor-extra addon: https://github.com/reactor/reactor-addons/blob/master/reactor-extra/src/main/java/reactor/retry/Retry.java

Modeling an event Sink in RxJava for events that need onComplete/onError

I'm in the process of writing a client for Apache Mesos' new HTTP Scheduler API using RxJava and RxNetty.
I've managed to successfully create the connection with RxNetty and create an Observable<Event> from the resulting chunked stream.
Now I'm at the point of trying to model a sink that can be used to send calls back to Mesos in order to claim/decline resource offers, acknowledge task status updates, etc.
The message that will be sent to sent to Mesos is a Call, I need to be able to provide an onCompleted or onError for every Call that comes into the Sink. This is due to Mesos performing synchronous validation on the Call being sent to it.
I'm essentially trying to allow for the following:
final MesosSchedulerClient client = new MesosSchedulerClient();
final Observable<Event> events = client.openEventStream(subscribeCall);
final Observable<Observable<Call>> ackCalls = events
.filter(event -> event.getType() == Event.Type.UPDATE && event.getUpdate().getStatus().hasUuid())
.zipWith(frameworkIDObservable, (Event e, AtomicReference<FrameworkID>> fwId) -> {
final TaskStatus status = e.getUpdate().getStatus();
final Call ackCall = ackUpdate(fwId.get(), status.getUuid(), status.getAgentId(), status.getTaskId());
return Observable.just(ackCall)
.doOnComplete(() -> { ... })
.doOnError((e) -> { ... });
});
client.sink(ackCalls);
Right now I've come up with a custom object[1] that extends Subject and specifies the Call and Action0 for onCompleted and Action1<Throwable> for onError. Though, I would prefer to use the existing constructs from RxJava if possible. Sample usage of what I've come up with[2].
Any guidance would be greatly appreciated.
[1] https://github.com/BenWhitehead/mesos-rxjava/blob/sink-operation/mesos-rxjava-core/src/main/java/org/apache/mesos/rx/java/SinkOperation.java#L17
[2] https://github.com/BenWhitehead/mesos-rxjava/blob/sink-operation/mesos-rxjava-example/mesos-rxjava-example-framework/src/main/java/org/apache/mesos/rx/java/example/framework/sleepy/Main.java#L117-L124
The solution I ended up with was to create a custom Subscriber that would process the event stream and send the requests back to mesos.

Categories