How do I configure backpressure in Spring WebFlux? - java

I'm trying to understand how to apply backpressure in Spring WebFlux. I understand the theory of backpressure, but I can't reproduce it, so I don't fully understand it.
Let's take the following example:
public void test() throws InterruptedException {
EmitterProcessor<String> processor = EmitterProcessor.create();
new Thread(() -> {
int i = 0;
while(runThread) {
try {
Thread.sleep(100);
} catch (InterruptedException ignored) {
}
processor.onNext("Value: " + i);
i++;
}
processor.onComplete();
}).start();
processor
.subscribe(makeSubscriber("FIRST - "), Throwable::printStackTrace);
}
private Consumer<String> makeSubscriber(String label) {
return v -> {
System.out.println(label + v);
try {
Thread.sleep(1000);
} catch (InterruptedException ignored) {
}
};
}
I have created a Hot Flux in the form of an EmitterProcessor and in a separate thread I start producing data for it.
A bit lower, I subscribe to it. The subscriber is slower than the rate at which elements are being produced, so the issues should start to occur, right?
But the subscriber logic is run on the producer thread. When I call processor.onNext(), it synchronously calls all the subscribers, so if the subscribers are slow, the publisher is slowed down as well. So, then backpressure doesn't even seem useful.
I have also tried making two Spring Boot WebFlux applications, one with a Flux endpoint and one that consumes the endpoint, so I can be certain the consumer runs on a separate thread. But then, any attempt I make at backpressure in the consumer does nothing. There is no buffer being filled, there is nothing being dropped or anything!
Can anyone give me a concrete example of backpressure? Preferably in Spring WebFlux but I'll take any reactive Java library.

the documentation to the variant of subscribe method you have chosen reads:
The subscription will request an unbounded demand (Long.MAX_VALUE).
that is, you switched off backpressure yourself.
To use backpressure , subscribe with Flux.subscribe(Subscriber)

Related

Leverage PriorityBlockingQueue to build producer-comsumer pattern in Java Reactor

In my project, there is a Spring scheduler periodically scans "TO BE DONE" tasks from DB, then distributing them to task consumer for subsequent handling. So, the current implementation is to construct a Reactor Sinks between producer and consumer.
Sinks.Many<Task> taskSink = Sinks.many().multicast().onBackpressureBuffer(1000, false);
Producer:
Flux<Date> dates = loadDates();
dates.filterWhen(...)
.concatMap(date -> taskManager.getTaskByDate(date))
.doOnNext(taskSink::tryEmitNext)
.subscribe();
Consumer:
taskProcessor.process(taskSink.asFlux())
.subscribeOn(Schedulers.boundedElastic())
.subscribe();
By using Sink, it works fine for most of cases. But when the system under heavy load, system maintainer would want to know:
How many tasks still sitting in the Sink?
If it is possible to clear all tasks within the Sink.
If it is possible to prioritize tasks within the Sink.
Unfortunately, Sink it's impossible to fulfill all the needs mentioned above.
So, I created a wrapper class that includes a Map and PriorityBlockingQueue. I refrerenced the implementation from this link https://stackoverflow.com/a/71009712/19278017.
After that, the original producer-consumer code revised as below:
Task queue:
MergingQueue<Task> taskQueue = new PriorityMergingQueue();
Producer:
Flux<Date> dates = loadDates();
dates.filterWhen(...)
.concatMap(date -> taskManager.getTaskByDate(date))
.doOnNext(taskQueue::enqueue)
.subscribe();
Consumer:
taskProcessor.process(Flux.create((sink) -> {
sink.onRequest(n -> {
Task task;
try {
while(!sink.isCancel() && n > 0) {
if(task = taskQueue.poll(1, TimeUnit.SECOND) != null) {
sink.next(task);
n--;
}
} catch() {
....
})
.subscribeOn(Schedulers.boundedElastic())
.subscribe();
I got some questions as below:
Will that be an issue the code doing a .poll()? Since, I came across thread hang issue during the longevity testing. Just not sure if it's due to the poll() call.
Is there any alternative solution in Reactor, which works like a PriorityBlockingQueue?
The goal of reactive programming is to avoid blocking operations. PriorityBlockingQueue.poll() will cause issues as it will block the thread waiting for the next element.
There is however an alternative solution in Reactor: the unicast version of Sinks.Many allows using an arbitrary Queue for buffering using Sinks.many().unicast().onBackPressureBuffer(Queue<T>). By using a PriorityQueue instanced outside of the Sink, you can fulfill all three requirements.
Here is a short demo where I emit a Task every 100ms:
public record Task(int prio) {}
private static void log(Object message) {
System.out.println(LocalTime.now(ZoneOffset.UTC).truncatedTo(ChronoUnit.MILLIS) + ": " + message);
}
public void externalBufferDemo() throws InterruptedException {
Queue<Task> taskQueue = new PriorityQueue<>(Comparator.comparingInt(Task::prio).reversed());
Sinks.Many<Task> taskSink = Sinks.many().unicast().onBackpressureBuffer(taskQueue);
taskSink.asFlux()
.delayElements(Duration.ofMillis(100))
.subscribe(task -> log(task));
for (int i = 0; i < 10; i++) {
taskSink.tryEmitNext(new Task(i));
}
// Show amount of tasks sitting in the Sink:
log("Nr of tasks in sink: " + taskQueue.size());
// Clear all tasks in the sink after 350ms:
Thread.sleep(350);
taskQueue.clear();
log("Nr of tasks after clear: " + taskQueue.size());
Thread.sleep(1500);
}
Output:
09:41:11.347: Nr of tasks in sink: 9
09:41:11.450: Task[prio=0]
09:41:11.577: Task[prio=9]
09:41:11.687: Task[prio=8]
09:41:11.705: Nr of tasks after clear: 0
09:41:11.799: Task[prio=7]
Note that delayElements has an internal queue of size 1, which is why Task 0 was picked up before Task 1 was emitted, and why Task 7 was picked up after the clear.
If multicast is required, you can transform your flux using one of the many operators enabling multicasting.

How to create blocking backpressure with rxjava Flowables?

I have a Flowable that we are returning in a function that will continually read from a database and add it to a Flowable.
public void scan() {
Flowable<String> flow = Flowable.create((FlowableOnSubscribe<String>) emitter -> {
Result result = new Result();
while (!result.hasData()) {
result = request.query(skip, limit);
partialResult.getResult()
.getFeatures().forEach(feature -> emmitter.emit(feature));
}
}, BackpressureStrategy.BUFFER)
.subscribeOn(Schedulers.io());
return flow;
}
Then I have another object that can call this method.
myObj.scan()
.parallel()
.runOn(Schedulers.computation())
.map(feature -> {
//Heavy Computation
})
.sequential()
.blockingSubscribe(msg -> {
logger.debug("Successfully processed " + msg);
}, (e) -> {
logger.error("Failed to process features because of error with scan", e);
});
My heavy computation section could potentially take a very long time. So long in fact that there is a good chance that the database requests will load the whole database into memory before the consumer finishes the first couple entries.
I have read up on backpressure with rxjava but the only 4 options essentially make me drop data or replace it with the last.
Is there a way to make it so that when I call emmitter.emit(feature) the call blocks until there is more room in the Flowable?
I.E I want to treat the Flowable as a blocking queue where push will sleep if the queue is past the capacity.

Guaranteed delivery of multiple messages to Kafka cluster

If I publish several messages in a row to a Kafka cluster (using the new Producer API), I get a Future from the producer for each message.
Now, assuming I have configured my producer to have max.in.flight.requests.per.connection = 1 and retries > 0 can I just wait on the last future and be certain that all previous have also been delivered (and in order)? Or do I need to wait on all Futures?
In code, can I do this:
Producer<String, String> producer = new KafkaProducer<>(myConfig);
Future<?> f = null;
for(MessageType message : messages){
f = producer.send(new ProducerRecord<String,String>("myTopic", message.getKey(), message.getValue());
}
try {
f.get();
} catch(ExecutionException e) {
//handle exception
}
instead of this:
Producer<String, String> producer = new KafkaProducer<>(myConfig);
List<Future<?>> futureList = new ArrayList<>();
for(MessageType message : messages){
futureList.add(producer.send(new ProducerRecord<String,String>("myTopic", message.getKey(), message.getValue()));
}
try {
for(Future<?> f : futureList) {
f.get();
}
} catch(ExecutionException e) {
//handle exception
}
and be assured that if nothing is caught here (from first snippet):
try {
f.get();
} catch(ExecutionException e) {
then all my messages have been stored in the cluster in order (whether or not the producer performed any retries under the hood) and if something goes wrong then I WILL get an exception there even if it was not the last future (that I'm waiting on) that first encountered the problem?
Are there any more strange corner cases to be aware of?
You can do this, but only if you a) set retries to be infinite (or effectively infinite) and b) are ok discarding data if you encounter a non-retriable exception.
To explain a bit more, Kafka has two classes of exceptions. Retriable exceptions are failures where you might be able to succeed if you run it again. For example, the NotEnoughReplicasException indicates that there are fewer replicas than you require and so the request gets rejected. But if a failed broker comes back online, then you might have enough replicas, be back in good shape, and the request will succeed if you send it again. In contrast, a SerializationException is not retriable because we have no reason to believe that if you try to serialize again the result will be different.
The producer retries only apply up to the point you hit a non-retriable exception. So if you never hit any of these, use infinite retries, and use the other settings you mentioned, the ordering and successful delivery are guaranteed once the final future has been resolved. However, since you might encounter non-retriable exceptions, it is definitely much better to handle each future (or callback) and ensure you at least log something if a request fails.
Further to what Ewen said, you could also make a call to flush() after you finished sending all your messages in the loop. This call will block until all futures have been completed, so after this you can check the futures for any exceptions. You'd need to hold on to all futures to be able to do this though.
An alternative way would be to use a callback with your sends and store any returned exceptions, like shown below. The use of flush again ensures that all sends have been completed, before you check for exceptions.
Producer<String, String> producer = new KafkaProducer<>(myConfig);
final ArrayList<Exception> exceptionList = new ArrayList<>();
for(MessageType message : messages){
producer.send(new ProducerRecord<String, String>("myTopic", message.getKey(), message.getValue()), new Callback() {
#Override
public void onCompletion(RecordMetadata metadata, Exception exception) {
if (exception != null) {
exceptionList.add(exception);
}
}
});
}
producer.flush();
if (!exceptionList.isEmpty()) {
// do stuff
}

Rx java OutOfMemory

EDITED: see this question which is more clear and precise:
RxJava flatMap and backpressure strange behavior
I'm currently writing a data synchronization job with RxJava and I'm quite novice with reactive programming and especialy RxJava library.
My job is quite simple I have a list of element IDs, I call a webservice to get each element by ID, do some processing and do multiple call to push data to DB.
I load the data from WS with 1 io thread and push the data to DB with multiple io threads.
However I always end-up with OutOfMemory error.
I thought first that loading the data from the WS was faster than storing them in the DBs.
But as both WS call and DB call synchronous call should they exert backpressure on each other?
Thank you for your help.
My code pretty much look like this:
#Test
public void test() {
int MAX_CONCURRENT_LOAD = 1;
int MAX_CONCURRENT_STORE = 2;
List<Integer> ids = IntStream.range(0, 10000).boxed().collect(Collectors.toList());
Observable.from(ids)
.flatMap(this::produce, MAX_CONCURRENT_LOAD)
.flatMap(this::consume, MAX_CONCURRENT_STORE)
.toBlocking().forEach(s -> System.out.println("Value " + s));
System.out.println("Finished");
}
private Observable<Integer> produce(final int value) {
return Observable.<Integer>create(s -> {
try {
if (!s.isUnsubscribed()) {
Thread.sleep(500); //Here I call WS to retrieve data
s.onNext(value);
s.onCompleted();
}
} catch (Exception e) {
s.onError(e);
}
}).subscribeOn(Schedulers.io());
}
private Observable<Boolean> consume(Integer value) {
return Observable.<Boolean>create(s -> {
try {
if (!s.isUnsubscribed()) {
Thread.sleep(10000); //Here I call DB to store data
s.onNext(true);
s.onCompleted();
}
} catch (Exception e) {
s.onNext(false);
s.onCompleted();
}
}).subscribeOn(Schedulers.io());
}
It seems your WS is poll based so if you use fromCallable instead of your custom Observable, you get proper backpressure:
return Observable.<Integer>fromCallabe(s -> {
Thread.sleep(500); //Here I call WS to retrieve data
return value;
}).subscribeOn(Schedulers.io());
Otherwise, if you have blocking WS and blocking database, you can use them to backpressure each other:
ids.map(id -> db.store(ws.get(id)).subscribeOn(Schedulers.io())
.toBlocking().subscribe(...)
and potentially leave off subscribeOn and toBlocking as well.

Proper termination of a stuck Couchbase Observable

I'm trying to delete a batch of couchbase documents in rapid fashion according to some constraint (or update the document if the constraint isn't satisfied). Each deletion is dubbed a "parcel" according to my terminology.
When executing, I run into a very strange behavior - the thread in charge of this task starts working as expected for a few iterations (at best). After this "grace period", couchbase gets "stuck" and the Observable doesn't call any of its Subscriber's methods (onNext, onComplete, onError) within the defined period of 30 seconds.
When the latch timeout occurs (see implementation below), the method returns but the Observable keeps executing (I noticed that when it kept printing debug messages when stopped with a breakpoint outside the scope of this method).
I suspect couchbase is stuck because after a few seconds, many Observables are left in some kind of a "ghost" state - alive and reporting to their Subscriber, which in turn have nothing to do because the method in which they were created has already finished, eventually leading to java.lang.OutOfMemoryError: GC overhead limit exceeded.
I don't know if what I claim here makes sense, but I can't think of another reason for this behavior.
How should I properly terminate an Observable upon timeout? Should I? Any other way around?
public List<InfoParcel> upsertParcels(final Collection<InfoParcel> parcels) {
final CountDownLatch latch = new CountDownLatch(parcels.size());
final List<JsonDocument> docRetList = new LinkedList<JsonDocument>();
Observable<JsonDocument> obs = Observable
.from(parcels)
.flatMap(parcel ->
Observable.defer(() ->
{
return bucket.async().get(parcel.key).firstOrDefault(null);
})
.map(doc -> {
// In-memory manipulation of the document
return updateDocs(doc, parcel);
})
.flatMap(doc -> {
boolean shouldDelete = ... // Decide by inner logic
if (shouldDelete) {
if (doc.cas() == 0) {
return Observable.just(doc);
}
return bucket.async().remove(doc);
}
return (doc.cas() == 0 ? bucket.async().insert(doc) : bucket.async().replace(doc));
})
);
obs.subscribe(new Subscriber<JsonDocument>() {
#Override
public void onNext(JsonDocument doc) {
docRetList.add(doc);
latch.countDown();
}
#Override
public void onCompleted() {
// Due to a bug in RxJava, onError() / retryWhen() does not intercept exceptions thrown from within the map/flatMap methods.
// Therefore, we need to recalculate the "conflicted" parcels and send them for update again.
while(latch.getCount() > 0) {
latch.countDown();
}
}
#Override
public void onError(Throwable e) {
// Same reason as above
while (latch.getCount() > 0) {
latch.countDown();
}
}
};
);
latch.await(30, TimeUnit.SECONDS);
// Recalculating remaining failed parcels and returning them for another cycle of this method (there's a loop outside)
}
I think this is indeed due to the fact that using a countdown latch doesn't signal the source that the flow of data processing should stop.
You could use more of rxjava, by using toList().timeout(30, TimeUnit.SECONDS).toBlocking().single() instead of collecting in an (un synchronized and thus unsafe) external list and of using the countdownLatch.
This will block until a List of your documents is returned.
When you create your couchbase env in code, set computationPoolSize to something large. When the Couchbase clients runs out of threads using async it just stops working, and wont ever call the callback.

Categories