Spring Reactor: switchIfEmpty with onErrorContinue - Bug or Feature? - java

I have made a interesting observation using switchIfEmpty in conjunction with onErrorContinue.
Given the following pipeline (in Kotlin):
Flux.fromIterable(keys).concatMap { key ->
someRepository.findByKey(key)
.map { domain -> someFunction(domain) }
.flatMap { domain -> someRepository.save(domain) }
.switchIfEmpty(Mono.defer { onEmptyFunction().toMono() })
}
The outer pipeline maps a Flux of keys to a Mono of a domain object and pipes it through the someFunction and saves it through the someRepository function.
If the findByKey function completes without data (cannot find the domain object in the repo) it should straight go to the switchIfEmpty in the pipline. Okay so far so good.
The above description works perfectly until the inner pipeline throws an Exception. Then the pipeline does not continue any further and the subscriber will get notified by its onError consumer function.
What if we want to continue the inner pipeline even if an Exception is thrown and just try the next key from the outer Flux?
Thats possible and we introduce a new operator onErrorContinue:
Flux.fromIterable(keys).concatMap { key ->
someRepository.findByKey(key)
.map { domain -> someFunction(domain) }
.flatMap { domain -> someRepository.save(domain) }
.switchIfEmpty(Mono.defer { onEmptyFunction().toMono() })
}.onErrorContinue(::someLog)
Now the pipeline should continue working even if the inner pipeline throws an exception (and it does so, I tested it).
However, here comes the interesting observation:
The switchIfEmpty function is ALSO getting called if the inner pipeline throws an Exception.
This means switchIfEmpty gets executed in TWO events:
- findByKey completes with an empty Mono
- inner pipeline throws an Exception (e.g. someFunction)
This is fundamentally different from above pipeline without onErrorContinue where the switchIfEmpty is NOT executed if the inner pipeline throws an Exception.
Questions:
Is this intended behaviour?
Is this a bug?
Can i somehow control that switchIfEmtpy is ONLY executed on empty data not on errors?
This is is really a dangerous situation which bugged me for a while!

Related

Spring Webflux Proper Way To Find and Save

I created the below method to find an Analysis object, update the results field on it and then lastly save the result in the database but not wait for a return.
public void updateAnalysisWithResults(String uuidString, String results) {
findByUUID(uuidString).subscribe(analysis -> {
analysis.setResults(results);
computeSCARepository.save(analysis).subscribe();
});
}
This feels poorly written to subscribe within a subscribe.
Is this a bad practice?
Is there a better way to write this?
UPDATE:
entry point
#PatchMapping("compute/{uuid}/results")
public Mono<Void> patchAnalysisWithResults(#PathVariable String uuid, #RequestBody String results) {
return computeSCAService.updateAnalysisWithResults(uuid,results);
}
public Mono<Void> updateAnalysisWithResults(String uuidString, String results) {
// findByUUID(uuidString).subscribe(analysis -> {
// analysis.setResults(results);
// computeSCARepository.save(analysis).subscribe();
// });
return findByUUID(uuidString)
.doOnNext(analysis -> analysis.setResults(results))
.doOnNext(computeSCARepository::save)
.then();
}
Why it is not working is because you have misunderstood what doOnNext does.
Lets start from the beginning.
A Flux or Mono are producers, they produce items. Your application produces things to the calling client, hence it should always return either a Mono or a Flux. If you don't want to return anything you should return a Mono<Void>.
When the client subscribes to your application what reactor will do is call all operators in the opposite direction until it finds a producer. This is what is called the assembly phase. If all your operators don't chain together you are what i call breaking the reactive chain.
When you break the chain, the things broken from the chain wont be executed.
If we look at your example but in a more exploded version:
#Test
void brokenChainTest() {
updateAnalysisWithResults("12345", "Foo").subscribe();
}
public Mono<Void> updateAnalysisWithResults(String uuidString, String results) {
return findByUUID(uuidString)
.doOnNext(analysis -> analysis.setValue(results))
.doOnNext(this::save)
.then();
}
private Mono<Data> save(Data data) {
return Mono.fromCallable(() -> {
System.out.println("Will not print");
return data;
});
}
private Mono<Data> findByUUID(String uuidString) {
return Mono.just(new Data());
}
private static class Data {
private String value;
public void setValue(String value) {
this.value = value;
}
}
in the above example save is a callable function that will return a producer. But if we run the above function you will notice that the print will never be executed.
This has to do with the usage of doOnNext. If we read the docs for it it says:
Add behavior triggered when the Mono emits a data successfully.
The Consumer is executed first, then the onNext signal is propagated downstream.
doOnNext takes a Consumer that returns void. And if we look at doOnNext we see that the function description looks as follows:
public final Mono<T> doOnNext(Consumer<? super T> onNext)`
THis means that it takes in a consumer that is a T or extends a T and it returns a Mono<T>. So to keep a long explanation short, you can see that it consumes something but also returns the same something.
What this means is that this usually used for what is called side effects basically for something that is done on the side that does not hinder the current flow. One of those things could for instance logging. Logging is one of those things that would consume for instance a string and log it, while we want to keep the string flowing down our program. Or maybe we we want to increment a number on the side. Or modify some state somewhere. You can read all about side effects here.
you can of think of it visually this way:
_____ side effect (for instance logging)
/
___/______ main reactive flow
That's why your first doOnNext setter works, because you are modifying a state on the side, you are setting the value on your class hence modifying the state of your class to have a value.
The second statement on the other hand, the save, does not get executed. You see that function is actually returning something we need to take care of.
This is what it looks like:
save
_____
/ \ < Broken return
___/ ____ no main reactive flow
all we have to do is actually change one single line:
// From
.doOnNext(this::save)
// To
.flatMap(this::save)
flatMap takes whatever is in the Mono, and then we can use that to execute something and then return a "new" something.
So our flow (with flatMap) now looks like this:
setValue() save()
______ _____
/ / \
__/____________/ \______ return to client
So with the use of flatMap we are now saving and returning whatever was returned from that function triggering the rest of the chain.
If you then choose to ignore whatever is returned from the flatMap its completely correct to do as you have done to call then which will
Return a Mono which only replays complete and error signals from this
The general rule is, in a fully reactive application, you should never block.
And you generally don't subscribe unless your application is the final consumer. Which means if your application started the request, then you are the consumerof something else so you subscribe. If a webpage starts off the request, then they are the final consumer and they are subscribing.
If you are subscribing in your application that is producing data its like you are running a bakery and eating your baked breads at the same time.
don't do that, its bad for business :D
Subscribe inside a subscribe is not a good practise. You can use flatMap operator to solve this problem.
public void updateAnalysisWithResults(String uuidString, String results) {
findByUUID(uuidString).flatMap(analysis -> {
analysis.setResults(results);
return computeSCARepository.save(analysis);
}).subscribe();
}

Difference between Flux.subscribe(Consumer<? super T> consumer>) and Flux.doOnNext(Consumer<? super T> onNext)

Just starting to understand reactive programming with Reactor and I've come across this code snippet from a tutorial here building-a-chat-application-with-angular-and-spring-reactive-websocket
class ChatSocketHandler(val mapper: ObjectMapper) : WebSocketHandler {
val sink = Sinks.replay<Message>(100);
val outputMessages: Flux<Message> = sink.asFlux();
override fun handle(session: WebSocketSession): Mono<Void> {
println("handling WebSocketSession...")
session.receive()
.map { it.payloadAsText }
.map { Message(id= UUID.randomUUID().toString(), body = it, sentAt = Instant.now()) }
.doOnNext { println(it) }
.subscribe(
{ message: Message -> sink.next(message) },
{ error: Throwable -> sink.error(error) }
);
return session.send(
Mono.delay(Duration.ofMillis(100))
.thenMany(outputMessages.map { session.textMessage(toJson(it)) })
)
}
fun toJson(message: Message): String = mapper.writeValueAsString(message)
}
I understand what it does but not why the author uses a consumer within the subscribe method vs chaining another doOnNext(consumer). ie. the lines:
.doOnNext { println(it) }
.subscribe(
{ message: Message -> sink.next(message) },
{ error: Throwable -> sink.error(error) }
From the Reactor documnetation I have read that the Flux.subscribe(Consumer <? super T> consumer):
Subscribe a Consumer to this Flux that will consume all the elements in the sequence. It will request an unbounded demand (Long.MAX_VALUE).
For a passive version that observe and forward incoming data see doOnNext(java.util.function.Consumer).
However from that I don't understand why one would choose one over the other, to me they seem functionally identical.
The difference is much more conventional rather than functional - the difference being side-effects vs a final consumer.
The doOnXXX series of methods are meant for user-designed side-effects as the reactive chain executes - logging being the most normal of these, but you may also have metrics, analytics, etc. that require a view into each element as it passes through. The key with all of these is that it doesn't make much sense to have any of these as a final consumer (such as the println() in your above example.)
On the contrary, the subscribe() consumers are meant to be a "final consumer", and usually called by your framework (such as Webflux) rather than by user code - so this case is a bit of an exception to that rule. In this case they're actively passing the messages in this reactive chain to another sink for further processing - so it doesn't make much sense to have this as a "side-effect" style method, as you wouldn't want the Flux to continue beyond this point.
(Addendum: As said above, the normal approach with reactor / Webflux is to let Webflux handle the subscription, which isn't what's happening here. I haven't looked in detail to see if there's a more sensible way to achieve this without a user subscription, but in my experience there usually is, and calling subscribe manually is usually a bit of a code smell as a result. You should certainly avoid it in your own code wherever you can.)

How to handle exceptions thrown in subscriptions to processors in Project Reactor

Consider the following test:
#Test
void test() {
DirectProcessor<Object> objectTopicProcessor = DirectProcessor.create();
Runnable r = mock(Runnable.class);
objectTopicProcessor.subscribe(next -> {throw new RuntimeException("eee");});
objectTopicProcessor.subscribe(next -> r.run());
assertThrows(RuntimeException.class, () -> objectTopicProcessor.onNext("")); // exception is thrown
verify(r).run(); // it's not run
}
Imagine that I build an API where I expose the processor to the client.
When someone has multiple subscriptions and one of them throws exception, the other calls are not executed. Furthermore, exception is propagated and thrown out from objectTopicProcessor.onNext(""). I'd like to prevent such behavior.
I know that client can wrap his code in try-catch inside subscription, but is there any other way? Sometimes, for example, NullPointer may happen or client can forget about checking the exception. For the API it's also inconvenient to force clients to try-catch all exceptions.
What are the best strategies to handle such cases?
In this example, the code passed to the subscribe method is by default executed on the main thread. It first encounters the exception and fails immediately without executing the second subscribe block.
In order to achieve parallelism, use .publishOn(scheduler) method:
#Test
void test() {
DirectProcessor<Object> processor = DirectProcessor.create();
Flux<Object> flux = processor.publishOn(Schedulers.parallel());
Runnable r = mock(Runnable.class);
flux.subscribe(next -> {throw new RuntimeException("eee");});
flux.subscribe(next -> r.run());
processor.onNext(""); // onNext no longer throws an exception
verify(r, timeout(1000)).run();
}

Spring Cassandra Repository - Saving a record in a background thread

I've been working on creating a record in my database on a background thread but I don't get any response in the console (No errors, exceptions or logs).
Below is the code
In my spring component I have:
ExecutorService tPool = Executors.newFixedThreadPool(15);
//This is a repository that extends CassandraRepository
#Autowired
MyRepository myRepository;
CompletableFuture<Boolean> myBool = CompletableFuture.supplyAsync(() -> {
//processing here
return new doSomeProcessing(arg1, arg2, arg3).process();
}, tPool);
myBool.whenComplete((isTrue, throwThis) -> {
if(isTrue) {
// do something
}
});
In my Class doSomeProcessing, I have the method process():
public boolean process() {
//This appears in the console
LOG.info("About to try and save the record on the background thread");
//After setting the repository in the thread class constructor
myRepository.save(arg1, arg2, arg3);
//This doesn't appear in the console
LOG.info("The record should be saved");
}
But the database doesn't show any new records and the console doesn't show any errors or exceptions or the last log statement.
How would you go about saving a record on a background thread using Spring with Cassandra?
Any explanation is greatly appreciated.
I've seen and tried the below with the async service and transactional as well as a few others:
How do I use JpaRepository in a backend thread?
How do I properly do a background thread when using Spring Data and Hibernate?
When a CompletableFuture is completed exceptionally, there's no stacktrace because the exception is still unhandled. It's stored until the user does something that "activates" that exception. For example calling get() would directly throw that exception.
When doing more complex things with CompletableFuture the exception is stored along the results of the future (hence BiConsumer to have result and exception as parameters), but it's up to you to check if there is an exception and handle it.
Since you can chain the futures and therefore encounter multiple exceptions, you end up with documentation like the following:
If the supplied action itself encounters an exception, then the
returned stage exceptionally completes with this exception unless this
stage also completed exceptionally.
If you understand that on the first read, you're talented.

Observable's doOnError correct location

I am kind of new to Observers, and I am still trying to figure them out. I have the following piece of code:
observableKafka.getRealTimeEvents()
.filter(this::isTrackedAccount)
.filter(e -> LedgerMapper.isDepositOrClosedTrade((Transaction) e.getPayload()))
.map(ledgerMapper::mapLedgerTransaction)
.map(offerCache::addTransaction)
.filter(offer -> offer != null) // Offer may have been removed from cache since last check
.filter(Offer::isReady)
.doOnError(throwable -> {
LOG.info("Exception thrown on realtime events");
})
.forEach(awardChecker::awardFailOrIgnore);
getRealTimeEvents() returns an Observable<Event>.
Does the location of .doOnError matters? Also, what is the effect of adding more than one call to it in this piece of code? I have realised I can do it and all of them get invoked, but I am not sure of what could be its purpose.
Yes, it does. doOnError acts when an error is passing through the stream at that specific point, so if the operator(s) before doOnError throw(s), your action will be called. However, if you place the doOnError further up, it may or may not be called depending on what downstream operators are in the chain.
Given
Observer<Object> ignore = new Observer<Object>() {
#Override public void onCompleted() {
}
#Override public void onError(Throwable e) {
}
#Override public void onNext(Object t) {
}
};
For example, the following code will always call doOnError:
Observable.<Object>error(new Exception()).doOnError(e -> log(e)).subscribe(ignore);
However, this code won't:
Observable.just(1).doOnError(e -> log(e))
.flatMap(v -> Observable.<Integer>error(new Exception())).subscribe(ignore);
Most operators will bounce back exceptions that originate downstream.
Adding multipe doOnError is viable if you transform an exception via onErrorResumeNext or onExceptionResumeNext:
Observable.<Object>error(new RuntimeException())
.doOnError(e -> log(e))
.onErrorResumeNext(Observable.<Object>error(new IllegalStateException()))
.doOnError(e -> log(e)).subscribe(ignore);
otherwise, you'd log the same exception at multiple locations of the chain.
the doOn??? methods are there for side-effects, processing that doesn't really is your core business value let's say. Logging is a perfectly fine use for that.
That said, sometimes you want to do something more meaningful with an error, like retrying, or displaying a message to a user, etc... For these cases, the "rx" way would be to process the error in a subscribe call.
doOnError (and the other doOn methods) wraps the original Observable into a new one and adds behavior to it (around its onError method, obviously). That's why you can call it many times. Also one benefit of being able to call it anywhere in the chain is that you can access errors that would otherwise be hidden from the consumer of the stream (the Subscriber), for instance because there's a retry down in the chain...

Categories