Writing Aspects For Reactive Pipelines

Writing Aspects For Reactive Pipelines - java

I am writing Aspects for methods that return promises. Consider the following method:
public Mono<Stream> publishToKafka(Stream s) {
//publishToKafka is asynchronous
return Mono.just(s).flatMap(worker::publishToKafka);
}
I want to cache if the publish was successful or not. Since this is a cross-cutting concern, an Aspect looks like the best design. Here's my Aspect for it.
#Around("#annotation....")
public Object cache() {
//get the data to cache from the annotation
Object result = pjp.proceed();
cache.cache("key","data");
return result;
}
Now since publishToKafka is asynchronous, the target method returns as soon as the thread switch happens and cache.cache() is called. This is not what I want. What I want is that the result should be cached iff the event was successfully published to Kafka. The following advice works.
#Around("#annotation....")
public <T extends Stream<T>> Mono<T> cache() {
//get the data to cache from the annotation
return ((Mono<T>)pjp.proceed()).doOnNext(a -> cache.cache(key, data));
}
I want to understand what's going on here. Does this happen during the assembly time of the pipeline? Or during the execution time (pjp.proceed() returns a promise) to which my advice adds the doOnNext operator?
I need to understand assembly vs. execution time in the context of this example.

Both Spring AOP and AspectJ aspects are always executed synchronously in the same thread as the intercepted joinpoint. Thus, if your intercepted method returns immediately and the return value is something like a promise, a future or nothing (void) in combination with a callback, you cannot expect to magically get the asynchronous result in the aspect's advice. You do need to make the aspect aware of the asynchronous situation.
Having said that, I also want to mention that I never used reactive programming before, I only know the concept. From what I see in your advice, the solution should work, but one thing is not so nice: You make the advice return a new Mono instance returned by your doOnNext(..) call. Maybe it would be cleaner to return the original Mono you get from proceed() after having registered your caching callback on it, just so as to avoid any side-effects.
I don't know what else to explain, the situation is pretty clear. Feel free to ask directly related follow-up questions if my explanation does not suffice.

Related

When is the deal Mono.fromCallback ? (reactive)

I still do not understand when to apply this method. In fact, it is similar to Mono.just, but I heard that callback is used for heavy operations if it needs to be performed separately from other flows. Now I use it like this, but is it correct.
Here is an example of use, I wrap sending a firebase notification in a callback since the operation is long
#Override
public Mono<NotificationDto> sendMessageAllDevice(NotificationDto notification) {
return Mono.fromCallable(() -> fcmProvider.sendPublicMessage(notification))
.thenReturn(notification);
}
maybe I still had to wrap up here in Mono.just ?

It depends which thread you want fcmProvider.sendPublicMessage(...) to be run on.
Either the one currently executing sendMessageAllDevice(...):
T result = fcmProvider.sendPublicMessage(notification);
return Mono.just(result);
Or the one(s) the underlying mono relies on:
Callable<T> callable = () -> fcmProvider.sendPublicMessage(notification);
return Mono.fromCallable(callable);
I would guess you need the latter approach.

If you use Mono.just(computeX()), computeX() is called immediately. No want you want(I guess).
If you use Mono.fromCallable(() -> computeX()), the computation is still not performed. I mean computeX() is only called when you subscribe to it. Maybe using .map, .flatMap, etc.
Important: if computeX() return Mono you doe not need to use Mono.fromCallable. It's only for blocking code

As you explained in the description, Mono.fromCallable is used when you want to compute a result with an async execution (mostly some heavy operation).
Since, you have already generated the Mono with Mono.fromCallable you do not have to wrap it again with Mono.just.

Update objects's state in Reactor

Given the following method:
private Mono<UserProfileUpdate> upsertUserIdentifier(UserProfileUpdate profileUpdate, String id){
return userIdentifierRepository.findUserIdentifier(id)
.switchIfEmpty(Mono.defer(() -> {
profileUpdate.setNewUser(true);
return createProfileIdentifier(profileUpdate.getType(), id);
}))
.map(userIdentifier -> {
profileUpdate.setProfileId(userIdentifier.getProfileId());
return profileUpdate;
});
}
switchIfEmpty and map operators mutate the profileUpdate object. Is it safe to mutate in switchIfEmpty operator? Regarding map, if I have understood correctly, this is not safe and object profileUpdate must be immutable, right? eg:
private Mono<UserProfileUpdate> upsertUserIdentifier(UserProfileUpdate profileUpdate, String id){
return userIdentifierRepository.findUserIdentifier(id)
.switchIfEmpty(Mono.defer(() -> {
profileUpdate.setNewUser(true);
return createProfileIdentifier(profileUpdate.getType(), id);
}))
.map(userIdentifier -> profileUpdate.withProfileId(userIdentifier.getProfileId()));
}
Later in the chain, another method mutates the object:
public Mono<UserProfileUpdate> transform(UserProfileUpdate profUpdate) {
if (profUpdate.isNewUser()) {
profUpdate.getAttributesToSet().putAll(profUpdate.getAttributesToSim());
} else if (!profUpdate.getAttributesToSim().isEmpty()) {
return userProfileRepository.findUserProfileById(profUpdate.getProfileId())
.map(profile -> {
profUpdate.getAttributesToSet().putAll(
collectMissingAttributes(profUpdate.getAttributesToSim(), profile.getAttributes().keySet()));
return profUpdate;
});
}
return Mono.just(profUpdate);
}
The above methods are called as follows:
Mono.just(update)
.flatMap(update -> upsertUserIdentifier(update, id))
.flatMap(this::transform)

Nebulous answer, but... it depends!
The danger of mutating an input parameter in the returned Mono or Flux comes from the fact that said Mono or Flux could be subscribed multiple times. In such a case, suddenly you have a shared resource on your hands, and it can lead to puzzling issues.
But if the methods in question are called from a well controlled context, it can be safe.
In your case, flatMap ensures that the inner publishers are subscribed to only once. So as long as you only use these methods inside such flatMaps, they can safely mutate their input parameter (it is kept in the scope of the flatmapping function).

There's arguably two factors here - the "dangerous" aspect and the "style" aspect.
Simon has covered the "dangerous" aspect very well; there is one thing I'd add however. Even though your code is "safe" within the method that we can see here (due to the guarantees we have behind a single subscription to an inner flatmap publisher), we still can't absolutely guarantee it's safe in a wider context - we don't know what else has visibility of, or might mutate, your profileUpdate parameter. If it's created, immediately passed into this method only, then read after this method completes, then sure, it's good. If it's created at some point in the past, perhaps passed around to a few methods that may or may not mutate it, passed back, passed into this method, passed into a few other methods... then, well, it might be safe, but it becomes increasingly difficult to analyse and say for certain - and if it's not safe, it becomes just as difficult to track down where that one in a million bug caused by the behaviour might occur.
Now, your code may look nothing like this complex mess I've just described - but with just a few lines changed here and there, or "just doing one more mutation" with this object elsewhere before it's passed in, it could start to get there.
That leads into the "style" aspect.
Personally, I'm very much a fan of keeping everything as part of the reactive chain where at all possible. Even ignoring the potential for bad mutations / side-effects, it becomes much harder to read the code if it's all written in this way - you have to mentally keep track of both the value being passed through the chain, as well as values external to the chain being mutated. With this example it's reasonably trivial, in larger examples it becomes almost unreadable (at least to my brain!)
So with that in mind, if I were reviewing this code I'd strongly prefer UserProfileUpdate to be immutable, then to use something like this instead:
private Mono<UserProfileUpdate> upsertUserIdentifier(UserProfileUpdate profileUpdate, String id){
return userIdentifierRepository.findUserIdentifier(id)
.switchIfEmpty(() -> createProfileIdentifier(profileUpdate.withNewUser(true), id))
.map(userIdentifier -> profileUpdate.withProfileId(userIdentifier.getProfileId()));
}
...note this isn't a drop-in replacement, in particular:
createProfileIdentifier() would just take a UserProfileUpdate object and an id, and be expected to return a new UserProfileUpdate object from those parameters;
UserProfileUpdate would need to be enhanced with #With (lombok) or a similar implementation to allow it to produce immutable copies of itself with only a single value changed.
Other code would likely need to be modified similarly to encapsulate the profileUpdate as part of the reactive chain, and not rely on mutating it.
However, in my opinion at least, the resulting code would be far more robust and readable.

How are Spring boot #Async methods actually async/non-blocking?

The following example is taken from Spring' Getting Started Creating Asynchronous Methods.
#Service
public class GitHubLookupService {
#Async
public CompletableFuture<User> findUser(String user) throws InterruptedException {
logger.info("Looking up " + user);
String url = String.format("https://api.github.com/users/%s", user);
User results = restTemplate.getForObject(url, User.class);
// Artificial delay of 1s for demonstration purposes
Thread.sleep(1000L);
return CompletableFuture.completedFuture(results);
}
}
AFAIK my knowledge of async methods in computer science goes - It should return immediately. It should be non-blocking.
So lets say somewhere in Spring lets say my code findUser() is called like so:
CompletableFuture<User> user = service.findUser("foo");
This would actually block. It would block a different thread on the Executor service but it would block due to the Thread.sleep(1000L). Correct ?
So how is this async ?
I mean the whole point of CompletableFuture is to get a reference to a computation that will be completed in the future. But here when I get back the completed future the computation is already over, ie. we are using CompletableFuture.completedFuture(results).
So what is the point of having a CompletableFuture in this case ? I mean if I am going to block and return only when my computation is over and I have the results, I might as well just return the result and not the CompletableFuture.
How is this truly non-blocking/async ?
The only non-blocking aspect I find here is offload to a different thread, nothing else.
Am I going wrong somewhere ? What am I missing ?
Thanks.

The problem lies in the way you create your Future. The code you use is
CompletableFuture.completedFuture(results)
Quoting from the JavaDoc, this is only a wrapper between sync and async, where the computation was done synchronously:
Returns a new CompletableFuture that is already completed with the given value.
This is useful in certain situations where you only want to do asynchronous work for some inputs. Consider
(x) -> x==0 ? CompletableFuture.completedFuture(0) : CompletableFuture.supplyAsync(expensiveComputation)
I hope this makes the difference clear - if you want truly async computations, you need to use the supplyAsync function:
Returns a new CompletableFuture that is asynchronously completed by a task running in the ForkJoinPool.commonPool() with the value obtained by calling the given Supplier.

The detail you're missing is that when #Async is used (and correctly configured), a proxy bean will be used, wrapping your service bean. Any calls to the asynchronous methods through the proxy will use a Spring TaskExecutor to asynchronously run the method.
Wrapping the method response in a synchronous Future such as CompletableFuture.completedFuture is necessary so the return type can be a Future. However, the Future you return is not the one returned by the proxy. Rather, the proxy returns a Future provided by the TaskExecutor, which will be processed asynchronously. The Future you create through e.g. CompletableFuture.completedFuture is unwrapped by the proxy, and its completion result is returned by the proxy's Future.
Proxying documentation
I don't see all of the aforementioned proxying details explicitly stated in either the Spring reference documentation or in the #Async or #EnableAsync Javadocs. However, the details can be pieced together by reading between the lines of what is provided.
The #Async Javadocs mentions the service proxy in passing, and explains why CompletableFuture.completedFuture is used in the service method's implementation:
A Future handle returned from the proxy will be an actual asynchronous Future that can be used to track the result of the asynchronous method execution. However, since the target method needs to implement the same signature, it will have to return a temporary Future handle that just passes a value through: e.g. Spring's AsyncResult, EJB 3.1's AsyncResult, or CompletableFuture.completedFuture(Object).
The fact that proxying is involved is also made apparent by the fact that two of the #EnableAsync annotation elements specify proxying details: mode and proxyTargetClass.
Question example
Finally, applying this to the question example will make it concrete. Code that calls the findUser method on the GitHubLookupService bean will actually be calling a method on a proxy class, rather than directly on the GitHubLookupService instance. The proxy class's findUser method submits a task to Spring's TaskExecutor, and return a CompletableFuture that will asynchronously be completed when the submitted task completes.
The submitted task will call the actual findUser method in the non-proxied GitHubLookupService. This will will perform the REST call, sleep 1 second, and return a completed CompletableFuture with the REST results.
Since this task is happening in a separate thread created by Spring's TaskExecutor, the calling code will continue to proceed past the GitHubLookupService.findUser call immediately, even though it will take at least 1 second for it to return.
If the result of the findUser call is used in the calling code (using e.g. CompletableFuture.get()), the value it will get from that Future will be the same results value passed to CompletableFuture.completedFuture in the GitHubLookupService code.

Should I use "then" or "flatMap" for control flow?

So, I'm trying to work with Webflux and I've got a scenario "check if an object exists; if so, do stuff, else - indicate error".
That can be written in reactor as:
public Mono<Void> handleObjectWithSomeId(Mono<IdType> id){
return id.
flatMap(repository::exists). //repository.exists returns Mono<Boolean>
flatMap(e -> e ? e : Mono.error(new DoesntExistException())).
then(
//can be replaced with just(someBusinessLogic())
Mono.fromCallable(this::someBusinessLogic)
);
}
or as:
public Mono<Void> handleObjectWithSomeId(Mono<IdType> id){
return id.
flatMap(repository::exists). //repository.exists returns Mono<Boolean>
flatMap(e -> e ? e : Mono.error(new DoesntExistException())).
map(e -> this.someBusinessLogic()));
}
Let's assume that return type of someBusinessLogic cannot be changed and it has to be simple void, not Mono<Void>.
In both cases if the object won't exist, appropriate Mono.error(...) will be produced.
While I understand that then and flatMap have different semantics, effectively I get the same result. Even though in second case I'm using flatMap against its meaning, I get to skip flatMap and fromCallable in favor of simple map with ignored argument (that seems more readable). My point is, both apporaches have advantages and disadvantages when it comes to readability and code quality.
So, here's a summary:
Using then
pros
is semantically correct
cons
in many cases (like above) requires wrapping in ad-hoc Mono/Flux
Using flatMap
pros
simplifies continued "happy scenario" code
cons
is semantically incorrect
What are other pros/cons of both approaches? What should I take under consideration when choosing an operator?
I've found this reactor issue that states that there is not real difference in speed.

TL, DR: If you care about the result of the previous computation, you can use map(), flatMap() or other map variant. Otherwise, if you just want the previous stream finished, use then().
You can see a detailed log of execution for yourself, by placing an .log() call in both methods:
public Mono<Void> handleObjectWithSomeId(Mono<IdType> id) {
return id.log()
.flatMap(...)
...;
}
Like all other operations in Project Reactor, the semantics for then() and flatMap() are already defined. The context mostly defines how these operators should work together to solve your problem.
Let's consider the context you provided in the question. What flatMap() does is, whenever it gets an event, it executes the mapping function asynchronously.
Since we have a Mono<> after the last flatMap() in the question, it will provide the result of previous single computation, which we ignore. Note that if we had a Flux<> instead, the computation would be done for every element.
On the other hand, then() doesn't care about the preceding sequence of events. It just cares about the completion event:
That's why, in your example it doesn't matter very much which one you use. However, in other contexts you might choose accordingly.
You might also find the Which operator do I need? section Project Reactor Reference helpful.

Why does my RxJava Observable not emit or complete unless it's blocking?

Background
I have a number of RxJava Observables (either generated from Jersey clients, or stubs using Observable.just(someObject)). All of them should emit exactly one value. I have a component test that mocks out all the Jersey clients and uses Observable.just(someObject), and I see the same behaviour there as when running the production code.
I have several classes that act upon these observables, perform some calculations (& some side-effects - I might make them direct return values later) and return empty void observables.
At one point, in one such class, I'm trying to zip several of my source observables up and then map them - something like the below:
public Observable<Void> doCalculation() {
return Observable.zip(
getObservable1(),
getObservable2(),
getObservable3(),
UnifyingObject::new
).concatMap(unifyingObject -> unifyingObject.processToNewObservable())
}
// in Unifying Object
public Observable<Void> processToNewObservable() {
// ... do some calculation ...
return Observable.empty();
}
The calculating classes are then all combined and waited on:
// Wait for rule computations to complete
List<Observable<Void>> calculations = ...;
Observable.zip(calculations, results -> results)
.toBlocking().lastOrDefault(null);
The problem
The trouble is, processToNewObservable() is never being executed. By process of elimination, I can see it's getObservable1() that's the trouble - if I replace it with Observable.just(null), everything's executed as I'd imagine (but with a null value where I want a real one).
To reiterate, getObservable1() returns an Observable from a Jersey client in production code, but that client is a Mockito mock returning Observable.just(someValue) in my test.
Investigation
If I convert getObservable1() to blocking, then wrap the first value in just(), again, everything executes as I'd imagine (but I don't want to introduce the blocking step):
Observable.zip(
Observable.just(getObservable1().toBlocking().first()),
getObservable2(),
getObservable3(),
UnifyingObject::new
).concatMap(unifyingObject -> unifyingObject.processToNewObservable())
My first thought was that perhaps something else was consuming the value emitted from my observable, and zip was seeing that it was already complete, thus determining that the result of zipping them should be an empty observable. I've tried adding .cache() onto every observable source I can think is related, however, but that hasn't altered the behaviour.
I've also tried adding next / error / complete / finally handlers on getObservable1 (without converting it to blocking) just before the zip, but none of them executed either:
getObservable1()
.doOnNext(...)
.doOnCompleted(...)
.doOnError(...)
.finallyDo(...);
Observable.zip(
getObservable1(),
getObservable2(),
getObservable3(),
UnifyingObject::new
).concatMap(unifyingObject -> unifyingObject.processToNewObservable())
The question
I'm very new to RxJava, so I'm pretty sure I'm missing something fundamental. The question is: what stupid thing could I be doing? If that's not obvious from what I've said so far, what can I do to help diagnose the issue?

The Observable must emit to start the chain. You have to think of your pipeline as a declaration of what will happen when the Observable emits.
You didn't share what was actually being observed, but Observable.just() causes the Observable to emit the wrapped object immediately.

Based on the response in the comment, either one of the getObservable doesn't return any value but just completes or the Mockito mocking does something wrong. The following standalone example works for me. Could you check it and start slowly mutating it to see where things break?
Observable.zip(
Observable.just(1),
Observable.just(2),
Observable.just(3),
(a, b, c) -> new Integer[] { a, b, c })
.concatMap(a -> Observable.from(a))
.subscribe(System.out::println)
;

Note: I didn't find my answer here very satisfying, so I dug in a bit further and found a much smaller reproduction case, so I've asked a new question here: Why does my RxJava Observable emit only to the first consumer?
I've figured out at least part of my troubles (and, apologies to all who tried to answer, I don't think you stood much of a chance given my explanation).
The various classes which perform these calculations were all returning Observable.empty() (as per processToNewObservable() in my original example). As far as I can tell, Observable.zip() doesn't subscribe to the Nth observable it's zipping until the N-1th observable has emitted a value.
My original example claimed it was getObservable1() that was misbehaving - that was actually slight inaccurate, it was a later observable in the parameter list. As I understand it, the reason making it blocking, then turning that value into an Observable again worked is because making it blocking and calling first forced its execution, and I got the side-effects I wanted.
If I change all my calculating classing to return Observable.just(null) instead, everything works: the final zip() of all the calculation classes' observables works through them all, so all the expected side-effects happen.
Returning a null Void seems like I'm definitely Doing Something Wrong from a design point of view, but at least this particular question is answered.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Writing Aspects For Reactive Pipelines - java

Related

When is the deal Mono.fromCallback ? (reactive)

Update objects's state in Reactor

How are Spring boot #Async methods actually async/non-blocking?

Should I use "then" or "flatMap" for control flow?

Why does my RxJava Observable not emit or complete unless it's blocking?

Categories

Resources