RxJava Subject Cache Result With Retry - java

I have Observable<FeaturedItemList> getFeatured() that is called everytime the page opened. This function is called from two different components on the same page. Since it retrieves from the network, I cached it and make it shareable with ReplaySubject.
public Observable<FeaturedItemList> getFeatured() {
if(mFeaturedReplaySubject == null) {
mFeaturedReplaySubject = ReplaySubject.create();
getFromNetwork().subscribe(mFeaturedReplaySubject);
}
return mFeaturedReplaySubject;
}
Then I realize that when the request failed for some reasons, if the user come back to that page it will not show any results unless the user killed the app. So I decided to have some retry logic. Here's what I do:
public Observable<FeaturedItemList> getFeatured() {
synchronized (this) {
if (mFeaturedReplaySubject == null) {
mFeaturedReplaySubject = ReplaySubject.create();
getFromNetwork().subscribe(mFeaturedReplaySubject);
return mFeaturedReplaySubject;
} else {
return mFeaturedReplaySubject.onErrorResumeNext(throwable -> {
mFeaturedReplaySubject = null;
return getFeatured();
});
}
}
}
While this works, I'm afraid I'm doing something not good here on there's a case that won't be covered with this approach.
Is there any better approach?
Also for sharing the observable using subject, I read somewhere that I can use connect(), publish(), and share() but I'm not sure how to use it.

The code
private Observable<FeaturedItemList> mFeatured =
// initialized on construction
getFromNetwork()
.retry(3) // number of times to retry
.cache();
public Observable<FeaturedItemList> getFeatured() {
return mFeatured;
}
Explanation
Network source
Your getFromNetwork() function shall return regular observable, which is supposed to access network every time it is subscribed. It shall not access network when it is invoked. For example:
Future<FeaturedItemList> makeAsyncNetworkRequest() {
... initiate network request here ...
}
Observable<FeaturedItemList> getFromNetwork() {
return Observable.fromCallable(this::makeAsyncNetworkRequest)
.flatMap(Observable::fromFuture);
}
Retry
There is a family of .retryXxx() operators, which get activated on errors only. They either re-subscribe to source or pass error down the line, subject to various conditions. In case of no error these operators do nothing. I used simple retry(count) in my example, it will retry specified number of times without delay. You may add a delay or whatever complicated logic using retryWhen() (see here and here for examples).
Cache
cache() operator records the sequence of events and replays it to all new subscribers. The bad thing is that it is not refreshable. It stores the outcome of upstream forever, whether it is success or error, and never retries again.
Alternative to cache()
replay().refCount() replays events to all existing subscribers, but forgets everything as soon as all of them unsubscribe (or complete). It will re-subscribe to getFromNetwork() when new subscriber(s) arrive (with retry on error of course).
Operators mentioned but not needed
share() is a shorthand for publish().refCount(). It allows multiple concurrent subscribers to share single subscription, i.e. makes single call to subscribe() instead of doing it for every subscriber. Both cache() and replay().refCount() incorporate same functionality.

Related

Do while loop behaving unexpectedly, for some inexplicable reason

I've been all over the internet and the Java docs regarding this one; I can't seem to figure out what it is about do while loops I'm not understanding. Here's the background: I have some message handler code that takes some JSON formatted data from a REST endpoint, parses it into a runnable task, then adds this task to a linked blocking queue for processing by the worker thread. Meanwhile, on the worker thread, I have this do while loop to process the message tasks:
do {
PublicTask currentTask = pubMsgQ.poll();
currentTask.run();
} while(pubMsgQ.size() > 0);
pubMsgQ is a LinkedBlockingQueue<PublicTask> (PublicTask implements the Runnable interface). I can't see any problems with this loop (obviously, or else I wouldn't be here), but this is how it behaves during execution: Upon entering the do block, pubMsgQ is polled and returns the runnable task as expected. The task is then run successfully with expected results, but then we get to the while statement. Now, according to the Java docs, poll() should return and remove the head of the queue, so I should expect that pubMsgQ.size() will return 0, right? Wrong I guess, because somehow the while statement passes and the program enters the do block again; of course this time pubMsgQ.poll() returns null (as I would have expected it should) and the program crashes with NullPointerException. What? Please explain like I'm five...
EDIT:
I decided to leave my original post as is above; because I think I actually explain the undesired behavior of that specific piece of the code quite succinctly (the loop is being executed twice while I'm fairly certain there is no way the loop should be executing twice). However, I realize that probably doesn't give enough context for that loop's existence and purpose in the first place, so here is the complete breakdown for what I am actually trying to accomplish with this code as I am sure there is a better way to implement this altogether anyways.
What this loop is actually a part of is a message handler class which implements the MessageHandler interface belonging to my Client Endpoint class [correction from my previous post; I had said the messages coming in were JSON formatted strings from a REST endpoint. This is technically not true: they are JSON formatted strings being received through a web socket connection. Note that while I am using the Spring framework, this is not a STOMP client; I am only using the built-in javax WebSocketContainer as this is more lightweight and easier for me to implement]. When a new message comes in onMessage() is called, which passes the JSON string to the MessageHandler; so here is the code for the entire MessageHandler class:
public class MessageHandler implements com.innotech.gofish.AutoBrokerClient.MessageHandler {
private LinkedBlockingQueue<PublicTask> pubMsgQ = new LinkedBlockingQueue<PublicTask>();
private LinkedBlockingQueue<AuthenticatedTask> authMsgQ = new LinkedBlockingQueue<AuthenticatedTask>();
private MessageLooper workerThread;
private CyclicBarrier latch = new CyclicBarrier(2);
private boolean running = false;
private final boolean authenticated;
public MessageHandler(boolean authenticated) {
this.authenticated = authenticated;
}
#Override
public void handleMessage(String msg) {
try {
//Create new Task and submit it to the message queue:
if(authenticated) {
AuthenticatedTask msgTsk = new AuthenticatedTask(msg);
authMsgQ.put(msgTsk);
} else {
PublicTask msgTsk = new PublicTask(msg);
pubMsgQ.put(msgTsk);
}
//Check status of worker thread:
if(!running) {
workerThread = new MessageLooper();
running = true;
workerThread.start();
} else if(running && !workerThread.active) {
latch.await();
latch.reset();
}
} catch(InterruptedException | BrokenBarrierException e) {
e.printStackTrace();
}
}
private class MessageLooper extends Thread {
boolean active = false;
public MessageLooper() {
}
#Override
public synchronized void run() {
while(running) {
active = true;
if(authenticated) {
do {
AuthenticatedTask currentTask = authMsgQ.poll();
currentTask.run();
if(GoFishApplication.halt) {
GoFishApplication.reset();
}
} while(authMsgQ.size() > 0);
} else {
do {
PublicTask currentTask = pubMsgQ.poll();
currentTask.run();
} while(pubMsgQ.size() > 0);
}
try {
active = false;
latch.await();
} catch (InterruptedException | BrokenBarrierException e) {
e.printStackTrace();
}
}
}
}
}
You may probably see where I'm going with this...what this Gerry-rigged code is trying to do is act as a facsimile for the Looper class provided by the Android Development Kit. The actual desired behavior is as messages are received, the handleMessage() method adds the messages to the queue for processing and the messages are processed on the worker thread separately as long as there are messages to process. If there are no more messages to process, the worker thread waits until it is notified by the handler that more messages have been received; at which point it resumes processing those messages until the queue is once again empty. Rinse and repeat until the user stops the program.
Of course, the closest thing the JDK provides to this is the ThreadPoolExecutor (which I know is probably the actual proper way to implement this); but for the life of me I couldn't figure out how to for this exact case. Finally, as a quick aside so I can be sure to explain everything fully, The reason why there are two queues (and a public and authenticated handler) is because there are two web socket connections. One is an authenticated channel for sending/receiving private messages; the other is un-authenticated and used only to send/receive public messages. There should be no interference, however, given that the authenticated status is final and set at construction; and each Client Endpoint is passed it's own Handler which is instantiated at the time of server connection.
You appear to have a number of concurrency / threading bugs in your code.
Assumptions:
It looks like there could be multiple MessageHandler objects, each with its own pair of queues and (supposedly) at most one MessageLooper thread. It also looks as if a given MessageHandler could be used by multiple request worker threads.
If that is the case, then one problem is that MessageHandler is not thread-safe. Specifically, the handleMessage is accessing and updating fields of the MessageHandler instance without doing any synchronization.
Some of the fields are initialized during object creation and then never changed. They are probably OK. (But you should declare them as final to be sure!) But some of the variables are supposed to change during operation, and they must be handled correctly.
One section that rings particular alarm bells is this:
if (!running) {
workerThread = new MessageLooper();
running = true;
workerThread.start();
} else if (running && !workerThread.active) {
latch.await();
latch.reset();
}
Since this is not synchronized, and the variables are not volatile:
There are race conditions if two threads call this code simultaneously; e.g. between testing running and assigning true to it.
If one thread sets running to true, there are no guarantees that a second thread will see the new value.
The net result is that you could potentially get two or more MessageLooper threads for a given set of queues. That breaks your assumptions in the MessageLooper code.
Looking at the MessageLooper code, I see that you have declared the run method as synchronized. Unfortunately, that doesn't help. The problem is that the run method will be synchronizing on this ... which is the specific instance of MessageLooper. And it will acquire the lock once and release it once. On short, the synchronized is wrong.
(For Java synchronized methods and synchronized blocks to work properly, 1) the threads involved need to synchronize on the same object (i.e. the same primitive lock), and 2) all read and write operations on the state guarded by the lock need to be done while holding the lock. This applies to use of Lock objects as well.)
So ...
There is no synchronization between a MessageLooper thread and any other threads that are adding to or removing from the queues.
There are no guarantees that the MessageLooper thread will notice changes to the running flag.
As I previously noted, you could have two or more MessageLooper polling the same pair of queues.
In short, there are lots of possible explanations for strange behavior in the code in the Question. This includes the specific problem you noticed with the queue size.
Writing correct multi-threaded code is difficult. This is why you should be using an ExecutorService rather than attempting to roll your own code.
But it you do need to roll your own concurrency code, I recommend buying and reading "Java: Concurrency in Practice" by Brian Goetz et al. It is still the only good textbook on this topic ...

How in WebFlux to stop publisher when request is aborted by client?

SpringBoot v2.5.1
There is an endpoint requesting a long running process result and it is created somehow
(for simplicity it is Mono.fromCallable( ... long running ... ).
Client make a request and triggers the publisher to do the work, but after several seconds client aborts the request (i.e. connection is lost). And the process still continues to utilize resources for computation of a result to throw away.
What is a mechanism of notifying Project Reactor's event loop about unnecessary work in progress that should be cancelled?
#RestController
class EndpointSpin {
#GetMapping("/spin")
Mono<Long> spin() {
AtomicLong counter = new AtomicLong(0);
Instant stopTime = Instant.now().plus(Duration.of(1, ChronoUnit.HOURS));
return Mono.fromCallable(() -> {
while (Instant.now().isBefore(stopTime)) {
counter.incrementAndGet();
if (counter.get() % 10_000_000 == 0) {
System.out.println(counter.get());
}
// of course this does not work
if (Thread.currentThread().isInterrupted()){
break;
}
}
return counter.get();
});
}
}
fromCallable doesn't shield you from blocking computation inside the Callable, which your example demonstrates.
The primary mean of cancellation in Reactive Streams is the cancel() signal propagated from downstream via the Subscription.
Even with that, the fundamental requirement of avoiding blocking code inside reactive code still holds, because if the operators are simple enough (ie. synchronous), a blocking step could even prevent the propagation of the cancel() signal...
A way to adapt non-reactive code while still getting notified about cancellation is Mono.create: it exposes a MonoSink (via a Consumer<MonoSink>) which can be used to push elements to downstream, and at the same time it has a onCancel handler.
You would need to rewrite your code to eg. check an AtomicBoolean on each iteration of the loop, and have that AtomicBoolean flipped in the sink's onCancel handler:
Mono.create(sink -> {
AtomicBoolean isCancelled = new AtomicBoolean();
sink.onCancel(() -> isCancelled.set(true));
while (...) {
...
if (isCancelled.get()) break;
}
});
Another thing that is important to note in your example: the AtomicInteger is shared state. If you subscribe a second time to the returned Mono, both subscriptions will share the counter and increment it / check it in parallel, which is probably not good.
Creating these state variables inside the Consumer<MonoSink> of Mono.create ensures that each subscription gets its own separate state.

RxJava error handling for hot observable

I'm pretty new to RxJava and have some questions on patterns etc.
I'm creating an observable using the code below:
public Observable<Volume> getVolumeObservable(Epic epic) {
return Observable.create(event -> {
try {
listeners.add(streamingAPI.subscribeForChartCandles(epic.getName(), MINUTE, new HandyTableListenerAdapter() {
#Override
public void onUpdate(int i, String s, UpdateInfo updateInfo) {
if (updateInfo.getNewValue(CONS_END).equals(ONE)) {
event.onNext(new Volume(Integer.parseInt(updateInfo.getNewValue(LAST_TRADED_VOLUME))));
}
}
}));
} catch (Exception e) {
LOG.error("Error from volume observable", e);
}
});
}
Everything is working as expected, but I have some questions on error handling.
If I understand correctly, this is to be viewed as a "hot observble", i.e. events will happen regardless of there being a subscription or not (onUpdate is a callback used by a remote server which I have no control over).
I've chosen not to call onError here since I don't want the observable to stop emitting events in case of a single exception. Is there a better pattern to be used? .retry() comes to mind, but I'm not sure that it makes sense for a hot observable?
Also, how is the observable represented when the subscriptions is created, but before the first onNext is called? Is it just an Observable.empty()
1) Your observable is not hot. The distinguishing factor is whether multiple subscribers share the same subscription. Observable.create() invokes subscribe function for every subscriber, i.e. it is cold.
It is easy to make it hot though. Just add share() operator. It will subscribe with first subscriber and unsubscribe with last one. Do not forget to implement unsubscribe functionality with something like this:
event.setCancellable(() -> listeners.remove(...));
2) Errors could be recoverable and not recoverable.
In case you consider an error to be self-recoverable (no action required from your side) you should not call onError as this will kill your observable (no further events would be emitted). You can possibly notify your subscribers by emitting special Volume message with error details attached.
In case an error is fatal, e.g. you have failed to add listener, so there could be no further messages, you should not silently ignore this. Emit onError as your observable is not functional anyway.
In case an error requires actions from you, typically retry, or retry with timeout, you can add one of retryXxx() operators. Do this after create() but before share().
3) Observable is an object with subscribe() method. How exactly it is represented depends on the method you created it with. See source code of create() for example.

RxJava pattern for requesting a remote Observable with a temporary cache

The use case is this:
I want to temporarily cache the latest emitted expensive Observable response, but after it expires, return to the expensive source Observable and cache it again, etc.
A pretty basic network cache scenario, but I'm really struggling to get it working.
private Observable<String> getContentObservable() {
// expensive upstream source (API, etc.)
Observable<String> sourceObservable = getSourceObservable();
// cache 1 result for 30 seconds, then return to the source
return sourceObservable
.replay(1, 30, TimeUnit.SECONDS)
.autoConnect()
.switchIfEmpty(sourceObservable);
}
Initial request: goes to source
Second request within 30 seconds of source emitting: delivered from cache
Third request outside of cache expiry window: nothing. I subscribe to it and I get no data, but it's not switching to the upstream source Observable.
It looks as if I'm just connecting to my ConnectableObservable from autoConnect() and it's never completing with empty, so it's never triggering my switchIfEmpty().
How can I use this combination of replay(1,x,x) and switchIfEmpty()?
Or am I just approaching this wrong from the start?
return sourceObservable
.replay(1, 30, TimeUnit.SECONDS)
.autoConnect()
.switchIfEmpty(sourceObservable);
Initial request: goes to source Second request within 30 seconds of source emitting: delivered from cache Third request outside of cache expiry window: nothing. I subscribe to it and I get no data, but it's not switching to the upstream source Observable.
The problem here is, that replay just repeating the same sequence emitted by the sourceObservable in the last 30 sec, but when you subscribe after 30 sec, the sequence has no events, even no onCompleted(), so you can't switchIfEmpty(), it will not work as it's depends on 'onCompleted()' signal and without any emissions, to know that it's 'empty'.
In general, using replay is not suffice in cache scenario, as what you need is a way to resubscribe again in case the cache is expired, and additionally do it by demand, meaning only when some client subscribe to it. (you can do cache that refresh itself every 30 sec, but that's not the desired behavior I guess)
So, as #Yurly Kulikov suggested, you need to maintain a state, and to control the subscription operation for maintaining the state.
But I think there is a major flow in the solution, as it's actually not exatcly thread-safe, meaning if 2 subscribes to it 1 after the another, say A and B, while A executes the request and waits in order to save the new result in the cache, B can subscribe as well, and another request will be executed as cached value didn't set yet by A (it didn't finished yet the first network request.
I suggest to use similar approach with a different implementation, that I suggested here:
public class CachedRequest<T> {
private final AtomicBoolean expired = new AtomicBoolean(true);
private final Observable<T> source;
private final long cacheExpirationInterval;
private final TimeUnit cacheExpirationUnit;
private Observable<T> current;
public CachedRequest(Observable<T> o, long cacheExpirationInterval,
TimeUnit cacheExpirationUnit) {
source = o;
current = o;
this.cacheExpirationInterval = cacheExpirationInterval;
this.cacheExpirationUnit = cacheExpirationUnit;
}
private Observable<T> getCachedObservable() {
return Observable.defer(() -> {
if (expired.compareAndSet(true, false)) {
current = source.cache();
Observable.timer(cacheExpirationInterval, cacheExpirationUnit)
.subscribe(aLong -> expired.set(true));
}
return current;
});
}
}
with defer you can return the right Observable according to cache expiration status, so every subscribe happened within the cache expiration will get cached Observable (using cache()) - meaning request will be performed only once. after cache expiration, additional subscribe will trigger new request and will set a new timer to reset the cache expiration.
So it's turns out you can use Jake Wharton's replaying share to cache the last value even after dispose.
https://github.com/JakeWharton/RxReplayingShare
You will have to maintain a state shared between many callers. That is why you cannot create the Observable every time getContentObservable() is called.
One way to do it is to create an Observable outside, hold the internal state in the Observable (e.g. using buffer), but implementing stateful behaviour is often easier without Observables.
Here is an example with shared state in a field:
private Optional<String> cached = Optional.empty();
private Observable<String> getContentObservable() {
//use defer to delay cache evaluation to the point when someone subscribes
return Observable.defer(
() ->
cached.isPresent()
? Observable.just(cached)
: fetchAndCache()
)
//use the same scheduler for every cached field access
.subscribeOn(scheduler);
}
private Observable<String> fetchAndCache() {
Observable<String> cachedSource = getSourceObservable()
//I assume you only need one, make sure it is 1
.take(1)
.cache();
cachedSource
.observeOn(scheduler)
//side-effect stores the state
.doOnNext(str -> cached = Optional.of(str))
.flatMap(str -> Observable.timer(30, TimeUnit.SECONDS, scheduler))
//another side-effect clears the cache
.subscribe(l -> cached = Optional.empty());
return cachedSource;
}

RxJava: Reading multiple subscriptions and performing an action based on their results?

I have a network call that depends on the inputs of multiple UI elements. It's basically an interface for a transaction, where the user can pick things like the amount, currency, and destination. Before the request is fired off, I need to verify everything (for example, whether or not the user's balance has the amount, whether or not the destination is valid, etc.). I have Observables for all of these network calls, but I'm not sure what the best way to go about starting all of these calls concurrently and using their results to determine what action to take.
Basically, the ideal flow is for each condition to have a failure case (which I can determine in code for each), and if any of those failure cases are met, display an error to the user saying which inputs were invalid. If all of the checks pass, fire off the transaction.
How should I go about this?
If I understood correctly, the signatures of your Observables look similar to this:
// verifier Observables which perform network calls
Observable<Verification1> test1 = ...
Observable<Verification2> test2 = ...
...
// Observable to fire the transaction
Observable<TransactionResult> fireTransaction = ...
// represents the clicks on the "go" button of the UI
Observable<Void> goButtonClicks = ...
Then you could combine all these Observables as follows:
goButtonClicks.flatMap(theVoid -> {
return Observable.zip(
test1.subscribeOn(Schedulers.io()),
test2.subscribeOn(Schedulers.io()),
(v1, v2) -> {
if (v1 and v2 pass all your requirements) {
return fireTransaction;
} else {
return Observable.just(new TransactionFailure("error"));
}
}
);
}).subscribe(transactionResult -> {
UI.showMessage(transactionResult.getMessage());
})
The .subscribeOn(Schedulers.io()) calls give you parallelism in the verification network calls, and zip allows you to "wait" on all results.
However, I guess that on the server side, you will have to do all these tests again for security reasons. So if you can change the architecture, you may want to always "fire" the transaction in the UI, let the server make the checks, and return a success/failure notification to the UI.

Categories