I am observing the lines produced by a NetworkResource, wrapping it in an Observable.create. Here is the code, missing try/catch and cancellation for simplicity:
fun linesOf(resource: NetworkResource): Observable<String> =
Observable.create { emitter ->
while (!emitter.isDisposed) {
val line = resource.readLine()
Log.i(TAG, "Emitting: $line")
emitter.onNext(line)
}
}
The problem is that later I want to turn it into a Flowable using observable.toFlowable(LATEST) to add backpressure in case my consumer can't keep up, but depending on how I do it, the consumer stops receiving items after item 128.
A) this way everything works:
val resource = ...
linesOf(resource)
.subscribeOn(Schedulers.io())
.observeOn(AndroidSchedulers.mainThread())
.toFlowable(BackpressureStrategy.LATEST)
.subscribe { Log.i(TAG, "Consuming: $it") }
B) here the consumer gets stuck after 128 items (but the emitting continues):
val resource = ...
linesOf(resource)
.toFlowable(BackpressureStrategy.LATEST)
.subscribeOn(Schedulers.io())
.observeOn(AndroidSchedulers.mainThread())
.subscribe { Log.i(TAG, "Consuming: $it") } // <-- stops after 128
In option A) everything works without any issues, and I can see the Emitting: ... log side by side with the Consuming: ... log.
In option B) I can see the Emitting: ... log message happily emitting new lines, but I stop seeing the Consuming: ... log message after item 128, even though the emitting continues.
Question: Can someone help me understand why this happens?
First of all, you are using the wrong type and wrong operator. Using Flowable removes the need for conversion. Using Flowable.generate gets you backpressure:
Flowable.generate(emitter -> {
String line = resource.readLine();
if (line == null) {
emitter.onComplete();
} else {
emitter.onNext(line);
}
});
Second, the reason your version hangs is due to a same pool deadlock caused by subscribeOn. Requests from downstream are scheduled behind your eager emission loop and can not take effect, stopping the emission at the default 128 elements. Use Flowable.subscribeOn(scheduler, false) to avoid this case.
Related
Sample code:
Flux<Integer> fluxSrc = Flux.<Integer> create(e -> {
e.next(1);
try {
Thread.sleep(500);
} catch (InterruptedException e1) {
throw new RuntimeException(e1);
}
e.complete();
})
.publishOn(Schedulers.single())
.publish().autoConnect(2);
Flux<Integer> fluxA = fluxSrc
.publishOn(Schedulers.single())
.map(j -> 10 + j);
fluxA.subscribe(System.out::println);
Mono<Integer> monoB = fluxSrc
.publishOn(Schedulers.single())
.reduce(20, (j, k) -> {
try {
Thread.sleep(1000);
} catch (InterruptedException e1) {
throw new RuntimeException(e1);
}
return j + k;
});
monoB.subscribe(System.out::println);
Mono.when(fluxA, monoB)
.block();
System.out.println("After");
This produces the following output:
11
After
21
Why does it not wait for both publishers (fluxA and monoB) to complete? How should I structure the code so I make sure all publishers complete before After is reached?
By using .publish(), fluxSrc is turned into hot flux. Consider:
Hot publishers, on the other hand, do not depend on any number of
subscribers. They might start publishing data right away and would
continue doing so whenever a new Subscriber comes in (in which case
said subscriber would only see new elements emitted after it
subscribed). For hot publishers, something does indeed happen before
you subscribe.
(https://projectreactor.io/docs/core/release/reference/#reactor.hotCold)
One way to fix it is to get rid of publish and operate on cold stream. Another one is to change .autoConnect(2); to .autoConnect(3); - that's because you want to start processing data when 3rd subscription - Mono.when(fluxA, monoB).block(); is reached (previous ones are fluxA.subscribe and monoB.subscribe).
Edit:
When did wait for sources to finish, but it got onComplete signal from previous subsription.
What probably happened is:
flux A was subsribed by fluxA.subscribe(System.out::println);, emitted 11 and printed it.
flux B was subsribed by monoB.subscribe(System.out::println); and started reduction.
Mono.when was subsribed (which triggered "multicasting" - fluxes were subsribed second time).
Reduction started, it's result will be 21.
Another reduction started and was immediately finished with result 20 (reducing empty stream - only item from fluxSrc was already consumed by another reduction).
flux A sent onComplete to both subsribers.
flux B sent onComplete with result of reduction = 20. It was passed to subscription made by Mono.when, that's why it wasn't printed.
Both fluxes sent onComplete since Mono.when subsription, so that After was printed.
Around that time first reduction was completed with value 21, which was passed to monoB.subscribe(System.out::println);
I am trying to set up an exponential back off via an Observable.timer if the network is down or if a given service is down. I have a retryWhen when there are errors.
I have two issue, I cannot get the timer to work, no matter the time set, it always runs immediately. From what I know in the docs it should run the delay then send a complete, but when I look at the logs, I see no delay.
Second is because of I wanted to get the value of the retry when it is returned I used subscribe to get it, however when Observable error is returned it throws an exception when I do the calculations. For the second issue, I plan to do a check on the type of Observable and action it depending on the type.
If I could get ideas on what I may be doing wrong that would be great
return Observable.zip(
locationObservable,
oAdapterService.getIssuerInformation(sponsorCode),
oAdapterService.getOfferInformation(sponsorCode, activity.getOfferCode()),
(LocationInfo a, IssuerInfo b, OfferInfo c) -> {
OAdapterUtil.setLocationInfo(activity, a);
OAdapterUtil.setIssuerInfo(activity, b);
OAdapterUtil.setOfferInfo(activity, c);
return activity;
})
.retryWhen(errors -> errors.zipWith(Observable.range(1, maxRetries), (error, retries) -> {
if (retries++ < maxRetries) {
log.debug("Issues with Service call for transaction ID {} with initiator ID {}, retry count {}"
,activity.getTransactionId(),activity.getInitiatorId() ,retries);
return Observable.just(retries);
}
log.error("Tried to call Service {} time(s) for for transaction ID {} with initiator ID {}, error is {} "
,maxRetries,activity.getTransactionId(),activity.getInitiatorId(),error);
return Observable.error(error);
}
).flatMap(x -> {
log.debug("X value in flat map is {}",x.toString());
x.subscribe(currentValue -> {
log.debug("X value in subscribe is with subscribe {}",currentValue.toString());
double retryCount = Double.parseDouble(currentValue.toString()) + 2.0 ;
log.debug("retry count {}",retryCount);
long exponentialBackOff =(long)Math.pow(2.0, retryCount);
log.debug("exp back off {}",exponentialBackOff);
// Observable.timer(exponentialBackOff, TimeUnit.SECONDS);
});
Observable.timer(10, TimeUnit.SECONDS);
return x;
// Observable.timer(backoffPeriod, TimeUnit.MILLISECONDS);
}
));
You have an orphan line of code:
Observable.timer(10, TimeUnit.SECONDS);
The only thing this line of code does is to create an observable. The result is discarded because nothing is done with it.
If you need to back off, then do:
return x.delay(10, TimeUnit.SECONDS);
inside of the flatMap() operator. Remove the x.subscriber(); any logging should be done before returning.
I have a Flowable that we are returning in a function that will continually read from a database and add it to a Flowable.
public void scan() {
Flowable<String> flow = Flowable.create((FlowableOnSubscribe<String>) emitter -> {
Result result = new Result();
while (!result.hasData()) {
result = request.query(skip, limit);
partialResult.getResult()
.getFeatures().forEach(feature -> emmitter.emit(feature));
}
}, BackpressureStrategy.BUFFER)
.subscribeOn(Schedulers.io());
return flow;
}
Then I have another object that can call this method.
myObj.scan()
.parallel()
.runOn(Schedulers.computation())
.map(feature -> {
//Heavy Computation
})
.sequential()
.blockingSubscribe(msg -> {
logger.debug("Successfully processed " + msg);
}, (e) -> {
logger.error("Failed to process features because of error with scan", e);
});
My heavy computation section could potentially take a very long time. So long in fact that there is a good chance that the database requests will load the whole database into memory before the consumer finishes the first couple entries.
I have read up on backpressure with rxjava but the only 4 options essentially make me drop data or replace it with the last.
Is there a way to make it so that when I call emmitter.emit(feature) the call blocks until there is more room in the Flowable?
I.E I want to treat the Flowable as a blocking queue where push will sleep if the queue is past the capacity.
I'm trying to delete a batch of couchbase documents in rapid fashion according to some constraint (or update the document if the constraint isn't satisfied). Each deletion is dubbed a "parcel" according to my terminology.
When executing, I run into a very strange behavior - the thread in charge of this task starts working as expected for a few iterations (at best). After this "grace period", couchbase gets "stuck" and the Observable doesn't call any of its Subscriber's methods (onNext, onComplete, onError) within the defined period of 30 seconds.
When the latch timeout occurs (see implementation below), the method returns but the Observable keeps executing (I noticed that when it kept printing debug messages when stopped with a breakpoint outside the scope of this method).
I suspect couchbase is stuck because after a few seconds, many Observables are left in some kind of a "ghost" state - alive and reporting to their Subscriber, which in turn have nothing to do because the method in which they were created has already finished, eventually leading to java.lang.OutOfMemoryError: GC overhead limit exceeded.
I don't know if what I claim here makes sense, but I can't think of another reason for this behavior.
How should I properly terminate an Observable upon timeout? Should I? Any other way around?
public List<InfoParcel> upsertParcels(final Collection<InfoParcel> parcels) {
final CountDownLatch latch = new CountDownLatch(parcels.size());
final List<JsonDocument> docRetList = new LinkedList<JsonDocument>();
Observable<JsonDocument> obs = Observable
.from(parcels)
.flatMap(parcel ->
Observable.defer(() ->
{
return bucket.async().get(parcel.key).firstOrDefault(null);
})
.map(doc -> {
// In-memory manipulation of the document
return updateDocs(doc, parcel);
})
.flatMap(doc -> {
boolean shouldDelete = ... // Decide by inner logic
if (shouldDelete) {
if (doc.cas() == 0) {
return Observable.just(doc);
}
return bucket.async().remove(doc);
}
return (doc.cas() == 0 ? bucket.async().insert(doc) : bucket.async().replace(doc));
})
);
obs.subscribe(new Subscriber<JsonDocument>() {
#Override
public void onNext(JsonDocument doc) {
docRetList.add(doc);
latch.countDown();
}
#Override
public void onCompleted() {
// Due to a bug in RxJava, onError() / retryWhen() does not intercept exceptions thrown from within the map/flatMap methods.
// Therefore, we need to recalculate the "conflicted" parcels and send them for update again.
while(latch.getCount() > 0) {
latch.countDown();
}
}
#Override
public void onError(Throwable e) {
// Same reason as above
while (latch.getCount() > 0) {
latch.countDown();
}
}
};
);
latch.await(30, TimeUnit.SECONDS);
// Recalculating remaining failed parcels and returning them for another cycle of this method (there's a loop outside)
}
I think this is indeed due to the fact that using a countdown latch doesn't signal the source that the flow of data processing should stop.
You could use more of rxjava, by using toList().timeout(30, TimeUnit.SECONDS).toBlocking().single() instead of collecting in an (un synchronized and thus unsafe) external list and of using the countdownLatch.
This will block until a List of your documents is returned.
When you create your couchbase env in code, set computationPoolSize to something large. When the Couchbase clients runs out of threads using async it just stops working, and wont ever call the callback.
In code bellow I need to release some resources on unsubscription (where it logs "release").
Observable first = Observable.create(new Observable.OnSubscribe<Object>() {
#Override
public void call(Subscriber<? super Object> subscriber) {
subscriber.add(Subscriptions.create(() -> {
log(“release”);
}));
}
}).doOnUnsubscribe(() -> log(“first”));
Observable second = Observable.create(…).doOnUnsubscribe(() -> log(“second”));
Observable result = first.mergeWith(second).doOnUnsubscribe(() -> log(“result”));
Subscription subscription = result.subscribe(…);
//…
subscription.unsubscribe();
But it logs only “result”. Looks like unsubscription is not propagated to merge’s child observables. So how to handle unsubscription inside of first observable’s Observable.OnSubscribe?
Most of the time, calling unsubscribe has only effect on a live sequence and may not propagate if certain sequences have completed: the operators may not keep their sources around so they can avoid memory leaks. The main idea would be that operators release any resources they manage on termination just before or just after they call their downstream's onError or onCompleted methods, but this is somewhat inconsistent with 1.x.
If you want to make sure resources are releases, look at the using operator which will release your resource upon termination or unsubscription:
Observable.using(
() -> "resource",
r -> Observable.just(r),
r -> System.out.println("Releasing " + r))
.subscribe(System.out::println);