RxJava2. Flowable to behave same way as Observable does?

RxJava2. Flowable to behave same way as Observable does? - java

It seems like mental overhead to use both Observable and Flowable in the same project for RXJava 2.
Say to distinguish in the interfaces that this method return Observable so there is not back-pressure this one does so it is with back-pressure support.
But in the end of the day we can just use Flowable to make things simpler?
Then:
Can I make Flowable to be faster to turn off pressure to behave same way as Observable does?

My answer to myself: use Flowable everywhere. For all API that exposed themselves through the Net.
My use case: I use RX Java to defining RX API to the Web and itneracting with other microservices by Net.
From my understanding, the projectreactor https://projectreactor.io/ (another rx iml) project there is even no question whether to go with backpressure or not. It is there for all Flux
Going back to RxJava2. It seems to me that in the RX Java it was decided to keep Observable for the backward compatibility purposes primary. (there is no good explanation why not)
And when it comes to HTTP I found that I need it even for smaller sets. Usually. Event if I return less than 1000 items to the client it does not mean that it can handle it at once (it can be busy doing other stuff), so I would go with backpressure.
So I do not won't to overcomplicate things / api / code, will go with Flowable everywhere until I see any of Observable.

Related

Reactive web-crawler with limited concurrent request to the same domain

I'm working on an open-source web crawling project. I noticed that the application occasionally flood the websites it's crawling with requests (I get back 429 Too Many Requests). Because of this, I want to limit the concurrent request count to one with a one-second delay between requests for the same domain.
I figured out this code to do that:
Flux.generate(downloaderQueueConsumer)
.doFirst(this::initializeProcessing)
.flatMap(this::evaluateDocumentLocation)
.groupBy(this::parseDocumentDomain, 100000)
.flatMap(documentSourceItem1 -> documentSourceItem1
.delayElements(Duration.ofSeconds(1))
.doOnNext(this::incrementProcessedCount)
.flatMap(this::downloadDocument)
.flatMap(this::archiveDocument)
.doOnNext(this::incrementArchivedCount)
)
.doFinally(this::finishProcessing)
.subscribe();
My problem with this code is that it doesn't limit parallel request count to a domain to one. Is there a way to achieve that?

You'd probably need to maintain some sort of state external to the Flux if you wanted to do it this way - there's no obvious way to store and alter this sort of mutable data within the Flux itself.
That being said, this isn't the approach I'd recommend for rate limiting - I've instead done something similar to the following which is a nicer and more robust solution:
Map a 429 status code to a "rate limit" exception (you'll likely need to define this exception type yourself)
Pull in reactor-extra, then use Retry to use exponential backoff with jitter (or whatever backoff strategy you prefer.)
This will give you more control over your specific retry strategy as well as likely making your code more readable.

What does share operator do in RxJava? When should I use it?

I know that share() is a replacement of publish().refCount().
Then from the RxJava wiki:
Observable.publish( ) — represents an Observable as a Connectable Observable
ConnectableObservable.refCount( ) — makes a Connectable Observable behave like an ordinary Observable
This make me confused. If after publish().refCount(), it just behave like an ordinary Observable, why should I use it, how does this api make sense?

You're right - Observable.share is just a shortcut for publish().refCount(). I think that description you have quoted above is not entirely clear as ConnectedObservable.refCount does a little bit more :)
If you transform your Observable to ConnectableObservable - it will not emit items (even if something is subscribed) unless explicitly called ConnectableObservable.connect - it basically defers execution of subscribe method and prevents from executing it multiple times for every subscriber. This technique is often used to make sure that all subscribers are subscribed before observable starts emitting items (in other words - after everyone has subscribed - connect() method is called).
If you have more than one subscriber (what often happens), you have to handle their subscriptions and unsubscriptions and this is where things are getting tricky. This is why refCount() was introduced. This operator returns new Observable, keeps track of how many subscribers are subscribed to it and stays connected as long as there is at least one subscription. It will also automatically connect when the first subscriber appears.
PS. I'm learning how to use RxJava, if I am wrong - please point it out!

What is the difference between Observable and Flowable in RxJava 2.0?

Observable and Flowable interfaces seem to be identical. Why Flowable was introduced in RxJava 2.0? When should I prefer to use Flowable over Observable?

As stated in the documentation:
A small regret about introducing backpressure in RxJava 0.x is that
instead of having a separate base reactive class, the Observable
itself was retrofitted. The main issue with backpressure is that many
hot sources, such as UI events, can't be reasonably backpressured and
cause unexpected MissingBackpressureException (i.e., beginners don't
expect them).
We try to remedy this situation in 2.x by having
io.reactivex.Observable non-backpressured and the new
io.reactivex.Flowable be the backpressure-enabled base reactive class.
Use Observable when you have relatively few items over time (<1000) and/or there's no risk of producer overflooding consumers and thus causing OOM.
Use Flowable when you have relatively large amount of items and you need to carefully control how Producer behaves in order to to avoid resource exhaustion and/or congestion.
Backpressure
When you have an observable which emits items so fast that consumer can’t keep up with the flow leading to the existence of emitted but unconsumed items.
How unconsumed items, which are emitted by observables but not consumed by subscribers, are managed and controlled is what backpressure strategy deals with.
Ref link

Observable vs Flowable rxJava2

I have been looking at new rx java 2 and I'm not quite sure I understand the idea of backpressure anymore...
I'm aware that we have Observable that does not have backpressure support and Flowable that has it.
So based on example, lets say I have flowable with interval:
Flowable.interval(1, TimeUnit.MILLISECONDS, Schedulers.io())
.observeOn(AndroidSchedulers.mainThread())
.subscribe(new Consumer<Long>() {
#Override
public void accept(Long aLong) throws Exception {
// do smth
}
});
This is going to crash after around 128 values, and thats pretty obvious I am consuming slower than getting items.
But then we have the same with Observable
Observable.interval(1, TimeUnit.MILLISECONDS, Schedulers.io())
.observeOn(AndroidSchedulers.mainThread())
.subscribe(new Consumer<Long>() {
#Override
public void accept(Long aLong) throws Exception {
// do smth
}
});
This will not crash at all, even when I put some delay on consuming it still works. To make Flowable work lets say I put onBackpressureDrop operator, crash is gone but not all values are emitted either.
So the base question I can not find answer currently in my head is why should I care about backpressure when I can use plain Observable still receive all values without managing the buffer? Or maybe from the other side, what advantages do backpressure give me in favour of managing and handling the consuming?

What backpressure manifests in practice is bounded buffers, Flowable.observeOn has a buffer of 128 elements that gets drained as fast as the dowstream can take it. You can increase this buffer size individually to handle bursty source and all the backpressure-management practices still apply from 1.x. Observable.observeOn has an unbounded buffer that keeps collecting the elements and your app may run out of memory.
You may use Observable for example:
handling GUI events
working with short sequences (less than 1000 elements total)
You may use Flowable for example:
cold and non-timed sources
generator like sources
network and database accessors

Backpressure is when your observable (publisher) is creating more events than your subscriber can handle. So you can get subscribers missing events, or you can get a huge queue of events which just leads to out of memory eventually. Flowable takes backpressure into consideration. Observable does not. Thats it.
it reminds me of a funnel which when it has too much liquid overflows. Flowable can help with not making that happen:
with tremendous backpressure:
but with using flowable, there is much less backpressure :
Rxjava2 has a few backpressure strategies you can use depending on your usecase. by strategy i mean Rxjava2 supplies a way to handle the objects that cannot be processed because of the overflow (backpressure).
here are the strategies.
I wont go through them all, but for example if you want to not worry about the items that are overflowed you can use a drop strategy like this:
observable.toFlowable(BackpressureStrategy.DROP)
As far as i know there should be a 128 item limit on the queue, after that there can be a overflow (backpressure). Even if its not 128 its close to that number. Hope this helps someone.
if you need to change the buffer size from 128 it looks like it can be done
like this (but watch any memory constraints:
myObservable.toFlowable(BackpressureStrategy.MISSING).buffer(256); //but using MISSING might be slower.
in software developement usually back pressure strategy means your telling the emitter to slow down a bit as the consumer cannot handle the velocity your emitting events.

The fact that your Flowable crashed after emitting 128 values without backpressure handling doesn't mean it will always crash after exactly 128 values: sometimes it will crash after 10, and sometimes it will not crash at all. I believe this is what happened when you tried the example with Observable - there happened to be no backpressure, so your code worked normally, next time it may not. The difference in RxJava 2 is that there is no concept of backpressure in Observables anymore, and no way to handle it. If you're designing a reactive sequence that will probably require explicit backpressure handling - then Flowable is your best choice.

RxJava instead of AsyncTask?

I came across several instances when people were trying to persuade me into using RxJava instead of Android's standard AsyncTask construct.
In my opinion RxJava offers a lot more features but loses in simplicity against AsyncTask.
Are there any use cases that suit one approach better than the other or even more general can RxJava even be considered superior?

The full power of RxJava is visible when you use it on Java 8, preferably with a library like Retrofit. It allows you to trivially chain operations together, with full control of error handling. For example, consider the following code given id: an int that specifies the order and apiClient: a Retrofit client for the order management microservice:
apiClient
.getOrder(id)
.subscribeOn(Schedulers.io())
.flatMapIterable(Order::getLineItems)
.flatMap(lineItem ->
apiClient.getProduct(lineItem.getProductId())
.subscribeOn(Schedulers.io())
.map(product -> product.getCurrentPrice() * lineItem.getCount()),
5)
.reduce((a,b)->a+b)
.retryWhen((e, count) -> count<2 && (e instanceof RetrofitError))
.onErrorReturn(e -> -1)
.subscribe(System.out::println);
This will asynchronously calculate the total price of an order, with the following properties:
at most 5 requests against the API in flight at any one time (and you can tweak the IO scheduler to have a hard cap for all requests, not just for a single observable chain)
up to 2 retries in case of network errors
-1 in case of failure (an antipattern TBH, but that's an other discussion)
Also, IMO the .subscribeOn(Schedulers.io()) after each network call should be implicit - you can do that by modifying how you create the Retrofit client. Not bad for 11+2 lines of code, even if it's more backend-ish than Android-ish.

RxBinding/RxAndroid by Jake Wharton provides some nice threading functionality that you can use to make async calls but RxJava provides waaay more benefits/functionality than just dealing with async threading. That said, There is a pretty steep learning curve (IMO). Also, it should be noted that there is nothing wrong with using AsyncTasks, you can just write more eloquent solutions with Rx (also, IMO).
TLDR you should make an effort to use it. Retrofit and RxJava work together nicely for your AsyncTask replacement purposes.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.