I came across several instances when people were trying to persuade me into using RxJava instead of Android's standard AsyncTask construct.
In my opinion RxJava offers a lot more features but loses in simplicity against AsyncTask.
Are there any use cases that suit one approach better than the other or even more general can RxJava even be considered superior?
The full power of RxJava is visible when you use it on Java 8, preferably with a library like Retrofit. It allows you to trivially chain operations together, with full control of error handling. For example, consider the following code given id: an int that specifies the order and apiClient: a Retrofit client for the order management microservice:
apiClient
.getOrder(id)
.subscribeOn(Schedulers.io())
.flatMapIterable(Order::getLineItems)
.flatMap(lineItem ->
apiClient.getProduct(lineItem.getProductId())
.subscribeOn(Schedulers.io())
.map(product -> product.getCurrentPrice() * lineItem.getCount()),
5)
.reduce((a,b)->a+b)
.retryWhen((e, count) -> count<2 && (e instanceof RetrofitError))
.onErrorReturn(e -> -1)
.subscribe(System.out::println);
This will asynchronously calculate the total price of an order, with the following properties:
at most 5 requests against the API in flight at any one time (and you can tweak the IO scheduler to have a hard cap for all requests, not just for a single observable chain)
up to 2 retries in case of network errors
-1 in case of failure (an antipattern TBH, but that's an other discussion)
Also, IMO the .subscribeOn(Schedulers.io()) after each network call should be implicit - you can do that by modifying how you create the Retrofit client. Not bad for 11+2 lines of code, even if it's more backend-ish than Android-ish.
RxBinding/RxAndroid by Jake Wharton provides some nice threading functionality that you can use to make async calls but RxJava provides waaay more benefits/functionality than just dealing with async threading. That said, There is a pretty steep learning curve (IMO). Also, it should be noted that there is nothing wrong with using AsyncTasks, you can just write more eloquent solutions with Rx (also, IMO).
TLDR you should make an effort to use it. Retrofit and RxJava work together nicely for your AsyncTask replacement purposes.
Related
For a long time, Spring has been recommending RestTemplate for sync http requests. However, nowadays the documentation says:
NOTE: As of 5.0 this class is in maintenance mode, with only minor requests for changes and bugs to be accepted going forward. Please, consider using the org.springframework.web.reactive.client.WebClient which has a more modern API and supports sync, async, and streaming scenarios.
But I haven't been able to see how one is recommended to use WebClient for sync scenarios. There is this in the documentation:
WebClient can be used in synchronous style by blocking at the end for the result
and I've seen some codebases using .block() all over the place. However, my problem with this is that with some experience in reactive frameworks, I've grown to understand that blocking a reactive call is a code smell and should really be used in testing only. For example this page says
Sometimes you can only migrate part of your code to be reactive, and you need to reuse reactive sequences in more imperative code.
Thus if you need to block until the value from a Mono is available, use Mono#block() method. It will throw an Exception if the onError event is triggered.
Note that you should avoid this by favoring having reactive code end-to-end, as much as possible. You MUST avoid this at all cost in the middle of other reactive code, as this has the potential to lock your whole reactive pipeline.
So is there something I've missed that avoids block()s but allows you to do sync calls, or is using block() everywhere really the way?
Or is the intent of WebClient API to imply that one just shouldn't do blocking anywhere in your codebase anymore? As WebClient seems to be the only alternative for future http calls offered by Spring, is the only viable choice in the future to use non-blocking calls throughout your codebase, and change the rest of the codebase to accommodate that?
There's a related question here but it focuses on the occurring exception only, whereas I would be interested to hear what should be the approach in general.
Firstly accoring to the WebClient Java docs
public interface WebClient
Non-blocking, reactive client to perform HTTP requests, exposing a fluent, reactive API over underlying HTTP client libraries such as
Reactor Netty. Use static factory methods create() or create(String),
or builder() to prepare an instance.
So webClient is not created to be blocking somehow.
However the response that webClient returns can be of type <T> reactor.core.publisher.Flux<T> and other times of type <T> reactor.core.publisher.Mono<T>. Flux and Mono from reactor project are the ones that have blocking methods. ResponseSpec from WebClient.
WebClient was designed to be a reactive client.
As you might have seen from other reactive libraries from other languages example RxJs for javascript the reactive programming is usually based on functional programming.
What happens here with Flux and Mono from reactor project is that they allow you to make block() in order to make synchronous execution without the need of functional programming.
Here is a part of an article that I find much interesting
Extractors: The Subscribers from the Dark Side
There is another way to
subscribe to a sequence, which is to call Mono.block() or
Mono.toFuture() or Flux.toStream() (these are the "extractor"
methods — they get you out of the Reactive types into a less flexible,
blocking abstraction). Flux also has converters collectList() and
collectMap() that convert from Flux to Mono. They don’t actually
subscribe to the sequence, but they do throw away any control you
might have had over the suscription at the level of the individual
items.
Warning A good rule of thumb is "never call an extractor". There are
some exceptions (otherwise the methods would not exist). One notable
exception is in tests because it’s useful to be able to block to allow
results to accumulate. These methods are there as an escape hatch to
bridge from Reactive to blocking; if you need to adapt to a legacy
API, for instance Spring MVC. When you call Mono.block() you throw
away all the benefits of the Reactive Streams
So can you do synchronous programming without using the block() operations ?
Yes you can but then you have to think in terms of functional programming
for your application.
Example
public void doSomething1( ) {
webClientCall_1....subscribe( response1 -> {
...do something else ...
webClientCall_2....subscribe( response2 -> {
...do something else more with response1 and response2 available here...
});
});
}
This is called subscribe callback hell. You can avoid it using .block() methods but again as the provided article mentioned they throw away the reactive nature of that library.
I've just discovered the joys of RxJava and its 10000 public methods, but I am struggling to understand how (and if) we should incorporate async apis into reactive pipelines.
To give an example, let's say I'm building a pipeline that:
takes keys from some cold source (or hot, in which case let's say we already have a way of dealing with an overactive source)
fetches data for those keys using an asynchronous client (or just applies any kind of async processing)
batches the data and
saves it into storage.
If we had a blocking api for step #2, it might look something like this.
source.map((key) -> client.callBlocking(key))
.buffer(500, TimeUnit.MILLISECONDS, 100)
.subscribe(dataList -> storage.batchSave(dataList));
With a couple more calls, we could parallelise this, making it so that 100 threads are waiting on client.callBlocking at any given time.
But what if the api we have is already asynchronous and we want to make use of that? I imagine the same pipeline would look something like this
source.magicMethod(new Processor() {
// When downstream requests more items
public void request(int count) {
upstream.request(count);
}
// When upstream delivers us an item
public void onNext(Object key) {
client.callAsync(key)
.onResult((data) -> downstream.onNext(data));
}
})
.buffer(500, TimeUnit.MILLISECONDS, 100)
.subscribe(data -> storage.batchSave(data));
What I want to know is which method is magicMethod. Or perhaps this is a terrible idea to incorporate async calls into a pipeline and we should never ever. (There is also a question of pre-fetching, so that downstream code does not necessarily have to wait for data after requesting it, but let's put that aside for now)
Note that this is not a question about parallelism. The second version could run perfectly well in a single thread (plus whatever threads the client may or may not be using under the hood)
Also, while the question is about RxJava, I'd be just as happy to see an answer using Reactor.
Thanks for helping a poor old reactive noob :)
It seems like mental overhead to use both Observable and Flowable in the same project for RXJava 2.
Say to distinguish in the interfaces that this method return Observable so there is not back-pressure this one does so it is with back-pressure support.
But in the end of the day we can just use Flowable to make things simpler?
Then:
Can I make Flowable to be faster to turn off pressure to behave same way as Observable does?
My answer to myself: use Flowable everywhere. For all API that exposed themselves through the Net.
My use case: I use RX Java to defining RX API to the Web and itneracting with other microservices by Net.
From my understanding, the projectreactor https://projectreactor.io/ (another rx iml) project there is even no question whether to go with backpressure or not. It is there for all Flux
Going back to RxJava2. It seems to me that in the RX Java it was decided to keep Observable for the backward compatibility purposes primary. (there is no good explanation why not)
And when it comes to HTTP I found that I need it even for smaller sets. Usually. Event if I return less than 1000 items to the client it does not mean that it can handle it at once (it can be busy doing other stuff), so I would go with backpressure.
So I do not won't to overcomplicate things / api / code, will go with Flowable everywhere until I see any of Observable.
How are reactive streams different from non blocking I/O ? what is that java 8 future API cannot do that reactive streams can do?
Same as non-blocking I/O, Reactive Extensions(ReactiveX) offers the non blocking programing style.
Not only that, ReactiveX makes everything as a stream, offers many operations for a stream. This functionality makes asynchronous programing very easy, saves us from callback hell ;)
I recommend you to read this document.
http://reactivex.io/intro.html
And here is good slide for ReactiveX.
https://speakerdeck.com/benjchristensen/applying-reactive-programming-with-rxjava-at-goto-chicago-2015
http://sssslide.com/speakerdeck.com/android10/the-mayans-lost-guide-to-rxjava-on-android
The main reason is that ReactiveX provide some operators to run your pipeline asynchronously like SubscribeOn or ObserverOn. And also provide some other functionality that Java8 or Scala by default in his functional programing does not provide.
Here you can see an example about Asynchronous operators to understand how works https://github.com/politrons/reactive/blob/master/src/test/java/rx/observables/scheduler/ObservableAsynchronous.java
And global examples of RxJava here https://github.com/politrons/reactive
Non-blocking I/O is a lower-level abstraction than reactive streams, who offer you constructs like these (consider serviceX to be a retrofit client):
Observable.zip(
service1.getFoo(1),
service2.doBar(xyz),
service3.makeBaz("meh"),
(a,b,c) -> service4.somethingElse(a+b+c)
)
.onErrorReturn("error");
This, in 7 contacts 3 services in parallel and when all of them have returned contacts a 4th with their results.
I want to parse multiple files to extract the required data and then write the output into an XML file. I have used Callable Interface to implement this. My colleague asked me to use Java 8 feature which does this job easily. I am really confused which one of them I should use now.
list.parallelStream().forEach(a -> {
System.out.println(a);
});
Using concurrency or a parallel stream only helps if you have independent tasks to work on. A good example of when you wouldn't do this is what you are locking on a shared resources e.g.
// makes no sense to use parallel here.
list.parallelStream().forEach(a -> {
// locks System.out so only one thread at a time can do any work.
System.out.println(a);
});
However, as a general question, I would use parallelStream for processing data, instead of the concurrency libraries directly because;
a functional style of coding discourages shared mutable state. (Actually how are not supposed to have an mutable state in functional programming but Java is not really a functional language)
it's easier to write and understand for processing data.
it's easier to test whether using parallel helps or not. Most likely ti won't and you can just as easily change it back to being serial.
IMHO Given the chances that using parallel coding will really help is low, the best feature of parallelStream is not how simple it is to add, but how simple it is to take out.
The concurrency library is better if you have ad hoc work which is difficult to model as a stream of data. e.g. a worker pool for client requests might be simplier to implement using an ExecutorService.