How should one do sync http requests in modern Spring?

How should one do sync http requests in modern Spring? - java

For a long time, Spring has been recommending RestTemplate for sync http requests. However, nowadays the documentation says:
NOTE: As of 5.0 this class is in maintenance mode, with only minor requests for changes and bugs to be accepted going forward. Please, consider using the org.springframework.web.reactive.client.WebClient which has a more modern API and supports sync, async, and streaming scenarios.
But I haven't been able to see how one is recommended to use WebClient for sync scenarios. There is this in the documentation:
WebClient can be used in synchronous style by blocking at the end for the result
and I've seen some codebases using .block() all over the place. However, my problem with this is that with some experience in reactive frameworks, I've grown to understand that blocking a reactive call is a code smell and should really be used in testing only. For example this page says
Sometimes you can only migrate part of your code to be reactive, and you need to reuse reactive sequences in more imperative code.
Thus if you need to block until the value from a Mono is available, use Mono#block() method. It will throw an Exception if the onError event is triggered.
Note that you should avoid this by favoring having reactive code end-to-end, as much as possible. You MUST avoid this at all cost in the middle of other reactive code, as this has the potential to lock your whole reactive pipeline.
So is there something I've missed that avoids block()s but allows you to do sync calls, or is using block() everywhere really the way?
Or is the intent of WebClient API to imply that one just shouldn't do blocking anywhere in your codebase anymore? As WebClient seems to be the only alternative for future http calls offered by Spring, is the only viable choice in the future to use non-blocking calls throughout your codebase, and change the rest of the codebase to accommodate that?
There's a related question here but it focuses on the occurring exception only, whereas I would be interested to hear what should be the approach in general.

Firstly accoring to the WebClient Java docs
public interface WebClient
Non-blocking, reactive client to perform HTTP requests, exposing a fluent, reactive API over underlying HTTP client libraries such as
Reactor Netty. Use static factory methods create() or create(String),
or builder() to prepare an instance.
So webClient is not created to be blocking somehow.
However the response that webClient returns can be of type <T> reactor.core.publisher.Flux<T> and other times of type <T> reactor.core.publisher.Mono<T>. Flux and Mono from reactor project are the ones that have blocking methods. ResponseSpec from WebClient.
WebClient was designed to be a reactive client.
As you might have seen from other reactive libraries from other languages example RxJs for javascript the reactive programming is usually based on functional programming.
What happens here with Flux and Mono from reactor project is that they allow you to make block() in order to make synchronous execution without the need of functional programming.
Here is a part of an article that I find much interesting
Extractors: The Subscribers from the Dark Side
There is another way to
subscribe to a sequence, which is to call Mono.block() or
Mono.toFuture() or Flux.toStream() (these are the "extractor"
methods — they get you out of the Reactive types into a less flexible,
blocking abstraction). Flux also has converters collectList() and
collectMap() that convert from Flux to Mono. They don’t actually
subscribe to the sequence, but they do throw away any control you
might have had over the suscription at the level of the individual
items.
Warning A good rule of thumb is "never call an extractor". There are
some exceptions (otherwise the methods would not exist). One notable
exception is in tests because it’s useful to be able to block to allow
results to accumulate. These methods are there as an escape hatch to
bridge from Reactive to blocking; if you need to adapt to a legacy
API, for instance Spring MVC. When you call Mono.block() you throw
away all the benefits of the Reactive Streams
So can you do synchronous programming without using the block() operations ?
Yes you can but then you have to think in terms of functional programming
for your application.
Example
public void doSomething1( ) {
webClientCall_1....subscribe( response1 -> {
...do something else ...
webClientCall_2....subscribe( response2 -> {
...do something else more with response1 and response2 available here...
});
});
}
This is called subscribe callback hell. You can avoid it using .block() methods but again as the provided article mentioned they throw away the reactive nature of that library.

Related

Does the use of Spring Webflux's WebClient in a blocking application design cause a larger use of resources than RestTemplate

I am working on several spring-boot applications which have the traditional pattern of thread-per-request. We are using Spring-boot-webflux to acquire WebClient to perform our RESTful integration between the applications. Hence our application design requires that we block the publisher right after receiving a response.
Recently, we've been discussing whether we are unnecessarily spending resources using a reactive module in our otherwise blocking application design. As I've understood it, WebClient makes use of the event loop by assigning a worker thread to perform the reactive actions in the event loop. So using webclient with .block() would sleep the original thread while assigning another thread to perform the http-request. Compared to the alternative RestTemplate, it seems like WebClient would spend additional resources by using the event loop.
Is it correct that partially introducing spring-webflux in this way leads to additional spent resources while not yielding any positive contribution to performance, neither single threaded and concurrent? We are not expecting to ever upgrade our current stack to be fully reactive, so the argument of gradually upgrading does not apply.

In this presentation Rossen Stoyanchev from the Spring team explains some of these points.
WebClient will use a limited number of threads - 2 per core for a total of 12 threads on my local machine - to handle all requests and their responses in the application. So if your application receives 100 requests and makes one request to an external server for each, WebClient will handle all of those using those threads in a non-blocking / asynchronous manner.
Of course, as you mention, once you call block your original thread will block, so it would be 100 threads + 12 threads for a total of 112 threads to handle those requests. But keep in mind that these 12 threads do not grow in size as you make more requests, and that they don't do I/O heavy lifting, so it's not like WebClient is spawning threads to actually perform the requests or keeping them busy on a thread-per-request fashion.
I'm not sure if when the thread is under block it behaves the same as when making a blocking call through RestTemplate - it seems to me that in the former the thread should be inactive waiting for the NIO call to complete, while in the later the thread should be handling I/O work, so maybe there's a difference there.
It gets interesting if you begin using the reactor goodies, for example handling requests that depend on one another, or many requests in parallel. Then WebClient definitely gets an edge as it'll perform all concurrent actions using the same 12 threads, instead of using a thread per request.
As an example, consider this application:
#SpringBootApplication
public class SO72300024 {
private static final Logger logger = LoggerFactory.getLogger(SO72300024.class);
public static void main(String[] args) {
SpringApplication.run(SO72300024.class, args);
}
#RestController
#RequestMapping("/blocking")
static class BlockingController {
#GetMapping("/{id}")
String blockingEndpoint(#PathVariable String id) throws Exception {
logger.info("Got request for {}", id);
Thread.sleep(1000);
return "This is the response for " + id;
}
#GetMapping("/{id}/nested")
String nestedBlockingEndpoint(#PathVariable String id) throws Exception {
logger.info("Got nested request for {}", id);
Thread.sleep(1000);
return "This is the nested response for " + id;
}
}
#Bean
ApplicationRunner run() {
return args -> {
Flux.just(callApi(), callApi(), callApi())
.flatMap(responseMono -> responseMono)
.collectList()
.block()
.stream()
.flatMap(Collection::stream)
.forEach(logger::info);
logger.info("Finished");
};
}
private Mono<List<String>> callApi() {
WebClient webClient = WebClient.create("http://localhost:8080");
logger.info("Starting");
return Flux.range(1, 10).flatMap(i ->
webClient
.get().uri("/blocking/{id}", i)
.retrieve()
.bodyToMono(String.class)
.doOnNext(resp -> logger.info("Received response {} - {}", I, resp))
.flatMap(resp -> webClient.get().uri("/blocking/{id}/nested", i)
.retrieve()
.bodyToMono(String.class)
.doOnNext(nestedResp -> logger.info("Received nested response {} - {}", I, nestedResp))))
.collectList();
}
}
If you run this app, you can see that all 30 requests are handled immediately in parallel by the same 12 (in my computer) threads. Neat! If you think you can benefit from such kind of parallelism in your logic, it's probably worth it giving WebClient a shot.
If not, while I wouldn't actually worry about the "extra resource spending" given the reasons above, I don't think it would be worth it adding the whole reactor/webflux dependency for this - besides the extra baggage, in day to day operations it should be a lot simpler to reason about and debug RestTemplate and the thread-per-request model.
Of course, as others have mentioned, you ought to run load tests to have proper metrics.

According to official Spring documentation for RestTemplate it's in the maintenance mode and probably will not be supported in future versions.
As of 5.0 this class is in maintenance mode, with only minor requests
for changes and bugs to be accepted going forward. Please, consider
using the org.springframework.web.reactive.client.WebClient which has
a more modern API and supports sync, async, and streaming scenarios
As for system resources, it really depends on your use case and I would recommend to run some performance tests, but it seems that for low workloads using blocking client could have a better performance owning to a dedicated thread per connection. As load increases, the NIO clients tend to perform better.
Update - Reactive API vs Http Client
It's important to understand the difference between Reactive API (Project Reactor) and http client. Although WebClient uses Reactive API it doesn't add any additional concurrently until we explicitly use operators like flatMap or delay that could schedule execution on different thread pools. If we just use
webClient
.get()
.uri("<endpoint>")
.retrieve()
.bodyToMono(String.class)
.block()
the code will be executed on the caller thread that is the same as for blocking client.
If we enable debug logging for this code, we will see that WebClient code is executed on the caller thread but for network operations execution will be switched to reactor-http-nio-... thread.
The main difference is that internally WebClient uses asynchronous client based on non-blocking IO (NIO). These clients use Reactor pattern (event loop) to maintain a separate thread pool(s) which allow you to handle a large number of concurrent connections.
The purpose of I/O reactors is to react to I/O events and to dispatch
event notifications to individual I/O sessions. The main idea of I/O
reactor pattern is to break away from the one thread per connection
model imposed by the classic blocking I/O model.
By default, Reactor Netty is used but you could consider Jetty Rective Http Client, Apache HttpComponents (async) or even AWS Common Runtime (CRT) Http Client if you create required adapter (not sure it already exists).
In general, you can see the trend across the industry to use async I/O (NIO) because they are more resource efficient for applications under high load.
In addition, to handle resource efficiently the whole flow must be async. By using block() we are implicitly reintroducing thread-per-connection approach that will eliminate most of the benefits of the NIO. At the same time using WebClient with block() could be considered as a first step for migration to the fully reactive application.

Great question.
Last week we considered migrating from resttemplate to webclient.
This week, I start testing the performance between the blocking webclient and the resttemplate, to my surprise, the resttemplate performed better in scenarios where the response payloads were large. The difference was considerably large, with the resttemplate taking less than half the time to respond and using fewer resources.
I'm still carrying out the performance tests, now I started the tests with a wider range of users for request.
The application is mvc and is using spring 5.13.19 and spring boot 2.6.7.
For perfomance testing I'm using jmeter and for health check visualvm/jconsole

Using reactive webflux code inside of a #KafkaListener annotated method

I am using spring-kafka to implement a consumer that reads messages from a certain topic. All of these messages are processed by them being exported into another system via a REST API. For that, the code uses the WebClient from the Spring Webflux project, which results in reactive code:
#KafkaListener(topics = "${some.topic}", groupId = "my-group-id")
public void listenToTopic(final ConsumerRecord<String, String> record) {
// minimal, non-reactive code here (logging, serizializing the string)
webClient.get().uri(...).retrieve().bodyToMono(String.class)
// long, reactive chain here
.subscribe();
}
Now I am wondering if this setup is reasonable or if this could cause a lot of issues because the KafkaListener logic from spring-kafka isn't inherently reactive. I wonder if it is necessary to use reactor-kafka instead.
My understanding of the whole reactive world and also the kafka world is very limited, but here is what I am currently assuming the above setup would entail:
The listenToTopic function will almost immediately return, because the bulk of the work is done in a reactive chain, which will not block the function from returning. This means that, from what I understand, the KafkaListener logic will assume that the message is properly processed right there and then, so it will probably acknowledge it and at some point also commit it. If I understand correctly, then that means that the processing of the messages could get out of order. Work could still be done in the previous, reactive chain while the KafkaListener already fetches the next record. This means if the application relies on the messages being fully processed in strict order, then the above setup would be bad. But if it does not, then the above setup would be okay?
Another issue with the above setup is that the application could overload itself with work if there are a lot of messages coming in. Because the listener function returns almost immediately, a large amount of messages could be processing inside of reactive chains at the same time.
The retry-logic that comes built in with the #KafkaListener logic would not really work here, because exceptions inside of the reactive chain would not trigger it. Any retry-logic would have to be handled by the reactive code inside of the listener function itself.
When using reactor-kafka instead of the #KafkaListener annotation, one could change the behaviour described in point 1. Because the listener would now be integrated into the reactive chain, it would be possible to acknowledge a message only when the reactive chain has actually finished. This way, from what I understand, the next message will only be fetched after one message is fully processed via the reactive chain. This would probably solve the issues/behaviour described in point 2-4 as well.
The question: Is my understanding of the situation correct? Are there other issues that could be caused by this setup that I have missed?

Your understanding is correct; either switch to a non-reactive rest client (e.g. RestTemplate) or use reactor-kafka for the consumer.

Multiple asynchronous HTTP requests using Resttemplate

I have a service which uses springs RestTemplate to call out to multiple urls.
To improve performance I'd like to perform these requests in parallel. Two options available to me are:
java 8 parallel streams leveraging the fork-join common pool
completable future using isolated thread pool
Just wondering if it best practice to use parallel streams with blocking I/O calls ?

A ForkJoinPool isn't ideal for doing IO work since you don't gain any of the benefits of its work stealing properties. If you planned to use the commonPool and other parts of your app did as well, you might interfere with them. A dedicated thread pool, an ExecutorService for example, is probably the better solution among those two.
I'd like to suggest something even better. Instead of writing all the async wrapping code yourself, consider using Spring's AsyncRestTemplate. It's included in the Spring Web library and its API is almost identical to RestTemplate.
Spring's central class for asynchronous client-side HTTP access.
Exposes similar methods as RestTemplate, but returns ListenableFuture
wrappers as opposed to concrete results.
[...]
Note: by default AsyncRestTemplate relies on standard JDK facilities
to establish HTTP connections. You can switch to use a different HTTP
library such as Apache HttpComponents, Netty, and OkHttp by using a
constructor accepting an AsyncClientHttpRequestFactory.
ListenableFuture instances can easily be converted to CompletableFuture instances through ListenableFuture::completable().
As noted in the Javadoc, you can control what async mechanism you want to use by specifying a AsyncClientHttpRequestFactory. There are a number of built-in implementations, for each of the libraries listed. Internally, some of these libraries might do what you suggested and run blocking IO on dedicated thread pools. Others, like Netty (if memory serves), use non-blocking IO to run the connections. You might gain some benefit from that.
Then it's up to you how you reduce the results. With CompletableFuture, you have access to the anyOf and allOf helpers and any of the combination instance methods.
For example,
URI exampleURI = URI.create("https://www.stackoverflow.com");
AsyncRestTemplate template = new AsyncRestTemplate/* specific request factory*/();
var future1 = template.exchange(exampleURI, HttpMethod.GET, null, String.class).completable();
var future2 = template.exchange(exampleURI, HttpMethod.GET, null, String.class).completable();
var future3 = template.exchange(exampleURI, HttpMethod.GET, null, String.class).completable();
CompletableFuture.allOf(future1, future2, future3).thenRun(() -> {
// you're done
});
AsyncRestTemplate has since been deprecated in favor of Spring Web Flux' WebClient. This API is considerably different so I won't go into it (except to say that it does let you get back a CompletableFuture as well).

Completable future would be a better way to do this, as it is semantically more related to the task and you might keep the code flow going while the task proceeds.
If you use streams, beside the awkwardness of lambdas with exception handling inside and the fact that it is not so related to the task, semantically as in a pipeline, you will have to wait for all of them to finish, even if they are occuring in parallel. To avoid that you would need futures, but then you will be back to the first solution.
You might consider a mix, using streams to create the futures. But given that it is a blocking IO set of requests, you will probably not have enough requests or time to take advantage of the parallel streams, the library will probably not split the tasks in parallel for you and you will be better of with a loop.

RxJava instead of AsyncTask?

I came across several instances when people were trying to persuade me into using RxJava instead of Android's standard AsyncTask construct.
In my opinion RxJava offers a lot more features but loses in simplicity against AsyncTask.
Are there any use cases that suit one approach better than the other or even more general can RxJava even be considered superior?

The full power of RxJava is visible when you use it on Java 8, preferably with a library like Retrofit. It allows you to trivially chain operations together, with full control of error handling. For example, consider the following code given id: an int that specifies the order and apiClient: a Retrofit client for the order management microservice:
apiClient
.getOrder(id)
.subscribeOn(Schedulers.io())
.flatMapIterable(Order::getLineItems)
.flatMap(lineItem ->
apiClient.getProduct(lineItem.getProductId())
.subscribeOn(Schedulers.io())
.map(product -> product.getCurrentPrice() * lineItem.getCount()),
5)
.reduce((a,b)->a+b)
.retryWhen((e, count) -> count<2 && (e instanceof RetrofitError))
.onErrorReturn(e -> -1)
.subscribe(System.out::println);
This will asynchronously calculate the total price of an order, with the following properties:
at most 5 requests against the API in flight at any one time (and you can tweak the IO scheduler to have a hard cap for all requests, not just for a single observable chain)
up to 2 retries in case of network errors
-1 in case of failure (an antipattern TBH, but that's an other discussion)
Also, IMO the .subscribeOn(Schedulers.io()) after each network call should be implicit - you can do that by modifying how you create the Retrofit client. Not bad for 11+2 lines of code, even if it's more backend-ish than Android-ish.

RxBinding/RxAndroid by Jake Wharton provides some nice threading functionality that you can use to make async calls but RxJava provides waaay more benefits/functionality than just dealing with async threading. That said, There is a pretty steep learning curve (IMO). Also, it should be noted that there is nothing wrong with using AsyncTasks, you can just write more eloquent solutions with Rx (also, IMO).
TLDR you should make an effort to use it. Retrofit and RxJava work together nicely for your AsyncTask replacement purposes.

How are reactive streams different from non blocking I/O?

How are reactive streams different from non blocking I/O ? what is that java 8 future API cannot do that reactive streams can do?

Same as non-blocking I/O, Reactive Extensions(ReactiveX) offers the non blocking programing style.
Not only that, ReactiveX makes everything as a stream, offers many operations for a stream. This functionality makes asynchronous programing very easy, saves us from callback hell ;)
I recommend you to read this document.
http://reactivex.io/intro.html
And here is good slide for ReactiveX.
https://speakerdeck.com/benjchristensen/applying-reactive-programming-with-rxjava-at-goto-chicago-2015
http://sssslide.com/speakerdeck.com/android10/the-mayans-lost-guide-to-rxjava-on-android

The main reason is that ReactiveX provide some operators to run your pipeline asynchronously like SubscribeOn or ObserverOn. And also provide some other functionality that Java8 or Scala by default in his functional programing does not provide.
Here you can see an example about Asynchronous operators to understand how works https://github.com/politrons/reactive/blob/master/src/test/java/rx/observables/scheduler/ObservableAsynchronous.java
And global examples of RxJava here https://github.com/politrons/reactive

Non-blocking I/O is a lower-level abstraction than reactive streams, who offer you constructs like these (consider serviceX to be a retrofit client):
Observable.zip(
service1.getFoo(1),
service2.doBar(xyz),
service3.makeBaz("meh"),
(a,b,c) -> service4.somethingElse(a+b+c)
)
.onErrorReturn("error");
This, in 7 contacts 3 services in parallel and when all of them have returned contacts a 4th with their results.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.