I prepare a bunch of requests that I want to send in parallel to an external webservice.
In this flow, I continue to process the response directly (eg inserting something into a database).
Problem: I want to track the maximum request time (for one request!), excluding the processing.
But as written, this will only track the global time including any subprocess:
StopWatch watch = new StopWatch();
watch.start();
Flux.fromIterable(requests)
.flatMap(req -> webClient.send(req, MyResponse.class)
.doOnSuccess(rsp -> processResponse(rsp))) //assume some longer routine
.collectList()
.block();
watch.stop();
System.out.println(w.getTotalTimeMillis());
Question: how can I measure the maximum time the requests took, excluding the processResponse() time?
When using elapsed on a mono, you will get back a mono of a tuple with both the elapsed time and the original object in it. You have to unwrap them to use those. I wrote an example (a bit simplified from your code) in a test to see it working:
#Test
public void elapsed() {
Flux.fromIterable(List.of(1, 2, 3, 4, 5))
.flatMap(req -> Mono.delay(Duration.ofMillis(100L * req))
.map(it -> "response_" + req)
.elapsed()
.doOnNext(it -> System.out.println("I took " + it.getT1() + " MS"))
.map(Tuple2::getT2)
.doOnSuccess(rsp -> processResponse(rsp)))
.collectList()
.block();
}
#SneakyThrows
public void processResponse(Object it) {
System.out.println("This is the response: " + it);
Thread.sleep(1000);
}
the output looks like this:
I took 112 MS
This is the response: response_1
I took 205 MS
This is the response: response_2
I took 305 MS
This is the response: response_3
I took 403 MS
This is the response: response_4
I took 504 MS
This is the response: response_5
those numbers represent both the delay (which in your case would be the webClient.send()) and a little overhead from the reactive pipeline itself. It is calculated between subscription (which happens when the flatMap for a specific request runs) and the next signal (the result from the map in my case, in yours the result of the webclient request)
your code would look something like this:
Flux.fromIterable(requests)
.flatMap(req -> webClient.send(req, MyResponse.class)
.elapsed()
.doOnNext(it -> System.out.println("I took " + it.getT1() + " MS"))
.map(Tuple2::getT2)
.doOnSuccess(rsp -> processResponse(rsp))) //assume some longer routine
.collectList()
.block();
note if you want to use a stopwatch in stead, that is also possible by doing something like:
Flux.fromIterable(List.of(1, 2, 3, 4, 5)).flatMap(req -> {
StopWatch stopWatch = new StopWatch();
return Mono.fromRunnable(stopWatch::start)
.then(Mono.delay(Duration.ofMillis(100L * req)).map(it -> "response_" + req).doOnNext(it -> {
stopWatch.stop();
System.out.println("I took " + stopWatch.getTime() + " MS");
}).doOnSuccess(this::processResponse));
}).collectList().block();
but personally I would recommend the .elapsed() solution since it's a bit cleaner.
I would avoid stopwatch directly in that method. Rather create a metrics wrapper which can be used at other places as well.
You can leverage .doOnSubscribe(), .doOnError(), .doOnSuccess()
But to answer your question, you can have a timer something like this
public sendRequest(){
Flux.fromIterable(requests)
.flatMap(req -> webClient.send(req, MyResponse.class)
.transform(timerPublisher("time took for ", req.id)))
.collectList()
.block();
}
//this can be made sophisticated by determining what kind of publisher it is
//mono or flux
private Function<Mono<T>, Publisher<T>> timerPublisher(String metric) {
StopWatchHelper stopWatch = new StopWatchHelper(metric);
return s -> s.doOnSubscribe((s) -> stopWatch.start())
.doOnSuccess(documentRequest -> stopWatch.record())
.doOnError(stopWatch::record);
}
private class StopWatchHelper{
private StopWatch stopWatch;
private String metric;
public StopWatchHelper(String metric){
this.metric = metric;
stopWatch = new StopWatch();
}
public Consumer<Subscription> start() {
return (s) -> stopWatch.start();
}
public void record(){
if(stopWatch.isStarted()){
System.out.println(String.format("Metric %s took %s", metric, stopWatch.getTime()));
}
}
public void record(Throwable t){
if(stopWatch.isStarted()){
System.out.println(String.format("Metric %s took %s, reported in error %s", metric, stopWatch.getTime(),throwable));
}
}
}
PS: Avoid using .block() -> it beats the purpose :)
Spring boot provides an out of the box feature that will add instrumentation to your WebClient.
You can "enable" these metrics by using the auto-configured WebClient.Builder to create your WebClient instances ie.
#Bean
public WebClient myCustomWebClient(WebClient.Builder builder) {
return builder
// your custom web client config code
.build();
}
This instrumentation will time each individual API call made by your WebClient and register it in your configured MeterRegistry
Reference Docs
Related
I want to provide some data using Reactor's Flux. Since it may take a lot of time to provide this data, I decided to introduce a ping mechanism (e.g. to keep tcp connection alive and not get timeouts). Here is my simplified solution:
public class Example {
private final DataProvider dataProvider;
public Example(DataProvider dataProvider) {
this.dataProvider = dataProvider;
}
public Flux<String> getData() {
AtomicBoolean inProgress = new AtomicBoolean(true);
Flux<String> dataFlux = dataProvider.provide()
.doFinally(ignoreIt -> inProgress.set(false));
return dataFlux.mergeWith(ping(inProgress::get));
}
private Publisher<String> ping(Supplier<Boolean> inProgress) {
return Flux.interval(Duration.ofSeconds(1), Duration.ofSeconds(1))
.map((tick) -> "ping " + tick)
.takeWhile(ignoreIt -> inProgress.get());
}
interface DataProvider {
Flux<String> provide();
}
public static void main(String[] args) {
Callable<String> dataProviderLogic = () -> {
Thread.sleep(3500);
return "REAL DATA - SHOULD TERMINATE PING";
};
// wrapping synchronous call
DataProvider dataProvider = () -> Mono.fromCallable(dataProviderLogic)
.subscribeOn(Schedulers.boundedElastic())
.flux();
new Example(dataProvider).getData()
.doOnNext(data -> System.out.println("GOT: " + data))
.blockLast();
}
}
Above code prints on console:
GOT: ping 0
GOT: ping 1
GOT: ping 2
GOT: REAL DATA - SHOULD TERMINATE PING
So it works as expected.
The question is: how can I test this ping mechanism in a Junit5 test, so it won't take a lot of time (e.g. several seconds)?
In an ideal world I would like to write a test which imitates a delay for the data provision, check if expected number of pings was generated and verify if complete signal was emitted (to make sure that ping flux terminates as expected). Of course I would like to have a unit test, which can be run in ms.
I tried this, but with no luck:
#Test
void test() {
TestPublisher<String> publisher = TestPublisher.create();
Flux<String> data = new Example(publisher::flux).getData();
StepVerifier.withVirtualTime(() -> data)
.thenAwait(Duration.ofMillis(3500))
.then(() -> publisher.emit("REAL DATA - SHOULD TERMINATE PING"))
.then(publisher::complete)
.expectNextCount(4)
.verifyComplete();
}
Above test ends up with this error:
java.lang.AssertionError: expectation "expectNextCount(4)" failed (expected: count = 4; actual: counted = 1; signal: onComplete())
Is it possible at all to use virtual time for internally created Flux.interval?
Any ideas for an alternative ping solution will be appreciated.
Despite of the fact that above ping mechanism is not the best one (I suggest to use Sink instead of AtomicBoolean and use takeUntilOther instead of takeWhile), in my case the problem was probably related to the situation where not all flux instructions were wrapped with withVirtualTime. This code works as expected in the above case:
#Test
void test() {
StepVerifier.withVirtualTime(() -> {
Flux<String> data = Flux.just("REAL DATA - SHOULD TERMINATE PING").delayElements(Duration.ofMillis(3200));
return new Example(() -> data).getData();
})
.thenAwait(Duration.ofMillis(3500))
.expectNextCount(4)
.thenAwait(Duration.ofMillis(1000))
.verifyComplete();
}
I have a collection of 20 items, I will create a loop for the items and make API Calls to get the data, based on the data returned, I will have to update in the database. This requirement is simple and I am able to accomplish in plain Java.
Now for performance, I am learning about using RxJava. I went through many articles in the internet and found that people refer to the async-http-client library for async http calls, I find that the library is out of date and the maintainer is planning for a hand-over to someone else, the one given in RxJava library is also like developed in 2014. Since I am new to RxJava, can you please help me with the right approach.
I am currently getting all the data and converting to observables like below
Observable<ENV> envs= Observable.fromIterable(allEnvs);
I also need to get some help like is the above code fine or should I create like the following for the observable construction, this is the snippet in groovy which I will have to write in Java.
val createObserver = Observable.create(ObservableOnSubscribe<String> { emitter ->
emitter.onNext("Hello World")
emitter.onComplete()
})
Kindly help me in choosing the best approach
Imagine that the http call is represented by class below :
public class HttpCall implements Callable<String> {
private final int i;
private HttpCall(int i) {
this.i = i;
}
#Override
public String call() {
try {
Thread.sleep(2000);
} catch (InterruptedException e) {
e.printStackTrace();
}
return "Something for : " + i;
}
}
It waits 2 sec and then emits a string (the http call result).
To combine all the items resulting from different http calls we can use merge operator. But before that we need to transform the Callable to an Observable by using fromCallable operator.
void sequentially() {
List<Observable<String>> httpRequests = IntStream.range(0, 20)
.mapToObj(HttpCall::new)
.map(Observable::fromCallable)
.collect(Collectors.toList());
Observable.merge(httpRequests)
.timestamp(TimeUnit.SECONDS)
.subscribe(e -> System.out.println("Elapsed time : " + e.time() + " -- " + e.value() + ". Executed on thread : " + Thread.currentThread().getName()));
}
Because all the requests are executed on the same thread, the order is maintained :
Elapsed time : 1602122218 -- Something for : 0. Executed on thread : main
Elapsed time : 1602122220 -- Something for : 1. Executed on thread : main
Elapsed time : 1602122222 -- Something for : 2. Executed on thread : main
...
As you can see the items are separated by 2 sec.
To run each request in its own thread we need to tell Rx that we need a thread for each call. Easy-peasy, just switch to one of the suggested schedulers. IO its what we need (as it's an IO operation).
void parallel( {
List<Observable<String>> httpRequests = IntStream.range(0, 20)
.mapToObj(HttpCall::new)
.map(httpCall -> Observable.fromCallable(httpCall)
.subscribeOn(Schedulers.io())
) // take a thread from the IO pool
.collect(Collectors.toList());
Observable.merge(httpRequests)
.timestamp(TimeUnit.SECONDS)
.subscribe(e -> System.out.println("Elapsed time : " + e.time() + " -- " + e.value() + ". Executed on thread : " + Thread.currentThread().getName()));
}
This time the order is not guarenteed and they are produced at almost the same time :
Elapsed time : 1602123707 -- Something for : 2. Executed on thread : RxCachedThreadScheduler-3
Elapsed time : 1602123707 -- Something for : 0. Executed on thread : RxCachedThreadScheduler-1
Elapsed time : 1602123707 -- Something for : 1. Executed on thread : RxCachedThreadScheduler-1
...
The code could be shorten like :
Observable.range(0, 20)
.map(HttpCall::new)
.flatMap(httpCall -> Observable.fromCallable(httpCall).subscribeOn(Schedulers.io()))
.timestamp(TimeUnit.SECONDS)
.subscribe(e -> System.out.println("Elapsed time : " + e.time() + " -- " + e.value() + ". Executed on thread : " + Thread.currentThread().getName()));
merge uses flatMap behind scenes.
I have a two endpoints : /parent and /child/{parentId}
Both will return List
Let's assume each call will take two seconds.
So If I call /parent and got 10 parents in list, and I want to call and populate each child, I will need 22 seconds in total (2 secs for /parent, 10 times /child/{parentId} with 2 seconds each)
In Spring and Java 10, I can use RestTemplate, combined with Future to do async call.
In this snippet, /slow-five is call to parent, while /slow-six is call to child.
public List<Child> runSlow2() {
ExecutorService executor = Executors.newFixedThreadPool(5);
long start = System.currentTimeMillis();
RestTemplate restTemplate = new RestTemplate();
var futures = new ArrayList<Future<List<Child>>>();
var result = new ArrayList<Child>();
System.out.println("Start took (ms) : " + (System.currentTimeMillis() - start));
var responseFive = restTemplate.exchange("http://localhost:8005/api/r/slow-five", HttpMethod.GET, null,
new ParameterizedTypeReference<ResponseWrapper<Parent>>() {
});
for (var five : responseFive.getBody().getData()) {
// prepare future
var future = executor.submit(new Callable<List<Child>>() {
#Override
public List<Child> call() throws Exception {
var endpointChild = "http://localhost:8005/api/r/slow-six/" + five.getId();
var responseSix = restTemplate.exchange(endpointChild, HttpMethod.GET, null,
new ParameterizedTypeReference<ResponseWrapper<Child>>() {
});
return responseSix.getBody().getData();
}
});
futures.add(future);
}
for (var f : futures) {
try {
result.addAll(f.get());
} catch (Exception e) {
e.printStackTrace();
}
}
System.out.println("Before return took (ms) : " + (System.currentTimeMillis() - start));
return result;
}
Ignore the ResponseWrapper. It's just wrapper class like this
public class ResponseWrapper<T> {
private List<T> data;
private String next;
}
The code works fine, It took about 3-4 seconds to gather all childs from 10 parents. But I don't think it's efficient.
Furthermore, Spring 5 has WebClient that should be able to do this kind of thing.
However, I can't find any sample for this kind of hierarchial calls. Most samples on WebClient involves only simple call to single endpoint without dependency.
Any clue how can I use WebClient to achieve same things? Calling multiple /child asynchronously and merge the result?
Thanks
It took about 3-4 seconds to gather all childs from 10 parents.
I think we should make clear what slows down the method runSlow2(). Your method makes multiple calls to endpoints. You improve performance by executing calls parallelism and gather results from them. I don't think restTemplate is slow, nothing wrong with your code, maybe your endpoints are slow.
One improvement can be instead of making multiple calls parallelism to /child/{parentId}, you can introduce a new endpoint which accept a list of parentId.
Hope it help.
I have a Flowable that we are returning in a function that will continually read from a database and add it to a Flowable.
public void scan() {
Flowable<String> flow = Flowable.create((FlowableOnSubscribe<String>) emitter -> {
Result result = new Result();
while (!result.hasData()) {
result = request.query(skip, limit);
partialResult.getResult()
.getFeatures().forEach(feature -> emmitter.emit(feature));
}
}, BackpressureStrategy.BUFFER)
.subscribeOn(Schedulers.io());
return flow;
}
Then I have another object that can call this method.
myObj.scan()
.parallel()
.runOn(Schedulers.computation())
.map(feature -> {
//Heavy Computation
})
.sequential()
.blockingSubscribe(msg -> {
logger.debug("Successfully processed " + msg);
}, (e) -> {
logger.error("Failed to process features because of error with scan", e);
});
My heavy computation section could potentially take a very long time. So long in fact that there is a good chance that the database requests will load the whole database into memory before the consumer finishes the first couple entries.
I have read up on backpressure with rxjava but the only 4 options essentially make me drop data or replace it with the last.
Is there a way to make it so that when I call emmitter.emit(feature) the call blocks until there is more room in the Flowable?
I.E I want to treat the Flowable as a blocking queue where push will sleep if the queue is past the capacity.
Hello I would like to know how to call two or more web services or Rest services in pararelo and compose a response of the calls.
I have found some examples on the web using other technologies but I can not get it to work with a reactor
// start task A asynchronously
CompletableFuture<ResponseA> futureA = asyncServiceA.someMethod(someParam);
// start task B asynchronously
CompletableFuture<ResponseB> futureB = asyncServiceB.someMethod(someParam);
CompletableFuture<String> combinedFuture = futureA
.thenCombine(futureB, (a, b) -> a.toString() + b.toString());
// wait till both A and B complete
String finalValue = combinedFuture.join();
////////////////////////////////////////////////////////////////////////////////
static void Run()
{
//Follow steps at this link for addding a reference to the necessary .NET library:
//http://stackoverflow.com/questions/9611316/system-net-http-missing-from-
//namespace-using-net-4-5
//Create an HTTP Client
var client = new HttpClient();
//Call first service
var task1 = client.GetAsync("http://www.cnn.com");
//Call second service
var task2 = client.GetAsync("http://www.google.com");
//Create list of all returned async tasks
var allTasks = new List<Task<HttpResponseMessage>> { task1, task2 };
//Wait for all calls to return before proceeding
Task.WaitAll(allTasks.ToArray());
}
Let's imagine you need to hit 2 services, so you nee 2 base WebClient (each is configured with the correct base URL and eg. an authentication scheme):
#Bean
public WebClient serviceAClient(String authToken) {
return WebClient.builder()
.baseUrl("http://serviceA.com/api/v2/")
.defaultHeader(HttpHeaders.AUTHORIZATION, "Basic " + authToken)
.defaultHeader(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_JSON_VALUE)
.build();
}
#Bean
public WebClient serviceBClient(String authToken): WebClient {
return WebClient.builder()
.baseUrl("https://api.serviceB.com/")
.defaultHeader(HttpHeaders.AUTHORIZATION, "token " + authToken)
.defaultHeader(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_JSON_VALUE)
.build();
}
From there, let's assume you get these 2 webclients injected in your controller (as qualified beans). Here's the code to make a joint call to both using Reactor:
Mono<ResponseA> respA = webclientA.get()
.uri("/sub/path/" + foo)
.retrieve()
.bodyToMono(ResponseA.class);
Mono<ResponseB> respB = webclientB.get()
.uri("/path/for/b")
.retrieve()
.bodyToMono(ResponseB.class);
Mono<String> join = respA.zipWith(respB, (a, b) -> a.toString + b.toString);
return join;
Note the zip function could produce something more meaningful like a business object out of the 2 responses. The resulting Mono<String> only triggers the 2 requests if something subscribes to it (in the case of Spring WebFlux, the framework will do that if you return it from a controller method).
If you´re using Spring reactor what you need is the operator Zip, to run your process and zip them once are finished.
/**
* Zip operator execute the N number of Flux independently, and once all them are finished, results
* are combined in TupleN object.
*/
#Test
public void zip() {
Flux<String> flux1 = Flux.just("hello ");
Flux<String> flux2 = Flux.just("reactive");
Flux<String> flux3 = Flux.just(" world");
Flux.zip(flux1, flux2, flux3)
.map(tuple3 -> tuple3.getT1().concat(tuple3.getT2()).concat(tuple3.getT3()))
.map(String::toUpperCase)
.subscribe(value -> System.out.println("zip result:" + value));
}
You can see more about reactive technology here https://github.com/politrons/reactive
If you already have a synchronous implementation you can easily add some reactor features to make it run in parallel thanks to Mono.fromCallable() method.
Mono<ResponseA> responseA = Mono
.fromCallable(() -> blockingserviceA.getResponseA())
.subscribeOn(Schedulers.elastic()); // will execute on a separate thread when called
Mono<ResponseB> responseB = Mono
.fromCallable(() -> blockingserviceB.getResponseB())
.subscribeOn(Schedulers.elastic());
// At that point nothing has been called yet, responseA and responseB are empty Mono
AggregatedStuff aggregatedStuff = Mono.zip(responseA, responseB) // zip as many Mono as you want
.flatMap(r -> doStuff(r.getT1(), r.getT2())) // do whatever needed with the results
.block(); // execute the async calls, and then execute flatMap transformation
The important thing to note between fromCallable() and just(), is that just() will execute directly and in the main thread, but fromCallable() is lazy meaning it will only be executed when needed, e.g: when you call block(), collect() (for Flux), ...etc...
Mono<ResponseA> responseA = Mono
.just(blockingserviceA.getResponseA()) // execute directly
Mono<ResponseB> responseB = Mono
.just(blockingserviceB.getResponseB()) // execute directly
// Above code have been executed sequentially, you can access response result
// => yes it is kind of useless, and yes it is exactly how I did it the first time!
So avoid to use just() for heavy tasks that you want to run in parallel. Using just() for instantiation is completely correct since you would not want to create a new thread, and get the overhead that comes with it, every time you instantiate a String or any other object.
PS: As Simon Baslé pointed out you can use WebClient to directly return Mono and Flux and do async calls, but if you already have your api clients implemented and don't have the option of refactoring the entire application, fromCallable() is a simple way to setup asynchronous process without refactoring to much code.