Efficient way to do async call in Spring REST - java

I have a two endpoints : /parent and /child/{parentId}
Both will return List
Let's assume each call will take two seconds.
So If I call /parent and got 10 parents in list, and I want to call and populate each child, I will need 22 seconds in total (2 secs for /parent, 10 times /child/{parentId} with 2 seconds each)
In Spring and Java 10, I can use RestTemplate, combined with Future to do async call.
In this snippet, /slow-five is call to parent, while /slow-six is call to child.
public List<Child> runSlow2() {
ExecutorService executor = Executors.newFixedThreadPool(5);
long start = System.currentTimeMillis();
RestTemplate restTemplate = new RestTemplate();
var futures = new ArrayList<Future<List<Child>>>();
var result = new ArrayList<Child>();
System.out.println("Start took (ms) : " + (System.currentTimeMillis() - start));
var responseFive = restTemplate.exchange("http://localhost:8005/api/r/slow-five", HttpMethod.GET, null,
new ParameterizedTypeReference<ResponseWrapper<Parent>>() {
});
for (var five : responseFive.getBody().getData()) {
// prepare future
var future = executor.submit(new Callable<List<Child>>() {
#Override
public List<Child> call() throws Exception {
var endpointChild = "http://localhost:8005/api/r/slow-six/" + five.getId();
var responseSix = restTemplate.exchange(endpointChild, HttpMethod.GET, null,
new ParameterizedTypeReference<ResponseWrapper<Child>>() {
});
return responseSix.getBody().getData();
}
});
futures.add(future);
}
for (var f : futures) {
try {
result.addAll(f.get());
} catch (Exception e) {
e.printStackTrace();
}
}
System.out.println("Before return took (ms) : " + (System.currentTimeMillis() - start));
return result;
}
Ignore the ResponseWrapper. It's just wrapper class like this
public class ResponseWrapper<T> {
private List<T> data;
private String next;
}
The code works fine, It took about 3-4 seconds to gather all childs from 10 parents. But I don't think it's efficient.
Furthermore, Spring 5 has WebClient that should be able to do this kind of thing.
However, I can't find any sample for this kind of hierarchial calls. Most samples on WebClient involves only simple call to single endpoint without dependency.
Any clue how can I use WebClient to achieve same things? Calling multiple /child asynchronously and merge the result?
Thanks

It took about 3-4 seconds to gather all childs from 10 parents.
I think we should make clear what slows down the method runSlow2(). Your method makes multiple calls to endpoints. You improve performance by executing calls parallelism and gather results from them. I don't think restTemplate is slow, nothing wrong with your code, maybe your endpoints are slow.
One improvement can be instead of making multiple calls parallelism to /child/{parentId}, you can introduce a new endpoint which accept a list of parentId.
Hope it help.

Related

N+1 HTTP calls batching with simultaneous queue

Suppose,I have a method which fetches data from an HTTP API
public R getResource(String id){
//HTTP call to
return fetch("http://example.com/api/id")
}
But
http://example.org/api/ supports multiple Ids at a time say
http://example.org/api/id1,id2,id3
In a multi-threaded environment, i want to block, until i have collected 'm' ids and then in one shot, get data from the API.
Also, to avoid infinite/long blocks, there should be a wait timeout.
for m=5
Lets say 20 threads arrive concurrently to call this method, then 4 batches of requests should be sent to the HTTP api.
Any implementation suggestion or existing frameworks to support this batching.
Edit suggestions are welcome.
Use a BlockingQueue with a thread doing BlockingQueue::poll(long timeout, TimeUnit unit) with the timeout computed e.g., so that the first request waits no longer than some fixed duration.
The polling thread will gather the IDs from the queue in its own list until it either has m IDs or the maximum waiting duration gets reached. There should be only one such thread.
In the above list, there should be entries containing both the ID and a CompletableFuture<R>, which gets completed using the result of the call. The future is what you give to the caller. Instead of a list, you may want to use a Map<String, CompletableFuture<R>> so than on request completion, you can complete the futures easily. Actually, the queue should contain the future, too, so you can give it back to the caller.
A rough sketch:
class ResourceMultigetter<R> {
private final BlockingQueue<Map.Entry<String, CompletableFuture<R>>> newEntries = ...;
private final Map<String, CompletableFuture<R>> collected = ...;
private long millisOfFirstWaitingRequest;
private volatile boolean stopped;
class Processor implements Runnable {
#Override
public void run() { // run by the polling thread
while (!stopped) {
final Map.Entry<String, CompletableFuture<R>> e = newEntries.poll(....);
if (e == null) {
if (!timeHasElapsed()) continue;
} else {
if (collected.isEmpty()) {
millisOfFirstWaitingRequest = System.currentTimeMillis();
}
collected.put(e.getKey(), e.getValue());
if (collected.size() < m && !timeHasElapsed()) continue;
}
final List<String> processedIds = callTheServer();
processedIds.forEach(id -> collected.remove(id));
}
}
}
public CompletableFuture<R> enqueue(String id) {
final CompletableFuture<R> result = new CompletableFuture<>();
newEntries.add(new AbstractMap.SimpleImmutableEntry<>(id, result));
return result;
}
}
You'd initialize it like
ResourceMultigetter resourceMultigetter = new ResourceMultigetter();
new Thread(resourceMultigetter.new Processor()).start();
The client code would do something like
R r = resourceMultigetter.enqueue(id); // this blocks

Right way to handle parallel API calls in Java

I have a web application in Java where as part of client HTTP request handling, I need to make 2 API calls. The way I am planning to implement is to offload 1 API call to thread pool and do the other call in the same thread and then combine the result.
I want to process API1 call in parallel but don't want to block it in queue. Hence if no threads available, I am doing it sequentially.
This is what I have come up with.
//this is already created in setup, just listing here for reference.
ThreadPoolExecutor tpe = new ThreadPoolExecutor(1, 2, 300, TimeUnit.SECONDS, new SynchronousQueue<>());
.....
private Future<Integer> getDataFromAPI1(ThreadPoolExecutor tpe){
try {
return tpe.submit(new Callable<Integer>() {
#Override
public Integer call() throws Exception {
//....make API call here
return 1; //return result
}
});
}catch (RejectedExecutionException r){
//do sequentially and throw any exception encountered
///.....
return CompletableFuture.completedFuture(1); //return the result
}
}
public Integer handle(String reqStub){
Future<Integer> f1 = getDataFromAPI1(tpe);
//make API call2 here in same thread
//... this populates r2
Integer r1 = f1.get();
//now return final result based on 2 results
return r1+r2;
}
Assume that exception handling is done by caller of handle() method.
Does the code snippet look good in terms of correctness and performance.
Are there better ways of achieving the same?

Java 8: How can I convert a for loop to run in parallel?

for (int i=0; i<100000; i++) {
// REST API request.
restTemplate.exchange(url, HttpMethod.GET, request, String.class);
}
I have a situation where I have to request a resource for 100k users and it takes 70 minutes to finish. I tried to clean up my code as much as possible and I was able to reduce it only by 4 minutes).
Since each request is independent of each other, I would love to send requests in parallel (may be in 10s, 100s, or even 1000s of chunks which every finishes quickly). I'm hoping that I can reduce the time to 10 minutes or something close. How do I calculate which chunk size would get the job done quickly?
I have found the following way but I can't tell if the program processes all the 20 at a time; or 5 at a time; or 10 at a time.
IntStream.range(0,20).parallel().forEach(i->{
... do something here
});
I appericiate your help. I am open to any suggestions or critics!!
UPDATE: I was able to use IntStream and the task finished in 28 minutes. But I am not sure this is the best I could go for.
I used the following code in Java 8 and it did the work. I was able to reduce the batch job to run from 28 minutes to 3:39 minutes.
IntStream.range(0, 100000).parallel().forEach(i->{
restTemplate.exchange(url, HttpMethod.GET, request, String.class);
}
});
The standard call to parallel() will create a thread for each core your machine has available minus one core, using a Common Fork Join Pool.
If you want to specify the parallelism on your own, you will have different possibilities:
Change the parallelism of the common pool: System.setProperty("java.util.concurrent.ForkJoinPool.common.parallelism", "20")
Use an own pool:
Example:
int allRequestsCount = 20;
int parallelism = 4; // Vary on your own
ForkJoinPool forkJoinPool = new ForkJoinPool(parallelism);
IntStream.range(0, parallelism).forEach(i -> forkJoinPool.submit(() -> {
int chunkSize = allRequestsCount / parallelism;
IntStream.range(i * chunkSize, i * chunkSize + chunkSize)
.forEach(num -> {
// Simulate long running operation
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println(Thread.currentThread().getName() + ": " + num);
});
}));
This implementation is just examplary to give you an idea.
For your situation you can work with fork/join framework or make executor service pool of threads.
ExecutorService service = null;
try {
service = Executors.newFixedThreadPool(8);
service.submit(() -> {
//do your task
});
} catch (Exception e) {
} finally {
if (service != null) {
service.shutdown();
}
}
service.awaitTermination(1, TimeUnit.MINUTES);
if(service.isTerminated())
System.out.println("All threads have been finished");
else
System.out.println("At least one thread running");
And using fork/join framework
class RequestHandler extends RecursiveAction {
int start;
int end;
public RequestHandler(int start, int end) {
this.start = start;
this.end = end;
}
#Override
protected void compute() {
if (end - start <= 10) {
//REST Request
} else {
int middle = start + (end - start) / 2;
invokeAll(new RequestHandler(start, middle), new RequestHandler(middle, end));
}
}
}
Public class MainClass{
public void main(String[] args){
ForkJoinTask<?> task = new RequestHandler(0, 100000);
ForkJoinPool pool = new ForkJoinPool();
pool.invoke(task);
}
}
I've written a short article about that. It contains simple tool that allows you to control pool size:
https://gt-dev.blogspot.com/2016/07/java-8-threads-parallel-stream-how-to.html

Rx java OutOfMemory

EDITED: see this question which is more clear and precise:
RxJava flatMap and backpressure strange behavior
I'm currently writing a data synchronization job with RxJava and I'm quite novice with reactive programming and especialy RxJava library.
My job is quite simple I have a list of element IDs, I call a webservice to get each element by ID, do some processing and do multiple call to push data to DB.
I load the data from WS with 1 io thread and push the data to DB with multiple io threads.
However I always end-up with OutOfMemory error.
I thought first that loading the data from the WS was faster than storing them in the DBs.
But as both WS call and DB call synchronous call should they exert backpressure on each other?
Thank you for your help.
My code pretty much look like this:
#Test
public void test() {
int MAX_CONCURRENT_LOAD = 1;
int MAX_CONCURRENT_STORE = 2;
List<Integer> ids = IntStream.range(0, 10000).boxed().collect(Collectors.toList());
Observable.from(ids)
.flatMap(this::produce, MAX_CONCURRENT_LOAD)
.flatMap(this::consume, MAX_CONCURRENT_STORE)
.toBlocking().forEach(s -> System.out.println("Value " + s));
System.out.println("Finished");
}
private Observable<Integer> produce(final int value) {
return Observable.<Integer>create(s -> {
try {
if (!s.isUnsubscribed()) {
Thread.sleep(500); //Here I call WS to retrieve data
s.onNext(value);
s.onCompleted();
}
} catch (Exception e) {
s.onError(e);
}
}).subscribeOn(Schedulers.io());
}
private Observable<Boolean> consume(Integer value) {
return Observable.<Boolean>create(s -> {
try {
if (!s.isUnsubscribed()) {
Thread.sleep(10000); //Here I call DB to store data
s.onNext(true);
s.onCompleted();
}
} catch (Exception e) {
s.onNext(false);
s.onCompleted();
}
}).subscribeOn(Schedulers.io());
}
It seems your WS is poll based so if you use fromCallable instead of your custom Observable, you get proper backpressure:
return Observable.<Integer>fromCallabe(s -> {
Thread.sleep(500); //Here I call WS to retrieve data
return value;
}).subscribeOn(Schedulers.io());
Otherwise, if you have blocking WS and blocking database, you can use them to backpressure each other:
ids.map(id -> db.store(ws.get(id)).subscribeOn(Schedulers.io())
.toBlocking().subscribe(...)
and potentially leave off subscribeOn and toBlocking as well.

How to use AsyncRestTemplate to make multiple calls simultaneously?

I don't understand how to use AsyncRestTemplate effectively for making external service calls. For the code below:
class Foo {
public void doStuff() {
Future<ResponseEntity<String>> future1 = asyncRestTemplate.getForEntity(
url1, String.class);
String response1 = future1.get();
Future<ResponseEntity<String>> future2 = asyncRestTemplate.getForEntity(
url2, String.class);
String response2 = future2.get();
Future<ResponseEntity<String>> future3 = asyncRestTemplate.getForEntity(
url3, String.class);
String response3 = future3.get();
}
}
Ideally I want to execute all 3 calls simultaneously and process the results once they're all done. However each external service call is not fetched until get() is called but get() is blocked. So doesn't that defeat the purpose of AsyncRestTemplate? I might as well use RestTemplate.
So I don't understaand how I can get them to execute simultaneously?
Simply don't call blocking get() before dispatching all of your asynchronous calls:
class Foo {
public void doStuff() {
ListenableFuture<ResponseEntity<String>> future1 = asyncRestTemplate
.getForEntity(url1, String.class);
ListenableFuture<ResponseEntity<String>> future2 = asyncRestTemplate
.getForEntity(url2, String.class);
ListenableFuture<ResponseEntity<String>> future3 = asyncRestTemplate
.getForEntity(url3, String.class);
String response1 = future1.get();
String response2 = future2.get();
String response3 = future3.get();
}
}
You can do both dispatch and get in loops, but note that current results gathering is inefficient as it would get stuck on the next unfinished future.
You could add all the futures to a collection, and iterate through it testing each future for non blocking isDone(). When that call returns true, you can then call get().
This way your en masse results gathering will be optimised rather than waiting on the next slow future result in the order of calling get()s.
Better still you can register callbacks (runtimes) within each ListenableFuture returned by AccyncRestTemplate and you don't have to worry about cyclically inspecting the potential results.
If you don't have to use 'AsyncRestTemplate' I would suggest to use RxJava instead. RxJava zip operator is what you are looking for. Check code below:
private rx.Observable<String> externalCall(String url, int delayMilliseconds) {
return rx.Observable.create(
subscriber -> {
try {
Thread.sleep(delayMilliseconds); //simulate long operation
subscriber.onNext("response(" + url + ") ");
subscriber.onCompleted();
} catch (InterruptedException e) {
subscriber.onError(e);
}
}
);
}
public void callServices() {
rx.Observable<String> call1 = externalCall("url1", 1000).subscribeOn(Schedulers.newThread());
rx.Observable<String> call2 = externalCall("url2", 4000).subscribeOn(Schedulers.newThread());
rx.Observable<String> call3 = externalCall("url3", 5000).subscribeOn(Schedulers.newThread());
rx.Observable.zip(call1, call2, call3, (resp1, resp2, resp3) -> resp1 + resp2 + resp3)
.subscribeOn(Schedulers.newThread())
.subscribe(response -> System.out.println("done with: " + response));
}
All requests to external services will be executed in separate threads, when last call will be finished transformation function( in example simple string concatenation) will be applied and result (concatenated string) will be emmited from 'zip' observable.
What I Understand by Your question is You have a predefined asynchronous method and you try to do is call this method asynchoronously using RestTemplate Class.
I have wrote a method that will help you out to call Your method asynchoronously.
public void testMyAsynchronousMethod(String... args) throws Exception {
// Start the clock
long start = System.currentTimeMillis();
// Kick of multiple, asynchronous lookups
Future<String> future1 = asyncRestTemplate
.getForEntity(url1, String.class);;
Future<String> future2 = asyncRestTemplate
.getForEntity(url2, String.class);
Future<String> future3 = asyncRestTemplate
.getForEntity(url3, String.class);
// Wait until they are all done
while (!(future1 .isDone() && future2.isDone() && future3.isDone())) {
Thread.sleep(10); //10-millisecond pause between each check
}
// Print results, including elapsed time
System.out.println("Elapsed time: " + (System.currentTimeMillis() - start));
System.out.println(future1.get());
System.out.println(future2.get());
System.out.println(future3.get());
}
You might want to use CompletableFuture class (javadoc).
Transform your calls into CompletableFuture. For instance.
final CompletableFuture<ResponseEntity<String>> cf = CompletableFuture.supplyAsync(() -> {
try {
return future.get();
} catch (InterruptedException | ExecutionException e) {
throw new RuntimeException(e);
}
});
Next call CompletableFuture::allOf method with your 3 newly created completable futures.
Call join() method on the result. After the resulting completable future is resolved you can get the results from each separate completable future you've created on step 3.
I think you are misunderstanding a few things here. When you call the getForEntity method, the requests are already fired. When the get() method of the future object is called, you are just waiting for the request to complete. So in order fire all those three requests on the same subsecond, you just have to do:
// Each of the lines below will fire an http request when it's executed
Future<ResponseEntity<String>> future1 = asyncRestTemplate.getForEntity(url1, String.class);
Future<ResponseEntity<String>> future2 = asyncRestTemplate.getForEntity(url2, String.class);
Future<ResponseEntity<String>> future3 = asyncRestTemplate.getForEntity(url3, String.class);
After all these codes are run, all the requests are already fired (most probably in the same subsecond). Then you can do whatever you want in the meanwhile. As soon as you call any of the get() method, you are waiting for each request to complete. If they are already completed, then it will just return immediately.
// do whatever you want in the meantime
// get the response of the http call and wait if it's not completed
String response1 = future1.get();
String response2 = future2.get();
String response3 = future3.get();
I don't think any of the previous answers actually achieve parallelism. The problem with #diginoise response is that it doesn't actually achieve parallelism. As soon as we call get, we're blocked. Consider that the calls are really slow such that future1 takes 3 seconds to complete, future2 2 seconds and future3 3 seconds again. With 3 get calls one after another, we end up waiting 3 + 2 + 3 = 8 seconds.
#Vikrant Kashyap answer blocks as well on while (!(future1 .isDone() && future2.isDone() && future3.isDone())). Besides the while loop is a pretty ugly looking piece of code for 3 futures, what if you have more? #lkz answer uses a different technology than you asked for, and even then, I'm not sure if zip is going to do the job. From Observable Javadoc:
zip applies this function in strict sequence, so the first item
emitted by the new Observable will be the result of the function
applied to the first item emitted by each of the source Observables;
the second item emitted by the new Observable will be the result of
the function applied to the second item emitted by each of those
Observables; and so forth.
Due to Spring's widespread popularity, they try very hard to maintain backward compatibility and in doing so, sometimes make compromises with the API. AsyncRestTemplate methods returning ListenableFuture is one such case. If they committed to Java 8+, CompletableFuture could be used instead. Why? Since we won't be dealing with thread pools directly, we don't have a good way to know when all the ListenableFutures have completed. CompletableFuture has an allOf method that creates a new CompletableFuture that is completed when all of the given CompletableFutures complete. Since we don't have that in ListenableFuture, we will have to improvise.
I've not compiled the following code but it should be clear what I'm trying to do. I'm using Java 8 because it's end of 2016.
// Lombok FTW
#RequiredArgsConstructor
public final class CounterCallback implements ListenableFutureCallback<ResponseEntity<String>> {
private final LongAdder adder;
public void onFailure(Throwable ex) {
adder.increment();
}
public void onSuccess(ResponseEntity<String> result) {
adder.increment();
}
}
ListenableFuture<ResponseEntity<String>> f1 = asyncRestTemplate
.getForEntity(url1, String.class);
f1.addCallback(//);
// more futures
LongAdder adder = new LongAdder();
ListenableFutureCallback<ResponseEntity<String>> callback = new CounterCallback(adder);
Stream.of(f1, f2, f3)
.forEach {f -> f.addCallback(callback)}
for (int counter = 1; adder.sum() < 3 && counter < 10; counter++) {
Thread.sleep(1000);
}
// either all futures are done or we're done waiting
Map<Boolean, ResponseEntity<String>> futures = Stream.of(f1, f2, f3)
.collect(Collectors.partitioningBy(Future::isDone));
Now we've a Map for which futures.get(Boolean.TRUE) will give us all the futures that have completed and futures.get(Boolean.FALSE) will give us the ones that didn't. We will want to cancel the ones that didn't complete.
This code does a few things that are important with parallel programming:
It doesn't block.
It limits the operation to some maximum allowed time.
It clearly separates successful and failure cases.

Categories