Running Async Threading for Bulk API Calls in Spring Boot

Running Async Threading for Bulk API Calls in Spring Boot - java

I am trying to call a lot of APIs that needs a lot of time to finish. so I've tried to use threading to make it faster. I'm using Spring Boot, and the internet suggest me to use #Async but there's a problem. after using it for testing, I've realized that #Async only works for every async method calls. I don't know if I was mistaken about this, but overall I've created a controller that adds a lot of #Async method calls (up to 50 #Async method calls) and then run the CompletableFuture.allOf(apiCallsList.toArray(new CompletableFuture[0])).join()
Is there a way to make it simpler? because I don't know how to set the taskExecutor to limit the threadingPool if it was written like this.
Example code:
//get 50 person objects to the personPool
if (personPool.size() == 50) {
List<CompletableFuture<Map<String, String>>> results = new ArrayList<>();
for (Person person: personPool) {
//calls async api method call to a list
results.add(service.asyncApiProcess(service.callApi(person.getName())));
}
//run all async methods in paralel
CompletableFuture.allOf(results.toArray(new CompletableFuture[0])).join();
for (CompletableFuture<Map<String, String>> result : results) {
//write to xlsx
}
personPool.clear(); //clears the personPool for the next 50 data
}

Related

Caching a function with refresh rate

I have the following scenario. Below is a minimalist version of what I am trying to do
in a simple spring boot REST API Controller and Service.
func()
{
String lv=vService.getlv("1.2.1");
String mv=vService.getmv("1.3.1");
}
#Service
public class VService{
public String getlv(String version){
JsonArray lvVersions=makeHTTPGETLv();
String result=getVersion(lvVersions,version);
return result;
}
public String getMv(String version){
JsonArray mvVersions=makeHTTPGETMv();
String result=getVersion(mvVersions,version);
return result;
}
private String getVersion(JsonArray versions, String version){
Map<String,String> versionMap=new HashMap<>();
for(JsonElement je: versions){
JsonObject jo=je.getAsJsonObject();
//some logic
versionMap.put(jo.get("ver").getAsString(),jo.get("rver").getAsString);
}
return versionMap.get(version);
}
}
The getVersion method internally builds a map every time it is called.
The getlv method and getMv method both invokes the getVersion method
In the call getVersion(lvVersions,version); the lvVersions value will always be same,
It is a JSON array containing the mapping between versions.
And so each time getlv is called we are making a GET request makeHTTPGETLv and then
searching for the corresponding mapping of the given version.
Due to the static nature of this, it could be cached.
The multiple GET calls to makeHTTPGETLv could be avoided as it will always return the same JSON array (the changes could occur in some days though) and also the map that is built inside the getVersion method is unchanging.
Since the JsonArray could change, which is the response of a GET request,
So there could be a logic to update the cache every x minute. which could be 5 minutes or 60 mins.
How do I achieve this in spring boot?
Also to avoid taking the time for the first call, I could use Eager loading.
Steps Taken:
I tried using the #Cacheable annotation on top of the functions. It didn't have any effect. It seems I will have to put some more logic there. I could put out a separate function that builds the map that is being built inside the getVersion function. And so use this map for every call. But this map needs to be updated periodically (could be 5 mins, 60 mins, or even 1 day) which will require a GET call and building up the map. This reduces the problem of updating the Map every x minute. Because then I could directly use the map to fetch the corresponding version and avoid the GET call and parsing the response every time.
Similar Reducible/Smaller/alternate problem:
class Test {
private Map<String,String> map;
private void buildMap(){
}
}
The buildMap function updates the map attribute. Is there a way that buildMap function gets called every 30 minutes so that the map remains updated? That sounds like a cron job. Not sure if some caching is helpful here or if cron job and how to achieve that in spring boot.

Thousands of rest calls with spring boot

Let's say that we have the following entities: Project and Release, which is a one to many relationship.
Upon an event consumption from an SQS queue where a release id is sent as part of the event, there might be scenarios where we might have to create thousands of releases in our DB, where for each release we have to make a rest call to a 3rd party service in order to get some information for each release.
That means that we might have to make thousands of calls, in some cases more than 20k calls just to retrieve the information for the different releases and store it in the DB.
Obviously this is not scalable, so I'm not really sure what's the way to go in this scenario.
I know I might use a CompletableFuture, but I'm not sure how to use that with spring.
The http client that I am using is WebClient.
Any ideas?

You can make the save queries in a method transactional by adding the annotation #Transactional above the method signature. The method should also be public, or else this annotation is ignored.
As for using CompletableFuture in spring; You could make a http call method asynchronous by adding the #Async annotation above its signature and by letting it return a CompletableFuture as a return type. You should return a completed future holding the response value from the http call. You can easily make a completed future with the method CompletableFuture.completedFuture(yourValue). Spring will only return the completed future once the asynchronous method is done executing everything int its code block. For #Async to work you must also add the #EnableAsync annotation to one of your configuration classes. On top of that the #Async annotated method must be public and cannot be called by a method from within the same class. If the method is private or is called from within the same class then the #Async annotation will be ignored and instead the method will be executed in the same thread as the calling method is executed.
Next to an #Async annotated method you could also use a parallelStream to execute all 20K http calls in parallel. For example:
List<Long> releaseIds = new ArrayList<>();
Map<Long,ReleaseInfo> releaseInfo = releaseIds.parallelStream().map(releaseId -> new AbstractMap.SimpleEntry<>(releaseId, webClient.getReleaseInfo(releaseId)).collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
Lastly you could also use a ThreadPoolExecutor to execute the http calls in parallel. An example:
List<Long> releaseIds = new ArrayList<>();
ThreadPoolExecutor executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors()); //I've made the amount of threads in the pool equal to the amount of available CPU processors on the machine.
//Submit tasks to the executor
List<Future<ReleaseInfo>> releaseInfoFutures = releaseIds.stream().map(releaseId -> executor.submit(() -> webClient.getReleaseInfo(releaseId)).collect(Collectors.toList());
//Wait for all futures to complete and map all non-null values to ReleaseInfo list.
List<ReleaseInfo> releaseInfo = releaseInfoFutures.stream().map(this::getValueAfterFutureCompletion).filter(releaseInfo -> releaseInfo != null).collect(Collectors.toList());
private ReleaseInfo getValueAfterFutureCompletion(Future<ReleaseInfo> future){
ReleaseInfo releaseInfo = null;
try {
releaseInfo = future.get();
} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
} finally {
return releaseInfo;
}
}
Make sure to call shutdownNow() on ThreadPoolExecutor after you're done with it to avoid memory leaks.

How can I run a concurrent queue of tasks using Rx?

I've found a lot of examples about it and doesn't know what's the 'right' implementation right there.
Basically I've got a object (let's call it NBAManager) and there's a method public Completable generateGame() for this object. The idea is that generateGame method gets called a lot of times and I want to generate games in a sequential way: I was thinking about concurrent queue. I came up with the following design: I'd create a singleton instance of NBAService: service for NBAManager and the body of generateGame() will look like this:
public Completable generateGame(RequestInfo info)
return service.generateGame(info);
So basically I'll pass up that Completable result. And inside of that NBAService object I'll have a queue (a concurrent one, because I want to have an opportunity to poll() and add(request) if there's a call of generateGame() while NBAManager was processing one of the earlier requests) of requests. I got stuck with this:
What's the right way to write such a job queue in Rx way? There're so many examples of it. Could you send me a link of a good implementation?
How do I handle the logic of queue execution? I believe we've to execute if there's one job only and if there're many then we just have to add it and that's it. How can I control it without runnable? I was thinking about using subjects.
Thanks!

There are multiple ways to implement this, you can choose how much RxJava should be invoked. The least involvement can use a single threaded ExecutorService as the "queue" and CompletableSubject for the delayed completion:
class NBAService {
static ExecutorService exec = Executors.newSingleThreadedExecutor();
public static Completable generateGame(RequestInfo info) {
CompletableSubject result = CompletableSubject.create();
exec.submit(() -> {
// do something with the RequestInfo instance
f(info).subscribe(result);
});
return result;
}
}
A more involved solution would be if you wanted to trigger the execution when the Completable is subscribed to. In this case, you can go with create() and subscribeOn():
class NBAService {
public static Completable generateGame(RequestInfo info) {
return Completable.create(emitter -> {
// do something with the RequestInfo instance
emitter.setDisposable(
f(info).subscribe(emitter::onComplete, emitter::onError)
);
})
.subscribeOn(Schedulers.single());
}
}

Wrapping blocking I/O in project reactor

I have a spring-webflux API which, at a service layer, needs to read from an existing repository which uses JDBC.
Having done some reading on the subject, I would like to keep the execution of the blocking database call separate from the rest of my non-blocking async code.
I have defined a dedicated jdbcScheduler:
#Bean
public Scheduler jdbcScheduler() {
return Schedulers.fromExecutor(Executors.newFixedThreadPool(maxPoolSize));
}
And an AsyncWrapper utility to use it:
#Component
public class AsyncJdbcWrapper {
private final Scheduler jdbcScheduler;
#Autowired
public AsyncJdbcWrapper(Scheduler jdbcScheduler) {
this.jdbcScheduler = jdbcScheduler;
}
public <T> Mono<T> async(Callable<T> callable) {
return Mono.fromCallable(callable)
.subscribeOn(jdbcScheduler)
.publishOn(Schedulers.parallel());
}
}
Which is then used to wrap jdbc calls like so:
Mono<Integer> userIdMono = asyncWrapper.async(() -> userDao.getUserByUUID(request.getUserId()))
.map(userOption -> userOption.map(u -> u.getId())
.orElseThrow(() -> new IllegalArgumentException("Unable to find user with ID " + request.getUserId())));
I've got two questions:
1) Am I correctly pushing the execution of blocking calls to another set of threads? Being fairly new to this stuff I'm struggling with the intricacies of subscribeOn()/publishOn().
2) Say I want to make use of the resulting mono, e.g call an API with the result of the userIdMono, on which scheduler will that be executed? The one specifically created for the jdbc calls, or the main(?) thread that reactor usually operates within? e.g.
userIdMono.map(id -> someApiClient.call(id));

1) Use of subscribeOn is correctly putting the JDBC work on the jdbcScheduler
2) Neither, the results of the Callable - while computed on the jdbcScheduler, are publishOn the parallel Scheduler, so your map will be executed on a thread from the Schedulers.parallel() pool (rather than hogging the jdbcScheduler).

Concept of promises in Java

Is there a concept of using promises in java (just like ut is used in JavaScript) instead of using nested callbacks ?
If so, is there an example of how the callback is implemented in java and handlers are chained ?

Yep! Java 8 calls it CompletableFuture. It lets you implement stuff like this.
class MyCompletableFuture<T> extends CompletableFuture<T> {
static final Executor myExecutor = ...;
public MyCompletableFuture() { }
public <U> CompletableFuture<U> newIncompleteFuture() {
return new MyCompletableFuture<U>();
}
public Executor defaultExecutor() {
return myExecutor;
}
public void obtrudeValue(T value) {
throw new UnsupportedOperationException();
}
public void obtrudeException(Throwable ex) {
throw new UnsupportedOperationException();
}
}
The basic design is a semi-fluent API in which you can arrange:
(sequential or async)
(functions or actions)
triggered on completion of
i) ("then") ,or ii) ("andThen" and "orThen")
others. As in:
MyCompletableFuture<String> f = ...; g = ...
f.then((s -> aStringFunction(s)).thenAsync(s -> ...);
or
f.andThen(g, (s, t) -> combineStrings).or(CompletableFuture.async(()->...)....

UPDATE 7/20/17
I wanted to edit that there is also a Library called "ReactFX" which is supposed to be JavaFX as a reactive framework. There are many Reactive Java libraries from what I've seen, and since Play is based on the Reactive principal, I would assume that these Reactive libraries follow that same principal of non-blocking i/o, async calls from server to client and back while having communication be send by either end.
These libraries seem to be made for the client side of things, but there might be a Server reactive library as well, but I would assume that it would be wiser to use Play! with one of these client side reactive libraries.
You can take a look at https://www.playframework.com/
which implements this functionality here
https://www.playframework.com/documentation/2.2.0/api/java/play/libs/F.Promise.html
Additonal reading https://www.playframework.com/documentation/2.5.x/JavaAsync
Creating non-blocking actions
Because of the way Play works, action code must be as fast as possible, i.e., non-blocking. So what should we return from our action if we are not yet able to compute the result? We should return the promise of a result!
Java 8 provides a generic promise API called CompletionStage. A CompletionStage<Result> will eventually be redeemed with a value of type Result. By using a CompletionStage<Result> instead of a normal Result, we are able to return from our action quickly without blocking anything. Play will then serve the result as soon as the promise is redeemed.
The web client will be blocked while waiting for the response, but nothing will be blocked on the server, and server resources can be used to serve other clients.
How to create a CompletionStage
To create a CompletionStage<Result> we need another promise first: the promise that will give us the actual value we need to compute the result:
CompletionStage<Double> promiseOfPIValue = computePIAsynchronously();
CompletionStage<Result> promiseOfResult = promiseOfPIValue.thenApply(pi ->
ok("PI value computed: " + pi)
);
Play asynchronous API methods give you a CompletionStage. This is the case when you are calling an external web service using the play.libs.WS API, or if you are using Akka to schedule asynchronous tasks or to communicate with Actors using play.libs.Akka.
A simple way to execute a block of code asynchronously and to get a CompletionStage is to use the CompletableFuture.supplyAsync() helper:
CompletionStage<Integer> promiseOfInt = CompletableFuture.supplyAsync(() -> intensiveComputation());
Note: It’s important to understand which thread code runs on which promises. Here, the intensive computation will just be run on another thread.
You can’t magically turn synchronous IO into asynchronous by wrapping it in a CompletionStage. If you can’t change the application’s architecture to avoid blocking operations, at some point that operation will have to be executed, and that thread is going to block. So in addition to enclosing the operation in a CompletionStage, it’s necessary to configure it to run in a separate execution context that has been configured with enough threads to deal with the expected concurrency. See Understanding Play thread pools for more information.
It can also be helpful to use Actors for blocking operations. Actors provide a clean model for handling timeouts and failures, setting up blocking execution contexts, and managing any state that may be associated with the service. Also Actors provide patterns like ScatterGatherFirstCompletedRouter to address simultaneous cache and database requests and allow remote execution on a cluster of backend servers. But an Actor may be overkill depending on what you need.
Async results
We have been returning Result up until now. To send an asynchronous result our action needs to return a CompletionStage<Result>:
public CompletionStage<Result> index() {
return CompletableFuture.supplyAsync(() -> intensiveComputation())
.thenApply(i -> ok("Got result: " + i));
}
Actions are asynchronous by default
Play actions are asynchronous by default. For instance, in the controller code below, the returned Result is internally enclosed in a promise:
public Result index() {
return ok("Got request " + request() + "!");
}
Note: Whether the action code returns a Result or a CompletionStage<Result>, both kinds of returned object are handled internally in the same way. There is a single kind of Action, which is asynchronous, and not two kinds (a synchronous one and an asynchronous one). Returning a CompletionStage is a technique for writing non-blocking code.
Some info on CompletionStage
https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/CompletionStage.html
which is a subclass of the class mentioned in #Debosmit Ray's answer called CompletableFuture
This Youtube Video by LinkedIn dev Mr. Brikman explains a bit about Promises in
https://youtu.be/8z3h4Uv9YbE?t=15m46s
and
https://www.youtube.com/watch?v=4b1XLka0UIw
I believe the first video gives an example of a promise, the second video might also give some good info, I don't really recall which video had what content.
Either way the information here is very good, and worth looking into.
I personally do not use Play, but I have been looking at it for a long, long time, as it does a lot of really good stuff.

If you want to do Promise even before Java7, "java-promise" may be useful. (Of course it works with Java8)
You can easily control asynchronous operations like JavaScript's Promise.
https://github.com/riversun/java-promise
example
import org.riversun.promise.Promise;
public class Example {
public static void main(String[] args) {
Promise.resolve("foo")
.then(new Promise((action, data) -> {
new Thread(() -> {
String newData = data + "bar";
action.resolve(newData);
}).start();
}))
.then(new Promise((action, data) -> {
System.out.println(data);
action.resolve();
}))
.start();
System.out.println("Promise in Java");
}
}
result:
Promise in Java
foobar

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Running Async Threading for Bulk API Calls in Spring Boot - java

Related

Caching a function with refresh rate

Thousands of rest calls with spring boot

How can I run a concurrent queue of tasks using Rx?

Wrapping blocking I/O in project reactor

Concept of promises in Java

Categories

Resources