Best way to run multiple HystrixCommand in parallel - java

I have a List<HystrixCommand<?>>> commands, what is the best way to execute these commands and collect the results such that the commands run in parallel?
I have tried something like this:
List<Future<?>> futures = commands.stream()
.map(HystrixCommand::queue)
.collect(Collectors.toList());
List<?> results = futures.stream()
.map(Future::get)
.collect(Collectors.toList());
Does this run the commands in parallel?
I.e. when calling HystrixCommand.queue() followed by Future.get() on the same thread, the .get() call does not block on some command and delay the other commands?
I ask because I couldn't find any documentation for this.
I have also looked at HystrixCollapser, but this still requires creating and running the individual commands (like above) in the createCommand method.

Ok I have investigated this and figured it out... by creating some simple examples rather than debugging production code...
My initial code was correct:
List<Request> requests = ...; // some expensive requests
List<HystrixCommand<?>>> commands = getCommands(requests);
List<Future<?>> futures = commands.stream()
.map(HystrixCommand::queue)
.collect(Collectors.toList());
List<?> results = futures.stream()
.map(Future::get)
.collect(Collectors.toList());
The commands do indeed run in parallel.
The .get() method does block, but since all the commands have been queued (prior to any .get() call) they are all running (or queued to run).
Say the second command is faster to completion than the first. The first .get() will block, but when it eventually returns, the second .get() call will return immediately, as the second command was able to complete while the first command was blocking. (Assuming core size >=2.)
In terms of HystrixCollapser, I misunderstood the API. HystrixCollapser is used to combine many HystrixCollapser instances into one HystrixCommand not the other way around. So I had to modify my code to wrap my requests with HystrixCollapserrather than HystrixCommand:
List<Request> requests = ...; // some expensive requests
List<HystrixCollapser<?>>> commands = getCommands(requests);
HystrixRequestContext context = HystrixRequestContext.initializeContext();
try {
List<Future<?>> futures = commands.stream()
.map(HystrixCollapser::queue)
.collect(Collectors.toList());
List<?> results = futures.stream()
.map(Future::get)
.collect(Collectors.toList());
} finally {
context.shutdown();
}
JMH benchmarks and full example source here

Related

Concurrent execution of UPDATE statements in java

I have a list of objects for which I submit an update DB statement as below.
Current code:
List<MyObj> updateObj = ....<logic here>;
updateObj.stream().forEach(obj -> updateMyDB(obj));
Here each obj is getting executed sequentially and taking more time. Now I wish to parallelize it. What is the best way to do this?
Option 1:
updateObj.parallelStream().foreach(obj->updateMyObj(obj));
Option 2:
ExecutorService executor = new FixedThreaadPool(10);
updateObj.stream().foreach(obj -> completableFuture.runAsync(()-> updateMyObj(obj),executor));
Intention here is to parallize operations. Any ideas to achieve this?

Java mono repeat call until collected results compete

I'm picking up Java/Reactor after moving over from C#. I'm well versed in the C# async-await approach to non-blocking calls and am struggling to adapt to Flux/Mono.
I'm implementing a solution where I need to make a call to ElasticSearch via the Java SDK, get results, apply additional filters to strip out ES results, and keep paging through ES until my final collection of results is complete.
The ES SDK doesn't support Reactor but there are examples of Java adapter code that takes the ES callback and converts to a mono (I see a direct correlation to the C# async-await here as this is a non-blocking call to ES). What I then struggle with is the next bit - I need to take the results from the ES mono, filter them.
I do this by calling out to other external services to get additional data based on the results from the ES call, so I need to know the ids of each page of content the ES mono result before I can apply the filtering (effectively a kind of block), then apply the in-memory filters and if I don't have enough content, then go back to ES to get the next page... repeat until I have enough data or there are no more results from ES.
This appears to be very difficult to achieve compared to C# but I probably just don't understand the Java paradigm correctly.
My problem is that I can't use "block()" as this throws an error in Reactor 3.2 so I don't really know how to "wait" until the mono calls to ES and external services are complete until continuing. In C#, this would be as simple as call to an Async method with an await to handle the implicit callbacks
My blocking version (works in IntellJ, fails when published via maven and then run in a webserver) is effectively:
do {
var sr = GetSearchRequest(xxxx);
this.elasticsearch.results(sr)
.map(r -> chunk.addAdd(r))
.block();
if (chunk.size() == 0 {
isComplete = true;
}
else {
var filtered = postFilterResults(chunk);
finalResults.add(filtered);
if (finalResults.size() = MAXIMUM_RESULTS) {
isComplete = true;
}
esPage = esPage + 1;
while (isComplete == false);
If I try to subscribe() or other non-blocking reaktor calls, then (obviously) the code skips over the "get ES" bit and hits the do-while, looping repeatedly until the callback from ES finally happens and the subscribed map is invoked.
I think I need to perform an "async block" for each ES call but I don't know how.
To answer my own question... The underlying issue IMO is that Flux/Mono simply is not like any existing programming style in that it absolutely forces you to work within the fluent style that reactor mandates. This is very similar to C# Linq but it's almost a "false friend" as even things like loops need to be in Reactor.
In this case, the key issue to solve is one of paging and to keep doing this within a loop. it is very unclear how to achieve this as a subscription to a flux "locks in" the original parameters so repeating the subscription call simply gets the same page again. The solution is to use the Flux.defer method which forces lazy building of the subscription on each repeated invoke. You then need Atomic integers to keep track of the page counter across different calls. Again, this is something that C# handles for you, so it can catch a .net developer out.
Something like:
//The response from the elasticsearch adapter is a Flux<T> but we do not want to filter
//results on a row by row basis as this incurs one call for each row to the DB/Network
//(as appropriate). We choose to batch these up
var result = new SearchResult();
var page = new AtomicInteger();
var chunkSize = new AtomicInteger();
//Use a defer so we recalculate the subscription to the search with the new page count
var results = Flux.defer(() -> elasticsearch.results(GetSearchRequest(request, lc, pf, page.get()))
.doOnComplete(() -> {
chunkSize.set(0);
page.getAndAdd(1);
})
.collectList()
.map(chunk -> {
chunkSize.set(chunk.size());
return chunk;
})
.map(chunk -> postFilterResults(request, chunk, pf))
.map(filtered -> result.getDocuments().addAll(filtered)));
//Repeat the deferred flux (recalculating each time) until we have enough content or we don't get anything from the search engine
return results
.repeat()
.takeUntil(r -> chunkSize.get() == 0 || result.getDocuments().size() >= this.elasticsearch.getMaximumSearchResults())
.take(this.elasticsearch.getMaximumSearchResults())
.collectList()
.flatMap(r -> {
result.setTotalHits(result.getDocuments().size());
return Mono.just(result);
});

Using Executors in very high-load environment

I manage to write a REST API using Stripe Framework. Inside my API, I have several tasks which need to execute and combine their results. I come up with an approach, borrowed from JavaScript, which will spawn tasks into several threads and join rather than chronological implementation. Thus, I used ExecutorService but I found a bottleneck on the implementation when the number of requests is quite big, tasks are finished on a longer time than I expect.
My question is related to an alternate way to achieve the same purpose.
How can I create an Executors per request
How can I expand Executors' size
To demonstrate, let consider this way on Javascript
import Promise from 'bluebird';
let tasks = [];
tasks.push(task01);
tasks.push(task02);
Promise.all(tasks).then(results => { do_sth_here!} )
Bring this idea to Java, I have implemented like below
ExecutorService exec = Executors.newCachedThreadPool();
List<Callable<Promise>> tasks = new ArrayList<>();
List<Future<Promise>> PromiseAll;
try {
tasks.add(() -> TaskPromises(Input));
tasks.add(() -> TaskPromise(Input));
PromiseAll = exec.invokeAll(tasks);
for (Future<Promise> fr : PromiseAll) {
// do_some_thing_next
}
}

Using the faster output from 2 threads

I want to work with two threads in my Java program for a tiny part. I need to give the first call to a database and the second call to an API, both calls with same input, and then work with the output of whichever thread finishes first.
It's my first time programming with threads and I'm very confused. I've seen tutorials and they mainly explain how to get two separate things done with threads so I'm a little lost.
Can someone please help or re-direct me to any useful link they may have?
So far, as I understand it, should it look something like this? :
Thread thread1 = new Thread(func1());
Thread thread2 = new Thread(func2());
thread1.start();
thread2.start();
But then how do I extract the output of the functions? How would I know which one has finished first?
-----------UPDATE 1---------
After trying CompletableFuture (thanks for the help Johan!) I have something like this:
CompletableFuture<Object> getData = CompletableFuture.anyOf(
CompletableFuture.runAsync(() -> getDataFromDB(clientData)),
CompletableFuture.runAsync(() -> getDataFromApi(clientData))
);
getData.thenApply(dataObject -> {
// Cast the returned Object to the actual type of your data,
// assuming both getDataFromDb and getDataFromApi
// return the same result type
Object data = (String) dataObject;
// Work with the returned data.
result = (String) data;
});
But I get this error for getData.thenApply():
The method thenApply(Function) in the type CompletableFuture is not applicable for the arguments (( dataObject) -> {})
Since I know that getData in of type String, would it be okay to just convert it to String and store the result?
As #Johan Hirsch suggests try with CompletableFuture. I've just try this and it works:
CompletableFuture.anyOf(
CompletableFuture.supplyAsync(() -> getDataFromDB(clientData)),
CompletableFuture.supplyAsync(() -> getDataFromApi(clientData)))
.thenApply(item -> (String) item)
.thenAccept(result -> {
// Consume the data
System.out.println(result);
});
Beware that I'm currently consuming the data so it doesn't return anything. If you just want to pass the result to another CompletableFuture change the thenAccept method for a thenApply
Java 8 provides a very nice utility class called CompletableFuture, which can help in your case.
Create two CompletableFuture, one for each of your tasks, and then use the CompletableFuture.anyOf method to wait for either one to finish.
CompletableFuture<TData> getData = CompletableFuture.anyOf(
CompletableFuture.runAsync(() -> getDataFromDb()),
CompletableFuture.runAsync(() -> getDataFromApi())
);
getData.thenApply(dataObject -> {
// Cast the returned Object to the actual type of your data,
// assuming both getDataFromDb and getDataFromApi
// return the same result type
TData data = (TData)dataObject;
// Work with the returned data.
processData(data);
});
You can use ExecutorService.invokeAny
Executes the given tasks, returning the result of one that has completed successfully (i.e., without throwing an exception), if any do. Upon normal or exceptional return, tasks that have not completed are cancelled. The results of this method are undefined if the given collection is modified while this operation is in progress.

Vertx Future does not wait

Since I´m using Vertx 3.1 in my stack, I was thinking to use the Future feature that the tools brings, but after read the API seems pretty limited to me. I cannot even find the way to make the the future wait for an Observable.
Here my code
public Observable<CommitToOrderCommand> validateProductRestrictions(CommitToOrderCommand cmd) {
Future<Observable<CommitToOrderCommand>> future = Future.future();
orderRepository.getOrder(cmd, cmd.orderId)
.flatMap(order -> validateOrderProducts(cmd, order))
.subscribe(map -> checkMapValues(map, future, cmd));
Observable<CommitToOrderCommand> result = future.result();
if(errorFound){
throw MAX_QUANTITY_PRODUCT_EXCEED.create("Fail"/*restrictions.getBulkBuyLimit().getDescription())*/);
}
return result;
}
private void checkMapValues(Multimap<String, BigDecimal> totalUnitByRestrictions, Future<Observable<CommitToOrderCommand>> future,
CommitToOrderCommand cmd) {
for (String restrictionName : totalUnitByRestrictions.keySet()) {
Restrictions restrictions = Restrictions.valueOf(restrictionName);
if (totalUnitByRestrictions.get(restrictionName)
.stream()
.reduce(BigDecimal.ZERO, BigDecimal::add)
.compareTo(restrictions.getBulkBuyLimit()
.getMaxQuantity()) == 1) {
errorFound = true;
}
}
future.complete(Observable.just(cmd));
}
In the onComplete of my first Observable I´m checking the results, and after finish is when I finish the future to unblock the operation.
But I´m looking that future.result is not block until future.complete is invoke as I was expecting. Instead is just returning null.
Any idea what´s wrong here?
Regards.
The vertx future doesn't block but rather work with a handler that is invoked when a result has been injected (see setHandler and isComplete).
If the outer layer of code requires an Observable, you don't need to wrap it in a Future, just return Observable<T>. Future<Observable<T>> doesn't make much sense, you're mixing two ways of doing async results.
Note that there are ways to collapse an Observable into a Future, but the difficulty is that an Observable may emit several items whereas a Future can hold only a single item. You already took care of that by collecting your results into a single emission of map.
Since this Observable only ever emits one item, if you want a Future out of it you should subscribe to it and call future.complete(yourMap) in the onNext method. Also define a onError handler that will call future.fail.

Categories