Java parallel db calls - java

I've a web application which needs to be extremely fast. But for processing it requires access for multiple data sources. Therefore I decided that it might be useful to make a parallel calls for optimization.
Basically I want to make many different db calls in parallel. Could you please recommend me simple and reliable way and technologies for achieving my goal, it would be useful if you could provide few frameworks and design patterns.
Right now I am using Spring.

You can use the new Java 8 CompletableFuture. It allows to use asynchronously existing synchronous method.
Say you have a list of requests in the form List<DBRequest> listRequest that you want to run in parallel. You can make a stream and launching all requests asynchronously in the following way.
List<CompletableFuture<DBResult>> listFutureResult =
listRequest.stream()
.map(req -> CompletableFuture.supplyAsync(
() -> dbLaunchRequest(req), executor))
.collect(Collectors.toList());
List<DBResult> listResult =
listFutureResult.stream()
.map(CompletableFuture::join)
.collect(Collectors.toList());
To be effective you have to write your own custom Executor
private final Executor executor =
Executors.newFixedThreadPool(Math.min(listRequest.size(), 100),
new ThreadFactory(){
public Thread newThread(Runnable r){
Thread t = new Thread(r);
t.setDaemon(true);
return t;
}
});
Like this you can have enough threads but not too much. Marking threads to deamons allows you to finish the program even if one thread is blocked.
You can find clear explanations about these techniques in the chapter 11 of the book Java 8 in action
== UPDATE for Java 7 ==
If you are stick with Java 7, you can use the following solution:
class DBResult{}
class DBRequest implements Callable<DBResult>{
#Override
public DBResult call(){return new DBResult();}
}
class AsyncTest{
public void test(){
try {
for(Future<DBResult> futureResult : ((ExecutorService)executor).invokeAll(listRequest)){
futureResult.get();
}
} catch (InterruptedException | ExecutionException ex) {
Logger.getLogger(SoTest.class.getName()).log(Level.SEVERE, null, ex);
}
}
}
All requests are run asynchronously and you then wait for their completion, in the order of the list.
Finally, to answer the subsidiary question in the comment, you don't have to create a thread pool for each request.

Related

How to call multiple Uni concurrently

Recently, I'm working on a project where I have 2 make 2 asynchronous calls at the same time. Since I'm working with Quarkus, I ended up trying to make use of Mutiny and the vert.x library. However, I can not get my code working with Unis. In the below code, I would imagine that both Unis would be called and the Uni that returns fastest would be returned. However, it seems that when combining Unis it simply returns the first one in the list, even though the first uni should take a longer time.
The below code prints out one one when it should print out two two since the uniFast should finish first. How do I combine Unis and have the faster one return first?
#Test
public void testUniJion(){
var uniSLow = Uni.createFrom().item(() -> {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
return "one";
});
var uniFast = Uni.createFrom().item(() -> {
try {
Thread.sleep(100);
} catch (InterruptedException e) {
e.printStackTrace();
}
return "two";
});
var resp = Uni.join().first(uniSLow,uniFast).withItem().await().indefinitely();
System.out.println(resp);
var resp2 = Uni.combine().any().of(uniSLow,uniFast).await().indefinitely();
System.out.println(resp2);
}
Note: This is not the actual code I am trying to implement. In my code, I am trying to fetch from 2 different databases. However, one database often has a lot more latency than the other. However, Uni seems to always wait for the slower database. I'm simply trying to understand Mutiny and Uni's better so I made this code example.
The problem is that you are not telling Mutiny on which thread should run each uni. If I add a System.out to your example:
// Slow and Fast for the different Uni
System.out.println( "Slow - " + Thread.currentThread().getId() + ":" + Thread.currentThread().getName() );
I get the following output:
Slow - 1:Test worker
one
Slow - 1:Test worker
Fast - 1:Test worker
one
The output shows that everything runs on the same thread and therefore when we block the first one, the second one is blocked too.
That's why the output is one one.
One way to run the uni in parallel is to use a different executor at subscription:
ExecutorService executorService = Executors.newFixedThreadPool( 5 );
uniSlow = uniSlow.runSubscriptionOn( executorService );
uniFast = uniFast.runSubscriptionOn( executorService );
Now, when I run the test, I have the expected output:
Slow - 16:pool-3-thread-1
Fast - 17:pool-3-thread-2
two
Slow - 18:pool-3-thread-3
Fast - 19:pool-3-thread-4
two
Note that this time Slow and Fast are running on different threads.
The Mutiny guide has a section about the difference between emitOn vs. runSubscriptionOn and some examples on how to change the emission thread.

Is apache ignite call back function blocking or non blocking?

I am using apache ignite,
there is a code block that looks like this:
query.setLocalListener(cacheEntryEvents -> {
Set<String> taskIds = new HashSet<>();
for (CacheEntryEvent<? extends String, ? extends String> e: cacheEntryEvents){
if (e.getEventType().equals(EventType.CREATED)) {
taskIds.add(e.getKey());
}
}
if (!taskIds.isEmpty()){
Set<String> processedTaskId = this.process(taskIds);
taskProcessed.addAndGet(processedTaskId.size());
removeCacheEntryExecutor.removeCache(processedTaskId, inactiveTaskCache);
log.info("Finish processing {} distributed inactive task", processedTaskId.size());
log.info("---------------------------Current total processed task: " + taskProcessed);
}
});
My question is that, does the callback function in this local listener executed as a non-blocking function in a separate thread? Or does it blocks the listener? If the latter case is true, then what happens when this.process(taskIds) takes too long to execute? If the former case is true, how does it manage the separated thread?
No, Continuous Query should not block the caller and the callback function is being executed on a dedicated thread. Check a similar question
To answer the second question, you need to share your this.process(taskIds) method first.

Thread safety for method that returns Mono based on mutable attribute in Java

In my Spring Boot application I have a component that is supposed to monitor the health status of another, external system. This component also offers a public method that reactive chains can subscribe to in order to wait for the external system to be up.
#Component
public class ExternalHealthChecker {
private static final Logger LOG = LoggerFactory.getLogger(ExternalHealthChecker.class);
private final WebClient externalSystemWebClient = WebClient.builder().build(); // config omitted
private volatile boolean isUp = true;
private volatile CompletableFuture<String> completeWhenUp = new CompletableFuture<>();
#Scheduled(cron = "0/10 * * ? * *")
private void checkExternalSystemHealth() {
webClient.get() //
.uri("/health") //
.retrieve() //
.bodyToMono(Void.class) //
.doOnError(this::handleHealthCheckError) //
.doOnSuccess(nothing -> this.handleHealthCheckSuccess()) //
.subscribe(); //
}
private void handleHealthCheckError(final Throwable error) {
if (this.isUp) {
LOG.error("External System is now DOWN. Health check failed: {}.", error.getMessage());
}
this.isUp = false;
}
private void handleHealthCheckSuccess() {
// the status changed from down -> up, which has to complete the future that might be currently waited on
if (!this.isUp) {
LOG.warn("External System is now UP again.");
this.isUp = true;
this.completeWhenUp.complete("UP");
this.completeWhenUp = new CompletableFuture<>();
}
}
public Mono<String> waitForExternalSystemUPStatus() {
if (this.isUp) {
LOG.info("External System is already UP!");
return Mono.empty();
} else {
LOG.warn("External System is DOWN. Requesting process can now wait for UP status!");
return Mono.fromFuture(completeWhenUp);
}
}
}
The method waitForExternalSystemUPStatus is public and may be called from many, different threads. The idea behind this is to provide some of the reactive flux chains in the application a method of pausing their processing until the external system is up. These chains cannot process their elements when the external system is down.
someFlux
.doOnNext(record -> LOG.info("Next element")
.delayUntil(record -> externalHealthChecker.waitForExternalSystemUPStatus())
... // starting processing
The issue here is that I can't really wrap my head around which part of this code needs to be synchronised. I think there should not be an issue with multiple threads calling waitForExternalSystemUPStatusat the same time, as this method is not writing anything. So I feel like this method does not need to be synchronised. However, the method annotated with #Scheduled will also run on it's own thread and will in-fact write the value of isUp and also potentially change the reference of completeWhenUpto a new, uncompleted future instance. I have marked these two mutable attributes with volatilebecause from reading about this keyword in Java it feels to me like it would help with guaranteeing that the threads reading these two values see the latest value. However, I am unsure if I also need to add synchronized keywords to part of the code. I am also unsure if the synchronized keyword plays well with reactor code, I have a hard time finding information on this. Maybe there is also a way of providing the functionality of the ExternalHealthCheckerin a more complete, reactive way, but I cannot think of any.
I'd strongly advise against this approach. The problem with threaded code like this is it becomes immensely difficult to follow & reason about. I think you'd at least need to synchronise the parts of handleHealthCheckSuccess() and waitForExternalSystemUPStatus() that reference your completeWhenUp field otherwise you could have a race hazard on your hands (only one writes to it, but it might be read out-of-order after that write) - but there could well be something else I'm missing, and if so it may show as one of these annoying "one in a million" type bugs that's almost impossible to pin down.
There should be a much more reliable & simple way of achieving this though. Instead of using the Spring scheduler, I'd create a flux when your ExternalHealthChecker component is created as follows:
healthCheckStream = Flux.interval(Duration.ofMinutes(10))
.flatMap(i ->
webClient.get().uri("/health")
.retrieve()
.bodyToMono(String.class)
.map(s -> true)
.onErrorResume(e -> Mono.just(false)))
.cache(1);
...where healthCheckStream is a field of type Flux<Boolean>. (Note it doesn't need to be volatile, as you'll never replace it so cross-thread worries don't apply - it's the same stream that will be updated with different results every 10 minutes based on the healthcheck status, whatever thread you'll access it from.)
This essentially creates a stream of healthcheck response values every 10 minutes, always caches the latest response, and turns it into a hot source. This means that the "nothing happens until you subscribe" doesn't apply in this case - the flux will start executing immediately, and any new subscribers that come in on any thread will always get the latest result, be that a pass or a fail. handleHealthCheckSuccess() and handleHealthCheckError(), isUp, and completeWhenUp are then all redundant, they can go - and then your waitForExternalSystemUPStatus() can just become a single line:
return healthCheckStream.filter(x -> x).next();
...then job done, you can call that from anywhere and you'll have a Mono that will only complete when the system is up.

Using Executors in very high-load environment

I manage to write a REST API using Stripe Framework. Inside my API, I have several tasks which need to execute and combine their results. I come up with an approach, borrowed from JavaScript, which will spawn tasks into several threads and join rather than chronological implementation. Thus, I used ExecutorService but I found a bottleneck on the implementation when the number of requests is quite big, tasks are finished on a longer time than I expect.
My question is related to an alternate way to achieve the same purpose.
How can I create an Executors per request
How can I expand Executors' size
To demonstrate, let consider this way on Javascript
import Promise from 'bluebird';
let tasks = [];
tasks.push(task01);
tasks.push(task02);
Promise.all(tasks).then(results => { do_sth_here!} )
Bring this idea to Java, I have implemented like below
ExecutorService exec = Executors.newCachedThreadPool();
List<Callable<Promise>> tasks = new ArrayList<>();
List<Future<Promise>> PromiseAll;
try {
tasks.add(() -> TaskPromises(Input));
tasks.add(() -> TaskPromise(Input));
PromiseAll = exec.invokeAll(tasks);
for (Future<Promise> fr : PromiseAll) {
// do_some_thing_next
}
}

Pause execution of a method until callback is finished

I am fairly new to Java and extremely new to concurrency. However, I have worked with C# for a while. It doesn't really matter, but for the sake of example, I am trying to pull data off a table on server. I want method to wait until data is completely pulled. In C#, we have async-await pattern which can be used like this:
private async Task<List<ToDoItem>> PullItems ()
{
var newRemoteItems = await (from p in remoteTable select p).ToListAsync();
return newRemoteItems;
}
I am trying to have similar effect in Java. Here is the exact code I'm trying to port (Look inside SynchronizeAsync method.)! However, Java Azure SDK works with callbacks. So, I have a few options:
Use wait and notify pattern. Following code doesn't work since I don't understand what I'm doing.
final List<TEntity> newRemoteItems = new ArrayList<TEntity>();
synchronized( this ) {
remoteTable.where().field("lastSynchronized").gt(currentTimeStamp)
.execute(new TableQueryCallback<TEntity>() {
public void onCompleted(List<TEntity> result,
int count,
Exception exception,
ServiceFilterResponse response) {
if (exception == null) {
newRemoteItems.clear();
for (TEntity item: result) {
newRemoteItems.add(item);
}
}
}
});
}
this.wait();
//DO SOME OTHER STUFF
My other option is to move DO SOME OTHER STUFF right inside the callback's if(exception == null) block. However, this would result in my whole method logic chopped off into the pieces, disturbing the continuous flow. I don't really like this approach.
Now, here are questions:
What is recommended way of doing this? I am completing the tutorial on Java concurrency at Oracle. Still, clueless. Almost everywhere I read, it is recommended to use higher level stuff rather than wait and notify.
What is wrong with my wait and notify?
My implementation blocks the main thread and it's considered a bad practice. But what else can I do? I must wait for the server to respond! Also, doesn't C# await block the main thread? How is that not a bad thing?
Either put DO SOME OTHER STUFF into callback, or declare a semaphore, and call semaphore.release in the callback and call semaphore.aquire where you want to wait. Remove synchronized(this) and this.wait.

Categories