I'm trying to optimize my Futures management techniques.
Suppose we have this typical processing scenario: I run a query to fetch some records from a database:
SELECT * FROM mytable WHERE mycondition;
This query returns a lot of rows that I need to process with something like:
while (recordset have more results) {
MyRow row = recordset.getNextRow(); // Get the next row
processRow(row); // Process the row
}
Now suppose that all the rows are independent of each other, and the function processRow is slow because it performs some hard processing and queries on a C* cluster:
void processRow(MyRow row) {
// Fetch some useful data from the DB
int metadataid = row.getMetadataID();
Metadata metadata = getMetadataFromCassandra(metadataid);
// .... perform more processing on the row .....
// Store the processing result in the DB
ProcessingResult result = ....;
insertProcessingResultIntoCassandra(result);
}
A serial approach like this is expected to perform poorly, so a parallel execution is arguable.
With this basic processing structure in mind, here's some transformations I performed on the algorithm to get a major speed upgrade.
STEP 1: parallelize row processing
This is pretty straightforward. I created an Executor that gets the jobs done in parallel. Then I wait for all the jobs to finish. The code looks like:
ThreadPoolExecutor executor = (ThreadPoolExecutor)Executors.newCachedThreadPool();
int failedJobs = 0;
ArrayList<Future<Boolean>> futures = new ArrayList<>();
while (recordset have more results) {
final MyRow row = recordset.getNextRow(); // Get the next row
// Create the async job and send it to the executor
Callable<Boolean> c = new Callable<Boolean>() {
#Override
public Boolean call() {
try {
processRow(row);
} catch (Exception e) {
return false; // Job failed
}
return true; // Job is OK
}
};
futures.add(executor.submit(c));
}
// All jobs submitted. Wait for the completion.
while (futures.size() > 0) {
Future<Boolean> future = futures.remove(0);
Boolean result = false;
try {
result = future.get();
} catch (Exception e) {
e.printStackTrace();
}
failedJobs += (result ? 0 : 1);
}
STEP 2: limit the number of concurrent rows
So far so good, unless I have a low number of jobs this is expected to fail with an out of memory error, because the executor is backed by an unbound queue, and the main loop would submit the jobs all the way. I can solve this problem by controlling the maximum number of concurrent submitted jobs:
final const int MAX_JOBS = 1000;
while (recordset have more results) {
....
futures.add(executor.submit(c));
while (futures.size() >= MAX_JOBS) {
Future<Boolean> future = futures.remove(0);
Boolean result = false;
try {
result = future.get();
} catch (Exception e) {
e.printStackTrace();
}
failedJobs += (result ? 0 : 1);
}
}
Simply told, I wait for the first job of the list to be completed if we reached a certain threshold (1000 in this case). This works effectively, and this is a good speedup.
STEP 3: parallelize the single row processing
This is the step where I'd like to get a bit of help. I expect 1000 jobs will accumulate fast in the queue, due to the slowness of the IO. That is, I expect the JVM to fire 1000 threads to accomodate all the jobs. Now, 1000 threads when you only have an 8-core machine usually slow down everything, and I'm thinking that with a more tweaked parallelism this number could be lowered.
Currently, the getMetadataFromCassandra function is a wrapper around session.executeAsync, but manages retries:
public static ResultSet getMetadataFromCassandra(...) {
int retries = 0;
// Loop here
while (retries < MAX_RETRIES) {
// Execute the query
ResultSetFuture future = session.executeAsync(statement);
try {
// Try to get the result
return future.get(1000 * (int)Math.pow(2, retries), TimeUnit.MILLISECONDS);
} catch (Exception e) {
// Ooops. An error occurred. Cancel the future and schedule it again
future.cancel(true);
if (retries == MAX_RETRIES) {
e.printStackTrace();
String stackTrace = Throwables.getStackTraceAsString(e);
logToFile("Failed to execute query. Stack trace: " + stackTrace);
}
retries++;
}
}
return null;
}
As you can see, this is a blocking function because I .get() on the ResultSetFuture. That is, this call will block each thread waiting for the IO. So I'm getting an async approach, but I feel like I'm wasting a lot of hardware resources.
QUESTION
In my mind, I should be able to be notified when the .executeAsync results are available (or the timeout occurs), "freeing" the thread and allowing the same thread to perform other things.
Simply told, it seems to me that I'd need to transform the sequential structure of the processRow into a pipeline: the query is executed in an async way and, when the results are available, the remaining part of the processing is performed. And of course, I want the main loop to wait for the whole pipelined process to finish, not only the first part.
In other words,the main loop submits a job (let's call it jobJob) and I get a Future (let's call jobFuture) that I can .get() to wait for its completion. However, jobJob fires the "query" sub-job (let's call it queryJob), and queryJob is submitted async, so I get another Future (let's call it queryFuture) that should be used to fire the "process" sub-job (let's call processJob). At this point, I'm simply nesting Futures and blocking deep in the chain before completing the Future representing jobJob, and that means I'm back to the origin!!!
Before I go the hard route and implement this sort of pipeline as a Finite State Machine, I had a look at:
ForkJoinPool executor class
ListenableFuture from the Guava library
CompletableFuture class
None of them seem to satisfy my requirements of pipelining this process, or probably I didn't find a clear explanation on how to perform such apparent simple task. Can anyone simply enlighten me on this topic?
Any help is really appreciated.
Related
I am using Java 8, and I want to know the recommended way to enforce timeout on 3 async jobs that I would to execute async and retrieve the result from the future. Note that the timeout is the same for all 3 jobs. I also want to cancel the job if it goes beyond time limit.
I am thinking something like this:
// Submit jobs async
List<CompletableFuture<String>> futures = submitJobs(); // Uses CompletableFuture.supplyAsync
List<CompletableFuture<Void>> all = CompletableFuture.allOf(futures.toArray(new CompletableFuture[0]));
try {
allFutures.get(100L, TimeUnit.MILLISECONDS);
} catch (TimeoutException e){
for(CompletableFuture f : future) {
if(!f.isDone()) {
/*
From Java Doc:
#param mayInterruptIfRunning this value has no effect in this
* implementation because interrupts are not used to control
* processing.
*/
f.cancel(true);
}
}
}
List<String> output = new ArrayList<>();
for(CompeletableFuture fu : futures) {
if(!fu.isCancelled()) { // Is this needed?
output.add(fu.join());
}
}
return output;
Will something like this work? Is there a better way?
How to cancel the future properly? Java doc says, thread cannot be interrupted? So, if I were to cancel a future, and call join(), will I get the result immediately since the thread will not be interrupted?
Is it recommended to use join() or get() to get the result after waiting is over?
It is worth noting that calling cancel on CompletableFuture is effectively the same as calling completeExceptionally on the current stage. The cancellation will not impact prior stages. With that said:
In principle, something like this will work assuming upstream cancellation is not necessary (from a pseudocode perspective, the above has syntax errors).
CompletableFuture cancellation will not interrupt the current thread. Cancellation will cause all downstream stages to be triggered immediately with a CancellationException (will short circuit the execution flow).
'join' and 'get' are effectively the same in the case where the caller is willing to wait indefinitely. Join handles wrapping the checked Exceptions for you. If the caller wants to timeout, get will be needed.
Including a segment to illustrate the behavior on cancellation. Note how downstream processes will not be started, but upstream processes continue even after cancellation.
public static void main(String[] args) throws Exception
{
int maxSleepTime = 1000;
Random random = new Random();
AtomicInteger value = new AtomicInteger();
List<String> calculatedValues = new ArrayList<>();
Supplier<String> process = () -> { try { Thread.sleep(random.nextInt(maxSleepTime)); System.out.println("Stage 1 Running!"); } catch (InterruptedException e) { e.printStackTrace(); } return Integer.toString(value.getAndIncrement()); };
List<CompletableFuture<String>> stage1 = IntStream.range(0, 10).mapToObj(val -> CompletableFuture.supplyAsync(process)).collect(Collectors.toList());
List<CompletableFuture<String>> stage2 = stage1.stream().map(Test::appendNumber).collect(Collectors.toList());
List<CompletableFuture<String>> stage3 = stage2.stream().map(Test::printIfCancelled).collect(Collectors.toList());
CompletableFuture<Void> awaitAll = CompletableFuture.allOf(stage2.toArray(new CompletableFuture[0]));
try
{
/*Wait 1/2 the time, some should be complete. Some not complete -> TimeoutException*/
awaitAll.get(maxSleepTime / 2, TimeUnit.MILLISECONDS);
}
catch(TimeoutException ex)
{
for(CompletableFuture<String> toCancel : stage2)
{
boolean irrelevantValue = false;
if(!toCancel.isDone())
toCancel.cancel(irrelevantValue);
else
calculatedValues.add(toCancel.join());
}
}
System.out.println("All futures Cancelled! But some Stage 1's may still continue printing anyways.");
System.out.println("Values returned as of cancellation: " + calculatedValues);
Thread.sleep(maxSleepTime);
}
private static CompletableFuture<String> appendNumber(CompletableFuture<String> baseFuture)
{
return baseFuture.thenApply(val -> { System.out.println("Stage 2 Running"); return "#" + val; });
}
private static CompletableFuture<String> printIfCancelled(CompletableFuture<String> baseFuture)
{
return baseFuture.thenApply(val -> { System.out.println("Stage 3 Running!"); return val; }).exceptionally(ex -> { System.out.println("Stage 3 Cancelled!"); return ex.getMessage(); });
}
If it is necessary to cancel the upstream process (ex: cancel some network call), custom handling will be needed.
After calling cancel you cannot join the furure, since you get an exception.
One way to terminate the computation is to let it have a reference to the future and check it periodically: if it was cancelled abort the computation from inside.
This can be done if the computaion is a loop where at each iteration you can do the check.
Do you need it to be a CompletableFuture? Cause another way is to avoid to use a CompleatableFuture, and use a simple Future or a FutureTask instead: if you execute it with an Executor calling future.cancel(true) will terminate the computation if possbile.
Answerring to the question: "call join(), will I get the result immediately".
No you will not get it immediately, it will hang and wait to complete the computation: there is no way to force a computation that takes a long time to complete in a shorter time.
You can call future.complete(value) providing a value to be used as default result by other threads that have a reference to that future.
in following code I have asynchronous method which is executed by rest by authenticated user. In this method I execute loop in which is checking periodically cache of new data.
#Async
public CompletableFuture<List<Data>> pollData(Long previousMessageId, Long userId) throws InterruptedException {
// check db at first, if there are new data no need go to loop and waiting
List<Data> data = dataRepository.findByLastAndByUser(dataId, userId));
data not found so jump to loop for some time
if (data.size() == 0) {
short c = 0;
while (c < 100) {
// check if some new data added or not, if yes break loop
if (cache.getIfPresent(userId) != null) {
break;
}
c++;
Thread.sleep(1000);
System.out.println("SEQUENCE: " + c + " in " + Thread.currentThread().getName());
}
// check database on the end of loop or after break from loop
data = dataRepository.findByLastAndByUser(dataId, userId);
}
// clear data for that recipient and return result
cache.clear(userId);
return CompletableFuture.completedFuture(data);
}
and executor bean:
#Bean
public Executor asyncExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(2);
executor.setMaxPoolSize(2);
executor.setQueueCapacity(500);
executor.initialize();
return executor;
}
I execute this checking in separate thread for every request because these data are different for each user.
I need to optimize this code for many users (about 10k active users). In current state it doesn't work well because where there is more requests, these requests are waiting for some new free thread, and every another request takes very long time (5 min instead of 100 sec for example).
Can you help me improve it? Thanks in advice.
In case there are no other concurrent calls to the pollData method, it takes at most ~100s.
The parameter maxPoolSize defines the maximum number of thread that can run concurrently your #Asynch method.
So (number of users * execution time) / number of threads = 10K*100/2 = 500K[s].
I haven't completely understood the goal you want to reach with this method, but I suggest you to review the design of this functionality.
(For example take a look at spring cache, #evict,...)
(Notice that, in case you have multiple #async, you can bind the pool configuration with the #Asynch method by adding the name to the annotations #Bean("Pool1") and #Asynch("Pool1")).
I don't fully understand what you want to do.
But I think obviously it fill quickly your pool of thread.
I think you should try to use message broker or something like this.
Instead of trying to respond to request by waiting something new append, you should connect your clients with AMQP, WebSocket, Webhook... etc... On your server, when you detect new informations, you notify your clients.
So you don't need to occupy one thread by client.
I have the following function, in pseudo-code:
Result calc(Data data) {
if (data.isFinal()) {
return new Result(data); // This is the actual lengthy calculation
} else {
List<Result> results = new ArrayList<Result>();
for (int i=0; i<data.numOfSubTasks(); ++i) {
results.add(calc(data.subTask(i));
}
return new Result(results); // merge all results in to a single result
}
}
I want to parallelize it, using a fixed number of threads.
My first attempt was:
ExecutorService executorService = Executors.newFixedThreadPool(numOfThreads);
Result calc(Data data) {
if (data.isFinal()) {
return new Result(data); // This is the actual lengthy calculation
} else {
List<Result> results = new ArrayList<Result>();
List<Callable<Void>> callables = new ArrayList<Callable<Void>>();
for (int i=0; i<data.numOfSubTasks(); ++i) {
callables.add(new Callable<Void>() {
public Void call() {
results.add(calc(data.subTask(i));
}
});
}
executorService.invokeAll(callables); // wait for all sub-tasks to complete
return new Result(results); // merge all results in to a single result
}
}
However, this quickly got stuck in a deadlock, because, while the top recursion level waits for all threads to finish, the inner levels also wait for threads to become available...
How can I efficiently parallelize my program without deadlocks?
Your problem is a general design problem when using ThreadPoolExecutor for tasks with dependencies.
I see two options:
1) Make sure to submit tasks in a bottom-up order, so that you never have a running task that depends on a task which didn't start yet.
2) Use the "direct handoff" strategy (See ThreadPoolExecutor documentation):
ThreadPoolExecutor executor = new ThreadPoolExecutor(poolSize, poolSize, 0, TimeUnit.SECONDS, new SynchronousQueue<Runnable>());
executor.setRejectedExecutionHandler(new CallerRunsPolicy());
The idea is using a synchronous queue so that tasks never wait in a real queue. The rejection handler takes care of tasks which don't have an available thread to run on. With this particular handler, the submitter thread runs the rejected tasks.
This executor configuration guarantees that tasks are never rejected, and that you never have deadlocks due to inter-task dependencies.
you should split your approach in two phases:
create all the tree down until data.isFinal() == true
recursively collect the results (only possible if the merging does not produce other operations/calls)
To do that, you can use [Futures][1] to make the results async. Means all results of calc will be of type Future[Result].
Immediately returning a Future will free the current thread and give space for the processing of others. With the collection of the Results (new Result(results)) you should wait for all results to be ready (ScatterGather-Pattern, you can use a semaphore to wait for all results). The collection itself will be walking a tree and checking (or waiting for the results to arrive) will happen in a single thread.
Overall you build a tree of Futures, that is used to collect the results and perform only the "expensive" operations in the threadpool.
I'm writing a game engine which performs alhpa-beta search on at a game state, and I'm trying to parallelize it. What I have so far is working at first, and then it seems to slow to a halt. I suspect that this is because I'm not correctly disposing of my threads.
When playing against the computer, the game calls on the getMove() function of a MultiThreadedComputerPlayer object. Here is the code for that method:
public void getMove(){
int n = board.legalMoves.size();
threadList = new ArrayList<WeightedMultiThread>();
moveEvals = new HashMap<Tuple, Integer>();
// Whenever a thread finishes its work at a given depth, it awaits() the other threads
// When all threads are finished, the move evaluations are updated and the threads continue their work.
CyclicBarrier barrier = new CyclicBarrier(n, new Runnable(){
public void run() {
for(WeightedMultiThread t : threadList){
moveEvals.put(t.move, t.eval);
}
}
});
// Prepare and start the threads
for (Tuple move : board.legalMoves) {
MCBoard nextBoard = board.clone();
nextBoard.move(move);
threadList.add(new WeightedMultiThread(nextBoard, weights, barrier));
moveEvals.put(move, 0);
}
for (WeightedMultiThread t : threadList) {t.start();}
// Let the threads run for the maximum amount of time per move
try {
Thread.sleep(timePerMove);
} catch (InterruptedException e) {System.out.println(e);}
for (WeightedMultiThread t : threadList) {
t.stop();
}
// Play the best move
Integer best = infHolder.MIN;
Tuple nextMove = board.legalMoves.get(0);
for (Tuple m : board.legalMoves) {
if (moveEvals.get(m) > best) {
best = moveEvals.get(m);
nextMove = m;
}
}
System.out.println(nextMove + " is the choice of " + name + " given evals:");
for (WeightedMultiThread t : threadList) {
System.out.println(t);
}
board.move(nextMove);
}
And here run() method of the threads in question:
public void run() {
startTime = System.currentTimeMillis();
while(true) {
int nextEval = alphabeta(0, infHolder.MIN, infHolder.MAX);
try{barrier.await();} catch (Exception e) {}
eval = nextEval;
depth += 1;
}
}
I need to be able to interrupt all the threads when time is up-- how am I supposed to implement this? As of now I'm constantly catching (and ignoring) InterruptedExceptions.
Thread.stop was deprecated for a reason. When you interrupt a thread in the middle, the thread doesn't have the chance to properly release resources it was using, and doesn't notify other threads of its completion...something that's very important in multi-threaded apps. I'm not surprised your performance tanks; I would be willing to bet your memory usage shoots through the roof. You also don't recycle the threads, you start and stop them without creating new objects, which means whatever broken state the variables were left in is probably still plaguing them.
A better way is to set a flag that tells the thread it should return. So include in your WeightedMultiThread class a boolean named something like shouldQuit, and set it to false every time start() is called. Then, instead of while (true) do while (!shouldQuit), and instead of t.stop(), use t.shouldQuit = true. After you do that to every thread, have another loop that checks each thread for t.isAlive(), and once every thread has returned, go about your business. You should have much better results that way.
This looks like an ideal place to use an ExecutorService. You can create Callable instances that implement the parallel tasks, submit them to the ExecutorService, then use awaitTermination to enforce a timeout.
For example:
public void getMove() {
ExecutorService service = Executors.newFixedThreadPool(board.legalMoves.size());
List<Future<Something>> futures = new ArrayList<Future<Something>>(board.legalMoves.size());
for (Tuple move : board.legalMoves) {
futures.add(service.submit(new WeightedMultiThread(...)));
}
service.awaitTermination(timePerMove, TimeUnit.MILLISECONDS);
service.shutdownNow(); // Terminate all still-running jobs
for (Future<Something> future : futures) {
if (future.isDone()) {
Something something = future.get();
// Add best move logic here
}
}
...
}
Replace Something with something that encapsulates information about the move that has been evaluated. I'd suggest Something be a class that holds the Tuple and its associated score. Your WeightedMultiThread class can do something like this:
class WeightedMultiThread implements Callable<Something> {
public Something call() {
// Compute score
...
// Return an appropriate data structure
return new Something(tuple, score);
}
}
Even better would be to create the ExecutorService once and re-use it for each call to getMove. Creating threads is expensive, so best to only do it once if you can. If you take this approach then you should not call shutdownNow, but instead use the Future.cancel method to terminate jobs that have not completed in time. Make sure your WeightedMultiThread implementation checks for thread interruption and throws an InterruptedException. That's usually a good way to write a long-running task that needs to be interruptible.
EDIT:
Since you're doing a level-by-level exploration of the game space, I'd suggest that you encode that in the getMove function rather than in the Tuple evaluation code, e.g.
public Tuple getMove() {
ExecutorService service = ...
Tuple best = null;
long timeRemaining = MAX_TIME;
for (int depth = 0; depth < MAX_DEPTH && timeRemaining > 0; ++depth) {
long start = System.currentTimeMillis();
best = evaluateMoves(depth, service, timeRemaining);
long end = System.currentTimeMillis();
timeRemaining -= (end - start);
}
return best;
}
private Tuple evaluateMoves(int depth, ExecutorService service, long timeRemaining) {
List<Future<Whatever>> futures = service.submit(...); // Create all jobs at this depth
service.awaitTermination(timeRemaining, TimeUnit.MILLISECONDS);
// Find best move
...
return best;
}
That could probably be cleaner, but you get the idea.
The most sensitive way is to use interruption mechanism. Thread.interrupt() and Thread.isInterrupted() methods. This ensures your message will be delivered to a thread even if it sits inside a blocking call (remember some methods declare throwing InterruptedException?)
P.S. It would be useful to read Brian Goetz's "Java Concurrency in Practice" Chapter 7: Cancellation and Shutdown.
Im using the ExecutorService in Java to invoke Threads with invokeAll(). After, I get the result set with future.get(). Its really important that I receive the results in the same order I created the threads.
Here is a snippet:
try {
final List threads = new ArrayList();
// create threads
for (String name : collection)
{
final CallObject object = new CallObject(name);
threads.add(object);
}
// start all Threads
results = pool.invokeAll(threads, 3, TimeUnit.SECONDS);
for (Future<String> future : results)
{
try
{
// this method blocks until it receives the result, unless there is a
// timeout set.
final String rs = future.get();
if (future.isDone())
{
// if future.isDone() = true, a timeout did not occur.
// do something
}
else
{
// timeout
// log it and do something
break;
}
}
catch (Exception e)
{
}
}
}
catch (InterruptedException ex)
{
}
Is it assured that I receive the results from future.get() in the same order I created new CallObjects and added them to my ArrayList? I know, Documentation says the following:
invokeAll(): returns a list of Futures representing the tasks, in the same sequential order as produced by the iterator for the given task list. If the operation did not time out, each task will have completed. If it did time out, some of these tasks will not have completed. But I wanted to make sure I understood it correctly....
Thanks for answers! :-)
This is exactly what this piece of the statement is saying:
returns a list of Futures representing the tasks, in the same
sequential order as produced by the iterator for the given task list.
You will get the Futures in the exact order in which you inserted the items in the original list of Callables.
As per the documentation you will get the futures in same order.
Future object is just a reference of the task.
Future#get() is blocking call.
For ex
We have submitted 4 tasks.
Task 1 - > Completed
Task 2 --> Completed
Task 3 --> Timed Out
Task 4 --> Completed
As per our code
for (Future future : futures) {
future.get(); }
For 1&2 second task it will return immediately. We will wait for the third task will get completed. Even 4th task completed , iteration is waiting in third task . Once third task completed or timed wait expire on that time only iteration will continue.