I'm currently working on an enterprise application that's performing a long non-linear tasks.
An abstraction of the workflow:
Gather neccessary information (can take minutes, but not always necessary)
Process data (always takes very long)
Notify several worker who post-process the result (in new tasks)
Now, I have created 2 services, that can solve step 1 and 2.
As the services shouldn't know of each other, I want to have a higher order Component that coordinates the 3 steps of an task. Think of it as an Callable which sends the task to service one, wakes up again when service 1 returnes an result, sends it to service 2, ..., sends final result to all post-processors and ends task.
But as it is likely to have 100'000s of queued tasks, I don't want to start 100'000s threads with task-control callables which even if being idle like 99.9% of the time still would be an massive overhead.
So got anybody an idea of controling this producer consumer queue-like pattern encapsulated in a task-object or somebody knows of an framework simplifying my concern?
Besides actor frameworks, I would suggest two main approaches that work with plain old Java:
Using an ExecutorService to which we submit tasks. The proper sequencing of steps can be synchronized using Future objects. The overall set of tasks can be synchronized using a Phaser a shown below.
Using the Fork/Join framework
Here is an example using a simple executor service. The Workflow class is given an executor and a phaser (a synchronization barrier). Each time the workflow is executed, it submits a new task for each of the steps (i.e., data collection, processing, and post-processing). Each task uses these phaser to indicate when it starts and stops.
public class Workflow {
private final ExecutorService executor;
private final Phaser phaser;
public Workflow(ExecutorService executor, Phaser phaser) {
this.executor = executor;
this.phaser = phaser;
}
public void execute(int request) throws InterruptedException, ExecutionException {
executor.submit(() -> {
phaser.register();
// Data collection
Future<Integer> input = executor.submit(() -> {
phaser.register();
System.out.println("Gathering data for call " + request);
phaser.arrive();
return request;
});
// Data Processing
Future<Integer> result = executor.submit(() -> {
phaser.register();
System.out.println("Processing call " + request);
Thread.sleep(5000);
phaser.arrive();
return request;
});
// Post processing
Future<Integer> ack = executor.submit(() -> {
phaser.register();
System.out.println("Notyfing processors for call " + request);
phaser.arrive();
return request;
});
final Integer output = ack.get();
phaser.arrive();
return output;
});
}
}
The caller object uses the phaser object to know when all subtasks (steps) have completed, before to shutdown the executor.
public static void main(String[] args) throws InterruptedException, ExecutionException {
final Phaser phaser = new Phaser();
final ExecutorService executor = Executors.newCachedThreadPool();
Workflow workflow = new Workflow(executor, phaser);
phaser.register();
for (int request=0 ; request<10 ; request++) {
workflow.execute(request);
}
phaser.arriveAndAwaitAdvance();
executor.shutdown();
executor.awaitTermination(30, TimeUnit.SECONDS);
}
Related
We used multiple thread groups in projects for parallel execution like below
ThreadPoolExecutor executorService = (ThreadPoolExecutor) Executors.newFixedThreadPool(5);
Here my question is how to terminate other thread groups when exception comes in any one of the thread group.
thanks.
One option is to have a separate service which
tracks the relevant threadpools
tracks an exception flag
you delegate task submission to so it can wrap Runnables in a try-catch which sets the exception flag to true
periodically checks if the exception flag is true and, if so, attempts to shutdown all relevant threadpools
For example you could have something like below.
public class ThreadpoolService {
private final AtomicBoolean threadpoolException = new AtomicBoolean(false);
private final Set<ExecutorService> threadPools = new HashSet<>();
private final ScheduledExecutorService tracker = Executors.newSingleThreadScheduledExecutor();
public ThreadpoolService() {
// Start a thread tracking if an exception occurs in the threadpools, and if so attempts to shut them down
tracker.scheduleAtFixedRate(() -> {
if (threadpoolException.get()) {
shutdownThreadPools();
}
// Run the tracker every second, with an initial delay of 1 second before the first run
}, 1000, 1000, TimeUnit.MILLISECONDS);
}
private void shutdownThreadPools() {
// For each threadpool create a completable future which attempts to shut it down
final var threadpoolStopTasks = threadPools.stream()
.map(tp -> CompletableFuture.runAsync(() -> {
try {
tp.shutdown();
// Await termination, force if taking too long
if (!tp.awaitTermination(1000, TimeUnit.MILLISECONDS)) {
tp.shutdownNow();
}
} catch (InterruptedException e) {
tp.shutdownNow();
Thread.currentThread().interrupt();
}
}))
.collect(Collectors.toList());
// Create a completable future from all of the above stop tasks, wait for it to complete then
// stop the executor this tracker is running in
CompletableFuture.allOf(threadpoolStopTasks.toArray(new CompletableFuture[0]))
.thenApply((v) -> {
tracker.shutdownNow();
return null;
})
.join();
}
public void submit(ExecutorService threadPool, Runnable task) {
threadPools.add(threadPool);
threadPool.submit(() -> {
try {
task.run();
} catch (Exception e) {
// do stuff
threadpoolException.set(true);
}
});
}
public void shutdown() throws InterruptedException {
shutdownThreadPools();
tracker.shutdown();
if (!tracker.awaitTermination(1000, TimeUnit.MILLISECONDS)) {
tracker.shutdownNow();
}
}
}
Then in your program
final var threadpoolService = new ThreadpoolService();
// Then wherever you use a threadpool delegate to the above for task submissing
final var tp1 = Executors.newFixedThreadPool(5);
threadpoolService.submit(tp1, () -> {
// some task which may fail exceptionally
return;
});
When your program needs to shutdown for some other reason
threadpoolService.shutdown();
}
Of note is that an exception triggerring the shutdown of these threadpools is not recoverable i.e. the threadpools and ThreadpoolService are no longer in a functional state after shutdown and really, this should trigger the end of the program - you could enhance this to register a shutdown hook which ends the program.
It should also be noted that I've made a lot of assumptions inc.
use of the default fork-join pool for CompletableFutures (you can just pass your own executor service)
expectation the CompletableFuture.allOf will finish in a timely manner (you can add a timeout)
hardcoded time intervals (you can make these configurable)
It also doesn't cover the below, both of which can be resolved by using a guard (maybe threadpoolException) on appropriate methods and returning some value or throwing an exception as appropriate
race conditions on the various methods (e.g. calling shutdown while a shutdown is in progress)
calling submit following a shutdown
I am developing an API. This API needs to do 2 DB queries to get the result.
I tried following strategies:
Used callable as return type in Controller.
Created 2 threads in Service (use Callable and CoundownLatch) to run 2 queries parallel and detect finishing time.
public class PetService {
public Object getData() {
CountDownLatch latch = new CountDownLatch(2);
AsyncQueryDBTask<Integer> firstQuery= new AsyncQueryDBTask<>(latch);
AsyncQueryDBTask<Integer> secondQuery= new AsyncQueryDBTask<>(latch);
latch.await();
}
public class AsyncQueryDBTask<T> implements Callable {
private CountDownLatch latch;
public AsyncQueryDBTask(CountDownLatch latch) { this.latch = latch;}
#Override
public T call() throws Exception {
//Run query
latch.countDown();
}
It worked fine but I feel that I am breaking the structure of Spring somewhere.
I wonder what is the most efficient way to get data in Spring 4.
-How to know both of 2 threads that run own query completed their job?
-How to control thread resource such as use and release thread?
Thanks in advance.
You generally don't want to create your own threads in an ApplicationServer nor manage thread lifecycles. In application servers, you can submit tasks to an ExecutorService to pool background worker threads.
Conveniently, Spring has the #Async annotation that handles all of that for you. In your example, you would create 2 async methods that return a Future :
public class PetService {
public Object getData() {
Future<Integer> futureFirstResult = runFirstQuery();
Future<Integer> futureSecondResult = runSecondQuery();
Integer firstResult = futureFirstResult.get();
Integer secondResult = futureSecondResult.get();
}
#Async
public Future<Integer> runFirstQuery() {
//do query
return new AsyncResult<>(result);
}
#Async
public Future<Integer> runSecondQuery() {
//do query
return new AsyncResult<>(result);
}
}
As long as you configure a ThreadPoolTaskExecutor and enable async methods, Spring will handle submitting the tasks for you.
NOTE: The get() method blocks the current thread until a result is returned by the worker thread but doesn't block other worker threads. It's generally advisable to put a timeout to prevent blocking forever.
What is the proper way to implement concurrency in Java applications? I know about Threads and stuff, of course, I have been programming for Java for 10 years now, but haven't had too much experience with concurrency.
For example, I have to asynchronously load a few resources, and only after all have been loaded, can I proceed and do more work. Needless to say, there is no order how they will finish. How do I do this?
In JavaScript, I like using the jQuery.deferred infrastructure, to say
$.when(deferred1,deferred2,deferred3...)
.done(
function(){//here everything is done
...
});
But what do I do in Java?
You can achieve it in multiple ways.
1.ExecutorService invokeAll() API
Executes the given tasks, returning a list of Futures holding their status and results when all complete.
2.CountDownLatch
A synchronization aid that allows one or more threads to wait until a set of operations being performed in other threads completes.
A CountDownLatch is initialized with a given count. The await methods block until the current count reaches zero due to invocations of the countDown() method, after which all waiting threads are released and any subsequent invocations of await return immediately. This is a one-shot phenomenon -- the count cannot be reset. If you need a version that resets the count, consider using a CyclicBarrier.
3.ForkJoinPool or newWorkStealingPool() in Executors is other way
Have a look at related SE questions:
How to wait for a thread that spawns it's own thread?
Executors: How to synchronously wait until all tasks have finished if tasks are created recursively?
I would use parallel stream.
Stream.of(runnable1, runnable2, runnable3).parallel().forEach(r -> r.run());
// do something after all these are done.
If you need this to be asynchronous, then you might use a pool or Thread.
I have to asynchronously load a few resources,
You could collect these resources like this.
List<String> urls = ....
Map<String, String> map = urls.parallelStream()
.collect(Collectors.toMap(u -> u, u -> download(u)));
This will give you a mapping of all the resources once they have been downloaded concurrently. The concurrency will be the number of CPUs you have by default.
If I'm not using parallel Streams or Spring MVC's TaskExecutor, I usually use CountDownLatch. Instantiate with # of tasks, reduce once for each thread that completes its task. CountDownLatch.await() waits until the latch is at 0. Really useful.
Read more here: JavaDocs
Personally, I would do something like this if I am using Java 8 or later.
// Retrieving instagram followers
CompletableFuture<Integer> instagramFollowers = CompletableFuture.supplyAsync(() -> {
// getInstaFollowers(userId);
return 0; // default value
});
// Retrieving twitter followers
CompletableFuture<Integer> twitterFollowers = CompletableFuture.supplyAsync(() -> {
// getTwFollowers(userId);
return 0; // default value
});
System.out.println("Calculating Total Followers...");
CompletableFuture<Integer> totalFollowers = instagramFollowers
.thenCombine(twitterFollowers, (instaFollowers, twFollowers) -> {
return instaFollowers + twFollowers; // can be replaced with method reference
});
System.out.println("Total followers: " + totalFollowers.get()); // blocks until both the above tasks are complete
I used supplyAsync() as I am returning some value (no. of followers in this case) from the tasks otherwise I could have used runAsync(). Both of these run the task in a separate thread.
Finally, I used thenCombine() to join both the CompletableFuture. You could also use thenCompose() to join two CompletableFuture if one depends on the other. But in this case, as both the tasks can be executed in parallel, I used thenCombine().
The methods getInstaFollowers(userId) and getTwFollowers(userId) are simple HTTP calls or something.
You can use a ThreadPool and Executors to do this.
https://docs.oracle.com/javase/tutorial/essential/concurrency/pools.html
This is an example I use Threads. Its a static executerService with a fixed size of 50 threads.
public class ThreadPoolExecutor {
private static final ExecutorService executorService = Executors.newFixedThreadPool(50,
new ThreadFactoryBuilder().setNameFormat("thread-%d").build());
private static ThreadPoolExecutor instance = new ThreadPoolExecutor();
public static ThreadPoolExecutor getInstance() {
return instance;
}
public <T> Future<? extends T> queueJob(Callable<? extends T> task) {
return executorService.submit(task);
}
public void shutdown() {
executorService.shutdown();
}
}
The business logic for the executer is used like this: (You can use Callable or Runnable. Callable can return something, Runnable not)
public class MultipleExecutor implements Callable<ReturnType> {//your code}
And the call of the executer:
ThreadPoolExecutor threadPoolExecutor = ThreadPoolExecutor.getInstance();
List<Future<? extends ReturnType>> results = new LinkedList<>();
for (Type Type : typeList) {
Future<? extends ReturnType> future = threadPoolExecutor.queueJob(
new MultipleExecutor(needed parameters));
results.add(future);
}
for (Future<? extends ReturnType> result : results) {
try {
if (result.get() != null) {
result.get(); // here you get the return of one thread
}
} catch (InterruptedException | ExecutionException e) {
logger.error(e, e);
}
}
The same behaviour as with $.Deferred in jQuery you can archive in Java 8 with a class called CompletableFuture. This class provides the API for working with Promises. In order to create async code you can use one of it's static creational methods like #runAsync, #supplyAsync. Then applying some computation of results with #thenApply.
I usually opt for an async notify-start, notify-progress, notify-end approach:
class Task extends Thread {
private ThreadLauncher parent;
public Task(ThreadLauncher parent) {
super();
this.parent = parent;
}
public void run() {
doStuff();
parent.notifyEnd(this);
}
public /*abstract*/ void doStuff() {
// ...
}
}
class ThreadLauncher {
public void stuff() {
for (int i=0; i<10; i++)
new Task(this).start();
}
public void notifyEnd(Task who) {
// ...
}
}
I have a situation that I need to work on
I have a class which has send method, example
#Singleton
class SendReport {
public void send() {}
}
The send method is called from a user click on web page, and must return immediately, but must start a sequence of tasks that will take time
send
->|
| |-> Task1
<-| |
<-|
|
|-> Task2 (can only start when Task1 completes/throws exception)
<-|
|
|-> Task3 (can only start when Task2 completes/throws exception)
<-|
I am new to Java concurrent world and was reading about it. As per my understanding, I need a Executor Service and submit() a job(Task1) to process and get the Future back to continue.
Am I correct?
The difficult part for me to understand and design is
- How and where to handle exceptions by any such task?
- As far as I see, do I have to do something like?
ExecutorService executorService = Executors.newFixedThreadPool(1);
Future futureTask1 = executorService.submit(new Callable(){
public Object call() throws Exception {
System.out.println("doing Task1");
return "Task1 Result";
}
});
if (futureTask1.get() != null) {
Future futureTask2 = executorService.submit(new Callable(){
public Object call() throws Exception {
System.out.println("doing Task2");
return "Task2 Result";
}
}
... and so on for Task 3
Is it correct?
if yes, is there a better recommended way?
Thanks
Dependent task execution is made easy with Dexecutor
Disclaimer : I am the owner
Here is an example, it can run the following complex graph very easily, you can refer this for more details
Here is an example
If you just have a line of tasks that need to be called on completion of the previous one than as stated and discussed in the previous answers I don't think you need multiple threads at all.
If you have a pool of tasks and some of them needs to know the outcome of another task while others don't care you can then come up with a dependent callable implementation.
public class DependentCallable implements Callable {
private final String name;
private final Future pre;
public DependentCallable(String name, Future pre) {
this.name = name;
this.pre = pre;
}
#Override
public Object call() throws Exception {
if (pre != null) {
pre.get();
//pre.get(10, TimeUnit.SECONDS);
}
System.out.println(name);
return name;
}
A few other things you need to take care of based on the code in your question, get rid of future.gets in between submits as stated in previous replies. Use a thread pool size of which is at least greater than the depth of dependencies between callables.
Your current approach will not work as it will block till the total completion which you wanted to avoid.
future.get() is blocking();
so after submitting first Task, your code will wait till its finished and then next task will be submitted, again wait, so there is no advantage over single thread executing the tasks one by one.
so if anything the code would need to be:
Future futureTask2 = executorService.submit(new Callable(){
public Object call() throws Exception {
futureTask1.get()
System.out.println("doing Task2");
return "Task2 Result";
}
}
your graph suggests that the subsequent task should execute despite exceptions. The ExecutionException will be thrown from get if there was problem with computation so you need to guard the get() with appropriate try.
Since Task1, Task2 have to completed one after another, why you do you want them exececuted in different threads. Why not have one thread with run method that deals with Task1,Task2.. one by one. As you said not your "main" thread, it can be in the executor job but one that handles all the tasks.
I personally don't like anonymous inner classes and callback (that is what you kind of mimic with chain of futures). If I would have to implement sequence of tasks I would actually implement queue of tasks and processors that executes them.
Mainly cause it is "more manageable", as I could monitor the content of the queue or even remove not necessary tasks.
So I would have a BlockingQueue<JobDescription> into which I would submit the JobDescription containing all the data necessary for the Task execution.
I would implement threads (Processors) that in their run() will have infinitive loop in which they take the job from the queue, do the task, and put back into the queue the following task. Something in those lines.
But if the Tasks are predefined at the send method, I would simply have them submitted as one job and then execute in one thread. If they are always sequential then there is no point in splitting them between different threads.
You need to add one more task if you want to return send request immediately. Please check the following example. It submits the request to the background thread which will execute the tasks sequentially and then returns.
Callable Objects for 3 long running tasks.
public class Task1 implements Callable<String> {
public String call() throws Exception {
Thread.sleep(5000);
System.out.println("Executing Task1...");
return Thread.currentThread().getName();
}
}
public class Task2 implements Callable<String> {
public String call() throws Exception {
Thread.sleep(5000);
System.out.println("Executing Task2...");
return Thread.currentThread().getName();
}
}
public class Task3 implements Callable<String> {
public String call() throws Exception {
Thread.sleep(5000);
System.out.println("Executing Task3...");
return Thread.currentThread().getName();
}
}
Main method that gets request from the client and returns immediately, and then starts executing tasks sequentially.
public class ThreadTest {
public static void main(String[] args) {
final ExecutorService executorService = Executors.newFixedThreadPool(5);
executorService.submit(new Runnable() {
public void run() {
try {
Future<String> result1 = executorService.submit(new Task1());
if (result1.get() != null) {
Future<String> result2 = executorService.submit(new Task2());
if (result2.get() != null) {
executorService.submit(new Task3());
}
}
} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
}
}
});
System.out.println("Submitted request...");
}
}
I'm writing a swing application with HttpClient and I need a way to make a download list because I need to wait 1 minute (for example) before starting a new download.
So I would like to create a waiting list of threads (downloads).
I would have a class that takes a time parameter and contains a list of threads and when I add a thread in the list it starts if there is no running thread. Otherwise it waits for its turn.
Is there any tool to do that ?
Thanks a lot for your help.
Yes. ScheduledExecutorService. You can create a fixed length service via Executors.newScheduledThreadPool(corePoolSize). When you are ready to submit the task to wait the amount of time just submit it to ScheduledExecutorService.schedule
ScheduledExecutorService e = Executors.newScheduledThreadPool(10)
private final long defaultWaitTimeInMinutes = 1;
public void submitTaskToWait(Runnable r){
e.schedule(r, defaultWaitTimeInMinutes, TimeUnit.MINUTES);
}
Here the task will launch in 1 minute from the time of being submitted. And to address your last point. If there are currently tasks being downloaded (this configuration means 10 tasks being downloaded) after the 1 minute is up the runnable submitted will have to wait until one of the other downloads are complete.
Keep in mind this deviates a bit from the way you are designing it. For each new task you wouldnt create a new thread, rather you would submit to a service that already has thread(s) waiting. For instance, if you only want one task to download at a time you change from Executors.newScheduledThreadPool(10) to Executors.newScheduledThreadPool(1)
Edit: I'll leave my previous answer but update it with a solution to submit a task to start exactly 1 minute after the previous task completes. You would use two ExecutorServices. One to submit to the scheuled Executor and the other to do the timed executions. Finally the first Executor will wait on the completion and continue with the other tasks queued up.
ExecutorService e = Executors.newSingleThreadExecutor();
ScheduledExecutorService scheduledService = Executors.newScheduledThreadPool(1)
public void submitTask(final Runnable r){
e.submit(new Runnable(){
public void run(){
ScheduledFuture<?> future= scheduledService.schedule(r, defaultWaitTimeInMinutes, TimeUnit.MINUTES);
future.get();
}
});
}
Now when the future.get(); completes the next Runnable submitted through submitTask will be run and then scheduled for a minute. Finally this will work only if you require the task to wait the 1 minute even if there is no other tasks submitted.
I think this would be a wrong way of going about the problem. A bit more logical way would be to create "download job" objects which will be added to a job queue. Create a TimerTask which would query this "queue" every 1 minute, pick up the Runnable/Callable jobs and submit them to the ExecutorService.
You could use the built-in ExecutorService. You can queue up tasks as Runnables and they will run on the available threads. If you want only a single task to run at a time use newFixedThreadPool(1);
ExecutorService executor = Executors.newFixedThreadPool(1);
You could then append an artificial Thread.sleep at the beginning of each Runnable run method to ensure that it waits the necessary amount of time before starting (not the most elegant choice, I know).
The Java Concurrency package contains classes for doing what you ask. The general construct you're talking about is an Executor which is backed by a ThreadPool. You generate a list of Runables and send them to an Executor. The Executor has a ThreadPool behind it which will run the Runnables as the threads become available.
So as an example here, you could have a Runnable like:
private static class Downloader implements Runnable {
private String file;
public Downloader(String file) {
this.file = file;
}
#Override
public void run() {
// Use HttpClient to download file.
}
}
Then You can use it by creating Downloader objects and submitting it to an ExecutorService:
public static void main(String[] args) throws Exception {
ExecutorService executorService = Executors.newFixedThreadPool(5);
for (String file : args) {
executorService.submit(new Downloader(file));
}
executorService.awaitTermination(100, TimeUnit.SECONDS);
}
It is maybe not the best solution but here is what I came up with thanks to the answer of John Vint. I hope it will help someone else.
package tests;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
public class RunnableQueue
{
private long waitTime;
private TimeUnit unit;
ExecutorService e;
public RunnableQueue(long waitTime, TimeUnit unit) {
e = Executors.newSingleThreadExecutor();
this.waitTime = waitTime;
this.unit = unit;
}
public void submitTask(final Runnable r){
e.submit(new Runnable(){
public void run(){
Thread t = new Thread(r);
t.start();
try {
t.join();
Thread.sleep(unit.toMillis(waitTime));
} catch (InterruptedException e) {
e.printStackTrace();
}
}
});
}
public static void main(String[] args) {
RunnableQueue runQueue = new RunnableQueue(3, TimeUnit.SECONDS);
for(int i=1; i<11; i++)
{
runQueue.submitTask(new DownloadTask(i));
System.out.println("Submitted task " + i);
}
}
}