Elegant way to run parallel threads in Spring 4 - java

I am developing an API. This API needs to do 2 DB queries to get the result.
I tried following strategies:
Used callable as return type in Controller.
Created 2 threads in Service (use Callable and CoundownLatch) to run 2 queries parallel and detect finishing time.
public class PetService {
public Object getData() {
CountDownLatch latch = new CountDownLatch(2);
AsyncQueryDBTask<Integer> firstQuery= new AsyncQueryDBTask<>(latch);
AsyncQueryDBTask<Integer> secondQuery= new AsyncQueryDBTask<>(latch);
latch.await();
}
public class AsyncQueryDBTask<T> implements Callable {
private CountDownLatch latch;
public AsyncQueryDBTask(CountDownLatch latch) { this.latch = latch;}
#Override
public T call() throws Exception {
//Run query
latch.countDown();
}
It worked fine but I feel that I am breaking the structure of Spring somewhere.
I wonder what is the most efficient way to get data in Spring 4.
-How to know both of 2 threads that run own query completed their job?
-How to control thread resource such as use and release thread?
Thanks in advance.

You generally don't want to create your own threads in an ApplicationServer nor manage thread lifecycles. In application servers, you can submit tasks to an ExecutorService to pool background worker threads.
Conveniently, Spring has the #Async annotation that handles all of that for you. In your example, you would create 2 async methods that return a Future :
public class PetService {
public Object getData() {
Future<Integer> futureFirstResult = runFirstQuery();
Future<Integer> futureSecondResult = runSecondQuery();
Integer firstResult = futureFirstResult.get();
Integer secondResult = futureSecondResult.get();
}
#Async
public Future<Integer> runFirstQuery() {
//do query
return new AsyncResult<>(result);
}
#Async
public Future<Integer> runSecondQuery() {
//do query
return new AsyncResult<>(result);
}
}
As long as you configure a ThreadPoolTaskExecutor and enable async methods, Spring will handle submitting the tasks for you.
NOTE: The get() method blocks the current thread until a result is returned by the worker thread but doesn't block other worker threads. It's generally advisable to put a timeout to prevent blocking forever.

Related

How to manage threads in Spring TaskExecutor framework

I have a BlockingQueue of Runnable - I can simply execute all tasks using one of TaskExecutor implementations, and all will be run in parallel.
However some Runnable depends on others, it means they need to wait when Runnable finish, then they can be executed.
Rule is quite simple: every Runnable has a code. Two Runnable with the same code cannot be run simultanously, but if the code differ they should be run in parallel.
In other words all running Runnable need to have different code, all "duplicates" should wait.
The problem is that there's no event/method/whatsoever when thread ends.
I can built such notification into every Runnable, but I don't like this approach, because it will be done just before thread ends, not after it's ended
java.util.concurrent.ThreadPoolExecutor has method afterExecute, but it needs to be implemented - Spring use only default implementation, and this method is ignored.
Even if I do that, it's getting complicated, because I need to track two additional collections: with Runnables already executing (no implementation gives access to this information) and with those postponed because they have duplicated code.
I like the BlockingQueue approach because there's no polling, thread simply activate when something new is in the queue. But maybe there's a better approach to manage such dependencies between Runnables, so I should give up with BlockingQueue and use different strategy?
If the number of different codes is not that large, the approach with a separate single thread executor for each possible code, offered by BarrySW19, is fine.
If the whole number of threads become unacceptable, then, instead of single thread executor, we can use an actor (from Akka or another similar library):
public class WorkerActor extends UntypedActor {
public void onReceive(Object message) {
if (message instanceof Runnable) {
Runnable work = (Runnable) message;
work.run();
} else {
// report an error
}
}
}
As in the original solution, ActorRefs for WorkerActors are collected in a HashMap. When an ActorRef workerActorRef corresponding to the given code is obtained (retrieved or created), the Runnable job is submitted to execution with workerActorRef.tell(job).
If you don't want to have a dependency to the actor library, you can program WorkerActor from scratch:
public class WorkerActor implements Runnable, Executor {
Executor executor=ForkJoinPool.commonPool(); // or can by assigned in constructor
LinkedBlockingQueue<Runnable> queue = new LinkedBlockingQueu<>();
boolean running = false;
public synchronized void execute(Runnable job) {
queue.put(job);
if (!running) {
executor.execute(this); // execute this worker, not job!
running=true;
}
public void run() {
for (;;) {
Runnable work=null;
synchronized (this) {
work = queue.poll();
if (work==null) {
running = false;
return;
}
}
work.run();
}
}
}
When a WorkerActor worker corresponding to the given code is obtained (retrieved or created), the Runnable job is submitted to execution with worker.execute(job).
One alternate strategy which springs to mind is to have a separate single thread executor for each possible code. Then, when you want to submit a new Runnable you simply lookup the correct executor to use for its code and submit the job.
This may, or may not be a good solution depending on how many different codes you have. The main thing to consider would be that the number of concurrent threads running could be as high as the number of different codes you have. If you have many different codes this could be a problem.
Of course, you could use a Semaphore to restrict the number of concurrently running jobs; you would still create one thread per code, but only a limited number could actually execute at the same time. For example, this would serialise jobs by code, allowing up to three different codes to run concurrently:
public class MultiPoolExecutor {
private final Semaphore semaphore = new Semaphore(3);
private final ConcurrentMap<String, ExecutorService> serviceMap
= new ConcurrentHashMap<>();
public void submit(String code, Runnable job) {
ExecutorService executorService = serviceMap.computeIfAbsent(
code, (k) -> Executors.newSingleThreadExecutor());
executorService.submit(() -> {
semaphore.acquireUninterruptibly();
try {
job.run();
} finally {
semaphore.release();
}
});
}
}
Another approach would be to modify the Runnable to release a lock and check for jobs which could be run upon completion (so avoiding polling) - something like this example, which keeps all the jobs in a list until they can be submitted. The boolean latch ensures only one job for each code has been submitted to the thread pool at any one time. Whenever a new job arrives or a running one completes the code checks again for new jobs which can be submitted (the CodedRunnable is simply an extension of Runnable which has a code property).
public class SubmissionService {
private final ExecutorService executorService = Executors.newFixedThreadPool(5);
private final ConcurrentMap<String, AtomicBoolean> locks = new ConcurrentHashMap<>();
private final List<CodedRunnable> jobs = new ArrayList<>();
public void submit(CodedRunnable codedRunnable) {
synchronized (jobs) {
jobs.add(codedRunnable);
}
submitWaitingJobs();
}
private void submitWaitingJobs() {
synchronized (jobs) {
for(Iterator<CodedRunnable> iter = jobs.iterator(); iter.hasNext(); ) {
CodedRunnable nextJob = iter.next();
AtomicBoolean latch = locks.computeIfAbsent(
nextJob.getCode(), (k) -> new AtomicBoolean(false));
if(latch.compareAndSet(false, true)) {
iter.remove();
executorService.submit(() -> {
try {
nextJob.run();
} finally {
latch.set(false);
submitWaitingJobs();
}
});
}
}
}
}
}
The downside of this approach is that the code needs to scan through the entire list of waiting jobs after each task completes. Of course, you could make this more efficient - a completing task would actually only need to check for other jobs with the same code, so the jobs could be stored in a Map<String, List<Runnable>> structure instead to allow for faster processing.

Workflow Design pattern combined with Task Pattern?

I'm currently working on an enterprise application that's performing a long non-linear tasks.
An abstraction of the workflow:
Gather neccessary information (can take minutes, but not always necessary)
Process data (always takes very long)
Notify several worker who post-process the result (in new tasks)
Now, I have created 2 services, that can solve step 1 and 2.
As the services shouldn't know of each other, I want to have a higher order Component that coordinates the 3 steps of an task. Think of it as an Callable which sends the task to service one, wakes up again when service 1 returnes an result, sends it to service 2, ..., sends final result to all post-processors and ends task.
But as it is likely to have 100'000s of queued tasks, I don't want to start 100'000s threads with task-control callables which even if being idle like 99.9% of the time still would be an massive overhead.
So got anybody an idea of controling this producer consumer queue-like pattern encapsulated in a task-object or somebody knows of an framework simplifying my concern?
Besides actor frameworks, I would suggest two main approaches that work with plain old Java:
Using an ExecutorService to which we submit tasks. The proper sequencing of steps can be synchronized using Future objects. The overall set of tasks can be synchronized using a Phaser a shown below.
Using the Fork/Join framework
Here is an example using a simple executor service. The Workflow class is given an executor and a phaser (a synchronization barrier). Each time the workflow is executed, it submits a new task for each of the steps (i.e., data collection, processing, and post-processing). Each task uses these phaser to indicate when it starts and stops.
public class Workflow {
private final ExecutorService executor;
private final Phaser phaser;
public Workflow(ExecutorService executor, Phaser phaser) {
this.executor = executor;
this.phaser = phaser;
}
public void execute(int request) throws InterruptedException, ExecutionException {
executor.submit(() -> {
phaser.register();
// Data collection
Future<Integer> input = executor.submit(() -> {
phaser.register();
System.out.println("Gathering data for call " + request);
phaser.arrive();
return request;
});
// Data Processing
Future<Integer> result = executor.submit(() -> {
phaser.register();
System.out.println("Processing call " + request);
Thread.sleep(5000);
phaser.arrive();
return request;
});
// Post processing
Future<Integer> ack = executor.submit(() -> {
phaser.register();
System.out.println("Notyfing processors for call " + request);
phaser.arrive();
return request;
});
final Integer output = ack.get();
phaser.arrive();
return output;
});
}
}
The caller object uses the phaser object to know when all subtasks (steps) have completed, before to shutdown the executor.
public static void main(String[] args) throws InterruptedException, ExecutionException {
final Phaser phaser = new Phaser();
final ExecutorService executor = Executors.newCachedThreadPool();
Workflow workflow = new Workflow(executor, phaser);
phaser.register();
for (int request=0 ; request<10 ; request++) {
workflow.execute(request);
}
phaser.arriveAndAwaitAdvance();
executor.shutdown();
executor.awaitTermination(30, TimeUnit.SECONDS);
}

How to wait for completion of multiple tasks in Java?

What is the proper way to implement concurrency in Java applications? I know about Threads and stuff, of course, I have been programming for Java for 10 years now, but haven't had too much experience with concurrency.
For example, I have to asynchronously load a few resources, and only after all have been loaded, can I proceed and do more work. Needless to say, there is no order how they will finish. How do I do this?
In JavaScript, I like using the jQuery.deferred infrastructure, to say
$.when(deferred1,deferred2,deferred3...)
.done(
function(){//here everything is done
...
});
But what do I do in Java?
You can achieve it in multiple ways.
1.ExecutorService invokeAll() API
Executes the given tasks, returning a list of Futures holding their status and results when all complete.
2.CountDownLatch
A synchronization aid that allows one or more threads to wait until a set of operations being performed in other threads completes.
A CountDownLatch is initialized with a given count. The await methods block until the current count reaches zero due to invocations of the countDown() method, after which all waiting threads are released and any subsequent invocations of await return immediately. This is a one-shot phenomenon -- the count cannot be reset. If you need a version that resets the count, consider using a CyclicBarrier.
3.ForkJoinPool or newWorkStealingPool() in Executors is other way
Have a look at related SE questions:
How to wait for a thread that spawns it's own thread?
Executors: How to synchronously wait until all tasks have finished if tasks are created recursively?
I would use parallel stream.
Stream.of(runnable1, runnable2, runnable3).parallel().forEach(r -> r.run());
// do something after all these are done.
If you need this to be asynchronous, then you might use a pool or Thread.
I have to asynchronously load a few resources,
You could collect these resources like this.
List<String> urls = ....
Map<String, String> map = urls.parallelStream()
.collect(Collectors.toMap(u -> u, u -> download(u)));
This will give you a mapping of all the resources once they have been downloaded concurrently. The concurrency will be the number of CPUs you have by default.
If I'm not using parallel Streams or Spring MVC's TaskExecutor, I usually use CountDownLatch. Instantiate with # of tasks, reduce once for each thread that completes its task. CountDownLatch.await() waits until the latch is at 0. Really useful.
Read more here: JavaDocs
Personally, I would do something like this if I am using Java 8 or later.
// Retrieving instagram followers
CompletableFuture<Integer> instagramFollowers = CompletableFuture.supplyAsync(() -> {
// getInstaFollowers(userId);
return 0; // default value
});
// Retrieving twitter followers
CompletableFuture<Integer> twitterFollowers = CompletableFuture.supplyAsync(() -> {
// getTwFollowers(userId);
return 0; // default value
});
System.out.println("Calculating Total Followers...");
CompletableFuture<Integer> totalFollowers = instagramFollowers
.thenCombine(twitterFollowers, (instaFollowers, twFollowers) -> {
return instaFollowers + twFollowers; // can be replaced with method reference
});
System.out.println("Total followers: " + totalFollowers.get()); // blocks until both the above tasks are complete
I used supplyAsync() as I am returning some value (no. of followers in this case) from the tasks otherwise I could have used runAsync(). Both of these run the task in a separate thread.
Finally, I used thenCombine() to join both the CompletableFuture. You could also use thenCompose() to join two CompletableFuture if one depends on the other. But in this case, as both the tasks can be executed in parallel, I used thenCombine().
The methods getInstaFollowers(userId) and getTwFollowers(userId) are simple HTTP calls or something.
You can use a ThreadPool and Executors to do this.
https://docs.oracle.com/javase/tutorial/essential/concurrency/pools.html
This is an example I use Threads. Its a static executerService with a fixed size of 50 threads.
public class ThreadPoolExecutor {
private static final ExecutorService executorService = Executors.newFixedThreadPool(50,
new ThreadFactoryBuilder().setNameFormat("thread-%d").build());
private static ThreadPoolExecutor instance = new ThreadPoolExecutor();
public static ThreadPoolExecutor getInstance() {
return instance;
}
public <T> Future<? extends T> queueJob(Callable<? extends T> task) {
return executorService.submit(task);
}
public void shutdown() {
executorService.shutdown();
}
}
The business logic for the executer is used like this: (You can use Callable or Runnable. Callable can return something, Runnable not)
public class MultipleExecutor implements Callable<ReturnType> {//your code}
And the call of the executer:
ThreadPoolExecutor threadPoolExecutor = ThreadPoolExecutor.getInstance();
List<Future<? extends ReturnType>> results = new LinkedList<>();
for (Type Type : typeList) {
Future<? extends ReturnType> future = threadPoolExecutor.queueJob(
new MultipleExecutor(needed parameters));
results.add(future);
}
for (Future<? extends ReturnType> result : results) {
try {
if (result.get() != null) {
result.get(); // here you get the return of one thread
}
} catch (InterruptedException | ExecutionException e) {
logger.error(e, e);
}
}
The same behaviour as with $.Deferred in jQuery you can archive in Java 8 with a class called CompletableFuture. This class provides the API for working with Promises. In order to create async code you can use one of it's static creational methods like #runAsync, #supplyAsync. Then applying some computation of results with #thenApply.
I usually opt for an async notify-start, notify-progress, notify-end approach:
class Task extends Thread {
private ThreadLauncher parent;
public Task(ThreadLauncher parent) {
super();
this.parent = parent;
}
public void run() {
doStuff();
parent.notifyEnd(this);
}
public /*abstract*/ void doStuff() {
// ...
}
}
class ThreadLauncher {
public void stuff() {
for (int i=0; i<10; i++)
new Task(this).start();
}
public void notifyEnd(Task who) {
// ...
}
}

How to run concurrent job with dependent tasks?

I have a situation that I need to work on
I have a class which has send method, example
#Singleton
class SendReport {
public void send() {}
}
The send method is called from a user click on web page, and must return immediately, but must start a sequence of tasks that will take time
send
->|
| |-> Task1
<-| |
<-|
|
|-> Task2 (can only start when Task1 completes/throws exception)
<-|
|
|-> Task3 (can only start when Task2 completes/throws exception)
<-|
I am new to Java concurrent world and was reading about it. As per my understanding, I need a Executor Service and submit() a job(Task1) to process and get the Future back to continue.
Am I correct?
The difficult part for me to understand and design is
- How and where to handle exceptions by any such task?
- As far as I see, do I have to do something like?
ExecutorService executorService = Executors.newFixedThreadPool(1);
Future futureTask1 = executorService.submit(new Callable(){
public Object call() throws Exception {
System.out.println("doing Task1");
return "Task1 Result";
}
});
if (futureTask1.get() != null) {
Future futureTask2 = executorService.submit(new Callable(){
public Object call() throws Exception {
System.out.println("doing Task2");
return "Task2 Result";
}
}
... and so on for Task 3
Is it correct?
if yes, is there a better recommended way?
Thanks
Dependent task execution is made easy with Dexecutor
Disclaimer : I am the owner
Here is an example, it can run the following complex graph very easily, you can refer this for more details
Here is an example
If you just have a line of tasks that need to be called on completion of the previous one than as stated and discussed in the previous answers I don't think you need multiple threads at all.
If you have a pool of tasks and some of them needs to know the outcome of another task while others don't care you can then come up with a dependent callable implementation.
public class DependentCallable implements Callable {
private final String name;
private final Future pre;
public DependentCallable(String name, Future pre) {
this.name = name;
this.pre = pre;
}
#Override
public Object call() throws Exception {
if (pre != null) {
pre.get();
//pre.get(10, TimeUnit.SECONDS);
}
System.out.println(name);
return name;
}
A few other things you need to take care of based on the code in your question, get rid of future.gets in between submits as stated in previous replies. Use a thread pool size of which is at least greater than the depth of dependencies between callables.
Your current approach will not work as it will block till the total completion which you wanted to avoid.
future.get() is blocking();
so after submitting first Task, your code will wait till its finished and then next task will be submitted, again wait, so there is no advantage over single thread executing the tasks one by one.
so if anything the code would need to be:
Future futureTask2 = executorService.submit(new Callable(){
public Object call() throws Exception {
futureTask1.get()
System.out.println("doing Task2");
return "Task2 Result";
}
}
your graph suggests that the subsequent task should execute despite exceptions. The ExecutionException will be thrown from get if there was problem with computation so you need to guard the get() with appropriate try.
Since Task1, Task2 have to completed one after another, why you do you want them exececuted in different threads. Why not have one thread with run method that deals with Task1,Task2.. one by one. As you said not your "main" thread, it can be in the executor job but one that handles all the tasks.
I personally don't like anonymous inner classes and callback (that is what you kind of mimic with chain of futures). If I would have to implement sequence of tasks I would actually implement queue of tasks and processors that executes them.
Mainly cause it is "more manageable", as I could monitor the content of the queue or even remove not necessary tasks.
So I would have a BlockingQueue<JobDescription> into which I would submit the JobDescription containing all the data necessary for the Task execution.
I would implement threads (Processors) that in their run() will have infinitive loop in which they take the job from the queue, do the task, and put back into the queue the following task. Something in those lines.
But if the Tasks are predefined at the send method, I would simply have them submitted as one job and then execute in one thread. If they are always sequential then there is no point in splitting them between different threads.
You need to add one more task if you want to return send request immediately. Please check the following example. It submits the request to the background thread which will execute the tasks sequentially and then returns.
Callable Objects for 3 long running tasks.
public class Task1 implements Callable<String> {
public String call() throws Exception {
Thread.sleep(5000);
System.out.println("Executing Task1...");
return Thread.currentThread().getName();
}
}
public class Task2 implements Callable<String> {
public String call() throws Exception {
Thread.sleep(5000);
System.out.println("Executing Task2...");
return Thread.currentThread().getName();
}
}
public class Task3 implements Callable<String> {
public String call() throws Exception {
Thread.sleep(5000);
System.out.println("Executing Task3...");
return Thread.currentThread().getName();
}
}
Main method that gets request from the client and returns immediately, and then starts executing tasks sequentially.
public class ThreadTest {
public static void main(String[] args) {
final ExecutorService executorService = Executors.newFixedThreadPool(5);
executorService.submit(new Runnable() {
public void run() {
try {
Future<String> result1 = executorService.submit(new Task1());
if (result1.get() != null) {
Future<String> result2 = executorService.submit(new Task2());
if (result2.get() != null) {
executorService.submit(new Task3());
}
}
} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
}
}
});
System.out.println("Submitted request...");
}
}

How to run a Listener in a different thread or do its calculation in a different thread

I'm trying to build a cache with Google Guava and want to do some calculation on the expired objects. A removalListener notifies me, if some object was removed.
How can I run the removalListener in a different thread than the main application or pass the expired object (in the simple example below, that would be the Integer 3) to a different thread that handles the calculation?
Edit: As the calculation is rather short, but happens often, I would rather not create a new thread each time (would be thousands of threads), but have one (or maybe two) who calculate all objects.
Simple example:
Cache<String, Integer> cache = CacheBuilder.newBuilder().maximumSize(100)
.expireAfterAccess(100, TimeUnit.NANOSECONDS)
.removalListener(new RemovalListener<String, Integer>() {
public void onRemoval(final RemovalNotification notification) {
if (notification.getCause() == RemovalCause.EXPIRED) {
System.out.println("removed " + notification.getValue());
// do calculation=> this should be in another thread
}
}
})
.build();
cache.put("test1", 3);
cache.cleanUp();
To run your listener in an executor, wrap it with RemovalListeners.asynchronous.
.removalListener(asynchronous(new RemovalListener() { ... }, executor))
Create an ExecutorService using one of the Executors factory methods, and submit a new Runnable to this executor each time you need to:
private ExecutorService executor = Executors.newSingleThreadExecutor();
...
public void onRemoval(final RemovalNotification notification) {
if (notification.getCause() == RemovalCause.EXPIRED) {
System.out.println("removed " + notification.getValue());
submitCalculation(notification.getValue());
}
}
private void submitCalculation(final Integer value) {
Runnable task = new Runnable() {
#Override
public void run() {
// call your calculation here
}
};
executor.submit(task);
}
You can create a new class, and implement the java.utils.Runnable interface like so;
public class MyWorkerThread implements Runnable {
public MyWorkerThread(/*params*/) {
//set your instance variables here
//then start the thread
(new Thread(this)).start();
}
public void run() {
//do useful things
}
}
When you create a new MyWorkerThread by calling the constructor, execution is returned to the calling code as soon as the constructor is finished, and a separate thread is started that runs the code inside the run() method.
If you might want to create MyWorkerThread objects without immediately starting them off, you can remove the Thread.start() code from the constructor, and call the thread manually from the instance later like so;
MyWorkerThread t = new MyWorkerThread();
//later
(new Thread(t)).start();
Or if you want to keep a reference to the Thread object so you can do groovy things like interrupt and join, do it like so;
Thread myThread = new Thread(t);
myThread.start();
//some other time
myThread.interrupt();
you can simply create intermediate queue for expired entities (expiration listener will just add expired object to this queue) - say some sort of blocking in-memory queue - ArrayBlockingQueue, LinkedBlockingDeque.
Then you can setup thread-pool and handlers(with configurable size) that will consume objects using poll() method.
For high-performance queue - i can advice more advanced non-blocking queue implementation if needed. also you can read more about high-performance non-blocking queues here Add the first element to a ConcurrentLinkedQueue atomically
Use an executor service to dispatch your task to a different thread.
ExecutorService have an internal blocking queue that is used for safe publishing of references between the producer and the consumer threads. The factory class Executors can be used to create different ExecutorService with different thread management strategies.
private ExecutorService cleanupExecutor = Executors.newFixedThreadPool(CLEANUP_THREADPOOL_SIZE);
...
public void onRemoval(final RemovalNotification notification) {
if (notification.getCause() == RemovalCause.EXPIRED) {
System.out.println("removed " + notification.getValue());
doAsyncCalculation(notification.getValue());
}
}
private void doAsyncCalculation(final Object obj) {
cleanupExecutor.submit(new Runnable() {
public void run() {
expensiveOperation(obj);
}
}
}
In doAsyncCalculation you are creating new tasks to be run but not new threads. The executor service takes care of dispatching the task to the threads in the executorService's associated thread pool.

Categories