I have a similar situation to that described in this question:
Java email sending queue - fixed number of threads sending as many messages as are available
In that I have a blocking queue that gets fed commands(ICommandTask extends Callable{Object}) from which a thread pool takes off and runs. The blocking queue provides thread synchronization and isolation between calling thread and executing thread. Different objects throughout the program can submit ICommandTasks to the command queue which is why I've made AddTask() static.
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.LinkedBlockingQueue;
import com.mypackage.tasks.ICommandTask;
public enum CommandQueue
{
INSTANCE;
private final BlockingQueue<ICommandTask> commandQueue;
private final ExecutorService executor;
private CommandQueue()
{
commandQueue = new LinkedBlockingQueue<ICommandTask>();
executor = Executors.newCachedThreadPool();
}
public static void start()
{
new Thread(INSTANCE.new WaitForProducers()).start();
}
public static void addTask(ICommandTask command)
{
INSTANCE.commandQueue.add(command);
}
private class WaitForProducers implements Runnable
{
#Override
public void run()
{
ICommandTask command;
while(true)
{
try
{
command = INSTANCE.commandQueue.take();
executor.submit(task);
}
catch (InterruptedException e)
{
// logging etc.
}
}
}
}
}
In the main program during start up the Command Queue is started using the following which creates a New CommandQueue object and starts the WaitForProducers in a separate thread.
CommandQueue.Start();
I wanted to ask whether this method of setting up a multiple producers to single executor using the singleton enum (so that different parts of the program can access), and that uses a separate thread to take off tasks from the queue and submit to a ThreadPool is a recommended way of doing what I want to achieve. Particularly in a very multithreaded environment.
So far it seems to be working ok but I plan on creating similar objects to CommandQueue to handle different types of Tasks. They will be stored in their own queues. E.g. OrderQueue, EventQueue, NegotiationQueue etc. So it needs to be somewhat scaleable and threadsafe.
Thanks in advance.
Related
I have a BlockingQueue of Runnable - I can simply execute all tasks using one of TaskExecutor implementations, and all will be run in parallel.
However some Runnable depends on others, it means they need to wait when Runnable finish, then they can be executed.
Rule is quite simple: every Runnable has a code. Two Runnable with the same code cannot be run simultanously, but if the code differ they should be run in parallel.
In other words all running Runnable need to have different code, all "duplicates" should wait.
The problem is that there's no event/method/whatsoever when thread ends.
I can built such notification into every Runnable, but I don't like this approach, because it will be done just before thread ends, not after it's ended
java.util.concurrent.ThreadPoolExecutor has method afterExecute, but it needs to be implemented - Spring use only default implementation, and this method is ignored.
Even if I do that, it's getting complicated, because I need to track two additional collections: with Runnables already executing (no implementation gives access to this information) and with those postponed because they have duplicated code.
I like the BlockingQueue approach because there's no polling, thread simply activate when something new is in the queue. But maybe there's a better approach to manage such dependencies between Runnables, so I should give up with BlockingQueue and use different strategy?
If the number of different codes is not that large, the approach with a separate single thread executor for each possible code, offered by BarrySW19, is fine.
If the whole number of threads become unacceptable, then, instead of single thread executor, we can use an actor (from Akka or another similar library):
public class WorkerActor extends UntypedActor {
public void onReceive(Object message) {
if (message instanceof Runnable) {
Runnable work = (Runnable) message;
work.run();
} else {
// report an error
}
}
}
As in the original solution, ActorRefs for WorkerActors are collected in a HashMap. When an ActorRef workerActorRef corresponding to the given code is obtained (retrieved or created), the Runnable job is submitted to execution with workerActorRef.tell(job).
If you don't want to have a dependency to the actor library, you can program WorkerActor from scratch:
public class WorkerActor implements Runnable, Executor {
Executor executor=ForkJoinPool.commonPool(); // or can by assigned in constructor
LinkedBlockingQueue<Runnable> queue = new LinkedBlockingQueu<>();
boolean running = false;
public synchronized void execute(Runnable job) {
queue.put(job);
if (!running) {
executor.execute(this); // execute this worker, not job!
running=true;
}
public void run() {
for (;;) {
Runnable work=null;
synchronized (this) {
work = queue.poll();
if (work==null) {
running = false;
return;
}
}
work.run();
}
}
}
When a WorkerActor worker corresponding to the given code is obtained (retrieved or created), the Runnable job is submitted to execution with worker.execute(job).
One alternate strategy which springs to mind is to have a separate single thread executor for each possible code. Then, when you want to submit a new Runnable you simply lookup the correct executor to use for its code and submit the job.
This may, or may not be a good solution depending on how many different codes you have. The main thing to consider would be that the number of concurrent threads running could be as high as the number of different codes you have. If you have many different codes this could be a problem.
Of course, you could use a Semaphore to restrict the number of concurrently running jobs; you would still create one thread per code, but only a limited number could actually execute at the same time. For example, this would serialise jobs by code, allowing up to three different codes to run concurrently:
public class MultiPoolExecutor {
private final Semaphore semaphore = new Semaphore(3);
private final ConcurrentMap<String, ExecutorService> serviceMap
= new ConcurrentHashMap<>();
public void submit(String code, Runnable job) {
ExecutorService executorService = serviceMap.computeIfAbsent(
code, (k) -> Executors.newSingleThreadExecutor());
executorService.submit(() -> {
semaphore.acquireUninterruptibly();
try {
job.run();
} finally {
semaphore.release();
}
});
}
}
Another approach would be to modify the Runnable to release a lock and check for jobs which could be run upon completion (so avoiding polling) - something like this example, which keeps all the jobs in a list until they can be submitted. The boolean latch ensures only one job for each code has been submitted to the thread pool at any one time. Whenever a new job arrives or a running one completes the code checks again for new jobs which can be submitted (the CodedRunnable is simply an extension of Runnable which has a code property).
public class SubmissionService {
private final ExecutorService executorService = Executors.newFixedThreadPool(5);
private final ConcurrentMap<String, AtomicBoolean> locks = new ConcurrentHashMap<>();
private final List<CodedRunnable> jobs = new ArrayList<>();
public void submit(CodedRunnable codedRunnable) {
synchronized (jobs) {
jobs.add(codedRunnable);
}
submitWaitingJobs();
}
private void submitWaitingJobs() {
synchronized (jobs) {
for(Iterator<CodedRunnable> iter = jobs.iterator(); iter.hasNext(); ) {
CodedRunnable nextJob = iter.next();
AtomicBoolean latch = locks.computeIfAbsent(
nextJob.getCode(), (k) -> new AtomicBoolean(false));
if(latch.compareAndSet(false, true)) {
iter.remove();
executorService.submit(() -> {
try {
nextJob.run();
} finally {
latch.set(false);
submitWaitingJobs();
}
});
}
}
}
}
}
The downside of this approach is that the code needs to scan through the entire list of waiting jobs after each task completes. Of course, you could make this more efficient - a completing task would actually only need to check for other jobs with the same code, so the jobs could be stored in a Map<String, List<Runnable>> structure instead to allow for faster processing.
I want to create a health checker, which will check the health of a java process. My process does a lot of things and is multi threaded. Various exceptions could be thrown, like Service / SQL / IO, etc. My plan is to call the HealthChecker to check for the process, from the catch block, in the individual threads. This will check for all the different healths, and in the case where there is any issue it will pause the threads, and log appropriately. There will be other processes which will read the logs by the process, and alert support to take appropriate actions.
Below is the general structure of the java process.
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
public class Schedular {
private static int numOfTasks = 10 ;
public static void main(String[] args) {
ExecutorService service = Executors.newFixedThreadPool(5);
while(true){
for(int i=0;i<numOfTasks;i++){
service.execute(new Workers());
}
}
}
}
class Workers implements Runnable{
#Override
public void run() {
/*
* This can throw different exceptions , eg:
*/
try{
}catch(Exception e){
e.printStackTrace();
HealthChecker.checkHealth();
}
}
}
class HealthChecker{
public static void checkHealth() {
//Check health and then , log and pause all the threads
}
}
I am not able to figure out a way to pause all the threads. If there is a db exception I want all the threads to pause. I am requesting some suggestions.
You need a way to block the threads until some event occurs that allows the threads to continue. I see some major issues with the code:
1) The while(true) in your main thread might lead to a StackOverflowError. With each iteration of the while loop, you will add 10 more threads to the executor, and this will just continue unbounded.
2) There is no loop in your run() so that even if an exception is caught and we wait for the HealthCheck, the run() method would still exit. While a loop is not needed in your run() if you can constantly execute new Threads from your main thread to take the place of the terminated one, but that logic is not presently there in the main loop.
But setting those concerns aside here is one way to block worker threads until some event (presumably a HealthCheck all clear) occurs.
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
public class Schedular {
private static int numOfTasks = 10 ;
public static void main(String[] args) {
ExecutorService service = Executors.newFixedThreadPool(5);
HealtchChecker hChecker = new HealthChecker();
for(int i=0;i<numOfTasks;i++){
service.execute(new Workers(hChecker));
}
}
}
class Workers implements Runnable{
private HealtchChecker hChecker;
public Workers(HealtchChecker hChecker){
this.hChecker = hChecker;
}
#Override
public void run() {
/*
* This can throw different exceptions , eg:
*/
while(true) {
try{
}catch (InterruptedException ie) {
throw ie;
}catch(Exception e){
e.printStackTrace();
HealthChecker.checkHealth();
}
}
}
}
class HealthChecker implements Runnable {
private final Semaphore semaphore = new Semaphore(1, true);
public void checkHealth() {
try {
semaphore.acquire();
} finally {
semaphore.release();
}
}
#Override
public void run(){
//code to check for errors that cause threads to pause.
if (inErrorState) {
semaphore.acquire();
} else {
semaphore.release();
}
}
}
A few things worth mentioning.
1) The main thread only creates 10 threads, versus an unbounded amount. You can adjust this as needed.
2) The Worker thread is long lived, meaning it will continue running even if it encounters Exceptions, except for an InterruptException.
3) HealthCheck is no longer a static object. it is instead a shared object.
4) HealthCheck is a runnable that can be executed in its own thread for monitoring for errors. I did not add the code to execute this thread.
5) HealCheck uses a Semaphore to cause the threads to block until the error state is cleared. I looked for other objects that can do this, like CountDownLatch or CyclicBarrier or Phaser, but this one came closest to giving us what we need to block all the threads from one point (the run() method).
Its not perfect but I think it gets you a little bit closer to what you want.
You're venturing pretty far afield from best practices, but you didn't ask about best practices for monitoring the health of threads - so I won't answer that question. Instead, I'll just answer the question you asked: how can I pause a set of threads managed by an ExecutorService?
Assuming that your Workers.run() will eventually end without intervention (in other words, it's not in an infinite loop - intentional or otherwise), the right thing to do is to call service.shutdown() (where service is your instance of ExecutorService). To do this, you can pass service in to HealthCheck.healthCheck() as a new parameter. Calling shutdown() will allow the currently-running threads to complete, then stop the executor.
If Workers.run() will not naturally complete, best practice says that you need to change your code such that it will. There is a Thread.stop() method you can call to halt the thread and a Thread.suspend() method you can call to suspend the thread. Both of these are double-bad ideas for you to use for two reasons:
They are Deprecated and will leave the Threads in a super-unhealthy state. You will have very difficult problems in the future if you use them.
You are using ExecutorService. That means you are delegating thread management to that class. If you go messing with the state of the Threads underneath ExecutorService, it can't manage the thread pool for you and, again, you will have very difficult problems in the future.
I am trying to figure out how to use the types from the java.util.concurrent package to parallelize processing of all the files in a directory.
I am familiar with the multiprocessing package in Python, which is very simple to use, so ideally I am looking for something similar:
public interface FictionalFunctor<T>{
void handle(T arg);
}
public class FictionalThreadPool {
public FictionalThreadPool(int threadCount){
...
}
public <T> FictionalThreadPoolMapResult<T> map(FictionalFunctor<T> functor, List<T> args){
// Executes the given functor on each and every arg from args in parallel. Returns, when
// all the parallel branches return.
// FictionalThreadPoolMapResult allows to abort the whole mapping process, at the least.
}
}
dir = getDirectoryToProcess();
pool = new FictionalThreadPool(10); // 10 threads in the pool
pool.map(new FictionalFunctor<File>(){
#Override
public void handle(File file){
// process the file
}
}, dir.listFiles());
I have a feeling that the types in java.util.concurrent allow me to do something similar, but I have absolutely no idea where to start.
Any ideas?
Thanks.
EDIT 1
Following the advices given in the answers, I have written something like this:
public void processAllFiles() throws IOException {
ExecutorService exec = Executors.newFixedThreadPool(6);
BlockingQueue<Runnable> tasks = new LinkedBlockingQueue<Runnable>(5); // Figured we can keep the contents of 6 files simultaneously.
exec.submit(new MyCoordinator(exec, tasks));
for (File file : dir.listFiles(getMyFilter()) {
try {
tasks.add(new MyTask(file));
} catch (IOException exc) {
System.err.println(String.format("Failed to read %s - %s", file.getName(), exc.getMessage()));
}
}
}
public class MyTask implements Runnable {
private final byte[] m_buffer;
private final String m_name;
public MyTask(File file) throws IOException {
m_name = file.getName();
m_buffer = Files.toByteArray(file);
}
#Override
public void run() {
// Process the file contents
}
}
private class MyCoordinator implements Runnable {
private final ExecutorService m_exec;
private final BlockingQueue<Runnable> m_tasks;
public MyCoordinator(ExecutorService exec, BlockingQueue<Runnable> tasks) {
m_exec = exec;
m_tasks = tasks;
}
#Override
public void run() {
while (true) {
Runnable task = m_tasks.remove();
m_exec.submit(task);
}
}
}
How I thought the code works is:
The files are read one after another.
A file contents are saved in a dedicated MyTask instance.
A blocking queue with the capacity of 5 to hold the tasks. I count on the ability of the server to keep the contents of at most 6 files at one time - 5 in the queue and another fully initialized task waiting to enter the queue.
A special MyCoordinator task fetches the file tasks from the queue and dispatches them to the same pool.
OK, so there is a bug - more than 6 tasks can be created. Some will be submitted, even though all the pool threads are busy. I've planned to solve it later.
The problem is that it does not work at all. The MyCoordinator thread blocks on the first remove - this is fine. But it never unblocks, even though new tasks were placed in the queue. Can anyone tell me what am I doing wrong?
The thread pool you are looking for is the ExecutorService class. You can create a fixed-size thread pool using newFixedThreadPool. This allows you to easily implement a producer-consumer pattern, with the pool encapsulating all the queue and worker functionality for you:
ExecutorService exec = Executors.newFixedThreadPool(10);
You can then submit tasks in the form of objects whose type implements Runnable (or Callable if you want to also get a result):
class ThreadTask implements Runnable {
public void run() {
// task code
}
}
...
exec.submit(new ThreadTask());
// alternatively, using an anonymous type
exec.submit(new Runnable() {
public void run() {
// task code
}
});
A big word of advice on processing multiple files in parallel: if you have a single mechanical disk holding the files it's wise to use a single thread to read them one-by-one and submit each file to a thread pool task as above, for processing. Do not do the actual reading in parallel as it will degrade performance.
A simpler solution than using ExecuterService is to implement your own producer-consumer scheme. Have a thread that create tasks and submits to a LinkedBlockingQueue or ArrayBlockingQueue and have worker threads that check this queue to retrieve the tasks and do them. You may need a special kind of tasks name ExitTask that forces the workers to exit. So at the end of the jobs if you have n workers you need to add n ExitTasks into the queue.
Basically, what #Tudor said, use an ExecutorService, but I wanted to expand on his code and I always feel strange editing other people's posts. Here's a sksleton of what you would submit to the ExecutorService:
public class MyFileTask implements Runnable {
final File fileToProcess;
public MyFileTask(File file) {
fileToProcess = file;
}
public void run() {
// your code goes here, e.g.
handle(fileToProcess);
// if you prefer, implement Callable instead
}
}
See also my blog post here for some more details if you get stuck
Since processing Files often leads to IOExceptions, I'd prefer a Callable (which can throw a checked Exception) to a Runnable, but YMMV.
A code sample for demonstration of the idea from the title:
executor.submit(runnable1);
executor.submit(runnable2);
I need to be sure that runnable1 will finish before runnable2 start and I haven't found any proofs of such behavior in the executors documentation.
About the problem I'm solving:
I need write lots of logs to a file. Each log requires much precomputing (formatting and some other stuff). So, I want to put each logging task to a kind of queue and process these tasks in a separate thread. And, of course, it's important to keep logs ordering.
A single threaded executor will perform all tasks in the order submitted. You would only use a thread pool with multiple threads if you wanted the tasks to be perform concurrently.
Adding tasks to a queue can be expensive in itself. You can use an Exchanger like this
http://vanillajava.blogspot.com/2011/09/exchange-and-gc-less-java.html?z#!/2011/09/exchange-and-gc-less-java.html
This avoid using a queue or creating object.
An alternative which is faster is to use a memory mapped file which doesn't require a background thread (actually the OS is working in the background) This is much faster again. It supports sub-microsecond latencies and millions of messages per second.
https://github.com/peter-lawrey/Java-Chronicle
You could create a simple wrapper like the one below so that all your Runnables are executed in the same thread (i.e. sequentially), and submit that wrapper to the executor instead. That does not address the logging issue.
class MyRunnable implements Runnable {
private List<Runnable> runnables = new ArrayList<>();
public void add(Runnable r) {
runnables.add(r);
}
#Override
public void run() {
for (Runnable r : runnables) {
r.run();
}
}
}
//......
MyRunnable r = new MyRunnable();
r.add(runnable1);
r.add(runnable2);
executor.submit(r);
Presumably you are doing some post-analysis of the logfile? Have you considered not caring about the order they're written and re-ordering offline later. You could allocate a unique id at submit time using, a timestamp or AtomicLong?
a code sketch (untested) would look like this:
import java.util.concurrent.atomic.AtomicLong;
class MyProcessor {
public void work()
for (Object data: allData) {
executor.submit(new MySequencedRunnable(data);
}
}
}
class MySequencedRunnable implements Runnable {
private static final AtomicLong LOG_SEQUENCE_ID = new AtomicLong(0);
private final Object data;
MySequencedRunnable(Object data) {
this.data = data;
}
public void run() {
LOGGER.log(LOG_SEQUENCE_ID.incrementAndGet(), data);
}
}
Also consider, if you're using something like log4j, using NDC or MDC to assist with the re-ordering.
First of all, I must say that I am quite new to the API java.util.concurrent, so maybe what I am doing is completely wrong.
What do I want to do?
I have a Java application that basically runs 2 separate processing (called myFirstProcess, mySecondProcess), but these processing must be run at the same time.
So, I tried to do that:
public void startMyApplication() {
ExecutorService executor = Executors.newFixedThreadPool(2);
FutureTask<Object> futureOne = new FutureTask<Object>(myFirstProcess);
FutureTask<Object> futureTwo = new FutureTask<Object>(mySecondProcess);
executor.execute(futureOne);
executor.execute(futureTwo);
while (!(futureOne.isDone() && futureTwo.isDone())) {
try {
// I wait until both processes are finished.
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
logger.info("Processing finished");
executor.shutdown();
// Do some processing on results
...
}
myFirstProcess and mySecondProcess are classes that implements Callable<Object>, and where all their processing is made in the call() method.
It is working quite well but I am not sure that it is the correct way to do that.
Is a good way to do what I want? If not, can you give me some hints to enhance my code (and still keep it as simple as possible).
You'd be better off using the get() method.
futureOne.get();
futureTwo.get();
Both of which wait for notification from the thread that it finished processing, this saves you the busy-wait-with-timer you are now using which is not efficient nor elegant.
As a bonus, you have the API get(long timeout, TimeUnit unit) which allows you to define a maximum time for the thread to sleep and wait for a response, and otherwise continues running.
See the Java API for more info.
The uses of FutureTask above are tolerable, but definitely not idiomatic. You're actually wrapping an extra FutureTask around the one you submitted to the ExecutorService. Your FutureTask is treated as a Runnable by the ExecutorService. Internally, it wraps your FutureTask-as-Runnable in a new FutureTask and returns it to you as a Future<?>.
Instead, you should submit your Callable<Object> instances to a CompletionService. You drop two Callables in via submit(Callable<V>), then turn around and call CompletionService#take() twice (once for each submitted Callable). Those calls will block until one and then the other submitted tasks are complete.
Given that you already have an Executor in hand, construct a new ExecutorCompletionService around it and drop your tasks in there. Don't spin and sleep waiting; CompletionService#take() will block until either one of your tasks are complete (either finished running or canceled) or the thread waiting on take() is interrupted.
Yuval's solution is fine. As an alternative you can also do this:
ExecutorService executor = Executors.newFixedThreadPool();
FutureTask<Object> futureOne = new FutureTask<Object>(myFirstProcess);
FutureTask<Object> futureTwo = new FutureTask<Object>(mySecondProcess);
executor.execute(futureOne);
executor.execute(futureTwo);
executor.shutdown();
try {
executor.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
} catch (InterruptedException e) {
// interrupted
}
What is the advantage of this approach? There's not a lot of difference really except that this way you stop the executor accepting any more tasks (you can do that the other way too). I tend to prefer this idiom to that one though.
Also, if either get() throws an exception you may end up in a part of your code that assumes both tasks are done, which might be bad.
You can use invokeall(Colelction....) method
package concurrent.threadPool;
import java.util.Arrays;
import java.util.List;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
public class InvokeAll {
public static void main(String[] args) throws Exception {
ExecutorService service = Executors.newFixedThreadPool(5);
List<Future<java.lang.String>> futureList = service.invokeAll(Arrays.asList(new Task1<String>(),new Task2<String>()));
System.out.println(futureList.get(1).get());
System.out.println(futureList.get(0).get());
}
private static class Task1<String> implements Callable<String>{
#Override
public String call() throws Exception {
Thread.sleep(1000 * 10);
return (String) "1000 * 5";
}
}
private static class Task2<String> implements Callable<String>{
#Override
public String call() throws Exception {
Thread.sleep(1000 * 2);
int i=3;
if(i==3)
throw new RuntimeException("Its Wrong");
return (String) "1000 * 2";
}
}
}
You may want to use a CyclicBarrier if you are interested in starting the threads at the same time, or waiting for them to finish and then do some further processing.
See the javadoc for more information.
If your futureTasks are more then 2, please consider [ListenableFuture][1].
When several operations should begin as soon as another operation
starts -- "fan-out" -- ListenableFuture just works: it triggers all of
the requested callbacks. With slightly more work, we can "fan-in," or
trigger a ListenableFuture to get computed as soon as several other
futures have all finished.