As far as I know, executor completion service provides the output from the future object regardless of the order in which the task was requested in the inbound queue, i.e. whichever task is completed first the Result is put into the Outbound Queue. On the other hand, FixedThreadPool also executes the tasks parallelly, then what is the difference between the two? ( Not sure whether the FixedThreadPool gives the output sequentially in the order the tasks were fed to the inbound queue )
Thanks.
FixedThreadPool is one of the variations of Executor. It uses class ThreadPoolExecutor with same values for corePoolSize and maximumPoolSize. It means, if you create FixedThreadPool with 10 threads, it will always keep exact 10 threads. If any of these threads are terminated by running task - thread pool will create new ones to keep required amount.
CompletionService arranges that submitted tasks are, upon completion, placed on a queue.
It means, that all results of submitted tasks will be in a queue and you can process them later.
When you submit a task to a CompletionService, it creates a wrapper, so the result of async task is saved to queue. It doesn't create parallelism itself, instead CompletionService uses inside Executor for making parallel threads. You can pass FixedThreadPool inside, for example.
All tasks, submitted to FixedThreadPool and CompletionService will be done in parallel, without keeping the order.
CompletionService can be used, when you need to know when all of your tasks are finished. Example:
//Task extends Callable<Result>
List<Task> tasks = new ArrayList<Task>();
CompletionService<Result> cs = new ExecutorCompletionService<Result>(Executors.newFixedThreadPool(10));
tasks.forEach(task -> cs.submit(task));
for (int i = 0; i < tasks.size(); i++) { // you should know exact amount of submitted tasks
Result r = cs.take().get();
//process r
}
FixedThreadPool can be used in any other case, when you want to parallel threads without waiting for the results.
Also, note the difference between FixedThreadPool and CachedThreadPool. The first one is usually used when you need to keep threads alive and limit their amount. The seconds one is limited by system, it will process as many threads in parallel as possible. If a thread is in idle state in CachedThreadPool it will be automatically deleted after timeout (default is 60 seconds).
Related
In my code, I have a class containing a static final variable
private static final ForkJoinPool pool = new ForkJoinPool(availableCPUs - 1);
I have a long running task submitted to the pool, which would take all the CPU resource. Any other tasks submitted would be hanging.
However, when I switched to create a common pool
private static final ForkJoinPool pool = ForkJoinPool.commonPool();
All the tasks can be submitted and executed.
I was just wondering what the differences are between these two pieces of code. commonPool() still calls new ForkJoinPool() and passes the availableCPUs - 1
Also I noticed that commonPool() uses a factory of type SafeForkJoinWorkerThreadFactory while new ForkJoinPool() uses ForkJoinPool$DefaultForkJoinWorkerThreadFactory. Does this matter?
Thank you very much!
I think I figured it out.
ForkJoin maintains two types of queues: one general inbound queue and worker thread queues for each worker thread. All worker threads will fetch from the general inbound queue first and populate their worker threads. After one worker thread finishes all the tasks in its worker queue, it will try to steal from other worker threads. If there no other task to steal from other worker threads, the work thread will fetch from the general inbound queue again.
However, with common pool, the main thread will also help to process the tasks. The main thread does not have a worker queue though. Therefore, after finishing one task, the main thread will be able to fetch from general inbound queue.
Since by default, the ForkJoin queues are LIFO, the main thread will be able the fetch the last submitted tasks.
Documentation says:
The common pool is by default constructed with default parameters.
ForkJoinPool()
Creates a ForkJoinPool with parallelism equal to Runtime.availableProcessors(), using the default thread factory, no UncaughtExceptionHandler, and non-async LIFO processing mode.
So what makes you think that new ForkJoinPool(availableCPUs - 1) and ForkJoinPool.commonPool() would be pools of the same size?
If you only have 2 CPUs, then availableCPUs - 1 means you're creating a pool of 1 thread, i.e. it can only process one task at a time, so a long-running task will block all other tasks.
But with 2 CPUs, availableProcessors() means you're getting a common pool with 2 threads, i.e. it can process other tasks while a single long-running task is being processed.
I have a server that has multiple worker threads implemented using java executorService (ie. a thread pool)
My problem is that I am not able to log at every second, the length of jobs waiting to be processed if there is no idle thread available from the thread pool.
NOTE : Logging is not my problem but I want to be able to see how many tasks/jobs are waiting to be processed by a worker thread, is there anyway to see the length of the waiting queue (not the thread pool ) inside executor service?
I have no idea how to implement this thing.
The ThreadPoolExecutor constructor accepts a BlockingQueue parameter which is the Queue implementation used to store the waiting jobs. You can request this queue using getQueue() method, then check the size of the queue:
System.out.println("Number of waiting jobs: "+executor.getQueue().size());
Note that this method is not available in the ExecutorService interface, thus it's better to construct the ThreadPoolExecutor explicitly instead of using Executors.newFixedThreadPool and friends:
ThreadPoolExecutor executor = new ThreadPoolExecutor(nThreads, nThreads,
0L, TimeUnit.MILLISECONDS,
new LinkedBlockingQueue<Runnable>());
While Executors.newFixedThreadPool in OpenJDK/OracleJDK does the same, it's not specified, thus using (ThreadPoolExecutor)Executors.newFixedThreadPool(nThreads) may cause ClassCastException in future Java versions or alternative JDK implementations.
If you can assume that the ExecutorService implementation used by your server is ThreadPoolExecutor, then you can use the method getQueue() that returns the number of tasks that have not been assigned to a Worker yet.
/**
* Returns the task queue used by this executor. Access to the
* task queue is intended primarily for debugging and monitoring.
* This queue may be in active use. Retrieving the task queue
* does not prevent queued tasks from executing.
*
* #return the task queue
*/
public BlockingQueue<Runnable> getQueue() {
return workQueue;
}
So you can run something like this:
if(LOGGER.isDebugEnabled()) {
LOGGER.debug(String.format("Pending tasks: %d", executor.getQueue().size()));
}
Just as a suggestion use ThreadPoolExecutor instead of ExecutorService.
You can take advantage of the blocking queue present in the ThreadPoolExecutor class. This would give you the count of threads waiting.
Also ThreadPoolExecutor class is having methods to get the count of submitted tasks and executed task.
Refer the
ThreadPoolExecutor
BlockingQueue
Hope this helps
I'd suggest keeping a counter of the tasks added to the pool and a counter of the tasks completed/started by the pool. Simple and works with any threading paradigm.
Is there any way to know at a given point in time how many runnables are waiting to be executed by the ExecutorService. For example, assuming that the loop is invoking the execute method faster than the runnable can complete and a surplus accumulates, is there anyway to get a running count of the accumulated runnables?
ExecutorService es = Executors.newFixedThreadPool(50);
while (test) {
es.execute(new MyRunnable());
}
Is there any way to know at a given point in time how many runnables are waiting to be executed by the ExecutorService.
Yes. Instead of using the Executors... calls, you should instantiate your own ThreadPoolExecutor. Below is what the Executors.newFixedThreadPool(50) is returning:
ThreadPoolExecutor threadPool = new ThreadPoolExecutor(50, 50,
0L, TimeUnit.MILLISECONDS, new LinkedBlockingQueue<Runnable>());
Once you have the ThreadPoolExecutor, it has a number of getter methods that give you data on the pool. The number of outstanding jobs should be:
threadPool.getQueue().getSize();
Also available from ThreadPoolExecutor are:
getActiveCount()
getCompletedTaskCount()
getCorePoolSize()
getLargestPoolSize()
getMaximumPoolSize()
getPoolSize()
getTaskCount()
If you want to throttle the number of jobs in the queue so you don't get too far ahead, you should use a bounded BlockingQueue and the RejectedExecutionHandler. See my answer here: Process Large File for HTTP Calls in Java
You can use ThreadPoolExecutor implementation and call toString() method.
Returns a string identifying this pool, as well as its state, including indications of run state and estimated worker and task counts.
Read more here
There are more methods in this implementation to get you different type of counts.
You can use following two methods:
public long getTaskCount()
Returns the approximate total number of tasks that have ever been scheduled for execution. Because the states of tasks and threads may change dynamically during computation, the returned value is only an approximation.
public long getCompletedTaskCount()
Returns the approximate total number of tasks that have completed execution. Because the states of tasks and threads may change dynamically during computation, the returned value is only an approximation, but one that does not ever decrease across successive calls.
Cheers !!
Is there any way to find the number of tasks completed after a call to invokeAll()? It seems that that returns the list of booleans of completed tasks when all of the threads are completed.
I have a pool of 1000 tasks and want to take a look at them in 100 intervals without having to divide the them into 100-task batches.
I also for compatibility reasons have to work with Java 6 so newer methods won't help.
Also, as a side question: does the invokeAll() processes the tasks in FIFO manner? That is, are the tasks get started with the order with which they are added to the task list?
Thanks
I have a pool of 1000 tasks and want to take a look at them in 100 intervals without having to divide the them into 100-task batches.
You should consider using an ExecutorCompletionService which allows you to get notified once a single job has finished instead of having to wait for all jobs to complete using invokeAll(). Then you can put each of the finished jobs into a collection and then act on them when you get 100.
Maybe something like:
CompletionService<Result> ecs = new ExecutorCompletionService<Result>(executor);
for (Callable<Result> s : solvers)
ecs.submit(s);
int n = solvers.size();
List<Result> batch = new ArrayList<Result>();
for (int i = 0; i < n; ++i) {
Result r = ecs.take().get();
batch.add(r);
if (batch.size() >= 100) {
process(batch);
batch.clear();
}
}
if (!batch.isEmpty()) {
process(batch);
}
does the invokeAll() processes the tasks in FIFO manner? That is, are the tasks get started with the order with which they are added to the task list?
The tasks are submitting to the thread-pool in FIFO manner and are dequeued by the threads also in FIFO order. However, once each thread has a job, there are race conditions which may cause some re-ordering of the actual task "start" and certainly finish.
Tasks will be added in the order you specify, and approximately started in that order if you have multiple threads. The order they are completed will be roughly in that order if they take the same amount of time.
I would build a List<Future> which you can poll periodically to see how many are done.
I have a CPU intensive application, which can be written in Java. The application consists of few jobs (threads) that run independently.
If I start all the threads at once, the system will be overloaded. How could I start at most n threads at once, and when one thread finishes then a new one is started? By limiting the number of threads running at once, I intend to leave some other processors/cores available for other tasks.
Thanks
You should formulate your threads' tasks as Runnables or Callables and then submit them to a fixed thread pool executor. The executor will manage a pool of worker threads and run your tasks from an internal queue on those threads.
See Executors#newFixedThreadPool for a factory method that creates the type of executor you want.
Use a fixed size executor pool.
ExecutorService executor = Executors.newFixedThreadPool(4);