I have a CPU intensive application, which can be written in Java. The application consists of few jobs (threads) that run independently.
If I start all the threads at once, the system will be overloaded. How could I start at most n threads at once, and when one thread finishes then a new one is started? By limiting the number of threads running at once, I intend to leave some other processors/cores available for other tasks.
Thanks
You should formulate your threads' tasks as Runnables or Callables and then submit them to a fixed thread pool executor. The executor will manage a pool of worker threads and run your tasks from an internal queue on those threads.
See Executors#newFixedThreadPool for a factory method that creates the type of executor you want.
Use a fixed size executor pool.
ExecutorService executor = Executors.newFixedThreadPool(4);
Related
I have Executors.newFixedThreadPool(/* nThreads= */ 2) executor service. I noticed that sometimes when I pass TWO tasks to the executor service, it runs only ONE task, while I expect it to run TWO tasks. Is that possible and why?
I have two tasks which communicate with each other. These two tasks are put inside fixed thread pool of size two because I want both tasks to be running at the same time.
Executors make sure you will reuse the thread poll most efficiently. But it doesn't gurantee all tasks are executed all at once. I am wondering if you can use 2 threads coming from 2 threadpool which only having 1 thread?
In my code, I have a class containing a static final variable
private static final ForkJoinPool pool = new ForkJoinPool(availableCPUs - 1);
I have a long running task submitted to the pool, which would take all the CPU resource. Any other tasks submitted would be hanging.
However, when I switched to create a common pool
private static final ForkJoinPool pool = ForkJoinPool.commonPool();
All the tasks can be submitted and executed.
I was just wondering what the differences are between these two pieces of code. commonPool() still calls new ForkJoinPool() and passes the availableCPUs - 1
Also I noticed that commonPool() uses a factory of type SafeForkJoinWorkerThreadFactory while new ForkJoinPool() uses ForkJoinPool$DefaultForkJoinWorkerThreadFactory. Does this matter?
Thank you very much!
I think I figured it out.
ForkJoin maintains two types of queues: one general inbound queue and worker thread queues for each worker thread. All worker threads will fetch from the general inbound queue first and populate their worker threads. After one worker thread finishes all the tasks in its worker queue, it will try to steal from other worker threads. If there no other task to steal from other worker threads, the work thread will fetch from the general inbound queue again.
However, with common pool, the main thread will also help to process the tasks. The main thread does not have a worker queue though. Therefore, after finishing one task, the main thread will be able to fetch from general inbound queue.
Since by default, the ForkJoin queues are LIFO, the main thread will be able the fetch the last submitted tasks.
Documentation says:
The common pool is by default constructed with default parameters.
ForkJoinPool()
Creates a ForkJoinPool with parallelism equal to Runtime.availableProcessors(), using the default thread factory, no UncaughtExceptionHandler, and non-async LIFO processing mode.
So what makes you think that new ForkJoinPool(availableCPUs - 1) and ForkJoinPool.commonPool() would be pools of the same size?
If you only have 2 CPUs, then availableCPUs - 1 means you're creating a pool of 1 thread, i.e. it can only process one task at a time, so a long-running task will block all other tasks.
But with 2 CPUs, availableProcessors() means you're getting a common pool with 2 threads, i.e. it can process other tasks while a single long-running task is being processed.
When creating a thread pool with
Executors.newScheduledThreadPool(42);
I can schedule tasks in it as the thread pool is of type ScheduledExecutorService. ScheduledExecutorService is a subclass of ExecutorService. I therefore can submit normal Runnables or Callables.
Is it advisable to share one single thread pool in the application or would it be better to have two separate ones?
The scheduled tasks are not that time critical and their execution on time cannot be guaranteed even with a separate thread pool if there are two many waiting threads.
If having two separate thread pools, what is a good size for them based on
Number of tasks scheduled (you can assume that the number is constant)
Number of CPU/cores.
Is it possible to have a set of thread pools that share threads from a large thread pool instead of creating new thread?
In our RESTful API application, a request may involve several parallel tasks. To improve the performance, we want to execute parallel tasks in a thread pool which has a fixed number (say 200) of threads. But we also want to limit the max number of threads that each request can use. So I am thinking if it is possible to create a sub thread pool with a maximum pool size for each request, which does not create thread by itself but tries to fetch a new thread from the global thread pool and put job into queue if there is no available threads.
Has anyone done something similar? Or is there any other better solution?
Thanks!!
Instead of thread pools, think of executors. An executor consists of 2 things: a job queue and a thread pool. What you need is a light-weight executor for each request which has a job queue but has no threads. Instead, it passes jobs from its queue to the main executor (which owns all the threads). The trick is that the light-weight executor counts how many jobs it has submitted to the main executor, and stops passing jobs when the number exceeds the limit. Each job, before being passed to the main executor, is wrapped in an object of type Runnable which a) has the reference to the parent light-weight executor, b) executes the wrapped job and c) notifies the referenced executor when the job is finished so that the executor can pass another job to the main executor (or just decrease the job counter, if there is no jobs in the queue).
You could create a thread pool for every task e.g. Executors.newFixedThreadPool(10) This will do what you ask for with the inefficiency of potentially creating threads that a particular task instance doesn't need.
I read a great article about the fork-join framework in Java 7, and the idea is that, with ForkJoinPool and ForkJoinTask, the threads in the pool can get the sub tasks from other tasks, so it's able to use less threads to handle more tasks.
Then I tried to use a normal ExecutorService to do the same work, and found I can't tell the difference, since when I submit a new task to the pool, the task will be run on another available thread.
The only difference I can tell is if I use ForkJoinPool, I don't need to pass the pool to the tasks, because I can call task.fork() to make it running on another thread. But with normal ExecutorService, I have to pass the pool to the task, or make it a static, so inside the task, I can call pool.submit(newTask)
Do I miss something?
(You can view the living code from https://github.com/freewind/fork-join-test/tree/master/src)
Although ForkJoinPool implements ExecutorService, it is conceptionally different from 'normal' executors.
You can easily see the difference if your tasks spawn more tasks and wait for them to complete, e.g. by calling
executor.invoke(new Task()); // blocks this thread until new task completes
In a normal executor service, waiting for other tasks to complete will block the current thread. There are two possible outcomes: If your executor service has a fixed number of threads, it might deadlock if the last running thread waits for another task to complete. If your executor dynamically creates new threads on demand, the number of threads might explode and you end up having thousands of threads which might cause starvation.
In opposite, the fork/join framework reuses the thread in the meantime to execute other tasks, so it won't deadlock although the number of threads is fixed:
new MyForkJoinTask().invoke();
So if you have a problem that you can solve recursively, think of using a ForkJoinPool as you can easily implement one level of recursion as ForkJoinTask.
Just check the number of running threads in your examples.