Can fixedThreadPool have less threads than it was assigned to? - java

I have Executors.newFixedThreadPool(/* nThreads= */ 2) executor service. I noticed that sometimes when I pass TWO tasks to the executor service, it runs only ONE task, while I expect it to run TWO tasks. Is that possible and why?
I have two tasks which communicate with each other. These two tasks are put inside fixed thread pool of size two because I want both tasks to be running at the same time.

Executors make sure you will reuse the thread poll most efficiently. But it doesn't gurantee all tasks are executed all at once. I am wondering if you can use 2 threads coming from 2 threadpool which only having 1 thread?

Related

Executor pool limit number of threads at a time

I have a situation in which I have to run some 10,000 threads. Obviously, one machine cannot run these many threads in parallel. Is there any way by which we can ask Thread pool to run some specific number of threads in the beginning and as soon as one thread finishes, the threads which are left can start their processing ?
Executors.newFixedThreadPool(nThreads) is what most likely you are looking for. There will only be as many threads running at one time as the number of threads specified. And yes one machine cannot run 10,000 threads at once in parallel, but it will be able to run them concurrently. Depending on how resource intensive each thread is, it may be more efficient in your case to use
Executors.newCachedThreadPool() wherein as many threads are created as needed, and threads that have finished are reused.
Using Executors.newFixedThreadPool(10000) with invokeAll will throw an OutOfMemory exception with that many threads. You still could use it by submitting tasks to it instead of invoking all tasks at same time, that's I would say safer than just invokeAll.
For this use case. You can have a ThreadPollExecuter with Blocking Queue. http://howtodoinjava.com/core-java/multi-threading/how-to-use-blockingqueue-and-threadpoolexecutor-in-java/ this tutorial explains that very well.
It sounds like you want to run 10,000 tasks on a group of threads. A relatively simple approach is to create a List and then add all the tasks to the list, wrapping them in Runnable. Then, create a class that takes the list in the constructor and pops a Runnable of the list and then runs it. This activity must be synchronized in some manner. The class exits when the list is empty. Start some number of threads using this class. They'll burn down the list and then stop. Your main thread can monitor the length of the list.

Difference between ForkJoinPool and normal ExecutionService?

I read a great article about the fork-join framework in Java 7, and the idea is that, with ForkJoinPool and ForkJoinTask, the threads in the pool can get the sub tasks from other tasks, so it's able to use less threads to handle more tasks.
Then I tried to use a normal ExecutorService to do the same work, and found I can't tell the difference, since when I submit a new task to the pool, the task will be run on another available thread.
The only difference I can tell is if I use ForkJoinPool, I don't need to pass the pool to the tasks, because I can call task.fork() to make it running on another thread. But with normal ExecutorService, I have to pass the pool to the task, or make it a static, so inside the task, I can call pool.submit(newTask)
Do I miss something?
(You can view the living code from https://github.com/freewind/fork-join-test/tree/master/src)
Although ForkJoinPool implements ExecutorService, it is conceptionally different from 'normal' executors.
You can easily see the difference if your tasks spawn more tasks and wait for them to complete, e.g. by calling
executor.invoke(new Task()); // blocks this thread until new task completes
In a normal executor service, waiting for other tasks to complete will block the current thread. There are two possible outcomes: If your executor service has a fixed number of threads, it might deadlock if the last running thread waits for another task to complete. If your executor dynamically creates new threads on demand, the number of threads might explode and you end up having thousands of threads which might cause starvation.
In opposite, the fork/join framework reuses the thread in the meantime to execute other tasks, so it won't deadlock although the number of threads is fixed:
new MyForkJoinTask().invoke();
So if you have a problem that you can solve recursively, think of using a ForkJoinPool as you can easily implement one level of recursion as ForkJoinTask.
Just check the number of running threads in your examples.

Use only a subset of threads in an ExecutorService

In a typical JAVA application, one configures a global ExecutorService for managing a global thread pool. Lets say I configure a fixed thread pool of 100 threads:
ExecutorService threadPool = Executors.newFixedThreadPool(100);
Now lets say that I have a list of 1000 files to upload to a server, and for each upload I create a callable that will handle the upload of this one file.
List<Callable> uploadTasks = new ArrayList<Callable>();
// Fill the list with 1000 upload tasks
How can I limit the max number of concurrent uploads to, lets say, 5?
if I do
threadPool.invokeAll(uploadTasks);
I dont have control on how many threads my 1000 tasks will take. Potentially 100 uploads will run in parallel, but I only want max 5.
I would like to create some sort of sub-executor, wich uses a subset of the threads of the parent executor.
I dont want to create a new separated executorService just for upload, because i want to manage my thread pool globally.
Does any of you know how do to that or if an existing implementation exists? Ideally something like
ExecutorService uploadThreadPool = Executors.createSubExecutor(threadPool,5);
Many thanks,
Antoine
In a typical JAVA application, one configures a global ExecutorService for managing a global thread pool.
I'm not doing this, but maybe i'm atypical. :-) Back to your question:
As having a guard (possibly using a Semaphore inside your Callables will only clutter your global Executor which tasks waiting for each other, you'll have to either
use some external logic which ensures only 5 jobs are running at any time, which could in itself be a Callable submitted to your one Executor. This could be done by wrapping your download jobs with some logic which will pull the next job from a queue (containing the URLs or whatever) once one job is completed. Then you submit five of those "drain download queue" jobs to your Executor and call it done. But atypical as I am, I'd just
use a separate Executor for the downloads. This also gives you the ability to name your threads appropriately (in the Executors ThreadFactory), which might help with debugging and makes nice thread dumps.

Java ThreadPool concepts, and issues with controlling the number of actual threads

I am a newbie to Java concurrency and am a bit confused by several concepts and implementation issues here. Hope you guys can help.
Say, I have a list of tasks stored in a thread-safe list wrapper:
ListWrapper jobs = ....
'ListWrapper' has synchronized fetch/push/append functions, and this 'jobs' object will be shared by multiple worker threads.
And I have a worker 'Runnable' to execute the tasks:
public class Worker implements Runnable{
private ListWrapper jobs;
public Worker(ListWrapper l){
this.jobs=l;
}
public void run(){
while(! jobs.isEmpty()){
//fetch an item from jobs and do sth...
}
}
}
Now in the main function I execute the tasks:
int NTHREADS =10;
ExecutorService service= Executors.newFixedThreadPool(NTHREADS);
//run threads..
int x=3;
for(int i=0; i<x; i++){
service.execute(new Worker(jobs) );
}
I tested this code with 'x=3', and I found that only 3 threads are running at the same time; but as I set 'x=20', I found that only 10 (=NTHREADS) are running at the same time. Seems to me the # of actual threads is the min of the two values.
Now my questions are:
1) Which value ('x' or 'NTHREADS') should I set to control the number of concurrent threads? Or it doesn't matter in either I choose?
2) How is this approach different from simply using the Producer-Consumer pattern --creating a fixed number of 'stud' threads to execute the tasks(shown in the code below)?
Thread t1= new Worker(jobs);
Thread t2= new Worker(jobs);
...
t1.join();
t2.join();
...
Thank you very much!!
[[ There are some good answers here but I thought I'd add some more detail. ]]
I tested this code with 'x=3', and I found that only 3 threads are running at the same time; but as I set 'x=20', I found that only 10 (=NTHREADS) are running at the same time. Seems to me the # of actual threads is the min of the two values.
No, not really. I suspect that the reason you weren't seeing 20 threads is that threads had already finished or had yet to be started. If you call new Thread(...).start() 20 times then you will get 20 threads started. However, if you check immediately none of them may have actually begun to run or if you check later they may have finished.
1) Which value ('x' or 'NTHREADS') should I set to control the number of concurrent threads? Or it doesn't matter in either I choose?
Quoting the Javadocs of Executors.newFixedThreadPool(...):
Creates a thread pool that reuses a fixed number of threads operating off a shared unbounded queue. At any point, at most nThreads threads will be active processing tasks.
So changing the NTHREADS constant changes the number of threads running in the pool. Changing x changes the number of jobs that are executed by those threads. You could have 2 threads in the pool and submit 1000 jobs or you could have 1000 threads and only submit 1 job for them to work on.
Btw, after you have submitted all of your jobs, you should then shutdown the pool which stops all of the threads if all of the jobs have been run.
service.shutdown();
2) How is this approach different from simply using the Producer-Consumer pattern --creating a fixed number of 'stud' threads to execute the tasks(shown in the code below)?
It differs in that it does all of the heavy work for you.
You don't have to create a ListWrapper of the jobs since you get one inside of the ExecutorService. You just submit the jobs to the ExecutorService and it keeps track of them until the threads are available to run them.
You don't have to create any threads or worry about them throwing exceptions and dying because the ExecutorService starts/restarts the threads for you.
If you want your tasks to return information you can make use of the submit(Callable) method and use the Future to get the results of the jobs. Etc, etc..
Doing this code yourself is going to be harder to get right, more code to maintain, and most likely will not perform as well as the code in the JDK that is battle tested and optimized.
You shouldn't create threads by yourself when using a threadpool. Instead of WorkerThread class you should use a class that implements Runnable but is not a thread. Passing a Thread object to the threadpool won't make the thread run actually. The object will be passed to a different internal thread, which will simply execute the run method of your WorkerThread class.
The ExecutorService is simply incompatible with the way you want to write your program.
In the code you have right now, these WorkerThreads will stop to work when your ListWrapper is empty. If you then add something to the list, nothing will happen. This is definitely not what you wanted.
You should get rid of ListWrapper and simply put your tasks directly into the threadpool. The threadpool already incorporates an internal list of jobs shared between the threads. You should just submit your jobs to the threadpool and it will handle them accordingly.
To answer your questions:
1) Which value ('x' or 'NTHREADS') should I set to control the number of concurrent threads? Or it doesn't matter in either I choose?
NTHREADS, the threadpool will create the necessary number of threads.
2) How is this approach different from simply using the Producer-Consumer pattern --creating a fixed number of 'stud' threads to execute the tasks(shown in the code below)?
It's just that ExecutorService automates a lot of things for you. You can choose from a lot of different implementations of threadpools and you can substitute them easily. You can use for instance a scheduled executor. You get extra functionality. Why reinvent the wheel?
For 1) NTHREADS is the maximum threads that the pool will ever run concurrently, but that doesn't mean there will always be that many running. It will only use as many as is needed up to that max value... which in your case is 3.
As the docs say:
At any point, at most nThreads threads will be active processing tasks. If additional tasks are submitted when all threads are active, they will wait in the queue until a thread is available
http://docs.oracle.com/javase/8/docs/api/java/util/concurrent/Executors.html#newFixedThreadPool-int-
As for 2) using Java's concurrent executors framework is preferred with new code. You get a lot of stuff for free and removes the need for having to handle all of the fiddly thread work yourself.
The number of threads passed into newFixedThreadPool is at most how many threads could be running executing your tasks. If you only have three tasks ever submitted I'd expect the ExecutorService to only create three threads.
To answer your questions:
You should use the number you pass into the constructor to control how many threads are going to be used to execute your tasks.
This differs because of the extra functionality the ExecutorService gives you, as well as the flexibility it gives you such as in the case you need to change your ExecutorService type or number of tasks you'll run (less lines of code to change).
All that is happening is the executor service is only creating as many threads as it needs. NTHREADS is effectively the maximum number of threads it'll create.
There is no point creating ten threads up front if it only has 3 tasks to complete, the other 7 will just be hanging around consuming resources.
If you submit more than NTHREADS number of tasks then it will process that number concurrently and the rest will wait on a queue until a thread becomes free.
This isn't any different from creating a fixed set of your own threads, except the thread management and scheduling is handled for you. The executor service also restarts threads if they are killed by rogue exceptions in your task which you'd otherwise have to code for.
See: The Javadoc on Executorservice.newFixedThreadPool

Java Multithreading - Starting a lot of threads and keeping them idle, vs lazy loading new threads?

I'm in a situation where I have 4 groups of 7 threads each. My CPU (core i7) is supposed to be able to handle 8 threads, so I'm considering going through each group one at a time, running the 7 threads, then moving to the 2nd group, running its 7 threads, then 3rd and 4th groups in the same way, and then starting back at 1st group, until user sends a stop command.
My question is, once each group of 7 threads has finished processing, should I keep those threads idle, or shut them down completely and restart a new group of 7 threads at the next iteration? Which method will be faster? This is for a very speed intensive app, so I need everything to happen as quickly as possible.
I will be using a FixedThreadPool to manage each group of 7 threads. So I could either just invokeAll() and then leave them alone (presumably to idle), or I could shutdown() each threadpool after the invokeAll() and start a new thread pool at the next iteration.
Which method will be faster?
My question is, once each group of 7 threads has finished one cycle of processing, should I keep those threads idle, or shut them down completely and restart a new group of 7 threads at the next cycle?
I would use a single ExecutorService thread-pool and reuse the same threads for all tasks. See the tutorial on the subject. A thread-pool is designed to execute any Runnable or Callable class so they are task agnostic. For example, you might have ParentResult and ChildResult classes. You can submit a Callable<ParentResult> to the thread-pool which will return a Future<ParentResult> and you can submit a Callable<ChildResult> to the same thread-pool which will return a Future<ChildResult>.
The only reason why you'd want to have "groups of threads" is if each thread has some state that it must maintain -- a database connection or something. Even then many people use thread-pools since it does much of the concurrency heavy lifting for you.
If you do have to keep this state then I would certainly not shutdown the pools and then restart them later. A dormant thread/pool is taking no system resources aside from memory. The only reason why you would ever do this is if you are forking 100s of thread for the task but at that point, you should consider re-architecting your application.
You need not to schedule your threads manually. Start all 28 threads at once - this would not be slower, but well can be faster.
When you say your processor has 8 threads, I think you mean it has has 4 cores with hyperthreading. Java does not use threads in the same sense as your processor, so those 7 threads are of a different type to your processors.
The JVM handles processor usage, and is (IIRC) limited to using 1 core. The threads java uses are specific to the JVM, and are wholly separate.
As for your actual question, try testing different thread combinations to see which is fastest, which will give you a more accurate answer than arm-chair theorising.
I too prefer Alexei Kaigorodov suggestion to start all 28 threads. But I suggest you to replace newFixedThreadPoolwith new Executors API: ( since Java 8)
static ExecutorService newWorkStealingPool()
Creates a work-stealing thread pool using all available processors as its target parallelism level.
Above API returns ForkJoinPool type of ExecutorService
Now you don't need to worry utilization of idle threads. Java will take care better utilization of idle threads with work-stealing mechanism.
If you still need four different groups of FixedThreadPools, you can proceed with invokeAll. Don't shutdown ExecutorService to switch between multiple pools. You one ExecutorService effectively. If you want to poll the result of Future tasks using invokeAll, use CompletableFuture and poll it to know the status of task execution.
static CompletableFuture<Void> runAsync(Runnable runnable, Executor executor)
Returns a new CompletableFuture that is asynchronously completed by a task running in the given executor after it runs the given action.

Categories