I have used multithreading in many of applications I wrote . While reading more I came across ThreadPoolExecutors. I couldn't not differentiate between the two scenario wise .
Still what I understand is I should use multithreading when I have a task I want to divide a task in to multiple small tasks to utilize CPU and do the work faster . And use ThreadPoolExecutor when I have a set to tasks and each task can be run independent of each other.
Please correct me if I am wrong . Thanks
A ThreadPoolExecutor is just a high level API that enables you to run tasks in multiple threads while not having to deal with the low level Thread API. So it does not really make sense to differentiate between multithreading and ThreadPoolExecutor.
There are many flavours of ThreadPoolExecutors, but most of them allow more than one thread to run in parallel. Typically, you would use an Executor Service and use the Executors factory.
For example, a ExecutorService executor = Executors.newFixedThreadPool(10); will run the tasks you submit in 10 threads.
ThreadPoolExecutor is one way of doing multithreading. It's typically used when you
have independent operations that don't require coordination (though nothing prevents you
from coordinating, but you have to be careful)
want to limit the capacity of how many operations you're executing at once, and (optionally) want to queue operations when for execution if the pool is currently working in all threads.
Java 7 has another built in class called a ForkJoinPool which is typically used for Map-Reduce type operations. For instance, one can imagine implementing a merge sort using a ForkJoinPool by splitting the array in 1/2 at each fork point, waiting for the results, and merging the results together.
Thread pools (executors) are one form of multithreading, specifically an implementation of the single producer - multiple consumer pattern, in which a thread repeatedly puts work in a queue for a team of worker threads to execute. It is implemented using regular threads and brings several benefits:
thread anonymity - you don't explicitly control which thread does what; just fire off tasks and they'll be handled by the pool.
it encapsulates a work queue and thread team - no need to bother implementing you own thread-safe queue and looping threads.
load-balancing - since workers take new tasks as they finish previous ones, work is uniformly distributed, provided there is a sufficiently large number of tasks available.
thread recycling - just create a single pool at the beginning an keep feeding it tasks. No need to keep starting and killing threads every time work needs to be done.
Given the above, it is true that pools are suited for tasks that are usually independent of each-other and usually short-lived (long I/O operations will just tie up threads from the pool that won't be able to do other tasks).
ThreadPoolExecutor is a form of multithreading, with a simpler API to use than directly using Threads, where you submit tasks indeed. However, tasks can submit other tasks, so they need not be independent. As for the division of tasks into sub-tasks, you may be thinking of the new fork/join API in JDK7.
From source code documentation of ThreadPoolExecutor
/*
* <p>Thread pools address two different problems: they usually
* provide improved performance when executing large numbers of
* asynchronous tasks, due to reduced per-task invocation overhead,
* and they provide a means of bounding and managing the resources,
* including threads, consumed when executing a collection of tasks.
* Each {#code ThreadPoolExecutor} also maintains some basic
* statistics, such as the number of completed tasks.
*
* <p>To be useful across a wide range of contexts, this class
* provides many adjustable parameters and extensibility
* hooks. However, programmers are urged to use the more convenient
* {#link Executors} factory methods {#link
* Executors#newCachedThreadPool} (unbounded thread pool, with
* automatic thread reclamation), {#link Executors#newFixedThreadPool}
* (fixed size thread pool) and {#link
* Executors#newSingleThreadExecutor} (single background thread), that
* preconfigure settings for the most common usage
* scenarios.
*/
ThreadPoolExecutor is one way of achieving concurrency. There are many ways in achieving concurrency :
Executors framework provides different APIs. Some of important APIs are listed below.
static ExecutorService newFixedThreadPool(int nThreads)
Creates a thread pool that reuses a fixed number of threads operating off a shared unbounded queue.
static ExecutorService newCachedThreadPool()
Creates a thread pool that creates new threads as needed, but will reuse previously constructed threads when they are available.
static ScheduledExecutorService newScheduledThreadPool(int corePoolSize)
Creates a thread pool that can schedule commands to run after a given delay, or to execute periodically.
static ExecutorService newWorkStealingPool()
Creates a work-stealing thread pool using all available processors as its target parallelism level.
Have a look at below SE questions:
java Fork/Join pool, ExecutorService and CountDownLatch
How to properly use Java Executor?
Related
like - network operation and bitmap manipulating an image loading and other kinds of work can I create a single TheadPoolExecuter for my whole application and execute on it.
if the answer is no -> why? and how to create thread pool for every single operation?
or if yes -> is performance problem occurs?
thanks in advance.
Both of approach have advantages and disadvantages.
In case of single thread pool (singleton implementation, I suppose):
➕ you have one entry point to submit background task
➕ it easily to implement and control life cycle
➖ if you have a lot of different quick tasks and some long running task, long running tasks may hold all thread in limited pool while user wait some quick action in UI
Different thread pools (one pool for one type of task):
➕ thread pool of long-running tasks can accumulate task while quick task can be executed in their own thread pool in-depend
➕ you know everything about tasks in your application - you can fine-tune pool size for every type of task, setup threads priority, initial stack size etc. with thread factory
➕ if you define thread group and thread name, it can help you in debug
➖ have different thread pools involve to hard control their life cycle
➖ this implementation will not give a lot of benefits in poor separation by tasks classes
Any case, you need some compromise and an assessment of the advantages
Talking teorically i think you can do that and according to oracle documentation should be improve your performance:
Thread pools address two different problems: they usually provide
improved performance when executing large numbers of asynchronous
tasks, due to reduced per-task invocation overhead, and they provide a
means of bounding and managing the resources, including threads,
consumed when executing a collection of tasks. Each ThreadPoolExecutor
also maintains some basic statistics, such as the number of completed
tasks.
Is this possible to run multithreaded Java application in a deterministic fashion? I mean to have always the same thread switching in two different runs of my application.
Reason for that is to run simulation in exactly the same conditions in every run.
Similar case is when one gives some arbitrary seed when using random number generator to obtain always the same "random" sequence.
I am not aware of any practical way to do this.
In theory, it would be possible to implement a bytecode interpreter with an entirely deterministic behavior under certain assumptions1. You would need to simulate the multiple threads by implementing the threads and the thread scheduling entirely in software and using a single native thread.
1 - For example, no I/O, and no use of the system clock.
No it is not possible (other than to simulate it yourself) to use multiple threads interleaving in the same way each time around. Threads are not designed to do that.
If you want deterministic results, don't use threads.
As quoted by OldCurmudgeon, it's not possible with multi threading.
If you decide to use single Thread, I prefer newSingleThreadExecutor to normal Thread due to flexibility and advantages of newSingleThreadExecutor
Use
newSingleThreadExecutor from Executors
public static ExecutorService newSingleThreadExecutor()
Creates an Executor that uses a single worker thread operating off an unbounded queue. (Note however that if this single thread terminates due to a failure during execution prior to shutdown, a new one will take its place if needed to execute subsequent tasks.)
Tasks are guaranteed to execute sequentially, and no more than one task will be active at any given time. Unlike the otherwise equivalent newFixedThreadPool(1) the returned executor is guaranteed not to be reconfigurable to use additional threads.
Related SE questions:
Difference between Executors.newFixedThreadPool(1) and Executors.newSingleThreadExecutor()
ExecutorService vs Casual Thread Spawner
I'm in a situation where I have 4 groups of 7 threads each. My CPU (core i7) is supposed to be able to handle 8 threads, so I'm considering going through each group one at a time, running the 7 threads, then moving to the 2nd group, running its 7 threads, then 3rd and 4th groups in the same way, and then starting back at 1st group, until user sends a stop command.
My question is, once each group of 7 threads has finished processing, should I keep those threads idle, or shut them down completely and restart a new group of 7 threads at the next iteration? Which method will be faster? This is for a very speed intensive app, so I need everything to happen as quickly as possible.
I will be using a FixedThreadPool to manage each group of 7 threads. So I could either just invokeAll() and then leave them alone (presumably to idle), or I could shutdown() each threadpool after the invokeAll() and start a new thread pool at the next iteration.
Which method will be faster?
My question is, once each group of 7 threads has finished one cycle of processing, should I keep those threads idle, or shut them down completely and restart a new group of 7 threads at the next cycle?
I would use a single ExecutorService thread-pool and reuse the same threads for all tasks. See the tutorial on the subject. A thread-pool is designed to execute any Runnable or Callable class so they are task agnostic. For example, you might have ParentResult and ChildResult classes. You can submit a Callable<ParentResult> to the thread-pool which will return a Future<ParentResult> and you can submit a Callable<ChildResult> to the same thread-pool which will return a Future<ChildResult>.
The only reason why you'd want to have "groups of threads" is if each thread has some state that it must maintain -- a database connection or something. Even then many people use thread-pools since it does much of the concurrency heavy lifting for you.
If you do have to keep this state then I would certainly not shutdown the pools and then restart them later. A dormant thread/pool is taking no system resources aside from memory. The only reason why you would ever do this is if you are forking 100s of thread for the task but at that point, you should consider re-architecting your application.
You need not to schedule your threads manually. Start all 28 threads at once - this would not be slower, but well can be faster.
When you say your processor has 8 threads, I think you mean it has has 4 cores with hyperthreading. Java does not use threads in the same sense as your processor, so those 7 threads are of a different type to your processors.
The JVM handles processor usage, and is (IIRC) limited to using 1 core. The threads java uses are specific to the JVM, and are wholly separate.
As for your actual question, try testing different thread combinations to see which is fastest, which will give you a more accurate answer than arm-chair theorising.
I too prefer Alexei Kaigorodov suggestion to start all 28 threads. But I suggest you to replace newFixedThreadPoolwith new Executors API: ( since Java 8)
static ExecutorService newWorkStealingPool()
Creates a work-stealing thread pool using all available processors as its target parallelism level.
Above API returns ForkJoinPool type of ExecutorService
Now you don't need to worry utilization of idle threads. Java will take care better utilization of idle threads with work-stealing mechanism.
If you still need four different groups of FixedThreadPools, you can proceed with invokeAll. Don't shutdown ExecutorService to switch between multiple pools. You one ExecutorService effectively. If you want to poll the result of Future tasks using invokeAll, use CompletableFuture and poll it to know the status of task execution.
static CompletableFuture<Void> runAsync(Runnable runnable, Executor executor)
Returns a new CompletableFuture that is asynchronously completed by a task running in the given executor after it runs the given action.
What benefit is there to use Executors over just Threads in a Java program.
Such as
ExecutorService pool = Executors.newFixedThreadPool(2);
void someMethod() {
//Thread
new Thread(new SomeRunnable()).start();
//vs
//Executor
pool.execute(new SomeRunnable());
}
Does an executor just limit the number of threads it allows to have running at once (Thread Pooling)? Does it actually multiplex runnables onto the threads it creates instead? If not is it just a way to avoid having to write new Thread(runnable).start() every time?
Yes, executors will generally multiplex runnables onto the threads they create; they'll constrain and manage the number of threads running at once; they'll make it much easier to customize concurrency levels. Generally, executors should be preferred over just creating bare threads.
Creating new threads is expensive. Because Executors uses a thread pool, you get to easily reuse threads, resulting in better performance.
Does an executor just limit the number of threads it allows to have running at once (Thread Pooling)?
Executors#newFixedThreadPool(int), Executors#newSingleThreadExecutor do this, each one under different terms (read the proper javadoc to know more about it).
Does it actually multiplex runnables onto the threads it creates instead?
Yes
If not is it just a way to avoid having to write new Thread(runnable).start() every time?
ExecutorService helps you to control the way you handle threads. Of course, you can do this manually, but there's no need to reinvent the wheel. Also, there are other functionalities that ExecutorService provides you like executing asynchronous tasks through the usage of Future instances.
There are multiple concerns related to thread.
managing threads
resource utilization
creation of thread
Executors provides different kind of implementation for creating a pool of threads. Also thread creation is a costly affair. Executors creates and manages these threads internally. Details about it can be found in the below link.
http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadPoolExecutor.html
As I said over in a related question, Threads are pretty bad. Executors (and the related concurrency classes) are pretty good:
Caveat: Around here, I strongly discourage the use of raw Threads. I
much prefer the use of Callables and FutureTasks (From the javadoc: "A
cancellable asynchronous computation"). The integration of timeouts,
proper cancelling and the thread pooling of the modern concurrency
support are all much more useful to me than piles of raw Threads.
For example, I'm currently replacing a legacy piece of code that used a disjoint Thread running in a loop with a self-timer to determine how long it should Thread.sleep() after each iteration. My replacement will use a very simple Runnable (to hold a single iteration), a ScheduledExecutorService to run one of the iterations and the Future resulting from the scheduleAtAFixedRate method to tune the timing between iterations.
While you could argue that replacement will be effectively equivalent to the legacy code, I'll have replaced an arcane snarl of Thread management and wishful thinking with a compartmentalized set of functionality that separates the concerns of the GUI (are we currently running?) from data processing (playback at 5x speed) and file management (cancel this run and choose another file).
Executors provides newCachedThreadPool() and newScheduledThreadPool(), but not newCachedScheduledThreadPool(), what gives here? I have an application that receives bursty messages and needs to schedule a fairly lengthy processing step after a fixed delay for each. The time constraints aren't super tight, but I would prefer to have more threads created on the fly if I exceed the pool size and then have them trimmed back during periods of inactivity. Is there something I've missed in the concurrent library, or do I need to write my own?
By design the ScheduledThreadPoolExecutor is a fixed size. You can use a single threaded version that submits to a normal ExecutorService for performing the task. This event thread + worker pool is fairly ease to coordinate and the flexibility makes up for the dedicated thread. I've used this in the past to replace TimerTasks and other non-critical tasks to utilize a common executor as a system-wide pool.
Suggested here Why does ScheduledThreadPoolExecutor only accept a fixed number of threads? workaround:
scheduledExecutor = new ScheduledThreadPoolExecutor(128); //no more than 128 threads
scheduledExecutor.setKeepAliveTime(10, TimeUnit.SECONDS);
scheduledExecutor.allowCoreThreadTimeOut(true);
java.util.concurrent.Executors is nothing more than a collection of static convenience methods that construct common arrangements of executors.
If you want something specific that isn't offered by Executors, then feel free to construct your own instance of the implemention classes, using the examples in Executors as a guide.
Like skaffman says, Executors is only a collection of factory method. if you need a particular instance, you can always check all existing Executor implementors. In your case, i think that calling one of the various constructors of ScheduledThreadPoolExecutor would be a good idea.