like - network operation and bitmap manipulating an image loading and other kinds of work can I create a single TheadPoolExecuter for my whole application and execute on it.
if the answer is no -> why? and how to create thread pool for every single operation?
or if yes -> is performance problem occurs?
thanks in advance.
Both of approach have advantages and disadvantages.
In case of single thread pool (singleton implementation, I suppose):
➕ you have one entry point to submit background task
➕ it easily to implement and control life cycle
➖ if you have a lot of different quick tasks and some long running task, long running tasks may hold all thread in limited pool while user wait some quick action in UI
Different thread pools (one pool for one type of task):
➕ thread pool of long-running tasks can accumulate task while quick task can be executed in their own thread pool in-depend
➕ you know everything about tasks in your application - you can fine-tune pool size for every type of task, setup threads priority, initial stack size etc. with thread factory
➕ if you define thread group and thread name, it can help you in debug
➖ have different thread pools involve to hard control their life cycle
➖ this implementation will not give a lot of benefits in poor separation by tasks classes
Any case, you need some compromise and an assessment of the advantages
Talking teorically i think you can do that and according to oracle documentation should be improve your performance:
Thread pools address two different problems: they usually provide
improved performance when executing large numbers of asynchronous
tasks, due to reduced per-task invocation overhead, and they provide a
means of bounding and managing the resources, including threads,
consumed when executing a collection of tasks. Each ThreadPoolExecutor
also maintains some basic statistics, such as the number of completed
tasks.
Related
In Java, and more specifically, Android,
are new Threads automatically assigned to a different CPU core than the one I am currently utilizing, or should I take care of that?
Also, does it matter how the new thread is created, using the Thread class or submitting a Runnable to an Executor, that maintans a pool of threads?
There is a similar question here, but the answer goes on to explain how the OP should address his particular problem, rather than diving into the more general case:
Threads automatically utilizing multiple CPU cores?
In Java, and more specifically, Android, are new Threads automatically assigned to a different CPU core than the one I am currently utilizing, or should I take care of that?
The decision of what threads run on what cores is handled by the OS itself (in Android, based off of the Linux scheduler). You cannot affect those decisions yourself; the decisions are automatic and dynamic.
does it matter how the new thread is created, using the Thread class or submitting a Runnable to an Executor, that maintans a pool of threads?
With respect to what cores a thread runs on, the OS neither knows nor cares whether an Executor is involved, or even if the programming language that app was written in has something called Executor.
In Java, and more specifically, Android, are new Threads automatically
assigned to a different CPU core than the one I am currently
utilizing, or should I take care of that?
In Java threads are simply separate sequence of executions, but in Android it is a little more complicated than that. Android creates one main thread per application. This main thread is responsible for the UI and other tasks related to events (queue). For doing background work you have to create separate worker threads.
Simple threads are handled by the Android OS automatically, and they may or may not run on separate cores. If you are running 10 threads, it is quite possible that they all run on one core leaving all other cores idle.
If you need to run more than one threads and you want to run each thread on a separate core you should use ThreadPoolExecutor; it will handle thread creation and map it on number of CPU cores available. You can set various parameters according to your requirement. See what Android is saying:
A ThreadPoolExecutor will automatically adjust the pool size (see
getPoolSize()) according to the bounds set by corePoolSize (see
getCorePoolSize()) and maximumPoolSize (see getMaximumPoolSize()).
When a new task is submitted in method execute(Runnable), and fewer
than corePoolSize threads are running, a new thread is created to
handle the request, even if other worker threads are idle. If there
are more than corePoolSize but less than maximumPoolSize threads
running, a new thread will be created only if the queue is full.
See ThreadPoolExecutor for detail.
does it matter how the new thread is created, using the Thread class
or submitting a Runnable to an Executor, that maintans a pool of
threads?
yes, see the answer above.
Update
By saying "to run each thread on a separate core use ThreadPoolExecutor", I meant that ThreadPoolExecutor can do that if it is used properly in a careful manner.
Java does not map threads directly on the CPU. Java leaves threads schedule (by mapping on to the OS' processes) on OS but how we create threads influence scheduling at OS level. However Java, can assign priority to threads but again it is up to the OS to honor these priorities.
There are various parameters we should consider while creating a thread pool, few are as follows:
i) Threads should be equal in complexity.
ii) Comparing CPU bound tasks with I/O bound, I/O bound task usually need more threads than available core for optimal utilization of CPU
iii) Dependency between threads effect negatively
If threads are created keeping these points in mind, ThreadPoolExecutor can help achieve a 100% of the CPU utilization, meaning one thread per core (if the thread pool's size is equal to the number of cores and no other thread is running). A benefit of ThreadPoolExecutor is that it is cost effective as compare to creating threads separately and it also eliminates context switching which wastes a lot of CPU cycles.
Achieving the 100% of the CPU utilization while making things concurrent, is not an easy task.
Whichever way Threads are created (Either using Thread class or using submitting the task to ThreadPoolExecutor) or task assigned to threads it will not make any impact on OS scheduling.
There is OS component Scheduler involved in this process which takes the responsibility to schedule the tasks or threads among the CPU cores(if cores are more than one) inside the OS.
This decision is taken by scheduler.
If there is only one core in system, Scheduler plays fair with threads by allowing them to do processing for some milliseconds one by one.
When I have hundreds of items to iterate through, and I have to do a computation-heavy operation to each one, I would take a "divide and conquer" approach. Essentially, I would take the processor count + 1, and divide those items into the same number of batches. And then I would execute each batch on a runnable in a cached thread pool. It seems to work well. My GUI task went from 20 seconds to 2 seconds, which is a much better experience for the user.
However, I was reading Brian Goetz' fine book on concurrency, and I noticed that for iterating through a list of items, he would take a totally different approach. He would kick off a Runnable for each item! Before, I always speculated this would be bad, especially on a cached thread pool which could create tons of threads. However each runnable would probably finish very quickly in the larger scope, and I understand the cached thread pool is very optimal for short tasks.
So which is the more accepted paradigm to iterate through computation-heavy items? Dividing into a fixed number of batches and giving each batch a runnable? Or kicking each item off in its own runnable? If the latter approach is optimal, is it okay to use a cached thread pool or is it better to use a bounded thread pool?
With batches you will always have to wait for the longest running batch (you are as fast as the slowest batch). "Divide and conquer" implies management overhead: doing administration for the dividing and monitoring the conquering.
Creating a task for each item is relative straightforward (no management), but you are right in that it may start hundreds of threads (unlikely, but it could happen) which will only slow things down (context switching) if the task does no/very few I/O and is mostly CPU intensive.
If the cached thread pool does not start hundreds of threads (see getLargestPoolSize), then by all means, use the cached thread pool. If too many threads are started then one alternative is to use a bounded thread pool. But a bounded thread pool needs some tuning/decisions: do you use an unbounded task queue or a bounded task queue with a CallerRunsPolicy for example?
On a side note: there is also the ForkJoinPool which is suitable for tasks that start sub-tasks.
For a particular action, application creates two threads (doing different tasks) and main thread doesn't wait for it. Again for some cases, it can be only one thread too.
If I move this one to Executors.newFixedThreadPool(), does it make any difference? I understand Executors are doing thread management. It will be good for multi-threading scenarios.
But I want to know does it makes any small difference at least when two threads are changed to use executors? Please help.
Thanks in advance.
This may results in better CPU utilization when u have a many threads and want to
execute few of them at a time, but if you have only two thread then I think it is
not beneficial to use Executors.
from docs.oracle
Thread pools address two different problems: they usually provide improved performance when executing large numbers of asynchronous tasks, due to reduced per-task invocation overhead, and they provide a means of bounding and managing the resources, including threads, consumed when executing a collection of tasks. Each ThreadPoolExecutor also maintains some basic statistics, such as the number of completed tasks.
Can anybody explain with examples about why should we use Thread-pools.
I have know about use of threadpools with Executors theoretically.
I have gone through number of tutorials, but I didn't get any practically examples about why should we use Threadpools, it can be newFixedThreadPool or newCachedThreadPool or newSingleThreadExecutor
in terms of scalability and performance .
If anybody explain me with respect to performance and scalability with examples about it?
First off, check this description of thread pools that I wrote yesterday: Android Thread Pool to manage multiple bluetooth handeling threads? (ok, it was about android but it's the same for classic java).
The main use I always seem to find for using a threadpool is that is very nicely manages a very common problem: producer-consumer. In this pattern, someone needs to constantly send work items (the producer) to be processed by someone else (the consumers). The work items are obtained from some stream-like source, like a socket, a database, or a collection of disk files, and needs multiple workers in order to be processed efficiently. The main components identifiable here are:
the producer: a thread that keeps posting jobs
a queue where the jobs are posted
the consumers: worker threads that take jobs from the queue and execute them
In addition to this, synchronization needs to be employed to make all this work correctly, since reading and writing to the queue without synchronization can lead to corrupted and inconsistent data. Also, we need to make the system efficient, since the consumers should not waste CPU cycles when there is nothing to do.
Now this pattern is very common, but to implement it from scratch it takes a considerable effort, which is error prone and needs to be carefully reviewed.
The solution is the thread pool. It very conveniently manages the work queue, the consumer threads and all the synchronization needed. All you need to do is play the role of the producer and feed the pool with tasks!
I would start with a problem and only then try to find a solution for it.
If you start the way you have, you can have a solution looking for a problem to solve and you are likely to use it inappropriately.
If you can't think of a use for thread pools, don't use them. ;)
A common mistake people make is to assume that because they have lots of cpus now, they have to use them all as if this were a reason in itself. Its like saying I have lots of disk space, I must find a way to use all of it.
A good reason to use thread pools is to improve the performance of CPU bounds processes and the simplicity of IO bound processes (rather than using non-blocking IO with one thread)
If you have a busy CPU bound process which performs tasks which can be executed independently you have a good use case for a thread pool.
Note: Thread pool often has just one thread. There are specific static factories for these. If you want a simple background worker, this may be an option.
Note 2: A common mistake is to assume that a CPU bound tasks will run best on hundreds or thousands of threads. The optimial number of threads can be the number of core or cpus you have. Once all these are busy, you may find additional threads just add overhead.
Initializing a new thread (and its own stack) is a costly operation.
Thread pools are use to avoid this cost by reusing threads already created. Thus using thread pools you get better performance then creating new threads every time.
Also note that created threads might need to be "deleted" after they have been used, which increases the cost of garbage collection and the frequency it will happen (as the memory fills up faster).
This analysis is just from the performance point of view. I cannot think of an advantage of using thread pools in terms of scalability at the moment.
I googled "why use java thread pools" and found:
A thread pool offers a solution to both the problem of thread
life-cycle overhead and the problem of resource thrashing.
http://www.ibm.com/developerworks/library/j-jtp0730/index.html
and
The newCachedThreadPool method creates an executor with an expandable
thread pool. This executor is suitable for applications that launch
many short-lived tasks.
The newSingleThreadExecutor method creates an
executor that executes a single task at a time.
http://docs.oracle.com/javase/tutorial/essential/concurrency/pools.html
Can any one guide me with example about Thread and ThreadPool what is difference between them? which is best to use...? what are the drawback on its
Since a thread can only run once, you'd have to use a thread per task. However, creating and starting threads is somewhat expensive and can lead to a situation where too many threads are waiting for execution (don't remember the exact name for this right now) - which further reduces performance.
A thread pool is - as the name suggests - a pool of worker threads which are always running. Those threads then normally take tasks from a list, execute them, then try to take the next task. If there's no task, the thread will wait.
Using a thread pool has several advantages:
you don't have to create a thread per task
you normally have the optimal number of threads for your system (depending on the JVM too)
you can concentrate on writing tasks and use the thread pool to manage the infrastructure
Edit: Here are some quite good articles on concurrency in general: Sutter's Mill, look at the bottom for more links. Although they're primarily written for C/C++ the general concepts are the same, since it also describes the interdependence between concurrency solutions and hardware. A good article to understand concurrency performance issues is this article on drdobbs.com.
A thread pool is a collection of threads which are assigned to perform uniformed tasks.
The advantages of using thread pool pattern is that you can define how many threads is allowed to execute simultaneously. This is to avoid server crashing due to high CPU load or out of memory condition, e.g. the server's hardware capacity can support up to 100 requests per second only.
Database pooling has the similar concept with thread pool.
This pattern is widely used in most of the back-end servers' application process.
While a thread, is a unit which execute a task.