I can create a new threadpool in java and execute tasks on it using the ExecutorService.newFixedThreadPool and ExecutorService.submit methods.
Is there a 'default' threadpool that I can reuse for all executor services in my java program? Or do I just have to create a singleton that contains a default threadpool? C# has a default threadpool that runs tasks when the Task.Factory.StartNew method is called.
Since Java-8 there's ForkJoinPool.commonPool() which is used by default by many methods involving parallel or asyncronous execution. For example, Arrays.parallelSort() or parallel Stream API operation use this pool. You can submit your own tasks to this pool using many methods of CompletableFuture class like CompletableFuture.supplyAsync().
Using separate threadpools is good, default practice, and sharing threadpools is a (possibly premature) optimization.
Through Java 7 the answer is no, there is not a default threadpool, and the recommendation is to have many threadpools. It's good separation and will prevent blocking behavior on one collection of tasks from interfering with another.
If you share threadpools you should ask questions like:
will the logging framework be able to distinguish tasks? (Threads is one way to distinguish.)
If task pool A accidentally requests way too many threads and gets cut off, should task pool B starve? When you notice task pool B is failing will you be able to diagnose the problem in task pool A?
If pool A blocks should B starve?
Maybe you create something like a LightweightThreadpool. And the first 5 tasks you write use it in a lightweight fashion. And the 6th task... does, except it also writes errors to disk, and those errors are surprisingly big, and sometimes there's many of them, and they're not throttled. Suddenly the first 5 tasks are starved and have no idea what hit them, and furthermore, when you wrote those tasks, you really believed they were secure and might not have prepared for this type of incident.
So sharing threadpools is about as okay as having two different processes run on the same server is okay. You should think about resource management very carefully first and understand that the tasks are resource-coupled now. The lack of a default threadpool is trying to force you to use separate ones by default, and think about these questions carefully before sharing one.
As of Java 8 the answer is "yes" (per Tagir's answer on this question). But you will notice everything will start horribly failing if you submit blocking tasks to that threadpool.
Related
I am trying to mimic what single threaded async programming in Javascript in Java with the use of async / await library by EA (ea-async). This is mainly because I do not have long-lasting CPU bound computations in my program and I want to code single thread lock free code in Java.
ea-async library heavily relies on the CompletableFuture in Java and underneath Java seems to use ForkJoinPool to run the async callbacks. This puts me into multi threaded environment as my CPU is multi-core. It seems for every CompletableFuture task, I can supply async with my custom thread pool executor. I can supply Executors.newSingleThreadExecutor() for this but I need a way to set this globally so that all CompletableFuture will be using this executor within the single JVM process. How do I do this?
ea-async library heavily relies on the CompletableFuture in Java and
underneath Java seems to use ForkJoinPool to run the async callbacks.
That is the default behavior of CompleteableFuture:
All async methods without an explicit Executor argument are performed
using the ForkJoinPool.commonPool() (unless it does not support a
parallelism level of at least two, in which case, a new Thread is
created to run each task). This may be overridden for non-static
methods in subclasses by defining method defaultExecutor().
That's a defined characteristic of the class, so if you're using class CompleteableFuture, not a subclass, and generating instances without specifying an Executor explicitly, then a ForkJoinPool is what you're going to get.
Of course, if you are in control of the CompletableFutures provided to ea-async then you have the option to provide instances of a subclass that defines defaultExecutor() however you like. Alternatively, you can create your CompleteableFuture objects via the static factory methods that allow you to explicitly specify the Executor to use, such as runAsync​(Runnable, Executor).
But that's probably not what you really want to do.
If you use an executor with only one thread, then your tasks can be executed asynchronously with respect to the thread that submits them, yes, but they will be serialized with respect to each other. You do get only one thread working on them, but it will at any time be working on a specific one, sticking with that one only until it finishes, regardless of the order in which the responses actually arrive. If that's satisfactory, then it's unclear why you want async operations at all.
This puts me into multi threaded environment as my CPU is multi-core.
It puts you in multiple threads regardless of how many cores your CPU has. That's what Executors do, even Executors.newSingleThreadExecutor(). That's the sense of "asynchronous" they provide.
If I understand correctly, you are instead looking to use one thread to multiplex I/O to multiple remote web applications. That is what java.nio.channels.Selector is for, but using that generally requires either managing the I/O operations yourself or using interfaces designed to interoperate with selectors. If you are locked in to third-party interfaces that do not afford use of a Selector, then multithreading and multiprocessing are your only viable alternatives.
In comments you wrote:
I'm starting to think maybe BlockingQueue might do the job in
consolidating all API responses into one queue as tasks where a single
thread will work on them.
Again, I don't think that you want everything that comes with that, and if in fact you do, then I don't see why it wouldn't be even better and easier to work synchronously instead of asynchronously.
As far as I understand Executors help handling the execution of runnables. E.g. I would choose using an executor when I have several worker threads that do their job and then terminate.
The executor would handle the creation and the termination of the Threads needed to execute the worker runnables.
However now I am facing another situation. A fixed number of classes/objects shall encapsulate their own thread. So the thread is started at the creation of those objects and the Thread shall continue running for the whole life time of these objects.
The few objects in turn are created at the start of the programm and exist for the whole run time.
I guess Threads are preferable over Executors in this situation, however when I read the internet everybody seems to suggest using Executors over Threads in any possible situation.
Can somebody please tell me if I want to choose Executors or Threads here and why?
Thanks
You're somewhat mixing things. Executor is just an interface. Thread is a core class. There's nothing which directly implies that Executor implementations execute tasks in separate threads.
Read the first few lines of the JavaDoc.
Executor
So if you want full control, just use Thread and do things on your own.
Without knowing more about the context, it's hard to give a good answer, but generally speaking I'd say that the situations that calls for using Thread are pretty few and far between. If you start trying to synchronize your program "manually" using synchronized I bet things will get out of hand quickly. (Not to mention how hard it will be to debug the code.)
Last time I used a thread was when I wanted to record some audio in the background. It was a "start"/"stop" kind of thing, and not "task oriented". (I tried long and hard to try to find an audio library that would encapsulate that for me but failed.)
If you choose to go for a thread-solution, I suggest you try to limit the scope of the thread to only execute within the associated object. This will to an as large extent as possible avoid forcing you to think about happens-before relations, thread-safe publishing of values etc throughout the code.
ExecutorService can have thread pool
It optimizes performance, because creating a Thread is expensive.
ExecutorService has life cycle control
shutdown(), shutdownNow() etc are provided.
ExecutorService is flexible
You could invoke variety of behaviors: customize ThreadFactory, set thread pool size, delay behavior ScheduledThreadPoolExecutor etc...
I have some tasks that I need to process concurrently on Android and I would like to use some sort of a thread pool to do so. I couldn't find in the documentation what actually happens "behind the scenes" when executing an AsyncTask with AsyncTask.THREAD_POOL_EXECUTOR.
My question is: What do I lose by using AsyncTasks with AsyncTask.THREAD_POOL_EXECUTOR as opposed to implementing a custom ThreadPool with Runnables? (Let's talk post-honeycomb).
I realize the question is rather general, but I'm fairly new to doing concurrent programming (besides AsyncTask itself). I'm not looking for a tutorial on concurrent programming! I only seek to understand how the Android specific AsyncTask.THREAD_POOL_EXECUTOR is different. I think an explanation would be helpful for others in the future as they weigh the pros and cons of choosing to use AsyncTask vs Thread/Runnable. Thanks in advance!
AsyncTasks provide you with possibility to execute actions on UI thread before and after executing worker task. So, if you dont need communicating with UI then use your own executor - you can always implement this using handler. AsyncTasks are being executed serially since api 11 because parallel execution was considered to difficult to properly implement.
If you need more flexibility, then executors are a way to go, they will allow you to freely specify how many tasks to execute in parallel, how many to put in queue etc.
If you are interested in details, you can always look into sources:
http://androidxref.com/4.4.3_r1.1/xref/development/samples/training/bitmapfun/BitmapFun/src/main/java/com/example/android/bitmapfun/util/AsyncTask.java
Non-UI work might be taken by anything including AsyncTasks, HandlerThreads, IntentServices etc.
The reason it's suggested AsyncTasks for UI-related works (works that affect UI) is that AsyncTask has helper callbacks that lets you to transfer the control to the UI thread.
However, it's not suggested for longer running operations since it's, by default, uses a global executor and this may cause app-global waiting threads to be stalled while executing long-runnings ops. So you can switch to a custom executor and get rid of global affect.
At the end of the day, HandlerThreads are threads again that gives a Looper to keep the thread alive. Executions will still be done in serial so what's the real reason to use them ? I believe it's the power of ability to execute Runnables like Executors but more in light-weight fashion.
IntentServices are again - the way to execute tasks serially but you've more power and isolation since they're entirely different components has seperate lifecycles. They automatically destroyed so you don't have to worry about destroying them to reduce your app process priority ( off the topic but causes some memory performance problems, trashing etc. )
Can anybody explain with examples about why should we use Thread-pools.
I have know about use of threadpools with Executors theoretically.
I have gone through number of tutorials, but I didn't get any practically examples about why should we use Threadpools, it can be newFixedThreadPool or newCachedThreadPool or newSingleThreadExecutor
in terms of scalability and performance .
If anybody explain me with respect to performance and scalability with examples about it?
First off, check this description of thread pools that I wrote yesterday: Android Thread Pool to manage multiple bluetooth handeling threads? (ok, it was about android but it's the same for classic java).
The main use I always seem to find for using a threadpool is that is very nicely manages a very common problem: producer-consumer. In this pattern, someone needs to constantly send work items (the producer) to be processed by someone else (the consumers). The work items are obtained from some stream-like source, like a socket, a database, or a collection of disk files, and needs multiple workers in order to be processed efficiently. The main components identifiable here are:
the producer: a thread that keeps posting jobs
a queue where the jobs are posted
the consumers: worker threads that take jobs from the queue and execute them
In addition to this, synchronization needs to be employed to make all this work correctly, since reading and writing to the queue without synchronization can lead to corrupted and inconsistent data. Also, we need to make the system efficient, since the consumers should not waste CPU cycles when there is nothing to do.
Now this pattern is very common, but to implement it from scratch it takes a considerable effort, which is error prone and needs to be carefully reviewed.
The solution is the thread pool. It very conveniently manages the work queue, the consumer threads and all the synchronization needed. All you need to do is play the role of the producer and feed the pool with tasks!
I would start with a problem and only then try to find a solution for it.
If you start the way you have, you can have a solution looking for a problem to solve and you are likely to use it inappropriately.
If you can't think of a use for thread pools, don't use them. ;)
A common mistake people make is to assume that because they have lots of cpus now, they have to use them all as if this were a reason in itself. Its like saying I have lots of disk space, I must find a way to use all of it.
A good reason to use thread pools is to improve the performance of CPU bounds processes and the simplicity of IO bound processes (rather than using non-blocking IO with one thread)
If you have a busy CPU bound process which performs tasks which can be executed independently you have a good use case for a thread pool.
Note: Thread pool often has just one thread. There are specific static factories for these. If you want a simple background worker, this may be an option.
Note 2: A common mistake is to assume that a CPU bound tasks will run best on hundreds or thousands of threads. The optimial number of threads can be the number of core or cpus you have. Once all these are busy, you may find additional threads just add overhead.
Initializing a new thread (and its own stack) is a costly operation.
Thread pools are use to avoid this cost by reusing threads already created. Thus using thread pools you get better performance then creating new threads every time.
Also note that created threads might need to be "deleted" after they have been used, which increases the cost of garbage collection and the frequency it will happen (as the memory fills up faster).
This analysis is just from the performance point of view. I cannot think of an advantage of using thread pools in terms of scalability at the moment.
I googled "why use java thread pools" and found:
A thread pool offers a solution to both the problem of thread
life-cycle overhead and the problem of resource thrashing.
http://www.ibm.com/developerworks/library/j-jtp0730/index.html
and
The newCachedThreadPool method creates an executor with an expandable
thread pool. This executor is suitable for applications that launch
many short-lived tasks.
The newSingleThreadExecutor method creates an
executor that executes a single task at a time.
http://docs.oracle.com/javase/tutorial/essential/concurrency/pools.html
Assume that I have a set of objects that need to be analyzed in two different ways, both of which take relatively long time and involve IO-calls, I am trying to figure out how/if I could go about optimizing this part of my software, especially utilizing the multiple processors (the machine i am sitting on for ex is a 8-core i7 which almost never goes above 10% load during execution).
I am quite new to parallel-programming or multi-threading (not sure what the right term is), so I have read some of the prior questions, particularly paying attention to highly voted and informative answers. I am also in the process of going through the Oracle/Sun tutorial on concurrency.
Here's what I thought out so far;
A thread-safe collection holds the objects to be analyzed
As soon as there are objects in the collection (they come a couple at a time from a series of queries), a thread per object is started
Each specific thread takes care of the initial pre-analysis preparations; and then calls on the analyses.
The two analyses are implemented as Runnables/Callables, and thus called on by the thread when necessary.
And my questions are:
Is this a reasonable scheme, if not, how would you go about doing this?
In order to make sure things don't get out of hand, should I implement a ThreadManager or some thing of that sort, which starts and stops threads, and re-distributes them when they are complete? For example, if i have 256 objects to be analyzed, and 16 threads in total, the ThreadManager assigns the first finished thread to the 17th object to be analyzed etc.
Is there a dramatic difference between Runnable/Callable other than the fact that Callable can return a result? Otherwise should I try to implement my own interface, in that case why?
Thanks,
You could use a BlockingQueue implementation to hold your objects and spawn your threads from there. This interface is based on the producer-consumer principle. The put() method will block if your queue is full until there is some more space and the take() method will block if the queue is empty until there are some objects again in the queue.
An ExecutorService can help you manage your pool of threads.
If you are awaiting a result from your spawned threads then Callable interface is a good idea to use since you can start the computation earlier and work in your code assuming the results in Future-s. As far as the differencies with the Runnable interface, from the Callable javadoc:
The Callable interface is similar to Runnable, in that both are designed for classes whose instances are potentially executed by another thread. A Runnable, however, does not return a result and cannot throw a checked exception.
Some general things you need to consider in your quest for java concurrency:
Visibility is not coming by defacto. volatile, AtomicReference and other objects in the java.util.concurrent.atomic package are your friends.
You need to carefully ensure atomicity of compound actions using synchronization and locks.
Your idea is basically sound. However, rather than creating threads directly, or indirectly through some kind of ThreadManager of your own design, use an Executor from Java's concurrency package. It does everything you need, and other people have already taken the time to write and debug it. An executor manages a queue of tasks, so you don't need to worry about providing the threadsafe queue yourself either.
There's no difference between Callable and Runnable except that the former returns a value. Executors will handle both, and ready them the same.
It's not clear to me whether you're planning to make the preparation step a separate task to the analyses, or fold it into one of them, with that task spawning the other analysis task halfway through. I can't think of any reason to strongly prefer one to the other, but it's a choice you should think about.
The Executors provides factory methods for creating thread pools. Specifically Executors#newFixedThreadPool(int nThreads) creates a thread pool with a fixed size that utilizes an unbounded queue. Also if a thread terminates due to a failure then a new thread will be replaced in its place. So in your specific example of 256 tasks and 16 threads you would call
// create pool
ExecutorService threadPool = Executors.newFixedThreadPool(16);
// submit task.
Runnable task = new Runnable(){};;
threadPool.submit(task);
The important question is determining the proper number of threads for you thread pool. See if this helps Efficient Number of Threads
Sounds reasonable, but it's not as trivial to implement as it may seem.
Maybe you should check the jsr166y project.
That's probably the easiest solution to your problem.