The Java Concurrency API gives you Executor and ExecutorService interfaces to build from, and ships with several concrete implementations (ThreadPoolExecutor and ScheduledThreadPoolExecutor).
I'm completely new to Java Concurrency, and am having difficulty finding answers to several very-similarly-related questions. Rather than cluttering SO with all these tiny questions I decided to bundle them together, because there's probably a way to answer them all in one fell swoop (probably because I'm not seeing the whole picture here):
Is it common practice to implement your own Executor/ExecutorService? In what cases would you do this instead of using the two concretions I mention above? In what cases are the two concretions preferable over something "homegrown"?
I don't understand how all of the concurrent collections relate to Executors. For instance, does ThreadPoolExecutor use, say, ConcurrentLinkedQueue under the hood to queue up submitted tasks? Or are you (the API developer) supposed to select and use, say, ConcurrentLinkedQueue inside your parallelized run() method? Basicaly, are the concurrent collections there to be used internally by the Executors, or do you use them to help write non-blocking algorithms?
Can you configure which concurrent collections an Executor uses under the hood (to store submitted tasks), and is this common practice?
Thanks in advance!
Is it common practice to implement your own Executor/ExecutorService?
No. I've never had to do this and I've been using the concurrency package for some time. The complexity of these classes and the performance implications around getting them "wrong" mean that you should really think carefully about it before undertaking such a project.
The only time that I felt the need to implement my own executor service was when I wanted to implement a "self-run" executor service. That was until a friend showed me that there was a way to do it with a RejectedExecutionHandler.
The only reason why I'd wanted to tweak the behavior of the ThreadPoolExecutor was to have it start all of the threads up to the max-threads and then stick the jobs into the queue. By default the ThreadPoolExecutor starts min-threads and then fills the queue before starting another thread. Not what I expect or want. But then I'd just be copying the code from the JDK and changing it -- not implementing it from scratch.
I don't understand how all of the concurrent collections relate to Executors. For instance, does ThreadPoolExecutor use, say, ConcurrentLinkedQueue under the hood to queue up submitted tasks?
If you are using one of the Executors helper methods then you don't have to worry about this. If you are instantiating ThreadPoolExecutor yourself then you provide the BlockingQueue to use.
public static ExecutorService newFixedThreadPool(int nThreads) {
return new ThreadPoolExecutor(nThreads, nThreads,
0L, TimeUnit.MILLISECONDS, new LinkedBlockingQueue<Runnable>());
}
Versus:
ExecutorService threadPool =
new ThreadPoolExecutor(minThreads, maxThreads, 0L, TimeUnit.MILLISECONDS,
new SynchronousQueue<Runnable>());
Can you configure which concurrent collections an Executor uses under the hood (to store submitted tasks), and is this common practice?
See the last answer.
Related
Can Threadpoolexecutor change its blockingqueue after start? I am using multiple threadpoolexecutors in my process. I don't want to breach the maximum number of threads beyond a certain number in my process. That is why I thought of the idea of switching blockingqueue of my threadpool to a more busy blockingqueue. But I don't see any function in ThreadpoolExecutor class which provides the facility of switching blockingqueues. What could be the reason behind this?
Apparently threadpoolexecutor gives access to its blockingqueue. I can achieve the same behavior by transfering tasks from one queue to another queue.
Immutable objects are usually favoured in modern programming practices. It usually make things... Simpler in regards to object model growth and future enhancements (And no, I don't consider python's approach of "Let's all be responsible adults" as modern for the sake of the argument).
As for solving your problem you.could perhaps pass a smart "Delegating" BlockingQueue implementation that'll implement the standard interface but back it with some queue switching mechanism, controlled internally or externally as your specification requires
What benefit is there to use Executors over just Threads in a Java program.
Such as
ExecutorService pool = Executors.newFixedThreadPool(2);
void someMethod() {
//Thread
new Thread(new SomeRunnable()).start();
//vs
//Executor
pool.execute(new SomeRunnable());
}
Does an executor just limit the number of threads it allows to have running at once (Thread Pooling)? Does it actually multiplex runnables onto the threads it creates instead? If not is it just a way to avoid having to write new Thread(runnable).start() every time?
Yes, executors will generally multiplex runnables onto the threads they create; they'll constrain and manage the number of threads running at once; they'll make it much easier to customize concurrency levels. Generally, executors should be preferred over just creating bare threads.
Creating new threads is expensive. Because Executors uses a thread pool, you get to easily reuse threads, resulting in better performance.
Does an executor just limit the number of threads it allows to have running at once (Thread Pooling)?
Executors#newFixedThreadPool(int), Executors#newSingleThreadExecutor do this, each one under different terms (read the proper javadoc to know more about it).
Does it actually multiplex runnables onto the threads it creates instead?
Yes
If not is it just a way to avoid having to write new Thread(runnable).start() every time?
ExecutorService helps you to control the way you handle threads. Of course, you can do this manually, but there's no need to reinvent the wheel. Also, there are other functionalities that ExecutorService provides you like executing asynchronous tasks through the usage of Future instances.
There are multiple concerns related to thread.
managing threads
resource utilization
creation of thread
Executors provides different kind of implementation for creating a pool of threads. Also thread creation is a costly affair. Executors creates and manages these threads internally. Details about it can be found in the below link.
http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadPoolExecutor.html
As I said over in a related question, Threads are pretty bad. Executors (and the related concurrency classes) are pretty good:
Caveat: Around here, I strongly discourage the use of raw Threads. I
much prefer the use of Callables and FutureTasks (From the javadoc: "A
cancellable asynchronous computation"). The integration of timeouts,
proper cancelling and the thread pooling of the modern concurrency
support are all much more useful to me than piles of raw Threads.
For example, I'm currently replacing a legacy piece of code that used a disjoint Thread running in a loop with a self-timer to determine how long it should Thread.sleep() after each iteration. My replacement will use a very simple Runnable (to hold a single iteration), a ScheduledExecutorService to run one of the iterations and the Future resulting from the scheduleAtAFixedRate method to tune the timing between iterations.
While you could argue that replacement will be effectively equivalent to the legacy code, I'll have replaced an arcane snarl of Thread management and wishful thinking with a compartmentalized set of functionality that separates the concerns of the GUI (are we currently running?) from data processing (playback at 5x speed) and file management (cancel this run and choose another file).
Assume that I have a set of objects that need to be analyzed in two different ways, both of which take relatively long time and involve IO-calls, I am trying to figure out how/if I could go about optimizing this part of my software, especially utilizing the multiple processors (the machine i am sitting on for ex is a 8-core i7 which almost never goes above 10% load during execution).
I am quite new to parallel-programming or multi-threading (not sure what the right term is), so I have read some of the prior questions, particularly paying attention to highly voted and informative answers. I am also in the process of going through the Oracle/Sun tutorial on concurrency.
Here's what I thought out so far;
A thread-safe collection holds the objects to be analyzed
As soon as there are objects in the collection (they come a couple at a time from a series of queries), a thread per object is started
Each specific thread takes care of the initial pre-analysis preparations; and then calls on the analyses.
The two analyses are implemented as Runnables/Callables, and thus called on by the thread when necessary.
And my questions are:
Is this a reasonable scheme, if not, how would you go about doing this?
In order to make sure things don't get out of hand, should I implement a ThreadManager or some thing of that sort, which starts and stops threads, and re-distributes them when they are complete? For example, if i have 256 objects to be analyzed, and 16 threads in total, the ThreadManager assigns the first finished thread to the 17th object to be analyzed etc.
Is there a dramatic difference between Runnable/Callable other than the fact that Callable can return a result? Otherwise should I try to implement my own interface, in that case why?
Thanks,
You could use a BlockingQueue implementation to hold your objects and spawn your threads from there. This interface is based on the producer-consumer principle. The put() method will block if your queue is full until there is some more space and the take() method will block if the queue is empty until there are some objects again in the queue.
An ExecutorService can help you manage your pool of threads.
If you are awaiting a result from your spawned threads then Callable interface is a good idea to use since you can start the computation earlier and work in your code assuming the results in Future-s. As far as the differencies with the Runnable interface, from the Callable javadoc:
The Callable interface is similar to Runnable, in that both are designed for classes whose instances are potentially executed by another thread. A Runnable, however, does not return a result and cannot throw a checked exception.
Some general things you need to consider in your quest for java concurrency:
Visibility is not coming by defacto. volatile, AtomicReference and other objects in the java.util.concurrent.atomic package are your friends.
You need to carefully ensure atomicity of compound actions using synchronization and locks.
Your idea is basically sound. However, rather than creating threads directly, or indirectly through some kind of ThreadManager of your own design, use an Executor from Java's concurrency package. It does everything you need, and other people have already taken the time to write and debug it. An executor manages a queue of tasks, so you don't need to worry about providing the threadsafe queue yourself either.
There's no difference between Callable and Runnable except that the former returns a value. Executors will handle both, and ready them the same.
It's not clear to me whether you're planning to make the preparation step a separate task to the analyses, or fold it into one of them, with that task spawning the other analysis task halfway through. I can't think of any reason to strongly prefer one to the other, but it's a choice you should think about.
The Executors provides factory methods for creating thread pools. Specifically Executors#newFixedThreadPool(int nThreads) creates a thread pool with a fixed size that utilizes an unbounded queue. Also if a thread terminates due to a failure then a new thread will be replaced in its place. So in your specific example of 256 tasks and 16 threads you would call
// create pool
ExecutorService threadPool = Executors.newFixedThreadPool(16);
// submit task.
Runnable task = new Runnable(){};;
threadPool.submit(task);
The important question is determining the proper number of threads for you thread pool. See if this helps Efficient Number of Threads
Sounds reasonable, but it's not as trivial to implement as it may seem.
Maybe you should check the jsr166y project.
That's probably the easiest solution to your problem.
Executors provides newCachedThreadPool() and newScheduledThreadPool(), but not newCachedScheduledThreadPool(), what gives here? I have an application that receives bursty messages and needs to schedule a fairly lengthy processing step after a fixed delay for each. The time constraints aren't super tight, but I would prefer to have more threads created on the fly if I exceed the pool size and then have them trimmed back during periods of inactivity. Is there something I've missed in the concurrent library, or do I need to write my own?
By design the ScheduledThreadPoolExecutor is a fixed size. You can use a single threaded version that submits to a normal ExecutorService for performing the task. This event thread + worker pool is fairly ease to coordinate and the flexibility makes up for the dedicated thread. I've used this in the past to replace TimerTasks and other non-critical tasks to utilize a common executor as a system-wide pool.
Suggested here Why does ScheduledThreadPoolExecutor only accept a fixed number of threads? workaround:
scheduledExecutor = new ScheduledThreadPoolExecutor(128); //no more than 128 threads
scheduledExecutor.setKeepAliveTime(10, TimeUnit.SECONDS);
scheduledExecutor.allowCoreThreadTimeOut(true);
java.util.concurrent.Executors is nothing more than a collection of static convenience methods that construct common arrangements of executors.
If you want something specific that isn't offered by Executors, then feel free to construct your own instance of the implemention classes, using the examples in Executors as a guide.
Like skaffman says, Executors is only a collection of factory method. if you need a particular instance, you can always check all existing Executor implementors. In your case, i think that calling one of the various constructors of ScheduledThreadPoolExecutor would be a good idea.
Executor seems like a clean abstraction. When would you want to use Thread directly rather than rely on the more robust executor?
To give some history, Executors were only added as part of the java standard in Java 1.5. So in some ways Executors can be seen as a new better abstraction for dealing with Runnable tasks.
A bit of an over-simplification coming... - Executors are threads done right so use them in preference.
I use Thread when I need some pull based message processing. E.g. a Queue is take()-en in a loop in a separate thread. For example, you wrap a queue in an expensive context - lets say a JDBC connection, JMS connection, files to process from single disk, etc.
Before I get cursed, do you have some scenario?
Edit:
As stated by others, the Executor (ExecutorService) interface has more potential, as you can use the Executors to select a behavior: scheduled, prioritized, cached etc. in Java 5+ or a j.u.c backport for Java 1.4.
The executor framework has protection against crashed runnables and automatically re-create worker threads. One drawback in my opinion, that you have to explicitly shutdown() and awaitTermination() them before you exit your application - which is not so easy in GUI apps.
If you use bounded queues you need to specify a RejectedExecutionHandler or the new runnables get thrown away.
You might have a look at Brian Goetz et al: Java Concurrency in Practice (2006)
There is no advantage to using raw threads. You can always supply Executors with a Thread factory, so even the option of custom thread creation is covered.
You don't use Thread unless you need more specific behaviour that is not found in Thread itself. You then extend Thread and add your specifically wanted behaviour.
Else just use Runnable or Executor.
Well, I thought that a ThreadPoolExecutor provided better performance for it manages a pool of threads, minimizing the overhead of instantiating a new thread, allocating memory...
And if you are going to launch thousands of threads, it gives you some queuing functionality you would have to program by yourself...
Threads & Executors are different tools, used on different scenarios... As I see it, is like asking why should I use ArrayList when I can use HashMap? They are different...
java.util.concurrent package provides executor interface and can be used to created thread.
The Executor interface provides a single method, execute, designed to be a drop-in replacement for a common thread-creation idiom. If r is a Runnable object, and e is an Executor object you can replace
(new Thread(r)).start();
with
e.execute(r);
Refer here
It's always better to prefer Executor to Thread even for single thread as below
ExecutorService fixedThreadPool = Executors.newFixedThreadPool(1);
You can use Thread over Executor in below scenarios
Your application needs limited thread(s) and business logic is simple
If simple multi-threading model caters your requirement without Thread Pool
You are confident of managing thread(s) life cycle + exception handling scenarios with help of low level APIs in below areas : Inter thread communication, Exception handling, reincarnation of threads due to unexpected errors
and one last point
If your application does not need customization of various features of ThreadPoolExecutor
ThreadPoolExecutor(int corePoolSize, int maximumPoolSize, long keepAliveTime,
TimeUnit unit, BlockingQueue<Runnable> workQueue, ThreadFactory threadFactory,
RejectedExecutionHandler handler)
In all other cases, you can go for ThreadPoolExecutor