Should I limit the number of Executors I have? - java

I have a Java project where I need to run things in parallel. I do this with executors. The thing is, I need to use executors in a great many places. Should I favor passing a few executors around to do the work (forget about limiting the global number of threads for a moment) or is it preferable to create the executors where I need them?

What you really need to think about is controlling the number of Threads working off any Executors you create.
The number of threads you create off each executor will be a function of the frequency of arrival and expected duration (processing time) of each task being submitted. Having a queue per logical task type allows you to tune the executor for just that task, so that you don't have more threads than required, and you can always keep up with the expected task throughput.
If you have one monolithic Executor shared across all processing stages in your app it becomes much harder to tune.
SEDA is a typical concurrency pattern that reflects this principle of queue per processing stage.
In some instances it does make sense to have a shared executor, such as for infrequent, ad-hoc or low priority scheduled tasks.

There's no strict rule that will tell you how many executors should be used. One thing, though can be recommended. Use some dependency injection mechanism or framework to inject executor implementations. This will allow quick and easy replacement and configuration of used executors.

Related

Is there a default thread pool in java

I can create a new threadpool in java and execute tasks on it using the ExecutorService.newFixedThreadPool and ExecutorService.submit methods.
Is there a 'default' threadpool that I can reuse for all executor services in my java program? Or do I just have to create a singleton that contains a default threadpool? C# has a default threadpool that runs tasks when the Task.Factory.StartNew method is called.
Since Java-8 there's ForkJoinPool.commonPool() which is used by default by many methods involving parallel or asyncronous execution. For example, Arrays.parallelSort() or parallel Stream API operation use this pool. You can submit your own tasks to this pool using many methods of CompletableFuture class like CompletableFuture.supplyAsync().
Using separate threadpools is good, default practice, and sharing threadpools is a (possibly premature) optimization.
Through Java 7 the answer is no, there is not a default threadpool, and the recommendation is to have many threadpools. It's good separation and will prevent blocking behavior on one collection of tasks from interfering with another.
If you share threadpools you should ask questions like:
will the logging framework be able to distinguish tasks? (Threads is one way to distinguish.)
If task pool A accidentally requests way too many threads and gets cut off, should task pool B starve? When you notice task pool B is failing will you be able to diagnose the problem in task pool A?
If pool A blocks should B starve?
Maybe you create something like a LightweightThreadpool. And the first 5 tasks you write use it in a lightweight fashion. And the 6th task... does, except it also writes errors to disk, and those errors are surprisingly big, and sometimes there's many of them, and they're not throttled. Suddenly the first 5 tasks are starved and have no idea what hit them, and furthermore, when you wrote those tasks, you really believed they were secure and might not have prepared for this type of incident.
So sharing threadpools is about as okay as having two different processes run on the same server is okay. You should think about resource management very carefully first and understand that the tasks are resource-coupled now. The lack of a default threadpool is trying to force you to use separate ones by default, and think about these questions carefully before sharing one.
As of Java 8 the answer is "yes" (per Tagir's answer on this question). But you will notice everything will start horribly failing if you submit blocking tasks to that threadpool.

Best approach: tree set structure vs thread pool executor

Guys I'm in bit dilemma between Tree Set and Thread Pool Executor
Following is the scenario :
First Approach
I have to use structure which has tasks in it with priorities of each task.Now based on treeset constructor(with comparator interface)
I can compare task on priorities and based on that, tasks are ordered properly.
Now after that, tasks should processed in order of priority through iteration of tree set and execute each task one by one.
Second Approach
second approach is to do some sort of logic building and use core functionality of Thread pool executor and for this I had taken inspiration from this link and I had achieved my requirements with this approach also which will choose high priority task first and execute it first and same way it will execute all the tasks.
Now my confusion here is which one is best to use in term of performance costs, flexibility(increase/decrease threads) etc and why should I opt for it?
Any suggestions and answers are highly appreciated.
There are two different notions of priority embedded in your question:
starting priority: in which order tasks are submitted for execution, (point 1 of your first approach explanation)
runtime priority: in which order threads are considered for scheduling (point 3)
These two properties happen to be equal in your scenario, so the tree set will help you define both of them. The executor will help you enforce them, but you will need an ad-hoc tailored executor (based on thread pooling or not), to start your threads up with a specific priority. Basically, each time a task is pulled out of the priority queue, it should be associated with a thread set at the task's priority level. I assume that this is the feature that the executor implementation found in the article you link is providing, and thus what you do.
As for thread pools, from the documentation:
Using worker threads minimizes the overhead due to thread creation. Thread objects use a significant amount of memory, and in a large-scale application, allocating and deallocating many thread objects creates a significant memory management overhead.
Worker threads are threads managed by threadpools, and are conservatively recycled (as opposed to destroyed and recreated), to handle sequences of tasks. I Don't think it matters much with regard to priority handling, but it will optimise your usage of resources.
Regarding the implementation from the article, the code uses a simple blocking deque for handling incoming tasks, hence it's a plain fifo priority scheme. It doesn't reorder tasks.
Finally got the real winner out of this two. I should select for Thread pool Executor because of following reasons
Performance cost: Here if we see, using a resource maximum is main motive to get performance during heavy load.So if we use threads in this high time it will be providing high performance as an advantage of multi-threading .
Flexibility:Flexibility in terms of scalable use of resources i.e during low time we can reduce number of worker threads in thread pool executor architecture and vice versa.
Less number of iterations and minimal updates:If we maintain tree set every time, it will check with the help of comparator interface though it has complexity O(logn) but after that we have to fetch it and it will become a sequential flow of single source so we will not multi-threaded environment advantage.
Faster processing:With the help of threading architecture we can achieve faster output.
etc were the reasons which I pointed out during a heavy brain storming,googling and last but not the least stack Overflow searching. Thank you all for your humble support and huge appreciation to #didierc for getting me clear over it.
You can try DelayedQueue in ordinary threadpool.
ThreadPoolExecutor threadPoolExecutor = new ThreadPoolExecutor(size, size, 0, TimeUnit.DAYS, new DelayQueue<>());
threadPoolExecutor.execute(runnable);
Runnable should be implements Comparable . So In this implementation , priority will taken care by delayedqueue.
This approach will be easier to implement.

Use Executors.newFixedThreadPool for only two threads?

For a particular action, application creates two threads (doing different tasks) and main thread doesn't wait for it. Again for some cases, it can be only one thread too.
If I move this one to Executors.newFixedThreadPool(), does it make any difference? I understand Executors are doing thread management. It will be good for multi-threading scenarios.
But I want to know does it makes any small difference at least when two threads are changed to use executors? Please help.
Thanks in advance.
This may results in better CPU utilization when u have a many threads and want to
execute few of them at a time, but if you have only two thread then I think it is
not beneficial to use Executors.
from docs.oracle
Thread pools address two different problems: they usually provide improved performance when executing large numbers of asynchronous tasks, due to reduced per-task invocation overhead, and they provide a means of bounding and managing the resources, including threads, consumed when executing a collection of tasks. Each ThreadPoolExecutor also maintains some basic statistics, such as the number of completed tasks.

Advantages of Executors over new Thread

What benefit is there to use Executors over just Threads in a Java program.
Such as
ExecutorService pool = Executors.newFixedThreadPool(2);
void someMethod() {
//Thread
new Thread(new SomeRunnable()).start();
//vs
//Executor
pool.execute(new SomeRunnable());
}
Does an executor just limit the number of threads it allows to have running at once (Thread Pooling)? Does it actually multiplex runnables onto the threads it creates instead? If not is it just a way to avoid having to write new Thread(runnable).start() every time?
Yes, executors will generally multiplex runnables onto the threads they create; they'll constrain and manage the number of threads running at once; they'll make it much easier to customize concurrency levels. Generally, executors should be preferred over just creating bare threads.
Creating new threads is expensive. Because Executors uses a thread pool, you get to easily reuse threads, resulting in better performance.
Does an executor just limit the number of threads it allows to have running at once (Thread Pooling)?
Executors#newFixedThreadPool(int), Executors#newSingleThreadExecutor do this, each one under different terms (read the proper javadoc to know more about it).
Does it actually multiplex runnables onto the threads it creates instead?
Yes
If not is it just a way to avoid having to write new Thread(runnable).start() every time?
ExecutorService helps you to control the way you handle threads. Of course, you can do this manually, but there's no need to reinvent the wheel. Also, there are other functionalities that ExecutorService provides you like executing asynchronous tasks through the usage of Future instances.
There are multiple concerns related to thread.
managing threads
resource utilization
creation of thread
Executors provides different kind of implementation for creating a pool of threads. Also thread creation is a costly affair. Executors creates and manages these threads internally. Details about it can be found in the below link.
http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadPoolExecutor.html
As I said over in a related question, Threads are pretty bad. Executors (and the related concurrency classes) are pretty good:
Caveat: Around here, I strongly discourage the use of raw Threads. I
much prefer the use of Callables and FutureTasks (From the javadoc: "A
cancellable asynchronous computation"). The integration of timeouts,
proper cancelling and the thread pooling of the modern concurrency
support are all much more useful to me than piles of raw Threads.
For example, I'm currently replacing a legacy piece of code that used a disjoint Thread running in a loop with a self-timer to determine how long it should Thread.sleep() after each iteration. My replacement will use a very simple Runnable (to hold a single iteration), a ScheduledExecutorService to run one of the iterations and the Future resulting from the scheduleAtAFixedRate method to tune the timing between iterations.
While you could argue that replacement will be effectively equivalent to the legacy code, I'll have replaced an arcane snarl of Thread management and wishful thinking with a compartmentalized set of functionality that separates the concerns of the GUI (are we currently running?) from data processing (playback at 5x speed) and file management (cancel this run and choose another file).

Why is there no scheduled cached thread pool provided by the Java Executors class?

Executors provides newCachedThreadPool() and newScheduledThreadPool(), but not newCachedScheduledThreadPool(), what gives here? I have an application that receives bursty messages and needs to schedule a fairly lengthy processing step after a fixed delay for each. The time constraints aren't super tight, but I would prefer to have more threads created on the fly if I exceed the pool size and then have them trimmed back during periods of inactivity. Is there something I've missed in the concurrent library, or do I need to write my own?
By design the ScheduledThreadPoolExecutor is a fixed size. You can use a single threaded version that submits to a normal ExecutorService for performing the task. This event thread + worker pool is fairly ease to coordinate and the flexibility makes up for the dedicated thread. I've used this in the past to replace TimerTasks and other non-critical tasks to utilize a common executor as a system-wide pool.
Suggested here Why does ScheduledThreadPoolExecutor only accept a fixed number of threads? workaround:
scheduledExecutor = new ScheduledThreadPoolExecutor(128); //no more than 128 threads
scheduledExecutor.setKeepAliveTime(10, TimeUnit.SECONDS);
scheduledExecutor.allowCoreThreadTimeOut(true);
java.util.concurrent.Executors is nothing more than a collection of static convenience methods that construct common arrangements of executors.
If you want something specific that isn't offered by Executors, then feel free to construct your own instance of the implemention classes, using the examples in Executors as a guide.
Like skaffman says, Executors is only a collection of factory method. if you need a particular instance, you can always check all existing Executor implementors. In your case, i think that calling one of the various constructors of ScheduledThreadPoolExecutor would be a good idea.

Categories