Is there an ExecutorService that allows an existing thread to perform the executions instead of spawning new threads? Bonus if it’s a ScheduledExecutor. Most executors spawn worker threads to do the execution, but I want the worker thread to be an existing thread that I’m on. Here's the API that I imagine:
while (!executor.isTerminated()) {
Runnable r = executor.take();
r.run();
}
This is similar to the way that SWT and JavaFX allow the main thread to dispatch events, as opposed to Swing, which requires its own event dispatch thread to be spawned to handle events.
Motivation: I currently have lots of places where a thread spawn a new executor and then just calls awaitTermination() to wait for it to finish. I’d like to save some resources and keep the stack traces from being split in two.
Note that I don’t want an executor that runs tasks in execute(Runnable)’s caller threads, which is what this answer and Guava’s MoreExecutors.sameThreadExecutor() do.
Most executors from java.util.concurrent behave exactly as you supposed. Some spawn additional threads when there are too many tasks, but usually they can be configured to set a limit.
To exploit such a behaviour, do not start new executor each time - use the same executor. To wait for a set of tasks to finish, use invokeAll(), or submit() and then future.get()
I'm assuming what you want is control over the creation of new threads, such as name, daemon-status, etc. Use a ThreadFactory:
public class MyThreadFactory implements ThreadFactory {
public Thread newThread(Runnable runnable) {
Thread t = new Thread(runnable, "MyThreadName");
t.setDaemon(true);
return t;
}
}
This allows you to control thread creation so that the execution happens in threads that you manufacture instead of some default thread from a default ThreadFactory.
Then to use it, all of the methods in Executors take a ThreadFactory:
Executors.newExecutorOfSomeKind(new MyThreadFactory());
Edit: I see what you mean now. Unfortunately, the behavior of all Executor implementations (as far as I'm aware) is to create new threads to run the task, except the sameThreadExecutor you mentioned. Without going through the Thread objects that are creating executors just to execute one task (which is a horrible design -- see comments for what I mean by this), there's no easy way to accomplish what you want. I would recommend changing the code to use a single Executor with something like an ExecutorCompletionService (see this question) or use a fork/join pattern. Fork/join is made easier in Java 7 (see this Java trail). For pre-Java 7 code, read up on the counting Semaphore in Java (and in general).
Related
Context: I've read this SO thread discussing the differences between CompletableFuture and Thread.
But I'm trying to understand when should I use new Thread() instead of runAsync().
I understand that runAsyn() is more efficient for short/one-time parallel task because my program might avoid the overhead of creating a brand new thread, but should I also consider it for long running operations?
What are the factors that I should be aware before considering to use one over the other?
Thanks everyone.
The difference between using the low-level concurrency APIs (such as Thread) and others is not just about the kind of work that you want to get done, it's also about the programming model and also how easy they make it to configure the environment in which the task runs.
In general, it is advisable to use higher-level APIs, such as CompletableFuture instead of directly using Threads.
I understand that runAsyn() is more efficient for short/one-time parallel task
Well, maybe, assuming you call runAsync, counting on it to use the fork-join pool. The runAsync method is overloaded with a method that allows one to specify the java.util.concurrent.Executor with which the asynchronous task is executed.
Using an ExecutorService for more control over the thread pool and using CompletableFuture.runAsync or CompletableFuture.supplyAsync with a specified executor service (or not) are both generally preferred to creating and running a Thread object directly.
There's nothing particularly for or against using the CompletableFuture API for long-running tasks. But the choice one makes to use Threads has other implications as well, among which:
The CompletableFuture gives a better API for programming reactively, without forcing us to write explicit synchronization code. (we don't get this when using threads directly)
The Future interface (which is implemented by CompletableFuture) gives other additional, obvious advantages.
So, in short: you can (and probably should, if the alternative being considered is the Thread API) use the CompletableFuture API for your long-running tasks. To better control thread pools, you can combine it with executor services.
The main difference is CompletableFuture run your task by default on the ForkJoinPool.commonPool. But if you create your own thread and start it will execute as a single thread, not on a Thread pool. Also if you want to execute some task in a sequence but asynchronously. Then you can do like below.
CompletableFuture.runAsync(() -> {
System.out.println("On first task");
System.out.println("Thread : " + Thread.currentThread());
}).thenRun(() -> {
System.out.println("On second task");
});
Output:
On first task
Thread : Thread[ForkJoinPool.commonPool-worker-1,5,main]
On second task
If you run the above code you can see that which pool CompletableFuture is using.
Note: Threads is Daemon in ForkJoinPool.commonPool.
I read a great article about the fork-join framework in Java 7, and the idea is that, with ForkJoinPool and ForkJoinTask, the threads in the pool can get the sub tasks from other tasks, so it's able to use less threads to handle more tasks.
Then I tried to use a normal ExecutorService to do the same work, and found I can't tell the difference, since when I submit a new task to the pool, the task will be run on another available thread.
The only difference I can tell is if I use ForkJoinPool, I don't need to pass the pool to the tasks, because I can call task.fork() to make it running on another thread. But with normal ExecutorService, I have to pass the pool to the task, or make it a static, so inside the task, I can call pool.submit(newTask)
Do I miss something?
(You can view the living code from https://github.com/freewind/fork-join-test/tree/master/src)
Although ForkJoinPool implements ExecutorService, it is conceptionally different from 'normal' executors.
You can easily see the difference if your tasks spawn more tasks and wait for them to complete, e.g. by calling
executor.invoke(new Task()); // blocks this thread until new task completes
In a normal executor service, waiting for other tasks to complete will block the current thread. There are two possible outcomes: If your executor service has a fixed number of threads, it might deadlock if the last running thread waits for another task to complete. If your executor dynamically creates new threads on demand, the number of threads might explode and you end up having thousands of threads which might cause starvation.
In opposite, the fork/join framework reuses the thread in the meantime to execute other tasks, so it won't deadlock although the number of threads is fixed:
new MyForkJoinTask().invoke();
So if you have a problem that you can solve recursively, think of using a ForkJoinPool as you can easily implement one level of recursion as ForkJoinTask.
Just check the number of running threads in your examples.
How does java.util.concurrent.Executor create the "real" thread?
Suppose I am implementing Executor or using any executor service (like ThreadPoolExecutor). How does JVM internally work?
It calls ThreadFactory. Look at the Executors class. Note they all have an overloaded argument where you can supply a ThreadFactory implementation. The ThreadFactory interface is basically
public Thread newThread(Runnable runnable);
and the default implementation if not supplied basically just is return new Thread(runnable);
Why override this - well it's very useful for setting the Thread name and daemon status among other things.
Executor is ready made thread management interface.
Depending on type of executor it creates one or more threads. After thread finishes its task executor stops them or leave running. You can also have executor that run scheduled tasks (for example every minute). This is good alternative for creating many (often thousand of threads) that are needed for just five seconds or plenty of threads that are used from time time.
If you specify number of threads to create and submit more tasks than thread quantity is -- all other Runnable objects will be queued until their turn will come. No JVM magic here just java code.
I'm looking for a java thread-pool, that won't run more threads simultaneously than there are cores in the system. This service is normally provided by a ThreadPoolExecutor using a BlockingQueue.
However, if a new thread is scheduled to execute, I want the new thread to pre-empt one of the already running threads, and add the the pre-empted thread (in a suspended state) to a task queue, so it can be resumed as soon as the new thread is finished.
Any suggestions?
I would make a subclass of ThreadPoolExecutor.
When you setup your ThreadPoolExecutor you want to set the corePoolSize and the maximumPoolSize to Runtime.getRuntime().availableProcessors() (Look at Executors.newFixedThreadPool() to see why this works).
Next you want to make sure that your Queue also implements Deque. LinkedBlockingDeque is an example but you should shop around to see which one will work best for you. A Deque allows you to get stack like LIFO behavior which is exactly what you want.
Since everything (submit(), invokeAll()) funnels through execute() you will want to override this method. Basically do what you described above:
Check if all threads are running. If not simply start the new runnable on an available thread. If all the threads are already running then you need to find the one running the oldest runnable, stop the runnable, re-queue the runnable somewhere (maybe at the beginning?), and then start your new runnable.
The idea of a ThreadPoolExecutor is to avoid all of the expensive actions related to creating and destroying a thread. If you absolutely insist on preempting the running tasks, then you won't get that from the default API.
If you are willing to allow the running tasks to complete and instead only preempt the tasks which have not begun execution, then you can use a BlockingQueue implementation which works like a Stack (LIFO).
You can also have tasks 'preempt' other tasks by using different executors with different thread priorities. Essentially, if the OS supports time-slicing, then the higher priority executor gets the time-slice.
Otherwise, you need a custom implementation which manages execution. You could use a SynchronousQueue and have P worker threads waiting on it. If a client calls execute and SynchronousQueue.offer fails, then you would have to create a special worker Thread which grabs one of the other Threads and flags them to halt before executing and again flags them to resume after executing.
Executor seems like a clean abstraction. When would you want to use Thread directly rather than rely on the more robust executor?
To give some history, Executors were only added as part of the java standard in Java 1.5. So in some ways Executors can be seen as a new better abstraction for dealing with Runnable tasks.
A bit of an over-simplification coming... - Executors are threads done right so use them in preference.
I use Thread when I need some pull based message processing. E.g. a Queue is take()-en in a loop in a separate thread. For example, you wrap a queue in an expensive context - lets say a JDBC connection, JMS connection, files to process from single disk, etc.
Before I get cursed, do you have some scenario?
Edit:
As stated by others, the Executor (ExecutorService) interface has more potential, as you can use the Executors to select a behavior: scheduled, prioritized, cached etc. in Java 5+ or a j.u.c backport for Java 1.4.
The executor framework has protection against crashed runnables and automatically re-create worker threads. One drawback in my opinion, that you have to explicitly shutdown() and awaitTermination() them before you exit your application - which is not so easy in GUI apps.
If you use bounded queues you need to specify a RejectedExecutionHandler or the new runnables get thrown away.
You might have a look at Brian Goetz et al: Java Concurrency in Practice (2006)
There is no advantage to using raw threads. You can always supply Executors with a Thread factory, so even the option of custom thread creation is covered.
You don't use Thread unless you need more specific behaviour that is not found in Thread itself. You then extend Thread and add your specifically wanted behaviour.
Else just use Runnable or Executor.
Well, I thought that a ThreadPoolExecutor provided better performance for it manages a pool of threads, minimizing the overhead of instantiating a new thread, allocating memory...
And if you are going to launch thousands of threads, it gives you some queuing functionality you would have to program by yourself...
Threads & Executors are different tools, used on different scenarios... As I see it, is like asking why should I use ArrayList when I can use HashMap? They are different...
java.util.concurrent package provides executor interface and can be used to created thread.
The Executor interface provides a single method, execute, designed to be a drop-in replacement for a common thread-creation idiom. If r is a Runnable object, and e is an Executor object you can replace
(new Thread(r)).start();
with
e.execute(r);
Refer here
It's always better to prefer Executor to Thread even for single thread as below
ExecutorService fixedThreadPool = Executors.newFixedThreadPool(1);
You can use Thread over Executor in below scenarios
Your application needs limited thread(s) and business logic is simple
If simple multi-threading model caters your requirement without Thread Pool
You are confident of managing thread(s) life cycle + exception handling scenarios with help of low level APIs in below areas : Inter thread communication, Exception handling, reincarnation of threads due to unexpected errors
and one last point
If your application does not need customization of various features of ThreadPoolExecutor
ThreadPoolExecutor(int corePoolSize, int maximumPoolSize, long keepAliveTime,
TimeUnit unit, BlockingQueue<Runnable> workQueue, ThreadFactory threadFactory,
RejectedExecutionHandler handler)
In all other cases, you can go for ThreadPoolExecutor