I'm looking for a java thread-pool, that won't run more threads simultaneously than there are cores in the system. This service is normally provided by a ThreadPoolExecutor using a BlockingQueue.
However, if a new thread is scheduled to execute, I want the new thread to pre-empt one of the already running threads, and add the the pre-empted thread (in a suspended state) to a task queue, so it can be resumed as soon as the new thread is finished.
Any suggestions?
I would make a subclass of ThreadPoolExecutor.
When you setup your ThreadPoolExecutor you want to set the corePoolSize and the maximumPoolSize to Runtime.getRuntime().availableProcessors() (Look at Executors.newFixedThreadPool() to see why this works).
Next you want to make sure that your Queue also implements Deque. LinkedBlockingDeque is an example but you should shop around to see which one will work best for you. A Deque allows you to get stack like LIFO behavior which is exactly what you want.
Since everything (submit(), invokeAll()) funnels through execute() you will want to override this method. Basically do what you described above:
Check if all threads are running. If not simply start the new runnable on an available thread. If all the threads are already running then you need to find the one running the oldest runnable, stop the runnable, re-queue the runnable somewhere (maybe at the beginning?), and then start your new runnable.
The idea of a ThreadPoolExecutor is to avoid all of the expensive actions related to creating and destroying a thread. If you absolutely insist on preempting the running tasks, then you won't get that from the default API.
If you are willing to allow the running tasks to complete and instead only preempt the tasks which have not begun execution, then you can use a BlockingQueue implementation which works like a Stack (LIFO).
You can also have tasks 'preempt' other tasks by using different executors with different thread priorities. Essentially, if the OS supports time-slicing, then the higher priority executor gets the time-slice.
Otherwise, you need a custom implementation which manages execution. You could use a SynchronousQueue and have P worker threads waiting on it. If a client calls execute and SynchronousQueue.offer fails, then you would have to create a special worker Thread which grabs one of the other Threads and flags them to halt before executing and again flags them to resume after executing.
Related
In my Java application I have a Runnable such as:
this.runner = new Runnable({
#Override
public void run() {
// do something that takes roughly 5 seconds.
}
});
I need to run this roughly every 30 seconds (although this can vary) in a separate thread. The nature of the code is such that I can run it and forget about it (whether it succeeds or fails). I do this as follows as a single line of code in my application:
(new Thread(this.runner)).start()
Now, this works fine. However, I'm wondering if there is any sort of cleanup I should be doing on each of the thread instances after they finish running? I am doing CPU profiling of this application in VisualVM and I can see that, over the course of 1 hour runtime, a lot of threads are being created. Is this concern valid or is everything OK?
N.B. The reason I start a new Thread instead of simply defining this.runner as a Thread, is that I sometimes need to run this.runner twice simultaneously (before the first run call has finished), and I can't do that if I defined this.runner as a Thread since a single Thread object can only be run again once the initial execution has finished.
Java objects that need to be "cleaned up" or "closed" after use conventionally implement the AutoCloseable interface. This makes it easy to do the clean up using try-with-resources. The Thread class does not implement AutoCloseable, and has no "close" or "dispose" method. So, you do not need to do any explicit clean up.
However
(new Thread(this.runner)).start()
is not guaranteed to immediately start computation of the Runnable. You might not care whether it succeeds or fails, but I guess you do care whether it runs at all. And you might want to limit the number of these tasks running concurrently. You might want only one to run at once, for example. So you might want to join() the thread (or, perhaps, join with a timeout). Joining the thread will ensure that the thread will completes its computation. Joining the thread with a timeout increases the chance that the thread starts its computation (because the current thread will be suspended, freeing a CPU that might run the other thread).
However, creating multiple threads to perform regular or frequent tasks is not recommended. You should instead submit tasks to a thread pool. That will enable you to control the maximum amount of concurrency, and can provide you with other benefits (such as prioritising different tasks), and amortises the expense of creating threads.
You can configure a thread pool to use a fixed length (bounded) task queue and to cause submitting threads to execute submitted tasks itself themselves when the queue is full. By doing that you can guarantee that tasks submitted to the thread pool are (eventually) executed. The documentation of ThreadPool.execute(Runnable) says it
Executes the given task sometime in the future
which suggests that the implementation guarantees that it will eventually run all submitted tasks even if you do not do those specific tasks to ensure submitted tasks are executed.
I recommend you to look at the Concurrency API. There are numerous pre-defined methods for general use. By using ExecutorService you can call the shutdown method after submitting tasks to the executor which stops accepting new tasks, waits for previously submitted tasks to execute, and then terminates the executor.
For a short introduction:
https://www.baeldung.com/java-executor-service-tutorial
In my Java application I have a Runnable such as:
this.runner = new Runnable({
#Override
public void run() {
// do something that takes roughly 5 seconds.
}
});
I need to run this roughly every 30 seconds (although this can vary) in a separate thread. The nature of the code is such that I can run it and forget about it (whether it succeeds or fails). I do this as follows as a single line of code in my application:
(new Thread(this.runner)).start()
Now, this works fine. However, I'm wondering if there is any sort of cleanup I should be doing on each of the thread instances after they finish running? I am doing CPU profiling of this application in VisualVM and I can see that, over the course of 1 hour runtime, a lot of threads are being created. Is this concern valid or is everything OK?
N.B. The reason I start a new Thread instead of simply defining this.runner as a Thread, is that I sometimes need to run this.runner twice simultaneously (before the first run call has finished), and I can't do that if I defined this.runner as a Thread since a single Thread object can only be run again once the initial execution has finished.
Java objects that need to be "cleaned up" or "closed" after use conventionally implement the AutoCloseable interface. This makes it easy to do the clean up using try-with-resources. The Thread class does not implement AutoCloseable, and has no "close" or "dispose" method. So, you do not need to do any explicit clean up.
However
(new Thread(this.runner)).start()
is not guaranteed to immediately start computation of the Runnable. You might not care whether it succeeds or fails, but I guess you do care whether it runs at all. And you might want to limit the number of these tasks running concurrently. You might want only one to run at once, for example. So you might want to join() the thread (or, perhaps, join with a timeout). Joining the thread will ensure that the thread will completes its computation. Joining the thread with a timeout increases the chance that the thread starts its computation (because the current thread will be suspended, freeing a CPU that might run the other thread).
However, creating multiple threads to perform regular or frequent tasks is not recommended. You should instead submit tasks to a thread pool. That will enable you to control the maximum amount of concurrency, and can provide you with other benefits (such as prioritising different tasks), and amortises the expense of creating threads.
You can configure a thread pool to use a fixed length (bounded) task queue and to cause submitting threads to execute submitted tasks itself themselves when the queue is full. By doing that you can guarantee that tasks submitted to the thread pool are (eventually) executed. The documentation of ThreadPool.execute(Runnable) says it
Executes the given task sometime in the future
which suggests that the implementation guarantees that it will eventually run all submitted tasks even if you do not do those specific tasks to ensure submitted tasks are executed.
I recommend you to look at the Concurrency API. There are numerous pre-defined methods for general use. By using ExecutorService you can call the shutdown method after submitting tasks to the executor which stops accepting new tasks, waits for previously submitted tasks to execute, and then terminates the executor.
For a short introduction:
https://www.baeldung.com/java-executor-service-tutorial
I read a great article about the fork-join framework in Java 7, and the idea is that, with ForkJoinPool and ForkJoinTask, the threads in the pool can get the sub tasks from other tasks, so it's able to use less threads to handle more tasks.
Then I tried to use a normal ExecutorService to do the same work, and found I can't tell the difference, since when I submit a new task to the pool, the task will be run on another available thread.
The only difference I can tell is if I use ForkJoinPool, I don't need to pass the pool to the tasks, because I can call task.fork() to make it running on another thread. But with normal ExecutorService, I have to pass the pool to the task, or make it a static, so inside the task, I can call pool.submit(newTask)
Do I miss something?
(You can view the living code from https://github.com/freewind/fork-join-test/tree/master/src)
Although ForkJoinPool implements ExecutorService, it is conceptionally different from 'normal' executors.
You can easily see the difference if your tasks spawn more tasks and wait for them to complete, e.g. by calling
executor.invoke(new Task()); // blocks this thread until new task completes
In a normal executor service, waiting for other tasks to complete will block the current thread. There are two possible outcomes: If your executor service has a fixed number of threads, it might deadlock if the last running thread waits for another task to complete. If your executor dynamically creates new threads on demand, the number of threads might explode and you end up having thousands of threads which might cause starvation.
In opposite, the fork/join framework reuses the thread in the meantime to execute other tasks, so it won't deadlock although the number of threads is fixed:
new MyForkJoinTask().invoke();
So if you have a problem that you can solve recursively, think of using a ForkJoinPool as you can easily implement one level of recursion as ForkJoinTask.
Just check the number of running threads in your examples.
As we create a Thread pool using Java's Executor service and submit threads to this thread pool, what is the order in which those threads get executed?
I want to ensure that threads submitted first, execute first.
For example, in the code below, I want first 5 threads to get executed first, followed by the next 5 threads and so on...
// Create a thread pool of 5 threads.
ScheduledExecutorService exService = Executors.newScheduledThreadPool(5, new ModifiedThreadFactory("ReadThreadPool"));
// Create 100 threads.
MyThread[] threads = createMyThreads(100);
// Submit these 100 threads to thread pool for execution.
for(MyThread thread : threads) {
exService.submit(thread);
}
Does Java's Thread Pool provide any API for this purpose, or do we need to implement a FIFO queue at our end to achieve this.
If Java's thread pool does not provide any such functionality, I am really interested to understand the reason behind the non-existence of this functionality as it appears like a very common use-case to me.
Is it technically not possible (which I think is quite unlikely), or is it just a miss?
That's the default behavior. ScheduledThreadExecutor (that you're using although you're not scheduling anything) extends from ThreadPoolExecutor. Tasks submitted to a ThreadPoolExecutor are stored in a BlockingQueue until one thread is available to take them and execute them. And queues are FIFO.
This is decscribed in details in the javadoc.
Threads do not get executed. Threads are the entities running taska like Runnable and Callable . Submiting such a task to a executor service will put it in it's inner BlockingQueue until it gets picked up by a thread from it's thread pool. This will still tell you nothing about the order of execution as different classes can do different things while implementing Runnable
I'm working on a project where execution time is critical. In one of the algorithms I have, I need to save some data into a database.
What I did is call a method that does that. It fires a new thread every time it's called. I faced a runoutofmemory problem since the loaded threads are more than 20,000 ...
My question now is, I want to start only one thread, when the method is called, it adds the job into a queue and notifies the thread, it sleeps when no jobs are available and so on. Any design patterns available or examples available online ?
Run, do not walk to your friendly Javadocs and look up ExecutorService, especially Executors.newSingleThreadExecutor().
ExecutorService myXS = Executors.newSingleThreadExecutor();
// then, as needed...
myXS.submit(myRunnable);
And it will handle the rest.
Yes, you want a worker thread or thread pool pattern.
http://en.wikipedia.org/wiki/Thread_pool_pattern
See http://www.ibm.com/developerworks/library/j-jtp0730/index.html for Java examples
I believe the pattern you're looking for is called producer-consumer. In Java, you can use the blocking methods on a BlockingQueue to pass tasks from the producers (that create the jobs) to the consumer (the single worker thread). This will make the worker thread automatically sleep when no jobs are available in the queue, and wake up when one is added. The concurrent collections should also handle using multiple worker threads.
Are you looking for java.util.concurrent.Executor?
That said, if you have 20000 concurrent inserts into the database, using a thread pool will probably not save you: If the database can't keep up, the queue will get longer and longer, until you run out of memory again. Also, note that an executors queue is volatile, i.e. if the server crashes, the data in it will be gone.