Do I need to clean up Thread objects in Java? - java

In my Java application I have a Runnable such as:
this.runner = new Runnable({
#Override
public void run() {
// do something that takes roughly 5 seconds.
}
});
I need to run this roughly every 30 seconds (although this can vary) in a separate thread. The nature of the code is such that I can run it and forget about it (whether it succeeds or fails). I do this as follows as a single line of code in my application:
(new Thread(this.runner)).start()
Now, this works fine. However, I'm wondering if there is any sort of cleanup I should be doing on each of the thread instances after they finish running? I am doing CPU profiling of this application in VisualVM and I can see that, over the course of 1 hour runtime, a lot of threads are being created. Is this concern valid or is everything OK?
N.B. The reason I start a new Thread instead of simply defining this.runner as a Thread, is that I sometimes need to run this.runner twice simultaneously (before the first run call has finished), and I can't do that if I defined this.runner as a Thread since a single Thread object can only be run again once the initial execution has finished.

Java objects that need to be "cleaned up" or "closed" after use conventionally implement the AutoCloseable interface. This makes it easy to do the clean up using try-with-resources. The Thread class does not implement AutoCloseable, and has no "close" or "dispose" method. So, you do not need to do any explicit clean up.
However
(new Thread(this.runner)).start()
is not guaranteed to immediately start computation of the Runnable. You might not care whether it succeeds or fails, but I guess you do care whether it runs at all. And you might want to limit the number of these tasks running concurrently. You might want only one to run at once, for example. So you might want to join() the thread (or, perhaps, join with a timeout). Joining the thread will ensure that the thread will completes its computation. Joining the thread with a timeout increases the chance that the thread starts its computation (because the current thread will be suspended, freeing a CPU that might run the other thread).
However, creating multiple threads to perform regular or frequent tasks is not recommended. You should instead submit tasks to a thread pool. That will enable you to control the maximum amount of concurrency, and can provide you with other benefits (such as prioritising different tasks), and amortises the expense of creating threads.
You can configure a thread pool to use a fixed length (bounded) task queue and to cause submitting threads to execute submitted tasks itself themselves when the queue is full. By doing that you can guarantee that tasks submitted to the thread pool are (eventually) executed. The documentation of ThreadPool.execute(Runnable) says it
Executes the given task sometime in the future
which suggests that the implementation guarantees that it will eventually run all submitted tasks even if you do not do those specific tasks to ensure submitted tasks are executed.

I recommend you to look at the Concurrency API. There are numerous pre-defined methods for general use. By using ExecutorService you can call the shutdown method after submitting tasks to the executor which stops accepting new tasks, waits for previously submitted tasks to execute, and then terminates the executor.
For a short introduction:
https://www.baeldung.com/java-executor-service-tutorial

Related

run things in parallel with multithreading [duplicate]

In my Java application I have a Runnable such as:
this.runner = new Runnable({
#Override
public void run() {
// do something that takes roughly 5 seconds.
}
});
I need to run this roughly every 30 seconds (although this can vary) in a separate thread. The nature of the code is such that I can run it and forget about it (whether it succeeds or fails). I do this as follows as a single line of code in my application:
(new Thread(this.runner)).start()
Now, this works fine. However, I'm wondering if there is any sort of cleanup I should be doing on each of the thread instances after they finish running? I am doing CPU profiling of this application in VisualVM and I can see that, over the course of 1 hour runtime, a lot of threads are being created. Is this concern valid or is everything OK?
N.B. The reason I start a new Thread instead of simply defining this.runner as a Thread, is that I sometimes need to run this.runner twice simultaneously (before the first run call has finished), and I can't do that if I defined this.runner as a Thread since a single Thread object can only be run again once the initial execution has finished.
Java objects that need to be "cleaned up" or "closed" after use conventionally implement the AutoCloseable interface. This makes it easy to do the clean up using try-with-resources. The Thread class does not implement AutoCloseable, and has no "close" or "dispose" method. So, you do not need to do any explicit clean up.
However
(new Thread(this.runner)).start()
is not guaranteed to immediately start computation of the Runnable. You might not care whether it succeeds or fails, but I guess you do care whether it runs at all. And you might want to limit the number of these tasks running concurrently. You might want only one to run at once, for example. So you might want to join() the thread (or, perhaps, join with a timeout). Joining the thread will ensure that the thread will completes its computation. Joining the thread with a timeout increases the chance that the thread starts its computation (because the current thread will be suspended, freeing a CPU that might run the other thread).
However, creating multiple threads to perform regular or frequent tasks is not recommended. You should instead submit tasks to a thread pool. That will enable you to control the maximum amount of concurrency, and can provide you with other benefits (such as prioritising different tasks), and amortises the expense of creating threads.
You can configure a thread pool to use a fixed length (bounded) task queue and to cause submitting threads to execute submitted tasks itself themselves when the queue is full. By doing that you can guarantee that tasks submitted to the thread pool are (eventually) executed. The documentation of ThreadPool.execute(Runnable) says it
Executes the given task sometime in the future
which suggests that the implementation guarantees that it will eventually run all submitted tasks even if you do not do those specific tasks to ensure submitted tasks are executed.
I recommend you to look at the Concurrency API. There are numerous pre-defined methods for general use. By using ExecutorService you can call the shutdown method after submitting tasks to the executor which stops accepting new tasks, waits for previously submitted tasks to execute, and then terminates the executor.
For a short introduction:
https://www.baeldung.com/java-executor-service-tutorial

Skip to next task in a single threaded ExecutorSerivce?

I am considering an implementation of an ExecutorService to run a series of tasks. I plan to use the internal queue to have a few tasks waiting for their turn to run. Is there some way to interrupt the task (the Runnable) that is currently running in an ExecutorService thread, and keep the thread alive to run the next task? Or is only possible to call .shutdown() and then create a new ExecutorService?
I have found this and wanted to know if there are any other solutions.
Instead of interfering with the threads you may want to have a Task class (that extends or wraps the Runnable) which implements an interrupt mechanism (e.g. a boolean flag).
When you execute your task you need to check this flag periodically and if it is set, the task should stop what it is doing. You might want to return a specific result at this point, that tells your code that the task was cancelled succesfully.
If a user now decides that he no longer requires the results from this task,
you will have to set this flag. However the task might have already completed at this point of time so you still need to deal with the cases where the result already exists but the user does no longer care about it.
An interrupt on a thread level does not guarantee that the thread stops working. This will only work if the thread is in a state where it can receive an interrupt.
Also you should not interfere with the Threads of the ExecutorSerivce directly, as you might unintentionally stop a different task or stop the ExecutorSerivce from working properly.
Why would you want to kill that task and continue with the next one? If it is a question of times you can define that the threads that are taking longer than you declared in the method that executes them are automatically canceled. E.g:
ExecutorService executor = Executors.newSingleThreadExecutor();
executor.invokeAll(Arrays.asList(new Task()), 60, TimeUnit.SECONDS); // Timeout of 60 seconds.
executor.shutdown();
If any of the threads takes longer than 60 seconds it will throw a cancellation.Exception() that you must catch

Difference between ForkJoinPool and normal ExecutionService?

I read a great article about the fork-join framework in Java 7, and the idea is that, with ForkJoinPool and ForkJoinTask, the threads in the pool can get the sub tasks from other tasks, so it's able to use less threads to handle more tasks.
Then I tried to use a normal ExecutorService to do the same work, and found I can't tell the difference, since when I submit a new task to the pool, the task will be run on another available thread.
The only difference I can tell is if I use ForkJoinPool, I don't need to pass the pool to the tasks, because I can call task.fork() to make it running on another thread. But with normal ExecutorService, I have to pass the pool to the task, or make it a static, so inside the task, I can call pool.submit(newTask)
Do I miss something?
(You can view the living code from https://github.com/freewind/fork-join-test/tree/master/src)
Although ForkJoinPool implements ExecutorService, it is conceptionally different from 'normal' executors.
You can easily see the difference if your tasks spawn more tasks and wait for them to complete, e.g. by calling
executor.invoke(new Task()); // blocks this thread until new task completes
In a normal executor service, waiting for other tasks to complete will block the current thread. There are two possible outcomes: If your executor service has a fixed number of threads, it might deadlock if the last running thread waits for another task to complete. If your executor dynamically creates new threads on demand, the number of threads might explode and you end up having thousands of threads which might cause starvation.
In opposite, the fork/join framework reuses the thread in the meantime to execute other tasks, so it won't deadlock although the number of threads is fixed:
new MyForkJoinTask().invoke();
So if you have a problem that you can solve recursively, think of using a ForkJoinPool as you can easily implement one level of recursion as ForkJoinTask.
Just check the number of running threads in your examples.

Schedule periodic tasks in Java, avoid creating new threads until necessary (like CachedThreadPool)

I have a number of tasks that I would like to execute periodically at different rates for most tasks. Some of the tasks may be scheduled for simultaneous execution though. Also, a task may need to start executing while another is currently executing.
I would also like to customize each task by setting an object for it, on which the task will operate while it is being executed.
Usually, the tasks will execute in periods of 2 to 30 minutes and will take around 4-5 seconds, sometimes up to 30 seconds when they are executed.
I've found Executors.newSingleThreadedScheduledExecutor(ThreadFactory) to be almost exactly what I want, except that it might cause me problems if a new task happens to be scheduled for execution while another is already executing. This is due to the fact that the Executor is backed up by a single execution thread.
The alternative is to use Executors.newScheduledThreadPool(corePoolSize, ThreadFactory), but this requires me to create a number of threads in a pool. I would like to avoid creating threads until it is necessary, for instance if I have two or more tasks that happen to need parallell executing due to their colliding execution schedules.
For the case above, the Executors.newCachedThreadPool(ThreadFactory) appears to do what I want, but then I can't schedule my tasks. A combination of both cached and scheduled executors would be best I think, but I am unable to find something like that in Java.
What would be the best way to implement the above do you think?
Isn't ScheduledThreadPoolExecutor.ScheduledThreadPoolExecutor(int):
ScheduledThreadPoolExecutor executor = new ScheduledThreadPoolExecutor(0);
what you need? 0 is the corePoolSize:
corePoolSize - the number of threads to keep in the pool, even if they are idle, unless allowCoreThreadTimeOut is set
I guess you will not able to do that with ScheduledExecutor, because it uses DelayedWorkQueue where as newCachedThreadPool uses ThreadPoolExecutor SynchronousQueue as a work queue.
So you can not change implementation of ScheduledThreadPoolExecutor to act like that.

java executor with pre-emptable thread queue

I'm looking for a java thread-pool, that won't run more threads simultaneously than there are cores in the system. This service is normally provided by a ThreadPoolExecutor using a BlockingQueue.
However, if a new thread is scheduled to execute, I want the new thread to pre-empt one of the already running threads, and add the the pre-empted thread (in a suspended state) to a task queue, so it can be resumed as soon as the new thread is finished.
Any suggestions?
I would make a subclass of ThreadPoolExecutor.
When you setup your ThreadPoolExecutor you want to set the corePoolSize and the maximumPoolSize to Runtime.getRuntime().availableProcessors() (Look at Executors.newFixedThreadPool() to see why this works).
Next you want to make sure that your Queue also implements Deque. LinkedBlockingDeque is an example but you should shop around to see which one will work best for you. A Deque allows you to get stack like LIFO behavior which is exactly what you want.
Since everything (submit(), invokeAll()) funnels through execute() you will want to override this method. Basically do what you described above:
Check if all threads are running. If not simply start the new runnable on an available thread. If all the threads are already running then you need to find the one running the oldest runnable, stop the runnable, re-queue the runnable somewhere (maybe at the beginning?), and then start your new runnable.
The idea of a ThreadPoolExecutor is to avoid all of the expensive actions related to creating and destroying a thread. If you absolutely insist on preempting the running tasks, then you won't get that from the default API.
If you are willing to allow the running tasks to complete and instead only preempt the tasks which have not begun execution, then you can use a BlockingQueue implementation which works like a Stack (LIFO).
You can also have tasks 'preempt' other tasks by using different executors with different thread priorities. Essentially, if the OS supports time-slicing, then the higher priority executor gets the time-slice.
Otherwise, you need a custom implementation which manages execution. You could use a SynchronousQueue and have P worker threads waiting on it. If a client calls execute and SynchronousQueue.offer fails, then you would have to create a special worker Thread which grabs one of the other Threads and flags them to halt before executing and again flags them to resume after executing.

Categories