Why doesn't this thread pool get garbage collected? - java

In this code example, the ExecutorService is used one and allowed to go out of scope.
public static void main(String[] args)
{
ExecutorService executorService = Executors.newFixedThreadPool(3);
executorService.submit(new Runnable()
{
public void run()
{
System.out.println("hello");
}
});
}
Once executorService is out of scope, it should get collected and finalized. The finalize() method in ThreadPoolExecutor calls shutdown().
/**
* Invokes {#code shutdown} when this executor is no longer
* referenced and it has no threads.
*/
protected void finalize() {
shutdown();
}
Once shutdown() is called, the pool threads should terminate and the JVM should be allowed to exit. However the executorSerivce is never getting collected and thus the JVM stays alive. Even calls to System.gc() don't seem to work. Why isn't executorService getting collected even after main() terminates?
Note: I know I should call shutdown() myself and I always do outside of testing. I'm curious why finalization isn't working as a back-up here.

This doesn't really have anything to do with GC being non-deterministic, although it doesn't help! (That is one cause in your example, but even if we 'fixed' it to eat up memory and force a collection, it still wouldn't finalize)
The Worker threads that the executor creates are inner classes that have a reference back to the executor itself. (They need it to be able to see the queue, runstate, etc!) Running threads are not garbage collected, so with each Thread in the pool having that reference, they will keep the executor alive until all threads are dead. If you don't manually do something to stop the threads, they will keep running forever and your JVM will never shut down.

Affe is correct; the thread pool's threads will keep it from being garbage collected. When you call Executors.newFixedThreadPool(3) you get a ThreadPoolExecutor constructed like so:
ThreadPoolExecutor(3, 3, 0L, TimeUnit.MILLISECONDS, new LinkedBlockingQueue<Runnable>());
And if you read the JavaDoc for ThreadPoolExecutor it says:
A pool that is no longer referenced in a program AND has no remaining
threads will be shutdown automatically. If you would like to ensure
that unreferenced pools are reclaimed even if users forget to call
shutdown(), then you must arrange that unused threads eventually die,
by setting appropriate keep-alive times, using a lower bound of zero
core threads and/or setting allowCoreThreadTimeOut(boolean).
If you want your thread pool to finalize like you're expecting, you should do one of those things.

If you want the threads to be finalized when the Executor service is out of scope you should avoid, as mjt suggested, the use of
ExecutorService executorService = Executors.newFixedThreadPool(3);`
and use for example:
ExecutorService executorService = new ThreadPoolExecutor(0, 3, 10, TimeUnit.MILLISECONDS, new LinkedBlockingQueue<Runnable>());

Because garbage collection is “non deterministic” ie you cannot predict when it will happen, you thus cannot predict exactly when the finalize method will run. You can only make Objects eligible for GC and suggest gc with System.gc() without any guarantee.
Even worse threads are OS specific handled by the JVM and are hardly predictable...

Finalizers are too unpredictable. Depending on them is usually bad practice.
You can read more about it in "Effective java" by Joshua Bloch (item 1.7)

Once executorService is out of scope, it should get collected and finalized.
Not really - once it is out of scope, it could get collected and finalized. There are no guarantees made in the VM spec about when objects are finalized, or even if they are finalized:
The Java programming language does not specify how soon a finalizer will be invoked, except to say that it will happen before the storage for the object is reused.

Related

What is the method name or mechanism to override "when thread is released" behavior

I would like to override behaviour so that ExecutorService calls custom method. When a thread is released I would like to clear all ThreadLocal variables. Not very familiar with api or maybe there is something which exists there already.
Not sure how thread pool manages threads when they finished their job but I assume it does not destroy them as that would be expensive if it does not destroy them then based on ThreadLocal description:
Each thread holds an implicit reference to its copy of a thread-local
* variable as long as the thread is alive and the {#code ThreadLocal}
* instance is accessible; after a thread goes away, all of its copies of
* thread-local instances are subject to garbage collection (unless other
* references to these copies exist).
I need to clear up ThreadLocal
For an ExecutorService you could make a self cleaning task.
public CleanerTask implements Runnable {
private Disposable realRunnable;
public CleanerTask(Disposable d) {
realRunnable = d;
}
public void run() {
realRunnable.run();
realRunnable.dispose();
}
}
In this example Disposable is an interface extending Runnable and providing a dispose() method that cleans the ThreadLocal variables. The implementation guarantees that run() and dispose() are run in the same thread, so the variables can safely be cleared.
Then you just need to make sure you wrap your tasks in a CleanerTask before submitting them to your executor.
However if you're not tied to ExecutorService you can extend ThreadPoolExecutor which provides an afterExecute method. Then you just call dispose() there (after checking that the Runnable is of the correct type).
(I first thought afterExecute wasn't run in the thread that ran the task, but luckily I thought wrong.)
Not sure if you are thinking about threads and thread pools in a right way. Threads are started with start() and when their execution is finished, they are destroyed. How and when the threads are created depends on your executor service implementation... You might have executor service that just runs tasks in the current thread. Pooled executor service might start its threads with infinite loop waiting for submitted tasks... however even thread pools are usually flexible in a way that the pool keeps only a limited number of waiting threads and if there are more threads it lets them die (breaks the infinite sleep loop). Also usually if the execution throws an exception, the thread is discarded as well.
Having thread-locals survive a single execution is not a good practice. You should clean up your thread-locals after every execution. Do not wait for thread disposal / destruction.
TL;DR Do not try to hack into "thread destruction", but rather start every execution with try/finally to set-up and clean your thread locals.
Threads will get reused by an executorservice that implements a threadpool, those threadlocal entries will stay with the thread across tasks unless removed. If you know when a task is done that its Threadlocal value is now irrelevant, you can clean it up like Kayaman says.
But the point of Threadlocals is that they are available across different components, for cases where the different components can't manage its scope. For instance a web application could put something in a threadlocal in a filter on the way in with a HTTP request, have it available to web controllers and services, etc., over the course of the request, and clean up the threadlocal in the filter on the way back out. So in this example the scope of the threadlocal value is managed by the filter in order to be available to everything participating in the request for that thread, where in a "normal" (meaning not some async non-blocking setup like Play) web application the request is handled by one thread in the application server.
If it's that straightforward for you to identify the scope where the ThreadLocal value isn't needed anymore that the task can clean it up, then it sounds like your code is using ThreadLocals unnecessarily. I'd suggest removing these ThreadLocals and using local variables within the task instead. A ThreadLocal shouldn't be used as an easy alternative to argument-passing.

Why does Java ThreadPoolExecutor override finalize()

I'd like to know why the ThreadPoolExecutor finalize() method invokes its shutdown() method when it is known that the finalize method only gets invoked by the GC AFTER all of its threads have been stopped. So why does ThreadPoolExecutor override finalize() at all?
It seems misleading to me (and a source of a thread-leak my project) that ThreadPoolExecutor.finalize() invokes shutdown() as this gives the strong (but false) impression that
- ThreadPoolExecutor manages the lifecycle of its threads and will stop the threads when the GC collects the ThreadPoolExecutor object
- it is only necessary to invoke shutdown() or shutdownNow() if you want deterministic results as opposed to relying on the GC to tidy up (obviously, poor practice to do this!)
Notes
In this thread, why-doesnt-this-thread-pool-get-garbage-collected Affe explains why it is still necessary for the client to invoke shutdown()
In this thread, why-threadpoolexecutor-finalize-invokes-shutdown-and-not-shutdownnow the originator is puzzled by this topic but the answers aren't as comprehensive as in 1
The JavaDocs for ThreadPoolEecutor.finalize() do include the words "and it has no threads" but this is easily overlooked.
First, if you think this is a bug, then report it to Oracle.
Second, we cannot definitely answer your question, because you are essentially asking "why did they design it this way" ... and we weren't there when they made the decision.
But I suspect the reason is that the current choice was deemed to be the lesser of two evils:
On the one hand, if you a threadpool could be shutdown merely because it was no longer directly referenced, then threads that are doing real work could be terminated prematurely.
On the other hand, as you observed a threadpool that doesn't get automatically shutdown on becoming on longer directly reachable could be a memory leak.
Given that there are clearly ways to avoid the storage leak, I think that the 2nd alternative (i.e. the current behaviour) is the lesser of the evils.
Anyway, there is a clear evidence that this behaviour was considered by the designers; i.e. this quotation from the ThreadPoolExecutor javadoc:
Finalization
A pool that is no longer referenced in a program AND has no remaining threads will be shutdown automatically. If you would like to ensure that unreferenced pools are reclaimed even if users forget to call shutdown(), then you must arrange that unused threads eventually die, by setting appropriate keep-alive times, using a lower bound of zero core threads and/or setting allowCoreThreadTimeOut(boolean).
(And that sort of answer's #fge's comment. It happens when inactive worker threads are configured to time out.)
I also think there is an implementation-based reason as well. Take a look at the code here.
Each thread in the thread pool has a reference to a Runnable, which is actually an instance of the inner class ThreadPoolExecutor.Worker. This means that there is a strong reference path from the (live, though probably idle) Thread to the ThreadPoolExecutor.Worker object, and from that object to the enclosing ThreadPoolExecutor instance. And since live threads are always reachable, a ThreadPoolExecutor instance is remains reachable until all of its threads actually terminate.
Now I/we can't tell you which came first, the behaviour or the javadoc specification. See my "Secondly ..." point above ...
But, like I said above ("Firstly ..."), if you think this is a bug, report it to Oracle. And the same goes if you think that the javadoc is misleading ... though I think your argument there is weak given the material that I quoted.
As to why they overloaded finalize() to call shutdown() if shutdown() does nothing in this circumstance:
it may have done something significant in earlier versions, and
the shutdown() method ends by calling a hook method that is apparently overridden in subclasses to do something significant ... according to the comments.

Do threads get automatically garbage collected after run() method exits in Java?

Does a thread self-delete and get garbage collected after it runs or does it continue to exist and consume memory even after the run() method is complete?
For example:
Class A{
public void somemethod()
{
while(true)
new ThreadClass().start();
}
public class ThreadClass extends Thread{
public ThreadClass()
{}
#Override
public void run() {......}
}
}
I want to clarify whether this thread will be automatically removed from memory, or does it need to be done explicitly.
This will happen automatically i.e. memory will be released automatically once the thread is done with its run method.
Threads only exist until the end of their run method, after that they are made eligible for garbage collection.
If you require a solution where memory is at a premium, you might want to consider an ExecutorService. This will handle the threads for you and allow you to concentrate on the logic rather than handling the threads and the memory.
Threads are automagically garbage collected on completion of the run method, hence you do not have to do it explicitly.
Threads will be garbage collected after their run method has completed. The notable exception to this is when you are using the android debugger. The android debugger will prevent garbage collection on objects that it is aware of, which includes threads that have finished running.
Why do threads leak on Android?

Java "unstopped " executed/finished threads

I've got a question about threads. When I do sth like this:
new Thread(new Runnable(){
#Override
public void run() {
//sth to do
}
}).start();
What happens when all the code in run() is executed ? Does the system automatically deletes the thread or does the thread still remain in memory?
thx & regards
When a thread finished its run() method, it will enter the 'dead' state. Then the next thread in your stack runs after.
Dead state :
"A thread is considered dead when its run() method completes. It may
still be a viable Thread object, but it is no longer a separate thread
of execution. Once a thread is dead, it can never be brought back to
life! (The whole "I see dead threads" thing.) If you invoke start() on
a dead Thread instance, you'll get a runtime (not compiler) exception.
And it probably doesn't take a rocket scientist to tell you that if a
thread is dead, it is no longer considered to be alive."
Java's Threading Model is a little bit more complicated than that.
Basically, a java.lang.Thread is just a wrapper around some data, not a process by itself. When you call the .start() method, a native thread is created and linked to your java Thread. This work is done by the JVM using internal data structures (JavaThread and OSThread).
Once the .run() method finish, many operations are performed by the JVM to delete the native thread that was used. Therefore, you won't see this thread anymore in you process list (using top or ps, for example).
However, the objects that were allocated in the heap and the java.lang.Thread instance itself stay in memory until a GC cycle collects them.
So, to sum up :
Yes, the JVM deletes the native thread that was used
No, the JVM does not delete the java.lang.Thread instance that was used
The GC will eventually collect this instance
For more information, you should read the book "Java Performance" by Charlie Hunt. It contains lots of information on this topic (and many others).
Hope that helps !
When the code in a thread finishes executing, the thread is stopped.
The Thread instance will still exist until it gets GC'd, but the actual system thread will no longer exist.
If you don't use any custom-configured thread pool mechanism, your thread will die and the Threadobject itself will be eligible to garbage collection.

is there any way to confirm if the thread is killed at the end of execution?

is there any way to confirm if the thread is killed at the end of execution? If the garbage collector takes long time to destroy the threads even when they are available for GC, out of memory exceptions may arise. to get rid of those kind of issues, it would be good to know if the threads have been destroyed.
As of now, my understanding is that at the end of run method , the thread gets killed and we need not do anything explicitly to kill the thread instance.
Thanks in advance!
class A
{
public static void main()
{
Thread t = new Thread(new TestA());
t.start();
Thread t1 = new Thread(new TestB());
t1.start();
Thread t2 = new Thread(new TestC());
t2.start();
}
}
class TestA implements Runnable {
Thread t;
public void run() {
for(...){
try{
}catch()
{
....
}
}
}
}
You are absolutely right that "at the end of run method, the thread gets killed and we need not do anything explicitly to kill the thread instance". Simply letting the thread leave its run() method normally is enough.
If you want to make sure that a particular thread has terminated, then Thread.isAlive() will check, and Thread.join() will wait until it happens. If you have a particular set of threads that you're worried about, then keep a reference to them somewhere, and check up on them using these methods.
Thread.getAllStackTraces()
gets you a current map of threads/stacktraces. However I would normally expect the JVM to clear up the threads upon exit from run(). Obviously if you're using some sort of thread pooling then that's not the case.
You can use some softwares like visualvm to monitor the thread states .
These kind of softwares will give you full flexibility to profile your application in a visual way.
To check the state of a thread , you can call the getState() method on a thread object to see the state of the thread.
The javadoc of OutOfMemoryError says:
Thrown when the Java Virtual Machine cannot allocate an object because
it is out of memory, and no more memory could be made available by the
garbage collector.
So, if a thread is not running anymore and is eligible to GC, the GC will try to collect it before throwing an OOM. Like with any other object.
is there any way to confirm if the thread is killed at the end of execution?
There's no sense confirming something you know to be true. Whenever the JVM process dies, all its threads are automatically killed by the operating system. Any other behavior is a bug in the OS.
If the garbage collector takes long time to destroy the threads even when they are available for GC, out of memory exceptions may arise.
The garbage collector doesn't kill threads - the JVM wraps operating-system-specific thread libraries into a consistent Java-language thread abstraction, so those thread libraries determine when a thread dies.
my understanding is that at the end of run method, the thread gets killed and we need not do anything explicitly to kill the thread instance.
That is correct.
If you look up in the javadoc for the Thread class you will see many methods that might help you check what you want, for example:
activeCount() : Returns the number of active threads in the current thread's thread group.
You can use this as a debug method.
isAlive() : Tests if this thread is alive.
To check if a specific thread is alive.
join() : Waits for this thread to die.
If you call this at the end of your method then it will wait for the thread to join (i.e. to end execution) before advancing. If you call for all threads, then you are sure that all have finished when the main() has finished.
destroy() : Destroys this thread, without any cleanup.
Does what it says, but I would never suggest this.
Hope it helps!

Categories