I'd like to know why the ThreadPoolExecutor finalize() method invokes its shutdown() method when it is known that the finalize method only gets invoked by the GC AFTER all of its threads have been stopped. So why does ThreadPoolExecutor override finalize() at all?
It seems misleading to me (and a source of a thread-leak my project) that ThreadPoolExecutor.finalize() invokes shutdown() as this gives the strong (but false) impression that
- ThreadPoolExecutor manages the lifecycle of its threads and will stop the threads when the GC collects the ThreadPoolExecutor object
- it is only necessary to invoke shutdown() or shutdownNow() if you want deterministic results as opposed to relying on the GC to tidy up (obviously, poor practice to do this!)
Notes
In this thread, why-doesnt-this-thread-pool-get-garbage-collected Affe explains why it is still necessary for the client to invoke shutdown()
In this thread, why-threadpoolexecutor-finalize-invokes-shutdown-and-not-shutdownnow the originator is puzzled by this topic but the answers aren't as comprehensive as in 1
The JavaDocs for ThreadPoolEecutor.finalize() do include the words "and it has no threads" but this is easily overlooked.
First, if you think this is a bug, then report it to Oracle.
Second, we cannot definitely answer your question, because you are essentially asking "why did they design it this way" ... and we weren't there when they made the decision.
But I suspect the reason is that the current choice was deemed to be the lesser of two evils:
On the one hand, if you a threadpool could be shutdown merely because it was no longer directly referenced, then threads that are doing real work could be terminated prematurely.
On the other hand, as you observed a threadpool that doesn't get automatically shutdown on becoming on longer directly reachable could be a memory leak.
Given that there are clearly ways to avoid the storage leak, I think that the 2nd alternative (i.e. the current behaviour) is the lesser of the evils.
Anyway, there is a clear evidence that this behaviour was considered by the designers; i.e. this quotation from the ThreadPoolExecutor javadoc:
Finalization
A pool that is no longer referenced in a program AND has no remaining threads will be shutdown automatically. If you would like to ensure that unreferenced pools are reclaimed even if users forget to call shutdown(), then you must arrange that unused threads eventually die, by setting appropriate keep-alive times, using a lower bound of zero core threads and/or setting allowCoreThreadTimeOut(boolean).
(And that sort of answer's #fge's comment. It happens when inactive worker threads are configured to time out.)
I also think there is an implementation-based reason as well. Take a look at the code here.
Each thread in the thread pool has a reference to a Runnable, which is actually an instance of the inner class ThreadPoolExecutor.Worker. This means that there is a strong reference path from the (live, though probably idle) Thread to the ThreadPoolExecutor.Worker object, and from that object to the enclosing ThreadPoolExecutor instance. And since live threads are always reachable, a ThreadPoolExecutor instance is remains reachable until all of its threads actually terminate.
Now I/we can't tell you which came first, the behaviour or the javadoc specification. See my "Secondly ..." point above ...
But, like I said above ("Firstly ..."), if you think this is a bug, report it to Oracle. And the same goes if you think that the javadoc is misleading ... though I think your argument there is weak given the material that I quoted.
As to why they overloaded finalize() to call shutdown() if shutdown() does nothing in this circumstance:
it may have done something significant in earlier versions, and
the shutdown() method ends by calling a hook method that is apparently overridden in subclasses to do something significant ... according to the comments.
Related
This question is NOT about alternatives to Thread.suspend.
This is about the possibility to implement a bias lock with Thread.suspend, which (I believe) can't be implemented with Thread.interrupt or similar alternatives.
I know Thread.suspend is deprecated.
But I want to know the precise semantics of Thread.suspend.
If I call thread1.suspend(), am I guaranteed to be blocked until thread1 is fully stopped? If I call thread1.resume(), can this call be visible to other threads out of order?
More over, if I successfully suspend a thread, will this thread be suspended at a somewhat safe point? Will I see its intermediate state (because Java forbids out of thin air value even in not properly synchronized program, I don't believe this is allowed) or see something out of order (if suspend is an asynchronous request, then sure I will see that kind of thing)?
I want to know these because I want to implement some toy asymmetric lock within Java (like BiasedLock in HotSpot). Using Thread.suspend you can implement a Dekker like lock without store load barrier (and shift the burden to the rare path). My experimentation shows it works, but since a Thread.sleep is enough to wait for a remote context switch, I am not sure this is guaranteed behavior.
By the way, are there any other way to force (or detect) remote barrier? For example, I search the web and find others use FlushProcessWriteBuffers or change affinity to bind a thread to each core. Can these tricks done within Java?
EDIT
I came up with an idea. Maybe I can use GC and finalizer to implement the biased lock, at least if only two threads are there. Unfortunately the slow path may require explicit gc() call, which isn't really practical.
If GC is not precise, I maybe end up with a deadlock. If the GC is too smart and collect my object before I nullify the reference (maybe the compiler is allowed to reuse stack variables, but is the compiler allowed to do these kind of things for heap variables, ignoring acquire fence and load fence? ), I end up with corrupted data.
EDIT
It seems a so called "reachability fence" is needed to prevent the optimizer moveing an object's last reference upward. Unfortunately it's no where.
Its semantics consist entirely of what is specified in the Javadoc:
Suspends this thread.
First, the checkAccess method of this thread is called with no arguments. This may result in throwing a SecurityException (in the current thread).
If the thread is alive, it is suspended and makes no further progress unless and until it is resumed.
But as you're not going to use it, because it's deprecated, this is all irrelevant.
I am actually looking for an easier way to kill the thread not matter where the thread is running at. But most of the solutions in internet point me to use boolean flag to control the execution of the thread, if I want to stop the thread then set the boolean variable to false.
But what if the task that in the runnable is a LONG linear task, which mean the task is not repeating? In that case, it is not so easy to create a 'while' loop to cover the whole block of task.
It is really so temptative to use Thread.stop but the warning "Deprecated" seem like quite dangerous to use. I have read through this article
Why Are Thread.stop, Thread.suspend, Thread.resume and Runtime.runFinalizersOnExit Deprecated?
but I can't understand
If any of the objects previously protected by these monitors were in
an inconsistent state, other threads may now view these objects in an
inconsistent state. Such objects are said to be damaged.
What does the "inconsistent state" mean? I appreciate if anyone can explain about this.
I want to extend my question to a more lower level of view, let say i = i + 1; in JVM (perhaps assembly language alike), maybe this Java statement will be split into few smaller instructions, for example like move i ; add i ; get i into memory 0x0101 (This is an example! I totally don't know assembly language!)
Now, if we call thread.stop, where actually will it stop at? Will the thread stop after a COMPLETED Java statement, or could be in the middle of the "assemble language"? If the answer is the second, could it be reason that we said
Such objects are said to be damaged.
?
Ok, my question is kind of confused, hope someone can understand and explain. Thanks in advance.
"Damaged object" is a high-level concept, it doesn't happen at the JVM level. A programmer designs his class with thread safety in mind by guarding critical sections with locks. It is an invariant of his class that each critical section either runs in full, or doesn't run at all. When you stop a thread, a critical section may have been interrupted in the middle, so disrupting the invariant. At that moment the object is damaged.
Stopping a thread conceals many more dangers, like no cleanup performed, no acquired resources released, etc. If a thread doesn't give up what it is doing, there is no way to make it stop without compromising the entire application.
In practice, whenever one faces the need to run alien code that may need to be forcefully aborted, this must be done in a separate process because killing a process at least performs OS-level cleanup and does a much better job of containing the damage.
The "inconsistent state" means state of data as your application cares about, state that your application logic have carefully produced by making your application thread-safe with locks/monitors etc.
Imagine you have this simple method:
public synchronized void doSomething()
{
count++;
average = count/total;
}
This method, along with other methods are synchronized, as multiple threads are using this object.
Perhaps there's a
public synchronized AverageAndCount getMeasurement()
{
return new AverageAndCount(average, count);
}
This assures that a thread can't read an incomplete measurement, i.e. if the current measurement is in the process of being calculated inside e.g. doSomething(), getMeasurement() will block/wait until that's finished.
Now, imagine the doSomething is run in a thread, and you call .stop() on that thread.
So the thread might be stopped right after it performs count++;, the monitor that's held is unlocked and the method terminates and average = count/total; is not executed,
That means the data is now inconsistent. Anyone calling getMeasurement() afterwards will now get inconsistent data.
Note also that at this point it is not very relevant whether this happens at a java statement level, or at a lower level, the data can be in an inconsistent state that you can't reason about in any case.
I'm no expert but this is what I think.
If you use Thread.stop() you cause the ThreadDeath exception that will cause all monitors to be released.
Since you provoke an exception you are applying an unnatural behaviour to the state of things.
Other threads relying on those monitors could enter in an inconsistent situation because they were not expecting it. And I don't think you can even anticipate the monitors releasing order.
I believe the concern is that the thread may be in the middle of a synchronize block performing multi-step updates to an object's members. If the thread is stopped abruptly, then some updates will have occurred but not others and now the object's state may render it unusable.
I have my doubts that the ThreadDeath handling will release a Lock backed by the AbstractQueuedSynchronizer which could leave the application on the path to a sort of deadlock.
At any logical point in your long sequence of code you can simply add:
if (Thread.interrupted()) {
throw new InterruptedException();
}
...this will exit execution at this point if it is determined that Thread.interupt() was called on the Thread executing the long running task.
It's not clear way to stop the thread.actually deprecated the stop() method whenever run() method is completed or any exception is occurred then thread is stop.by using the boolean flag variable .Bydefault "false"
I am currently using JMX to manage and monitor a huge migration process which is executed within a Java class.
I would like to be able to abort and kill the process when I needed, e.g. customer/time required, or some dead loop happens within a single migration.
Here, we call abort a gracefully way to kill a thread by setting up a boolean flag and once every loop will check the flag first and then decides whether to proceed or not. This has been implemented without any issue.
However, I am having troubles with kill the thread. My colleague had suggested me to override the finalize() method and try to kill it within it. However, what I have found online is this method will not be able to destroy the object and it is recommned to be called by GC but the user instead.
I guess the theory is OK that as long as the object is destroyed, no more process will be able to happen. I am just not sure whether this is able to be implemented in JAVA or not.
Also, I would like to know, is there any other ways that you guys can give me a hint.
Would be very appreciate your help.
P.S: by relating to JMX, doesn't mean it really has to do with JMX, it just I would like this killing command is coming from the JMX console client.
It's a bit hard to understand what you are saying, but I don't think that finalize is going to be any help.
A live thread (i.e. one that has been started and has not yet terminated) is reachable by definition, and therefore won't be garbage collected. So adding a finalize method to it won't have any affect.
If the object you are talking about is not the thread, adding a finalize probably won't help either:
If the thread's runnable (or whatever) has a reference to the object, that will stop it from being garbage collected.
If it doesn't, and the object does become unreachable, the finalize method won't run until after the GC has decided to collect the object ... and that may never happen.
Even if the finalize method did get called, what could it do? You've already told the thread to shut down ... and nothing has happened.
The real problem here seems to be that the thread is not responding to your "graceful shutdown" flag.
I'd try to fix this by using Thread.interrupt() and Thread.isInterrupted() rather than a custom flag. This has the advantage that an interrupt will also unblock things like Thread.sleep Object.wait and certain I/O operations.
If the thread is blocked trying to talk to some external service via a socket or pipe, you could unblock it by closing the socket and/or stream. This of course assumes that your shutdown code can get its hands on the reference to the Socket or Stream object.
If those approaches failed, I'd consider pulling the plug on the entire application by calling System.exit() ... if that's a reasonable thing to do.
If you are totally desperate (and a little bit insane) you could consider using the deprecated Thread.abort() method. But there is a distinct possibility that that would leave your entire application in a broken and unresponsive state. So I would NOT recommend this approach.
The other possibilities to consider are:
that the Thread has actually responded and exited, but your shutdown code didn't notice,
that the Thread died before you tried to shut it down it, and your shutdown code didn't notice,
that the Thread is deadlocked, or
that there is some long running (but not infinite) loop in the runnable needs to be modified to check the "you die now" flag more often.
Some of these things you could be diagnosed by attaching a debugger and taking a thread dump.
I think you said that you saw advice to the effect that it was a BAD IDEA to call System.gc(). This is good advice.
You should perform certain task in finally which you want to perform when method exits in any condition. Most preferable example people give about this is cosing database connection.
Yes it is recommended to leave Garbage Collection on JVM.
JVM takes care of destrying objects.
After a lot of research I believe I understand the JMM quite well, certainly well enough to know that when an object is shared between two threads you must synchronize all access on the same monitor. I understand that if multiple active threads access an object concurrently all bets are off as to what they will observe.
However, if an object is deterministically and actually constructed before some other thread which uses it is started (or that thread is even constructed), does the JMM guarantee that the contents of the object seen by the later thread are the same as was configured by the earlier set-up thread.
IOW, is it possible to reference an object for the first time in a thread and observe dirty memory due to, e.g. CPU caching, instead of the real contents of the object? Or does the JMM guarantee that when first obtaining a reference to any given object, the memory it references is coherent?
I ask because there is one specific pattern I use in a number of places which I am still unsure about. Often I have an object which is constructed and configured in a piece-meal fashion and then subsequently used immutably. Because it's configured piece-meal, none of it's members can be final (and I don't want to change these all to a builder pattern unless I have to).
For example, creating an HTTP connection handler, and adding plugin objects to handle specific HTTP requests. The handler is created and configured using mutators, and then installed into a TCP connection processor which uses a thread pool to process connections. Since the connection handler is configured and installed before the connection processor's thread pool is started and never changed once installed into the connection processor I don't use explicit synchronization between the thread which sets everything up and the threads which process connections.
In this specific case, it's probable that the thread configuring is also the same thread which starts the thread pool, and since the thread pool start is synchronized all the threads which run out of it are also synchronized on the same thread pool object, so this might mask any underlying problem (it's not required by my API that the starting thread is the same as the configuring thread).
Generally you should have happens-before relationships when threads interact. For instance, as provided by concurrent queues. There is not necessarily any need for finer synchronisation.
The rare case of passing objects between threads without happens-before relationships is known as unsafe publication. There are rules surrounding the use of final fields that allow this to be made safe. However, you should very rarely find yourself wanting to rely upon that.
There is always a happens-before relationship between invoking start on a thread, and the execution of the thread. So if an object is safely published to the starting thread before starting, the started thread will also see the object coherently.
Doesn't marking a variable 'volatile' prevent threads from seeing 'dirty' values?
I have an application which has to live as a service, I create an object which then spawns off a buch of threads.
If I set the only reference to that object to null will all the child threads get cleaned up? or will I suffer from a memory leak.
Do I have to explicitly terminate all the child threads?
Threads and static references are 'root objects'. They are immune from GCing and anything that can be traced back to them directly or indirectly cannot be collected. Threads will therefore not be collected as long as they are running. Once the run method exits though, the GC can eat up any unreferenced thread objects.
Yes, you need to make sure your other threads stop. The garbage collector is irrelevant to this. You should also do so in an orderly fashion though - don't just abort them.
Here's a pattern in C# for terminating threads co-operatively - it's easy to translate to Java.
As others have mentioned, threads won't be cleaned up until they've been stopped. They are root objects for the GC, so you don't have to keep references to them. Your application won't quit until all threads have exited.
There is one exception to this rule. If you mark a thread as a daemon then it will not prevent your application from exiting, and if there are no other non-daemon threads running they it will be cleaned up automatically.
See the javadoc for Thread for more info.
No matter the theory (or StackOverflow answers), you should also create some test to see if what you intended to do, is really happening. Maybe you have some forgotten pointer preventing garbage collection.