In order to understand weak references in Java, I have had to consult the Java Language Specification. The following part, from section 12.6, puzzles me:
An unfinalized object has never had its finalizer automatically invoked;
a finalized object has had its finalizer automatically invoked. A finalizable
object has never had its finalizer automatically invoked, but the Java virtual
machine may eventually automatically invoke its finalizer.
So what is the formal difference between an unfinalized and a finalizable object ? From the quote it seems that if unfinalized and finalizable are to be different, then for an unfinalized object it must be the case that it is not true that the JVM may eventually invoke its finalizer. A little confusing or I still have some English semantics to study ;)
Link to the section in the Java spec: Implementing Finalization
The answer seems to lie in this line:
If the Java virtual machine detects that an unfinalized object has become finalizer-reachable or unreachable, it may label the object finalizable (G, H);
Unfinalized objects are not eligible for finalization yet. They are reachable. Finalizable objects are eligible to be finalized, so the JVM may do that when it chooses. In other words, "may" in the sense of "has permission to" not just in the sense of "it might happen."
The difference between an unfinalized and a finalizable object is that the finalizer on the second one could be automatically invoked at any time in the future, while the finalizer on the unfinalized object can't be automatically invoked, unless the object first becomes finalizable.
a unfinalized object will not get its finalizer automatically invoked by the JVM in this state
a finalizable object can eventually get its finalizer automatically invoked by the JVM
There is no guarantee that a GC will ever be performed or that finalize() will ever be called. It is highly likely that it will happen at some point.
When an object no longer has a strong reference to it, it can be garbage collected. Some time later a GC can be performed and the object is added to a finalization queue to have its finalize() method called. Once the method has been called it can be removed if there still is not strong reference to it.
Related
PhantomReference java doc for java 8 and less looks like this:
Phantom reference objects, which are enqueued after the collector
determines that their referents may otherwise be reclaimed. Phantom
references are most often used for scheduling pre-mortem cleanup
actions in a more flexible way than is possible with the Java
finalization mechanism. If the garbage collector determines at a
certain point in time that the referent of a phantom reference is
phantom reachable, then at that time or at some later time it will
enqueue the reference.
In order to ensure that a reclaimable object remains so, the referent
of a phantom reference may not be retrieved: The get method of a
phantom reference always returns null.
Unlike soft and weak references, phantom references are not
automatically cleared by the garbage collector as they are enqueued.
An object that is reachable via phantom references will remain so
until all such references are cleared or themselves become unreachable
PhantomReference java doc for java 9 and higher looks like this:
Phantom reference objects, which are enqueued after the collector
determines that their referents may otherwise be reclaimed. Phantom
references are most often used to schedule post-mortem cleanup
actions. Suppose the garbage collector determines at a certain point
in time that an object is phantom reachable. At that time it will
atomically clear all phantom references to that object and all phantom
references to any other phantom-reachable objects from which that
object is reachable. At the same time or at some later time it will
enqueue those newly-cleared phantom references that are registered
with reference queues.
In order to ensure that a reclaimable object remains so, the referent
of a phantom reference may not be retrieved: The get method of a
phantom reference always returns null.
Was something changing in PhantomReference behaviour in java 9? or just java founders rethought dedication of that class ?
Since Java 9, PhantomReference (PR) are automatically cleared. What you see is the Javadoc change that comes as the result of that change.
Before Java 9, the object referenced by PR was kept alive, even though its get() would return null. Therefore, until PR itself is dead, the referent would be technically alive, although you could not acquire the reference to it. The benefits of this behavior are not very clear. Anyhow, PR handling would be the "pre-mortem cleanup".
After Java 9, PR is cleared right before enqueueing (just like other types of weak/soft refs), the referent itself becomes fully dead before PR is processed by application code, which would be the "post-mortem cleanup".
I'm reading this article and I can't really understand how the finalizable objects (objects which override the finalize method) takes at least 2 GC cycles before it can be reclaimed.
It takes at least two garbage collection cycles (in the best case) before a finalizeable object can be reclaimed.
Can someone also explain in detail how is it possible for a finalizable object to take more than one GC cycle for reclamation?
My logical argument is that when we override finalize method, the runtime will have to register this object with the garbage-collector (so that GC can call finalize of this object, which makes me think that GC will have reference to all the finalizable objects). And for this, GC will have to keep a strong reference to the finalizable object. If that is the case then how this object became a candidate for reclamation by GC in the first place? I reach a contradiction by this theory.
PS: I understand that overriding finalize is not the recommended approach and this method is deprecated since Java 9.
You are right in that the garbage collector needs a reference to finalizable objects. Of course, this particular reference must not be considered when deciding whether the object is still reachable before the finalization. This implies special knowledge about the nature of this reference to the garbage collector.
When the garbage collector determines that an object is eligible for finalization, the finalizer will run, which implies that the object becomes strongly reachable again, at least as long as the finalizer is executed. After its finalization, the object must become unreachable again and this must be detected, before the object’s memory can be reclaimed. That’s why it takes at least two garbage collection cycles.
In case of the widely used Hotspot/OpenJDK environment (and likely also in IBM’s JVM), this is implemented by creating an instance of a special, non-public subclass of Reference, a Finalizer, right when an object, whose class has a non-trivial finalize() method, is created. Like with weak & soft references, these references are enqueued by the garbage collector when no strong reference to the referent exist, but they are not cleared, so the finalizer thread can read the object, making it strongly reachable again for the finalization. At this point, the Finalizer is cleared, but also not referenced anymore, so it would get collected like an ordinary object anyway, so by the next time the referent becomes unreachable, no special reference to it exists anymore.
For objects whose class has a “trivial finalizer”, i.e. the finalize() method inherited by java.lang.Object or an empty finalize() method, the JVM will take a short-cut and not create the Finalizer instance in the first place, so you could say, these objects, which make the majority of all objects, behave as if their finalizer did already run, right from the start.
Though you got your answer (which is absolutely correct), I want to add a small-ish addendum here. In general, references are of two types : strong and weak. Weak References are WeakReference/SoftReference/PhantomReference and Finalizer(s).
When a certain GC cycle traverses the heap graph and sees one of these weak references, it treats it in a special way. When it first encounters a dead finalizer reference (let's consider this being the first GC cycle), it has to resurrect the instance. finalize is an instance method, and it needs an actual instance to be invoked. So a GC first saw that this Object is dead, only to revive it moments later, to be able to call finalize on it. Once it calls that method on it, it marks the fact that it has already been called; so when the next cycle happens, it can be actually be GC-ed.
It would be incorrect to call this the second GC.
For example G1GC does partial clean-up of the heap (young and mixed), so it might not even capture this reference in the next cycle. It might not fall under its radar, as simple as that.
Other GCs, like Shenandoah, have flags that control on which iteration to handle these special references (ShenandoahRefProcFrequency, 5 by default).
So indeed there is a need for two cycles, but they do not have to be subsequent.
As far as I understand, GC starts with some set of initial objects (stack, static objects) and recursively traverses it building a graph of reachable objects. Then it marks the memory taken by these objects as occupied and assumes all the rest of the memory free.
But what if this 'free' memory contains an object with finalize method? GC has to call it, but I don't see how it can even know about objects that aren't reachable anymore.
I suppose GC can keep track of all 'finalizable' objects while they are alive. If so, does having finalizable objects make garbage collecting more expensive even when they are still alive?
Consider the Reference API.
It offers some references with special semantics to the GC, i.e Weak, Soft, and Phantom references. There’s simply another non-public type of special reference, for objects needing finalization.
Now, when the garbage collector traverses the object graph and encounters such a special reference object, it will not mark objects reachable through this reference as strongly reachable, but reachable with the special semantics. So if an object is only finalizer-reachable, the reference will be enqueued, so that one (or one of the) finalizer thread(s) can poll the queue and execute the finalize() method (it’s not the garbage collector itself calling this method).
In other words, the garbage collector never processes entirely unreachable objects here. To apply a special semantic to the reachability, the reference object must be reachable, so the referent can be reached through that reference. In case of finalizer-reachability, Finalizer.register is called when an object is created and it creates an instance of Finalizer in turn, a subclass of FinalReference, and right in its constructor, it calls an add() method which will insert the reference into a global linked list. So all these FinalReference instances are reachable through that list until an actual finalization happens.
Since this FinalReference will be created right on the instantiation of the object, if its class declares a non-trivial finalize() method, there is already some overhead due to having a finalization requirement, even if the object has not collected yet.
The other issue is that an object processed by a finalizer thread is reachable by that thread and might even escape, depending on what the finalize() method does. But the next time, this object becomes unreachable, the special reference object does not exist anymore, so it can be treated like any other unreachable object.
This would only be a performance issue, if memory is very low and the next garbage collection had to be performed earlier to eventually reclaim that object. But this doesn’t happen in the reference implementation (aka “HotSpot” or “OpenJDK”). In fact, there could be an OutOfMemoryError while objects are pending in the finalizer queue, whose processing could make more memory reclaimable. There is no guaranty that finalization runs fast enough for you’re purposes. That’s why you should not rely on it.
But what if this 'free' memory contains an object with finalize
method? GC has to call it, but I don't see how it can even know about
objects that aren't reachable anymore.
Let's say we use CMS garbage collector. After it successfully marked all live objects in a first phase, it will then scan memory again and remove all dead objects. GC thread does not call finalize method directly for these objects.
During creation, they are wrapped and added to finalizer queue by JVM (see java.lang.ref.Finalizer.register(Object)). This queue is processed in another thread (java.lang.ref.Finalizer.FinalizerThread), finalize method will be called when there are no references to the object. More details are covered in this blog post.
If so, does having finalizable objects make garbage collecting more
expensive even when they are still alive?
As you can now see, most of the time it does not.
The finalise method is called when an object is about to get garbage collected. That means, when GC determines that the object is no longer being referenced, it can call the finalise method on it. It doesn't have to keep track of objects to be finalised.
According to javadoc, finalize
Called by the garbage collector on an object when garbage collection determines that there are no more references to the object.
So the decision is based on reference counter or something like that.
Actually it is possible not to have this method called at all. So it may be not a good idea to use it as destructor.
Hi I have one doubt about phantom reference. What I understand the finalize method is called just before when object are going for garbage collection. But some time if object are not eligible for garbage collection then finalize method will not execute.
Now talking about phantom reference when this finalize method will called.
Is finalize always called in phantom reference.
I am very much confuse about this. Please help me.
Finalizers are never guaranteed to be called, whether there is a phantom reference or not. Don't rely on finalizers for any critical part of your code because there is no guarantee that they will be called in a timely manner or in fact at all.
Many people advocate that you simply should never use finalizers at all because they are incredibly difficult to use correctly.
When object becomes available only through phantom reference then after the first GC finalize() method is invoked and after the second GC the reference is enqueued. If after that phantom reference is cleaned (or becomes unavailable itself) then the memory is cleared after the third GC.
Finalize will always be called, but not neccessarely, when you expect it. It may happen, that the call will only be made at the JVM shutdown (assuming you don't simply kill the program). You should not rely on finalize() in order to do significant work. But it is also good practice to implement a usefull finalize() and include a call to super.finalize() too.
In Java, finalize is called on an object (that overrides it) when it's about to be garbage collectioned, so when it's unreachable. But what if the finalizer makes the object reachable again, what happens then?
Then the object doesn't get garbage collected, basically. This is called object resurrection. Perform a search for that term, and you should get a bunch of interesting articles. As Jim mentioned, one important point is that the finalizer will only be run once.
The object will not be collected until it gets unreachable again.
According to the JavaDoc, finalize() will not be called again.
If you read the API description carefully, you'll see that the finalizer can make the object reachable again. The object won't be discarded until it is unreachable (again), but finalize() won't be called more than once.
Yeah, this is why you don't use finalizers (Well, one of the many reasons).
There is a reference collection that is made to do this stuff. I'll look it up and post it here in a sec, but I think it's PhantomReference.
Yep, PhantomReference:
Phantom reference objects, which are enqueued after the collector determines that their referents may otherwise be reclaimed. Phantom references are most often used for scheduling pre-mortem cleanup actions in a more flexible way than is possible with the Java finalization mechanism.
It actually does another pass to check and make sure there are no more references to the object. Since it will fail that test on its second pass, you'll end up not freeing the memory for the object.
Because finalize is only called a single time for any given object, the next time through when it has no references, it will just free the memory without calling finalize. Some good information here on finalization.