Java allow to write:
new PhantomReference(new Object(), null)
At this case new Object() will be collected?
As I understand, phantom reference is alternative of finalize() method usage.
And after appearing reference in queue, I need to do some additional actions and then run clear()
java doc stays:
It is possible to create a phantom reference with a null queue, but
such a reference is completely useless: Its get method will always
return null and, since it does not have a queue, it will never be
enqueued
What does mean if it will never be enqueued?
As I understand it means that after finalize method invocation rerference will not be added to the referenceQueue. Thus it may lead to:
1. object memory will be cleared at once
2. Object memory will not be cleared
which case correct?
Well, as you noticed yourself, a PhantomReference is not automatically cleared. This implies that as long as you keep a strong reference to the PhantomReference, the referent will stay phantom reachable. As the documentation says: “An object that is reachable via phantom references will remain so until all such references are cleared or themselves become unreachable.”
However, considering when an object is unreachable (now I’m talking about the “phantom references themselves”) can lead to many surprises. Especially as it’s very likely that the reference object, not providing useful operations, will not be subsequently touched anymore.
Since the PhantomReference without a queue will never be enqueued and its get() method will always return null, it is indeed not useful.
So why does the constructor allows to construct such a useless object? Well, the documentation of the very first version (1.2) states that it will throw a NullPointerException if the queue is null. This statement persists until 1.4, then Java 5 is the first version containing the statement that you can construct a PhantomReference without a queue, despite being useless. My guess is, that it always inherited the super class’ behavior of allowing a null queue, contradicting the documentation, and it was noticed so late, that the decision was made to stay compatible and adapt the documentation rather than changing the behavior.
The question, even harder to answer, is why a PhantomReference isn’t automatically cleared. The documentation only says that a phantom reachable object will remain so, which is the consequence of not being cleared, but doesn’t explain why this has any relevance.
This question has been brought up on SO, but the answer isn’t really satisfying. It says “to allow performing cleanup before an object is garbage collected”, which might even match the mindset of whoever made that design decision, but since the cleanup code can’t access the object, it has no relevance whether it is executed before or after the object is reclaimed. As said above, since this rule depends on the reachability of the PhantomReference object, which is subject to optimizing code transformations, it might be even the case that the object is reclaimed together with the PhantomReference instance before the cleanup code completes, without anyone noticing.
I also found a similar question on the HotSpot developer mailing list back in 2013 which also lacks an answer.
There is the enhancement request JDK-8071507 to change that behavior and clear PhantomReferences just like the others, which has the status “fixed” for Java 9, and indeed, its documentation now states that they are cleared like any other reference.
This, unfortunately implies that the answer at the beginning of my post will be wrong starting with Java 9. Then, new PhantomReference(new Object(), null) will make the newly created Object instance immediately eligible for garbage collection, regardless of whether you keep a strong reference to the PhantomReference instance or not.
Related
I am developing some concurrent algorithms which deal with Reference objects. I am using java 17.
The thing is I don't know what's the memory semantics of operations like get, clear or refersTo. It isn't documented in the Javadoc.
Looking into the source code of OpenJdk, the referent has no modifier, such as volatile (while the next pointer for reference queues is volatile).
Also, get implementation is trivial, but it is an intrinsic candidate. clear and refersTo are native. So I don't know what they really do.
When the GC clears a reference, I have to assume that all threads will see it cleared, or otherwise they would see a reference to an object (in process of being) garbage collected, but it's just an informal guess.
Is there any warranty about the memory semantics of all these operations?
If there isn't, is there a way to obtain the same warranries of a volatile access by invoking, for instance, a fence operation before and/or after calling one of these operations?
When you invoke clear() on a reference object, it will only clear this particular Reference object without any impact on the rest of your application and no special memory semantics. It’s exactly like you have seen in the code, an assignment of null to a field which has no volatile modifier.
Mind the documentation of clear():
This method is invoked only by Java code; when the garbage collector clears references it does so directly, without invoking this method.
So this is not related to the event of the GC clearing a reference. Your assumption “that all threads will see it cleared” when the GC clears a reference is correct. The documentation of WeakReference states:
Suppose that the garbage collector determines at a certain point in time that an object is weakly reachable. At that time it will atomically clear all weak references to that object and all weak references to any other weakly-reachable objects from which that object is reachable through a chain of strong and soft references.
So at this point, not only all threads will agree that a weak reference has been cleared, they will also agree that all weak references to the same object have been cleared. A similar statement can be found at SoftReference and PhantomReference.
The Java Language Specification, §12.6.2. Interaction with the Memory Model refers to points where such an atomic clear may happen as reachability decision points. It specifies interactions between these points and other program actions, in terms of “comes-before di” and “comes-after di” relationships, the most import ones being:
If r is a read that sees a write w and r comes-before di, then w must come-before di.
If x and y are synchronization actions on the same variable or monitor such that so(x, y) (§17.4.4) and y comes-before di, then x must come-before di.
So, the GC action will be inserted into the synchronization order and even a racy read could not subvert it, but it’s important to keep in mind that the exact location of the reachability decision point is not known to the application. It’s obviously somewhere between the last point where get() returned a non-null reference or refersTo(null) returned false and the first point where get() returned null or refersTo(null) returned true.
For practical applications, the fact that once the reference reports the object to be garbage collected you can be sure that it won’t reappear anywhere¹, is enough. Just keep the reference object private, to be sure that not someone invoked clear() on it.
¹ Letting things like “finalizer resurrection aside”
Just trying to understand something from GC viewpoint
public Set<Something> returnFromDb(String id) {
LookupService service = fromSomewhere();
Map<String,Object> where = new WeakHashMap<>() {}
where.put("id",id);
return service.doLookupByKVPair(where); // where doesn't need to be serializable
}
what I understand is that once this method call leaves the stack, there is no reference to where regardless of using HashMap or WeakHashMap - but since weak reference is weakly reachable wouldn't this be GCd faster? But if the method call leaves the stack, then there is no reachable reference anyway.
I guess the real question that I have is - "Would using WeakHashMap<> here actually matters at all" - I think it's a "No, because the impact is insignificant" - but a second answer wouldn't hurt my knowledge.
When you use a statement like where.put("id",id); you’re associating a value with a String instance created from a literal, permanently referenced by the code containing it. So the weak semantic of the association is pointless, as long as the code is reachable, this specific key object will never get garbage collected.
When the entire WeakHashMap becomes unreachable, the weak nature of the references has no impact on the garbage collection, as unreachable objects have in general. As discussed in this answer, the garbage collection performance mainly depends on the reachable objects, not the unreachable ones.
Keep in mind the documentation:
The relationship between a registered reference object and its queue is one-sided. That is, a queue does not keep track of the references that are registered with it. If a registered reference becomes unreachable itself, then it will never be enqueued. It is the responsibility of the program using reference objects to ensure that the objects remain reachable for as long as the program is interested in their referents.
In other words, a WeakReference has no impact when it is unreachable, as it will be treated like any other garbage, i.e. not treated at all.
When you have a strong reference to a WeakHashMap while a garbage collection is in progress, it will reduce the performance, as the garbage collector has to keep track of the encountered reachable WeakReference instances, to clear and enqueue them if their referent has not been encountered and marked as strongly reachable. This additional effort is the price you have to pay for allowing the earlier collection of the keys and the subsequent cleanup, which is needed to remove the strongly referenced value.
As said, when, like in your example, the key will never become garbage collected, this additional effort is wasted. But if no garbage collection happens while the WeakHashMap is used, there will be no impact, as said, as the collection of an entire object graph happens at once, regardless of what kind of objects are in the garbage.
ReferenceQueue q = new ReferenceQueue();
Reference r = q.remove();
r.clear();
I see that the java doc says that the clear method clears this reference object. I don't understand the meaning of this. Does this clear from the memory and thus in other words the object has been garbage collected?
java.lang.Reference is a base class for few special references which are treated in special way by garbage collection.
Under certain circumstances garbage collector may push reference object in it's reference queue (reference may be queued only once in a lifetime).
clear() method can be used to suppress special handling (and thus additional work for garbage collector). If reference object is already in queue it doesn't make sense to clear it, it is already cleared by garbage collector.
This project on github has an implementation of resource management using PhantomReferences made for educational purpose. clear() is used if resource is disposed explicitly to avoid extra work for GC in that case.
clear() simply sets the internal reference to null. Since references are automatically cleared when being enqueued by the garbage collector (with the exception of phantom references, but this oddity can be ignored, it will be eliminated in Java 9), there is usually no need to call clear() on a reference received via ReferenceQueue.remove().
In principle, there is the possibility to enqueue references manually via enqueue() without clearing them, but there is little sense in that, as the primary purpose of the reference queue is to learn about references being enqueued by the garbage collector which will be cleared.
When you call clear() on a Reference object that has not been enqueued yet, it may allow the referent to get collected without enqueuing the Reference object. On the other hand, when you don’t need the Reference object anymore, you can let the JVM collect it like an ordinary object, together with the referent if there are no other references left, as in that case, it won’t get enqueued as well, making clear() unnecessary.
If I have a reference pointing to some some java object, and do something like:
myObject=null;
Will the "lost data" of the old object be correctly freed by the JVM Garbage Collector? Something similar in C (with a pointer, would result in trash and a possible memory leak).
I am using null attribution in a java program and would like to now if it is "safe".
If myObject only holds memory ( say large internal array ), then setting this reference to null is enough.
If, on the other hand, it holds some other kind of resource that you've allocated ( Closeable, Thread, ExecutorService, etc ), you must take care to properly shut down these resources.
Even though some of them may have finalize method they may be called too late ( or even never ) for your system to have a desirable effect.
It is a very common mistake for somebody switching from C++ to Java, and I am guilty as charged here. In my first real Java project I would periodically run out of file handle, because I was not calling close after being done with them. Needless to say with a 512MB heap, GC would never feel the need to start finalizing my IO objects before it was too late.
Assuming that there are no other references to the object, this is a good way to free memory up for the GC. (Actually, aside from weak references and the like, it's basically the only way: make the object unreachable from any live variables.) Note that there is no schedule for when an object might get garbage collected once it becomes unreachable.
EDIT: As others have pointed out, setting myObject to null is unnecessary if myObject is going out of scope anyway. When the variable itself is no longer available as a path to reach the object it references, then it doesn't matter to the GC system whether or not it contains a reference or null.
Your assumption is correct, but you don't usually need to specifically do that.
Let's say your "myObject" is used in another object. At some point in the lifetime of your application's execution, this object will stopped being referenced by any other object, and thus will be marked for deletion by the GC. Them myObject will be marked for deletion as well. As soon as all references to a given object disappear, the GC will eventually reclaim the memory.
There are (rare) exceptions, like event handling, where the dependency between two objects cannot be properly automatically ended, and you may end up with a memory leak: when you subscribe to an event on another class, then the subscriber cannot be collected even when there's no "direct" references to it. In that specific case, it might be interesting to clear the link manually.
Yes, that is the purpose of the garbage collector in the JVM. The JVM may at some later time call the finalize method of the object, and then it may discard the associated storage.
Yes, it's sometimes a GOOD idea to set Java object references (pointers) null. This may (if there are no other references to the object) "free" the object sooner than would otherwise occur. This is especially helpful when you have large "networks" of intertwined objects.
At worst case, you're costing one additional memory store.
Yes, The object the reference pointed to is eligible for garbage collection (if there are no other live references to the object) when:
The method returns - if it was initially created with method local scope
Immediately - if it is an instance or class variable
In Java, finalize is called on an object (that overrides it) when it's about to be garbage collectioned, so when it's unreachable. But what if the finalizer makes the object reachable again, what happens then?
Then the object doesn't get garbage collected, basically. This is called object resurrection. Perform a search for that term, and you should get a bunch of interesting articles. As Jim mentioned, one important point is that the finalizer will only be run once.
The object will not be collected until it gets unreachable again.
According to the JavaDoc, finalize() will not be called again.
If you read the API description carefully, you'll see that the finalizer can make the object reachable again. The object won't be discarded until it is unreachable (again), but finalize() won't be called more than once.
Yeah, this is why you don't use finalizers (Well, one of the many reasons).
There is a reference collection that is made to do this stuff. I'll look it up and post it here in a sec, but I think it's PhantomReference.
Yep, PhantomReference:
Phantom reference objects, which are enqueued after the collector determines that their referents may otherwise be reclaimed. Phantom references are most often used for scheduling pre-mortem cleanup actions in a more flexible way than is possible with the Java finalization mechanism.
It actually does another pass to check and make sure there are no more references to the object. Since it will fail that test on its second pass, you'll end up not freeing the memory for the object.
Because finalize is only called a single time for any given object, the next time through when it has no references, it will just free the memory without calling finalize. Some good information here on finalization.