Does circular GC work in a map? - java

I have a User object which strongly refers to a Data object.
If I create a Map<Data, User> (with Guava MapMaker) with weak keys, such a key would only be removed if it's not referenced anywhere else. However, it is always refered to by the User object that it maps to, which is in turn only removed from the map when the Data key is removed, i.e. never, unless the GC's circular reference detection also works when crossing a map (I hope you understand what I mean :P)
Will Users+Datas be garbage collected if they're no longer used elsewhere in the application, or do I need to specify weak values as well?

The GC doesn't detect circular references because it doesn't need to.
The approach it takes is to keep all the objects which are strongly referenced from root nodes e.g. Thread stacks. This way objects not accessible strongly (with circular references or not) are collected.
EDIT: This may help explain the "myth"
http://www.javacoffeebreak.com/articles/thinkinginjava/abitaboutgarbagecollection.html
Reference counting is commonly used to explain one kind of garbage collection but it doesn't seem to be used in any JVM implementations.
This is an interesting link http://www.ibm.com/developerworks/library/j-jtp10283/

In documentation you see:
weakKeys()
Specifies that each key (not value) stored in the map should be wrapped in a WeakReference (by default, strong references are used).
since it is weakReferenced it will be collected.
http://guava-libraries.googlecode.com/svn/trunk/javadoc/com/google/common/collect/MapMaker.html

Related

What are WeakReferences, Weakhashmaps, softreferences used for?

Please explain what WeakReferences are used for. I usually do understand Java concepts, but this one is giving me trouble.
I do understand what WeakReferences are, but their usage and nature is a little vague inside my head. I am not able to visualize a correct scenario wherein using WeakReferences becomes a necessity.
I also know that a WeakHashMap is related to WeakReferences where a row which contains a null key, gets automatically removed. I can't visualize how can this be, that I have a WeakHashMap somewhere, and some other process nullifies a key, and then WeakHashMap saves the day by removing that row.
Also this article that everyone refers to, does not provide a case study that would help me understand.
If anyone out there can come up with a scenario and give me some understanding into this, I would be really grateful.
Weak references are basically used when you don't want the object to "stick" around if no one else is pointing to it. One very common use case which I believe helps when thinking of weak references is the use of weak hash map for maintaining canonical mapping.
Consider a case wherein you need to maintain a mapping between a Class<?> instance and the list of all methods it holds. Given that the JVM is perfectly capable of dynamic class loading and unloading, it's quite possible that the class you have in your map as a key is no longer needed (doesn't have anything else pointing to it). Now, if you would have used a "strong" reference to maintain the class to method mapping, your class will stick around as long as your map is reachable which isn't a good position to be in this case. What you would really want is that once there are no live references to your "class", it should be let go by the map. This is exactly what a weak hash map is used for.
EDIT: I would recommend giving this thread a read.

Keeping track of already seen objects

I'm trying to implement an interceptor for my application, that would be able to keep track of objects it has seen. I need to be able to tell, whether the object I'm seeing now is something new, or a reused one.
Assuming I have an interface like this one:
public interface Interceptor {
void process(Object o);
}
I've been thinking about adding a Set that would keep track of those objects. But since I don't want to cause memory leaks with that kind of behavior, perhaps I should devise some other pattern? In the end, those objects may be destroyed in other layers.
Possible solutions seem:
putting hashCode of an object into the Set
using WeakHashSet instead of HashSet
the first option seems not 100% reliable, because hashCode may not be unique. As for the second option, I'm not that sure this will prevent memleaks.
And one more note, I'm not able to modify the objects, I can't add fields, methods. Wrapping is also not an option.
Any ideas?
WeakReferences are the way to go. From here:
A weak reference, simply put, is a reference that isn't strong enough
to force an object to remain in memory. Weak references allow you to
leverage the garbage collector's ability to determine reachability for
you, so you don't have to do it yourself.
i.e. keeping a WeakReference won't force the JVM to hold a reference to this object.
Of course the weak reference isn't strong enough to prevent garbage
collection, so you may find (if there are no strong references to the
widget) that weakWidget.get() suddenly starts returning null.
Just a completion to Brian Agnew correct answer. There is no WeakHashSet class in java API, you'll need to create it from a WeakHashMap like this:
Set<Object> weakHashSet = Collections.newSetFromMap(
new WeakHashMap<Object, Boolean>());
See Collections.newSetFromMap java docs.

Java efficiency - child object referencing parent object

I'm new to java/garbage collected languages and I still am getting my head around what it means to have an object reference (because I'm told it's not a pointer?) so I'm pondering this question:
I have a parent/child object structure where the parent will have several lists of several children each...is there any inefficiency or any other reason not to have a pointer in each child back to it's parent? In my prior language (Delphi) it was a simple pointer so not a problem at all. Are there any considerations with this practice in Java?
There shouldn't be any issue here. Technically yes, Java references are not pointers, but for most issues, you can think of them similarly. Object references are integers pointing to locations in Java's heap. Each additional place it's stored is therefore one additional integer. Reasonably small, generally speaking.
You can (generally!) trust Java to do the right thing when it comes to object management, and shouldn't have to worry too much about garbage collection or the intricacies of how object references work.
From what I know I'd say you'd be fine doing that. Java does a good job of cleaning up your garbage and I usually have a 'parent' field in children classes.
As previous answers have stated, generally the GC is pretty good with clearing things up. Your primary concern will be things that persist once you leave an activity, hold onto context. This will cause your Activity to stay in memory because you have a reference to it that is not in it's parent child tree.
More on this here
I think it would be helpful if you read up on reference types as well - strong, weak, phantom and soft as it would be helpful. Also, read up on how GC works (for different generations - young/survivor spaces & old generation), garbage collectors to use and GC parameters that you can specify.

Is there a practical use for weak references? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Weak references - how useful are they?
Since weak references can be claimed by the garbage collector at any time, is there any practical reason for using them?
If you want to keep a reference to something as long as it is used elsewhere e.g. a Listener, you can use a weak reference.
WeakHashMap can be used as a short lived cache of keys to derived data. It can also be used to keep information about objects used else where and you don't know when those objects are discarded.
BTW Soft References are like Weak references, but they will not always be cleaned up immediately. The GC will always discard weak references when it can and retain Soft References when it can.
There is another kind of reference called a Phantom Reference. This is used in the GC clean up process and refers to an object which isn't accessible to "normal" code because its in the process of being cleaned up.
Since weak reference can be claimed by garbage collector at any time, is there any practical reason to use it?
Of course there are practical reasons to use it. It would be awfully strange if the framework designers went to the enormous expense of building a weak reference system that was impractical, don't you think?
I think the question you intended to ask was:
What are realistic situations in which people use weak references?
There are many. A common one is to achieve a performance goal. When performance tuning an application one often must make a tradeoff between more memory usage and more time usage. Suppose for example there is a complex calculation that you must perform many times, but the computation is "pure" -- the answer depends only on the arguments, not upon exogenous state. You can build a cache -- a map from the arguments to the result -- but that then uses memory. You might never ask the question again, and that memory is would then be wasted.
Weak references possibly solve this problem; the cache can get quite large, and therefore time is saved if the same question is asked many times. But if the cache gets large enough that the garbage collector needs to reclaim space, it can do so safely.
The downside is of course that the cleanup policy of the garbage collector is tuned to meet the goals of the whole system, not your specific cache problem. If the GC policy and your desired cache policy are sufficiently aligned then weak references are a highly pragmatic solution to this problem.
If a WeakReference is the only reference to an object, and you want the object to hang around, you should probably be using a SoftReference instead.
WeakReferences are best used in cases where there will be other references to the object, but you can't (or don't want to have to) detect when those other references are no longer used. Then, the other reference will prevent the object from being garbage collected, and the WeakReference will just be another way of getting to the same object.
Two common use cases are:
For holding additional (often expensively calculated but reproducible) information about specific objects that you cannot modify directly, and whose lifecycle you have little control over. WeakHashMap is a perfect way of holding these references: the key in the WeakHashMap is only weakly held, and so when the key is garbage collected, the value can be removed from the Map too, and hence be garbage collected.
For implementing some kind of eventing or notification system, where "listeners" are registered with some kind of coordinator, so they can be informed when something occurs – but where you don't want to prevent these listeners from being garbage collected when they come to the end of their life. A WeakReference will point to the object while it is still alive, but point to "null" once the original object has been garbage collected.
We use it for that reason - in our example, we have a variety of listeners that must register with a service. The service keeps weak references to the listeners, while the instantiated classes keep strong references. If the classes at any time get GC'ed, the weak reference is all that remains of the listeners, which will then be GC'ed as well. It makes keeping track of the intermediary classes much easier.
The most common usage of weak references is for values in "lookup" Maps.
With normal (hard) value references, if the value in the map no longer has references to it elsewhere, you often don't need the lookup any more. With weakly referenced map values, once there are no other references to it, the object becomes a candidate for garbage collection
The fact that the map itself has a (the only) reference to the object does not stop it from being garbage collected because the reference is a weak reference
To prevent memory leaks, see this article for details.
A weak reference is a reference that does not protect the referent object from collection by a garbage collector.
An object referenced only by weak references is considered
unreachable (or "weakly reachable") and so may be collected at any
time.
Weak references are used to avoid keeping memory referenced by
unneeded objects. Some garbage-collected languages feature or support
various levels of weak references, such as Java, C#, Python, Perl, PHP or
Lisp.
Garbage collection is used to reduce the potential for memory leaks
and data corruption. There are two main types of garbage collection:
tracing and reference counting. Reference counting schemes record the
number of references to a given object and collect the object when
the reference count becomes zero. Reference-counting cannot collect
cyclic (or circular) references because only one object may be
collected at a time. Groups of mutually referencing objects which are
not directly referenced by other objects and are unreachable can thus
become permanently resident; if an application continually generates
such unreachable groups of unreachable objects this will have the
effect of a memory leak. Weak references may be used to solve the
problem of circular references if the reference cycles are avoided by
using weak references for some of the references within the group.
Weak references are also used to minimize the number of unnecessary
objects in memory by allowing the program to indicate which objects
are not critical by only weakly referencing them.
I use it generally for some type of cache. Recently accessed items are available immediately and in the case of cache miss you reload the item (DB, FS, whatever).
I use WeakSet to encode links in a graph. If a node is deleted, the links automatically disappear.

How are weak references implemented?

I wonder how weak references work internally, for example in .NET or in Java. My two general ideas are:
"Intrusive" - to add list of weak references to the most top class (object class). Then, when an object is destroyed, all the weak references can be iterated and set to null.
"Non-intrusive" - to maintain a hashtable of objects' pointers to lists of weak references. When a weak reference A is created to an object B, there would be an entry in the hashtable modified or created, whose key would be the pointer to B.
"Dirty" - to store a special hash-value with each object, which would be zeroed when the object is destroyed. Weak references would copy that hash-value and would compare it with the object's value to check if the object is alive. This would however cause access violation errors, when used directly, so there would need to be an additional object with that hash-value, I think.
Either of these solutions seems clean nor efficient. Does anyone know how it is actually done?
In .NET, when a WeakReference is created, the GC is asked for a handle/opaque token representing the reference. Then, when needed, WeakReference uses this handle to ask the GC if that handle is still valid (i.e. the original object still exists) - and if so, it can get the actual object reference.
So this is building a list of tokens/handles against object addresses (and presumably maintaining that list during defragmentation etc)
I'm not sure I 100% understand the three bullets, so I hesitate to guess which (if any) that is closest to.
Not sure I understood your question, but you can have a look at the implementation for the class WeakReference and its superclass Reference in Java. It is well commented and you can see it has a field treated specially by the GC and another one used directly by the VM.
Python's PEP 205 has a decent explanation of how weak references should behave in Python, and this gives some insight into how they can be implemented. Since a weak reference is immutable, you could have just one for each object, to which you pass out references as needed. Thus, when the object is destroyed, only one weak reference needs to be invalidated.
It seems that implementation of weak references is well-kept secret in the industry ;-). For example, as of now, wikipedia article lacks any implementation details. And look at the answers above (including the accepted): "go look at the source" or "I think" ;-\ .
Of all the answers, only the one referencing Python's PEP 205 is insightful. As it says, for any single object, there can be at most one weak reference, if we treat weakref as an entity itself.
The rest describes Squirrel language implementation. So, weakref is itself an object, when you put weak reference to an object in some container, you actually put reference to weakref object. Each ref-countable object has field to store pointer to its weakref, which is NULL until weakref to that object is actually requested. Each object has method to request weakref, which either returns existing (singleton) weakref from the field, or creates it and caches in the field.
Of course, weakref points to the original object. So, then you just need to go thru all the available places where references to objects are handled and add transparent handling of weakrefs (i.e. automatically dereference it). ("Transparent" alternative is to add virtual "access" method which will be identity for most objects, and actual dereference for weakref.)
And as object has pointer to its weakref, then the object can NULLify the weakref in own destructor.
This implementation is pretty clean (no magic "calls into GC" and stuff) and has O(1) runtime cost. Of course, it's pretty greedy of memory - need to add +1 pointer field to each object, even though typically for 90+% objects that would be NULL. Of course, VHLLs already have large memory overhead per object, and there may be chance to compact different "extra" fields. For example, object type is typically a small enumeration, so it may be possible to merge type and some kind of weakref reference into single machine word (say, keep weakref objects in a separate arena, and use index to that).
The normal approach, I think, is for the system to maintain some sort of list of weak references. When the garbage collector executes, before dead objects are removed, the system iterates through the list of weak references and invalidates any reference whose target has not been tagged live. Depending upon the system, this may occur before or after the system temporarily resurrects objects which are eligible for immediate finalization (in the case of .net, there are two kinds of WeakReference--one of which is effectively processed before the system scans for finalizers, meaning that it will become invalid when its target becomes eligible for finalization, and one of which is processed after).
Incidentally, if I were designing a gc-based framework, I would add a couple of other goodies: (1) a means of declaring a reference-type storage location as holding a reference that's primarily of interest to someone else, and (2) A variety of WeakReference which would could indicate that the only references to an object are in "of interest to someone else" storage locations. Although WeakReference is a useful type, the act of turning a weak reference into a strong reference may prevent the system from ever recognizing that nobody would mind if its target disappeared.

Categories