Avoiding objects garbage collection - java

TLDR: How can I force the JVM not to garbage collect my objects, even if I don't want to use them in any meaningful way?
Longer story:
I have some Items which are loaded from a permanent storage and held as weak references in a cache. The weak reference usage means that, unless someone is actually using a specific Item, there are no strong references to them and unused ones are eventually garbage collected. This is all desired behaviour and works just fine. Additionally, sometimes it is necessary to propagate changes of an Item into the permanent storage. This is done asynchronously in a dedicated writer thread. And here comes the problem, because I obviously cannot allow the Item to be garbage collected before the update is finished. The solution I currently have is to include a strong reference to the Item inside the update object (the Item is never actually used during the update process, just held).
public class Item {
public final String name;
public String value;
}
public class PendingUpdate {
public final Item strongRef; // not actually necessary, just to avoid GC
public final String name;
public final String newValue;
}
But after some thinking and digging I found this paragraph in JavaSE specs (12.6.1):
Optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable. For example, a Java compiler or code generator may choose to set a variable or parameter that will no longer be used to null to cause the storage for such an object to be potentially reclaimable sooner.
Which, if I understand it correctly, means that java can just decide that the Item is garbage anyway. One solution would be to do some unnecessary operation on the Item like item.hashCode(); at the end of the storage update code. But I expect that a JVM might be smart enough to remove such unnecessary code anyway and I cannot think of any reasonable solution that a sufficiently smart JVM wouldn't be able to release sooner than needed.
public void performStorageUpdate(PendingUpdate update) {
final Transaction transaction = this.getDataManager().beginTransaction();
try {
// ... some permanent storage update code
} catch (final Throwable t) {
transaction.abort();
}
transaction.commit();
// The Item should never be garbage collected before this point
update.item.hashCode(); // Trying to avoid GC of the item, is probably not enough
}
Has anyone encounter a similar problem with weak references? Are there some language guarantees that I can use to avoid GC for such objects? (Ideally causing as small performance hit as possible.) Or am I overthinking it and the specification paragraph mean something different?
Edit: Why I cannot allow the Item to be garbage collected before the storage update finishes:
Problematic event sequence:
Item is loaded into cache and is used (held as a strong reference)
An update to the item is enqueued
Strong reference to the Item is dropped and there are no other strong references to the item (besides the one in the PendingUpdate, but as I explained, I think that that one can be optimized away by JVM).
Item is garbage collected
Item is requested again and is loaded from the permanent storage and a new strong reference to it is created
Update to the storage is performed
Result state: There are inconsistent data inside the cache and the permanent storage. Therefore, I need to held the strong reference to the Item until the storage update finishes, but I just need to hold it I don't actually need to do anything with it (so JVM is probably free to think that it is safe to get rid off).

TL;DR How can I force the JVM not to garbage collect my objects, even if I don't want to use them in any meaningful way?
Make them strongly reachable; e.g. by adding them to a strongly reachable data structure. If objects are strongly reachable then the garbage collector won't break weak references to them.
When you finish have finished the processing where the objects need to remain in the cache you can clear the data structure to break the above strong references. The next GC run then will be able to break the weak references.
Which, if I understand it correctly, means that java can just decide that the Item is garbage anyway.
That's not what it means.
What it really means that the infrastructure may be able to determine that an object is effectively unreachable, even though there is still a reference to it in a variable. For example:
public void example() {
int[] big = new int[1000000];
// long computation that doesn't use 'big'
}
If the compiler / runtime can determine that the object that big refers to cannot be used1 during the long computation, it is permitted to garbage collect it ... during the long computation.
But here's the thing. It can only do this if the object cannot be used. And if it cannot be used, there is no reason not to garbage collect it.
1 - ... without traversing a reference object.
For what it is worth, the definition of strongly reachable isn't just that there is a reference in a local variable. The definition (in the javadocs) is:
"An object is strongly reachable if it can be reached by some thread without traversing any reference objects. A newly-created object is strongly reachable by the thread that created it."
It doesn't specify how the object can be reached by the thread. Or how the runtime could / might deduce that no thread can reach it.
But the implication is clear that if threads can only access the object via a reference object, then it is not strongly reachable.
Ergo ... make the object strongly reachable.

Related

Java - HashMap and WeakHashMap references used in Application

Just trying to understand something from GC viewpoint
public Set<Something> returnFromDb(String id) {
LookupService service = fromSomewhere();
Map<String,Object> where = new WeakHashMap<>() {}
where.put("id",id);
return service.doLookupByKVPair(where); // where doesn't need to be serializable
}
what I understand is that once this method call leaves the stack, there is no reference to where regardless of using HashMap or WeakHashMap - but since weak reference is weakly reachable wouldn't this be GCd faster? But if the method call leaves the stack, then there is no reachable reference anyway.
I guess the real question that I have is - "Would using WeakHashMap<> here actually matters at all" - I think it's a "No, because the impact is insignificant" - but a second answer wouldn't hurt my knowledge.
When you use a statement like where.put("id",id); you’re associating a value with a String instance created from a literal, permanently referenced by the code containing it. So the weak semantic of the association is pointless, as long as the code is reachable, this specific key object will never get garbage collected.
When the entire WeakHashMap becomes unreachable, the weak nature of the references has no impact on the garbage collection, as unreachable objects have in general. As discussed in this answer, the garbage collection performance mainly depends on the reachable objects, not the unreachable ones.
Keep in mind the documentation:
The relationship between a registered reference object and its queue is one-sided. That is, a queue does not keep track of the references that are registered with it. If a registered reference becomes unreachable itself, then it will never be enqueued. It is the responsibility of the program using reference objects to ensure that the objects remain reachable for as long as the program is interested in their referents.
In other words, a WeakReference has no impact when it is unreachable, as it will be treated like any other garbage, i.e. not treated at all.
When you have a strong reference to a WeakHashMap while a garbage collection is in progress, it will reduce the performance, as the garbage collector has to keep track of the encountered reachable WeakReference instances, to clear and enqueue them if their referent has not been encountered and marked as strongly reachable. This additional effort is the price you have to pay for allowing the earlier collection of the keys and the subsequent cleanup, which is needed to remove the strongly referenced value.
As said, when, like in your example, the key will never become garbage collected, this additional effort is wasted. But if no garbage collection happens while the WeakHashMap is used, there will be no impact, as said, as the collection of an entire object graph happens at once, regardless of what kind of objects are in the garbage.

java - Memory management - Destroying a list when we don't need it [duplicate]

Does assigning an unused object reference to null in Java improve the garbage collection process in any measurable way?
My experience with Java (and C#) has taught me that is often counter intuitive to try and outsmart the virtual machine or JIT compiler, but I've seen co-workers use this method and I am curious if this is a good practice to pick up or one of those voodoo programming superstitions?
Typically, no.
But like all things: it depends. The GC in Java these days is VERY good and everything should be cleaned up very shortly after it is no longer reachable. This is just after leaving a method for local variables, and when a class instance is no longer referenced for fields.
You only need to explicitly null if you know it would remain referenced otherwise. For example an array which is kept around. You may want to null the individual elements of the array when they are no longer needed.
For example, this code from ArrayList:
public E remove(int index) {
RangeCheck(index);
modCount++;
E oldValue = (E) elementData[index];
int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null; // Let gc do its work
return oldValue;
}
Also, explicitly nulling an object will not cause an object to be collected any sooner than if it just went out of scope naturally as long as no references remain.
Both:
void foo() {
Object o = new Object();
/// do stuff with o
}
and:
void foo() {
Object o = new Object();
/// do stuff with o
o = null;
}
Are functionally equivalent.
In my experience, more often than not, people null out references out of paranoia not out of necessity. Here is a quick guideline:
If object A references object B and you no longer need this reference and object A is not eligible for garbage collection then you should explicitly null out the field. There is no need to null out a field if the enclosing object is getting garbage collected anyway. Nulling out fields in a dispose() method is almost always useless.
There is no need to null out object references created in a method. They will get cleared automatically once the method terminates. The exception to this rule is if you're running in a very long method or some massive loop and you need to ensure that some references get cleared before the end of the method. Again, these cases are extremely rare.
I would say that the vast majority of the time you will not need to null out references. Trying to outsmart the garbage collector is useless. You will just end up with inefficient, unreadable code.
Good article is today's coding horror.
The way GC's work is by looking for objects that do not have any pointers to them, the area of their search is heap/stack and any other spaces they have. So if you set a variable to null, the actual object is now not pointed by anyone, and hence could be GC'd.
But since the GC might not run at that exact instant, you might not actually be buying yourself anything. But if your method is fairly long (in terms of execution time) it might be worth it since you will be increasing your chances of GC collecting that object.
The problem can also be complicated with code optimizations, if you never use the variable after you set it to null, it would be a safe optimization to remove the line that sets the value to null (one less instruction to execute). So you might not actually be getting any improvement.
So in summary, yes it can help, but it will not be deterministic.
At least in java, it's not voodoo programming at all. When you create an object in java using something like
Foo bar = new Foo();
you do two things: first, you create a reference to an object, and second, you create the Foo object itself. So long as that reference or another exists, the specific object can't be gc'd. however, when you assign null to that reference...
bar = null ;
and assuming nothing else has a reference to the object, it's freed and available for gc the next time the garbage collector passes by.
It depends.
Generally speaking shorter you keep references to your objects, faster they'll get collected.
If your method takes say 2 seconds to execute and you don't need an object anymore after one second of method execution, it makes sense to clear any references to it. If GC sees that after one second, your object is still referenced, next time it might check it in a minute or so.
Anyway, setting all references to null by default is to me premature optimization and nobody should do it unless in specific rare cases where it measurably decreases memory consuption.
Explicitly setting a reference to null instead of just letting the variable go out of scope, does not help the garbage collector, unless the object held is very large, where setting it to null as soon as you are done with is a good idea.
Generally setting references to null, mean to the READER of the code that this object is completely done with and should not be concerned about any more.
A similar effect can be achieved by introducing a narrower scope by putting in an extra set of braces
{
int l;
{ // <- here
String bigThing = ....;
l = bigThing.length();
} // <- and here
}
this allows the bigThing to be garbage collected right after leaving the nested braces.
public class JavaMemory {
private final int dataSize = (int) (Runtime.getRuntime().maxMemory() * 0.6);
public void f() {
{
byte[] data = new byte[dataSize];
//data = null;
}
byte[] data2 = new byte[dataSize];
}
public static void main(String[] args) {
JavaMemory jmp = new JavaMemory();
jmp.f();
}
}
Above program throws OutOfMemoryError. If you uncomment data = null;, the OutOfMemoryError is solved. It is always good practice to set the unused variable to null
I was working on a video conferencing application one time and noticed a huge huge huge difference in performance when I took the time to null references as soon as I didn't need the object anymore. This was in 2003-2004 and I can only imagine the GC has gotten even smarter since. In my case I had hundreds of objects coming and going out of scope every second, so I noticed the GC when it kicked in periodically. However after I made it a point to null objects the GC stopped pausing my application.
So it depends on what your doing...
Yes.
From "The Pragmatic Programmer" p.292:
By setting a reference to NULL you reduce the number of pointers to the object by one ... (which will allow the garbage collector to remove it)
I assume the OP is referring to things like this:
private void Blah()
{
MyObj a;
MyObj b;
try {
a = new MyObj();
b = new MyObj;
// do real work
} finally {
a = null;
b = null;
}
}
In this case, wouldn't the VM mark them for GC as soon as they leave scope anyway?
Or, from another perspective, would explicitly setting the items to null cause them to get GC'd before they would if they just went out of scope? If so, the VM may spend time GC'ing the object when the memory isn't needed anyway, which would actually cause worse performance CPU usage wise because it would be GC'ing more earlier.
Even if nullifying the reference were marginally more efficient, would it be worth the ugliness of having to pepper your code with these ugly nullifications? They would only be clutter and obscure the intent code that contains them.
Its a rare codebase that has no better candidate for optimisation than trying to outsmart the Garbage collector (rarer still are developers who succeed in outsmarting it). Your efforts will most likely be better spent elsewhere instead, ditching that crufty Xml parser or finding some opportunity to cache computation. These optimisations will be easier to quantify and don't require you dirty up your codebase with noise.
Oracle doc point out "Assign null to Variables That Are No Longer Needed" https://docs.oracle.com/cd/E19159-01/819-3681/abebi/index.html
"It depends"
I do not know about Java but in .net (C#, VB.net...) it is usually not required to assign a null when you no longer require a object.
However note that it is "usually not required".
By analyzing your code the .net compiler makes a good valuation of the life time of the variable...to accurately tell when the object is not being used anymore. So if you write obj=null it might actually look as if the obj is still being used...in this case it is counter productive to assign a null.
There are a few cases where it might actually help to assign a null. One example is you have a huge code that runs for long time or a method that is running in a different thread, or some loop. In such cases it might help to assign null so that it is easy for the GC to know its not being used anymore.
There is no hard & fast rule for this. Going by the above place null-assigns in your code and do run a profiler to see if it helps in any way. Most probably you might not see a benefit.
If it is .net code you are trying to optimize, then my experience has been that taking good care with Dispose and Finalize methods is actually more beneficial than bothering about nulls.
Some references on the topic:
http://blogs.msdn.com/csharpfaq/archive/2004/03/26/97229.aspx
http://weblogs.asp.net/pwilson/archive/2004/02/20/77422.aspx
In the future execution of your program, the values of some data members will be used to computer an output visible external to the program. Others might or might not be used, depending on future (And impossible to predict) inputs to the program. Other data members might be guaranteed not to be used. All resources, including memory, allocated to those unused data are wasted. The job of the garbage collector (GC) is to eliminate that wasted memory. It would be disastrous for the GC to eliminate something that was needed, so the algorithm used might be conservative, retaining more than the strict minimum. It might use heuristic optimizations to improve its speed, at the cost of retaining some items that are not actually needed. There are many potential algorithms the GC might use. Therefore it is possible that changes you make to your program, and which do not affect the correctness of your program, might nevertheless affect the operation of the GC, either making it run faster to do the same job, or to sooner identify unused items. So this kind of change, setting an unusdd object reference to null, in theory is not always voodoo.
Is it voodoo? There are reportedly parts of the Java library code that do this. The writers of that code are much better than average programmers and either know, or cooperate with, programmers who know details of the garbage collector implementations. So that suggests there is sometimes a benefit.
As you said there are optimizations, i.e. JVM knows the place when the variable was last used and the object referenced by it can be GCed right after this last point (still executing in current scope). So nulling out references in most cases does not help GC.
But it can be useful to avoid "nepotism" (or "floating garbage") problem (read more here or watch video). The problem exists because heap is split into Old and Young generations and there are different GC mechanisms applied: Minor GC (which is fast and happens often to clean young gen) and Major Gc (which causes longer pause to clean Old gen). "Nepotism" does not allow for garbage in Young gen to be collected if it is referenced by garbage which was already tenured to an Old gen.
This is 'pathological' because ANY promoted node will result in the promotion of ALL following nodes until a GC resolves the issue.
To avoid nepotism it's a good idea to null out references from an object which is supposed to be removed. You can see this technique applied in JDK classes: LinkedList and LinkedHashMap
private E unlinkFirst(Node<E> f) {
final E element = f.item;
final Node<E> next = f.next;
f.item = null;
f.next = null; // help GC
// ...
}

Does assigning objects to null in Java impact garbage collection?

Does assigning an unused object reference to null in Java improve the garbage collection process in any measurable way?
My experience with Java (and C#) has taught me that is often counter intuitive to try and outsmart the virtual machine or JIT compiler, but I've seen co-workers use this method and I am curious if this is a good practice to pick up or one of those voodoo programming superstitions?
Typically, no.
But like all things: it depends. The GC in Java these days is VERY good and everything should be cleaned up very shortly after it is no longer reachable. This is just after leaving a method for local variables, and when a class instance is no longer referenced for fields.
You only need to explicitly null if you know it would remain referenced otherwise. For example an array which is kept around. You may want to null the individual elements of the array when they are no longer needed.
For example, this code from ArrayList:
public E remove(int index) {
RangeCheck(index);
modCount++;
E oldValue = (E) elementData[index];
int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null; // Let gc do its work
return oldValue;
}
Also, explicitly nulling an object will not cause an object to be collected any sooner than if it just went out of scope naturally as long as no references remain.
Both:
void foo() {
Object o = new Object();
/// do stuff with o
}
and:
void foo() {
Object o = new Object();
/// do stuff with o
o = null;
}
Are functionally equivalent.
In my experience, more often than not, people null out references out of paranoia not out of necessity. Here is a quick guideline:
If object A references object B and you no longer need this reference and object A is not eligible for garbage collection then you should explicitly null out the field. There is no need to null out a field if the enclosing object is getting garbage collected anyway. Nulling out fields in a dispose() method is almost always useless.
There is no need to null out object references created in a method. They will get cleared automatically once the method terminates. The exception to this rule is if you're running in a very long method or some massive loop and you need to ensure that some references get cleared before the end of the method. Again, these cases are extremely rare.
I would say that the vast majority of the time you will not need to null out references. Trying to outsmart the garbage collector is useless. You will just end up with inefficient, unreadable code.
Good article is today's coding horror.
The way GC's work is by looking for objects that do not have any pointers to them, the area of their search is heap/stack and any other spaces they have. So if you set a variable to null, the actual object is now not pointed by anyone, and hence could be GC'd.
But since the GC might not run at that exact instant, you might not actually be buying yourself anything. But if your method is fairly long (in terms of execution time) it might be worth it since you will be increasing your chances of GC collecting that object.
The problem can also be complicated with code optimizations, if you never use the variable after you set it to null, it would be a safe optimization to remove the line that sets the value to null (one less instruction to execute). So you might not actually be getting any improvement.
So in summary, yes it can help, but it will not be deterministic.
At least in java, it's not voodoo programming at all. When you create an object in java using something like
Foo bar = new Foo();
you do two things: first, you create a reference to an object, and second, you create the Foo object itself. So long as that reference or another exists, the specific object can't be gc'd. however, when you assign null to that reference...
bar = null ;
and assuming nothing else has a reference to the object, it's freed and available for gc the next time the garbage collector passes by.
It depends.
Generally speaking shorter you keep references to your objects, faster they'll get collected.
If your method takes say 2 seconds to execute and you don't need an object anymore after one second of method execution, it makes sense to clear any references to it. If GC sees that after one second, your object is still referenced, next time it might check it in a minute or so.
Anyway, setting all references to null by default is to me premature optimization and nobody should do it unless in specific rare cases where it measurably decreases memory consuption.
Explicitly setting a reference to null instead of just letting the variable go out of scope, does not help the garbage collector, unless the object held is very large, where setting it to null as soon as you are done with is a good idea.
Generally setting references to null, mean to the READER of the code that this object is completely done with and should not be concerned about any more.
A similar effect can be achieved by introducing a narrower scope by putting in an extra set of braces
{
int l;
{ // <- here
String bigThing = ....;
l = bigThing.length();
} // <- and here
}
this allows the bigThing to be garbage collected right after leaving the nested braces.
public class JavaMemory {
private final int dataSize = (int) (Runtime.getRuntime().maxMemory() * 0.6);
public void f() {
{
byte[] data = new byte[dataSize];
//data = null;
}
byte[] data2 = new byte[dataSize];
}
public static void main(String[] args) {
JavaMemory jmp = new JavaMemory();
jmp.f();
}
}
Above program throws OutOfMemoryError. If you uncomment data = null;, the OutOfMemoryError is solved. It is always good practice to set the unused variable to null
I was working on a video conferencing application one time and noticed a huge huge huge difference in performance when I took the time to null references as soon as I didn't need the object anymore. This was in 2003-2004 and I can only imagine the GC has gotten even smarter since. In my case I had hundreds of objects coming and going out of scope every second, so I noticed the GC when it kicked in periodically. However after I made it a point to null objects the GC stopped pausing my application.
So it depends on what your doing...
Yes.
From "The Pragmatic Programmer" p.292:
By setting a reference to NULL you reduce the number of pointers to the object by one ... (which will allow the garbage collector to remove it)
I assume the OP is referring to things like this:
private void Blah()
{
MyObj a;
MyObj b;
try {
a = new MyObj();
b = new MyObj;
// do real work
} finally {
a = null;
b = null;
}
}
In this case, wouldn't the VM mark them for GC as soon as they leave scope anyway?
Or, from another perspective, would explicitly setting the items to null cause them to get GC'd before they would if they just went out of scope? If so, the VM may spend time GC'ing the object when the memory isn't needed anyway, which would actually cause worse performance CPU usage wise because it would be GC'ing more earlier.
Even if nullifying the reference were marginally more efficient, would it be worth the ugliness of having to pepper your code with these ugly nullifications? They would only be clutter and obscure the intent code that contains them.
Its a rare codebase that has no better candidate for optimisation than trying to outsmart the Garbage collector (rarer still are developers who succeed in outsmarting it). Your efforts will most likely be better spent elsewhere instead, ditching that crufty Xml parser or finding some opportunity to cache computation. These optimisations will be easier to quantify and don't require you dirty up your codebase with noise.
Oracle doc point out "Assign null to Variables That Are No Longer Needed" https://docs.oracle.com/cd/E19159-01/819-3681/abebi/index.html
"It depends"
I do not know about Java but in .net (C#, VB.net...) it is usually not required to assign a null when you no longer require a object.
However note that it is "usually not required".
By analyzing your code the .net compiler makes a good valuation of the life time of the variable...to accurately tell when the object is not being used anymore. So if you write obj=null it might actually look as if the obj is still being used...in this case it is counter productive to assign a null.
There are a few cases where it might actually help to assign a null. One example is you have a huge code that runs for long time or a method that is running in a different thread, or some loop. In such cases it might help to assign null so that it is easy for the GC to know its not being used anymore.
There is no hard & fast rule for this. Going by the above place null-assigns in your code and do run a profiler to see if it helps in any way. Most probably you might not see a benefit.
If it is .net code you are trying to optimize, then my experience has been that taking good care with Dispose and Finalize methods is actually more beneficial than bothering about nulls.
Some references on the topic:
http://blogs.msdn.com/csharpfaq/archive/2004/03/26/97229.aspx
http://weblogs.asp.net/pwilson/archive/2004/02/20/77422.aspx
In the future execution of your program, the values of some data members will be used to computer an output visible external to the program. Others might or might not be used, depending on future (And impossible to predict) inputs to the program. Other data members might be guaranteed not to be used. All resources, including memory, allocated to those unused data are wasted. The job of the garbage collector (GC) is to eliminate that wasted memory. It would be disastrous for the GC to eliminate something that was needed, so the algorithm used might be conservative, retaining more than the strict minimum. It might use heuristic optimizations to improve its speed, at the cost of retaining some items that are not actually needed. There are many potential algorithms the GC might use. Therefore it is possible that changes you make to your program, and which do not affect the correctness of your program, might nevertheless affect the operation of the GC, either making it run faster to do the same job, or to sooner identify unused items. So this kind of change, setting an unusdd object reference to null, in theory is not always voodoo.
Is it voodoo? There are reportedly parts of the Java library code that do this. The writers of that code are much better than average programmers and either know, or cooperate with, programmers who know details of the garbage collector implementations. So that suggests there is sometimes a benefit.
As you said there are optimizations, i.e. JVM knows the place when the variable was last used and the object referenced by it can be GCed right after this last point (still executing in current scope). So nulling out references in most cases does not help GC.
But it can be useful to avoid "nepotism" (or "floating garbage") problem (read more here or watch video). The problem exists because heap is split into Old and Young generations and there are different GC mechanisms applied: Minor GC (which is fast and happens often to clean young gen) and Major Gc (which causes longer pause to clean Old gen). "Nepotism" does not allow for garbage in Young gen to be collected if it is referenced by garbage which was already tenured to an Old gen.
This is 'pathological' because ANY promoted node will result in the promotion of ALL following nodes until a GC resolves the issue.
To avoid nepotism it's a good idea to null out references from an object which is supposed to be removed. You can see this technique applied in JDK classes: LinkedList and LinkedHashMap
private E unlinkFirst(Node<E> f) {
final E element = f.item;
final Node<E> next = f.next;
f.item = null;
f.next = null; // help GC
// ...
}

How does Java Garbage collector handle self-reference?

Hopefully a simple question. Take for instance a Circularly-linked list:
class ListContainer
{
private listContainer next;
<..>
public void setNext(listContainer next)
{
this.next = next;
}
}
class List
{
private listContainer entry;
<..>
}
Now since it's a circularly-linked list, when a single elemnt is added, it has a reference to itself in it's next variable. When deleting the only element in the list, entry is set to null. Is there a need to set ListContainer.next to null as well for Garbage Collector to free it's memory or does it handle such self-references automagically?
Garbage collectors which rely solely on reference counting are generally vulnerable to failing to collection self-referential structures such as this. These GCs rely on a count of the number of references to the object in order to calculate whether a given object is reachable.
Non-reference counting approaches apply a more comprehensive reachability test to determine whether an object is eligible to be collected. These systems define an object (or set of objects) which are always assumed to be reachable. Any object for which references are available from this object graph is considered ineligible for collection. Any object not directly accessible from this object is not. Thus, cycles do not end up affecting reachability, and can be collected.
See also, the Wikipedia page on tracing garbage collectors.
Circular references is a (solvable) problem if you rely on counting the references in order to decide whether an object is dead. No java implementation uses reference counting, AFAIK. Newer Sun JREs uses a mix of several types of GC, all mark-and-sweep or copying I think.
You can read more about garbage collection in general at Wikipedia, and some articles about java GC here and here, for example.
The actual answer to this is implementation dependent. The Sun JVM keeps track of some set of root objects (threads and the like), and when it needs to do a garbage collection, traces out which objects are reachable from those and saves them, discarding the rest. It's actually more complicated than that to allow for some optimizations, but that is the basic principle. This version does not care about circular references: as long as no live object holds a reference to a dead one, it can be GCed.
Other JVMs can use a method known as reference counting. When a reference is created to the object, some counter is incremented, and when the reference goes out of scope, the counter is decremented. If the counter reaches zero, the object is finalized and garbage collected. This version, however, does allow for the possibility of circular references that would never be garbage collected. As a safeguard, many such JVMs include a backup method to determine which objects actually are dead which it runs periodically to resolve self-references and defrag the heap.
As a non-answer aside (the existing answers more than suffice), you might want to check out a whitepaper on the JVM garbage collection system if you are at all interested in GC. (Any, just google JVM Garbage Collection)
I was amazed at some of the techniques used, and when reading through some of the concepts like "Eden" I really realized for the first time that Java and the JVM actually could beat C/C++ in speed. (Whenever C/C++ frees an object/block of memory, code is involved... When Java frees an object, it actually doesn't do anything at all; since in good OO code, most objects are created and freed almost immediately, this is amazingly efficient.)
Modern GC's tend to be very efficient, managing older objects much differently than new objects, being able to control GCs to be short and half-assed or long and thorough, and a lot of GC options can be managed by command line switches so it's actually useful to know what all the terms actually refer to.
Note: I just realized this was misleading. C++'s STACK allocation is very fast--my point was about allocating objects that are able to exist after the current routine has finished (which I believe SHOULD be all objects--it's something you shouldn't have to think about if you are going to think in OO, but in C++ speed may make this impractical).
If you are only allocating C++ classes on the stack, it's allocation will be at least as fast as Java's.
Java collects any objects that are not reachable. If nothing else has a reference to the entry, then it will be collected, even though it has a reference to itself.
yes Java Garbage collector handle self-reference!
How?
There are special objects called called garbage-collection roots (GC roots). These are always reachable and so is any object that has them at its own root.
A simple Java application has the following GC roots:
Local variables in the main method
The main thread
Static variables of the main class
To determine which objects are no longer in use, the JVM intermittently runs what is very aptly called a mark-and-sweep algorithm. It works as follows
The algorithm traverses all object references, starting with the GC
roots, and marks every object found as alive.
All of the heap memory that is not occupied by marked objects is
reclaimed. It is simply marked as free, essentially swept free of
unused objects.
So if any object is not reachable from the GC roots(even if it is self-referenced or cyclic-referenced) it will be subjected to garbage collection.
Simply, Yes. :)
Check out http://www.ibm.com/developerworks/java/library/j-jtp10283/
All JDKs (from Sun) have a concept of "reach-ability". If the GC cannot "reach" an object, it goes away.
This isn't any "new" info (your first to respondents are great) but the link is useful, and brevity is something sweet. :)

Does using final for variables in Java improve garbage collection?

Today my colleagues and me have a discussion about the usage of the final keyword in Java to improve the garbage collection.
For example, if you write a method like:
public Double doCalc(final Double value)
{
final Double maxWeight = 1000.0;
final Double totalWeight = maxWeight * value;
return totalWeight;
}
Declaring the variables in the method final would help the garbage collection to clean up the memory from the unused variables in the method after the method exits.
Is this true?
Here's a slightly different example, one with final reference-type fields rather than final value-type local variables:
public class MyClass {
public final MyOtherObject obj;
}
Every time you create an instance of MyClass, you'll be creating an outgoing reference to a MyOtherObject instance, and the GC will have to follow that link to look for live objects.
The JVM uses a mark-sweep GC algorithm, which has to examine all the live refereces in the GC "root" locations (like all the objects in the current call stack). Each live object is "marked" as being alive, and any object referred to by a live object is also marked as being alive.
After the completion of the mark phase, the GC sweeps through the heap, freeing memory for all unmarked objects (and compacting the memory for the remaining live objects).
Also, it's important to recognize that the Java heap memory is partitioned into a "young generation" and an "old generation". All objects are initially allocated in the young generation (sometimes referred to as "the nursery"). Since most objects are short-lived, the GC is more aggressive about freeing recent garbage from the young generation. If an object survives a collection cycle of the young generation, it gets moved into the old generation (sometimes referred to as the "tenured generation"), which is processed less frequently.
So, off the top of my head, I'm going to say "no, the 'final' modifer doesn't help the GC reduce its workload".
In my opinion, the best strategy for optimizing your memory-management in Java is to eliminate spurious references as quickly as possible. You could do that by assigning "null" to an object reference as soon as you're done using it.
Or, better yet, minimize the size of each declaration scope. For example, if you declare an object at the beginning of a 1000-line method, and if the object stays alive until the close of that method's scope (the last closing curly brace), then the object might stay alive for much longer that actually necessary.
If you use small methods, with only a dozen or so lines of code, then the objects declared within that method will fall out of scope more quickly, and the GC will be able to do most of its work within the much-more-efficient young generation. You don't want objects being moved into the older generation unless absolutely necessary.
Declaring a local variable final will not affect garbage collection, it only means you can not modify the variable. Your example above should not compile as you are modifying the variable totalWeight which has been marked final. On the other hand, declaring a primitive (double instead of Double) final will allows that variable to be inlined into the calling code, so that could cause some memory and performance improvement. This is used when you have a number of public static final Strings in a class.
In general, the compiler and runtime will optimize where it can. It is best to write the code appropriately and not try to be too tricky. Use final when you do not want the variable to be modified. Assume that any easy optimizations will be performed by the compiler, and if you are worried about performance or memory use, use a profiler to determine the real problem.
No, it is emphatically not true.
Remember that final does not mean constant, it just means you can't change the reference.
final MyObject o = new MyObject();
o.setValue("foo"); // Works just fine
o = new MyObject(); // Doesn't work.
There may be some small optimisation based around the knowledge that the JVM will never have to modify the reference (such as not having check to see if it has changed) but it would be so minor as to not worry about.
Final should be thought of as useful meta-data to the developer and not as a compiler optimisation.
Some points to clear up:
Nulling out reference should not help GC. If it did, it would indicate that your variables are over scoped. One exception is the case of object nepotism.
There is no on-stack allocation as of yet in Java.
Declaring a variable final means you can't (under normal conditions) assign a new value to that variable. Since final says nothing about scope, it doesn't say anything about it's effect on GC.
Well, I don't know about the use of the "final" modifier in this case, or its effect on the GC.
But I can tell you this: your use of Boxed values rather than primitives (e.g., Double instead of double) will allocate those objects on the heap rather than the stack, and will produce unnecessary garbage that the GC will have to clean up.
I only use boxed primitives when required by an existing API, or when I need nullable primatives.
Final variables cannot be changed after initial assignment (enforced by the compiler).
This does not change the behaviour of the garbage collection as such. Only thing is that these variables cannot be nulled when not being used any more (which may help the garbage collection in memory tight situations).
You should know that final allows the compiler to make assumptions about what to optimize. Inlining code and not including code known not to be reachable.
final boolean debug = false;
......
if (debug) {
System.out.println("DEBUG INFO!");
}
The println will not be included in the byte code.
There is a not so well known corner case with generational garbage collectors. (For a brief description read the answer by benjismith for a deeper insight read the articles at the end).
The idea in generational GCs is that most of the time only young generations need to be considered. The root location is scanned for references, and then the young generation objects are scanned. During this more frequent sweeps no object in the old generation are checked.
Now, the problem comes from the fact that an object is not allowed to have references to younger objects. When a long lived (old generation) object gets a reference to a new object, that reference must be explicitly tracked by the garbage collector (see article from IBM on the hotspot JVM collector), actually affecting the GC performance.
The reason why an old object cannot refer to a younger one is that, as the old object is not checked in minor collections, if the only reference to the object is kept in the old object, it will not get marked, and would be wrongly deallocated during the sweep stage.
Of course, as pointed by many, the final keyword does not reallly affect the garbage collector, but it does guarantee that the reference will never be changed into a younger object if this object survives the minor collections and makes it to the older heap.
Articles:
IBM on garbage collection: history, in the hotspot JVM and performance. These may no longer be fully valid, as it dates back in 2003/04, but they give some easy to read insight into GCs.
Sun on Tuning garbage collection
GC acts on unreachable refs. This has nothing to do with "final", which is merely an assertion of one-time assignment. Is it possible that some VM's GC can make use of "final"? I don't see how or why.
final on local variables and parameters makes no difference to the class files produced, so cannot affect runtime performance. If a class has no subclasses, HotSpot treats that class as if it is final anyway (it can undo later if a class that breaks that assumption is loaded). I believe final on methods is much the same as classes. final on static field may allow the variable to be interpreted as a "compile-time constant" and optimisation to be done by javac on that basis. final on fields allows the JVM some freedom to ignore happens-before relations.
There seems to be a lot of answers that are wandering conjectures. The truth is, there is no final modifier for local variables at the bytecode level. The virtual machine will never know that your local variables were defined as final or not.
The answer to your question is an emphatic no.
All method and variable can be overridden bydefault in subclasses.If we want to save the subclasses from overridig the members of superclass,we can declare them as final using the keyword final.
For e.g-
final int a=10;
final void display(){......}
Making a method final ensures that the functionality defined in the superclass will never be changed anyway. Similarly the value of a final variable can never be changed. Final variables behaves like class variables.
Strictly speaking about instance fields, final might improve performance slightly if a particular GC wants to exploit that. When a concurrent GC happens (that means that your application is still running, while GC is in progress), see this for a broader explanation, GCs have to employ certain barriers when writes and/or reads are done. The link I gave you pretty much explains that, but to make it really short: when a GC does some concurrent work, all read and writes to the heap (while that GC is in progress), are "intercepted" and applied later in time; so that the concurrent GC phase can finish it's work.
For final instance fields, since they can not be modified (unless reflection), these barriers can be omitted. And this is not just pure theory.
Shenandoah GC has them in practice (though not for long), and you can do, for example:
-XX:+UnlockExperimentalVMOptions
-XX:+UseShenandoahGC
-XX:+ShenandoahOptimizeInstanceFinals
And there will be optimizations in the GC algorithm that will make it slightly faster. This is because there will be no barriers intercepting final, since no one should modify them, ever. Not even via reflection or JNI.
The only thing that I can think of is that the compiler might optimize away the final variables and inline them as constants into the code, thus you end up with no memory allocated.
absolutely, as long as make object's life shorter which yield great benefit of memory management, recently we examined export functionality having instance variables on one test and another test having method level local variable. during load testing, JVM throws outofmemoryerror on first test and JVM got halted. but in second test, successfully able to get the report due to better memory management.
The only time I prefer declaring local variables as final is when:
I have to make them final so that they can be shared with some anonymous class (for example: creating daemon thread and let it access some value from enclosing method)
I want to make them final (for example: some value that shouldn't/doesn't get overridden by mistake)
Does they help in fast garbage collection?
AFAIK a object becomes a candidate of GC collection if it has zero strong references to it and in that case as well there is no guarantee that they will be immediately garbage collected . In general, a strong reference is said to die when it goes out of scope or user explicitly reassign it to null reference, thus, declaring them final means that reference will continue to exists till the method exists (unless its scope is explicitly narrowed down to a specific inner block {}) because you can't reassign final variables (i.e. can't reassign to null). So I think w.r.t Garbage Collection 'final' may introduce a unwanted possible delay so one must be little careful in defining there scope as that controls when they will become candidate for GC.

Categories