Today my colleagues and me have a discussion about the usage of the final keyword in Java to improve the garbage collection.
For example, if you write a method like:
public Double doCalc(final Double value)
{
final Double maxWeight = 1000.0;
final Double totalWeight = maxWeight * value;
return totalWeight;
}
Declaring the variables in the method final would help the garbage collection to clean up the memory from the unused variables in the method after the method exits.
Is this true?
Here's a slightly different example, one with final reference-type fields rather than final value-type local variables:
public class MyClass {
public final MyOtherObject obj;
}
Every time you create an instance of MyClass, you'll be creating an outgoing reference to a MyOtherObject instance, and the GC will have to follow that link to look for live objects.
The JVM uses a mark-sweep GC algorithm, which has to examine all the live refereces in the GC "root" locations (like all the objects in the current call stack). Each live object is "marked" as being alive, and any object referred to by a live object is also marked as being alive.
After the completion of the mark phase, the GC sweeps through the heap, freeing memory for all unmarked objects (and compacting the memory for the remaining live objects).
Also, it's important to recognize that the Java heap memory is partitioned into a "young generation" and an "old generation". All objects are initially allocated in the young generation (sometimes referred to as "the nursery"). Since most objects are short-lived, the GC is more aggressive about freeing recent garbage from the young generation. If an object survives a collection cycle of the young generation, it gets moved into the old generation (sometimes referred to as the "tenured generation"), which is processed less frequently.
So, off the top of my head, I'm going to say "no, the 'final' modifer doesn't help the GC reduce its workload".
In my opinion, the best strategy for optimizing your memory-management in Java is to eliminate spurious references as quickly as possible. You could do that by assigning "null" to an object reference as soon as you're done using it.
Or, better yet, minimize the size of each declaration scope. For example, if you declare an object at the beginning of a 1000-line method, and if the object stays alive until the close of that method's scope (the last closing curly brace), then the object might stay alive for much longer that actually necessary.
If you use small methods, with only a dozen or so lines of code, then the objects declared within that method will fall out of scope more quickly, and the GC will be able to do most of its work within the much-more-efficient young generation. You don't want objects being moved into the older generation unless absolutely necessary.
Declaring a local variable final will not affect garbage collection, it only means you can not modify the variable. Your example above should not compile as you are modifying the variable totalWeight which has been marked final. On the other hand, declaring a primitive (double instead of Double) final will allows that variable to be inlined into the calling code, so that could cause some memory and performance improvement. This is used when you have a number of public static final Strings in a class.
In general, the compiler and runtime will optimize where it can. It is best to write the code appropriately and not try to be too tricky. Use final when you do not want the variable to be modified. Assume that any easy optimizations will be performed by the compiler, and if you are worried about performance or memory use, use a profiler to determine the real problem.
No, it is emphatically not true.
Remember that final does not mean constant, it just means you can't change the reference.
final MyObject o = new MyObject();
o.setValue("foo"); // Works just fine
o = new MyObject(); // Doesn't work.
There may be some small optimisation based around the knowledge that the JVM will never have to modify the reference (such as not having check to see if it has changed) but it would be so minor as to not worry about.
Final should be thought of as useful meta-data to the developer and not as a compiler optimisation.
Some points to clear up:
Nulling out reference should not help GC. If it did, it would indicate that your variables are over scoped. One exception is the case of object nepotism.
There is no on-stack allocation as of yet in Java.
Declaring a variable final means you can't (under normal conditions) assign a new value to that variable. Since final says nothing about scope, it doesn't say anything about it's effect on GC.
Well, I don't know about the use of the "final" modifier in this case, or its effect on the GC.
But I can tell you this: your use of Boxed values rather than primitives (e.g., Double instead of double) will allocate those objects on the heap rather than the stack, and will produce unnecessary garbage that the GC will have to clean up.
I only use boxed primitives when required by an existing API, or when I need nullable primatives.
Final variables cannot be changed after initial assignment (enforced by the compiler).
This does not change the behaviour of the garbage collection as such. Only thing is that these variables cannot be nulled when not being used any more (which may help the garbage collection in memory tight situations).
You should know that final allows the compiler to make assumptions about what to optimize. Inlining code and not including code known not to be reachable.
final boolean debug = false;
......
if (debug) {
System.out.println("DEBUG INFO!");
}
The println will not be included in the byte code.
There is a not so well known corner case with generational garbage collectors. (For a brief description read the answer by benjismith for a deeper insight read the articles at the end).
The idea in generational GCs is that most of the time only young generations need to be considered. The root location is scanned for references, and then the young generation objects are scanned. During this more frequent sweeps no object in the old generation are checked.
Now, the problem comes from the fact that an object is not allowed to have references to younger objects. When a long lived (old generation) object gets a reference to a new object, that reference must be explicitly tracked by the garbage collector (see article from IBM on the hotspot JVM collector), actually affecting the GC performance.
The reason why an old object cannot refer to a younger one is that, as the old object is not checked in minor collections, if the only reference to the object is kept in the old object, it will not get marked, and would be wrongly deallocated during the sweep stage.
Of course, as pointed by many, the final keyword does not reallly affect the garbage collector, but it does guarantee that the reference will never be changed into a younger object if this object survives the minor collections and makes it to the older heap.
Articles:
IBM on garbage collection: history, in the hotspot JVM and performance. These may no longer be fully valid, as it dates back in 2003/04, but they give some easy to read insight into GCs.
Sun on Tuning garbage collection
GC acts on unreachable refs. This has nothing to do with "final", which is merely an assertion of one-time assignment. Is it possible that some VM's GC can make use of "final"? I don't see how or why.
final on local variables and parameters makes no difference to the class files produced, so cannot affect runtime performance. If a class has no subclasses, HotSpot treats that class as if it is final anyway (it can undo later if a class that breaks that assumption is loaded). I believe final on methods is much the same as classes. final on static field may allow the variable to be interpreted as a "compile-time constant" and optimisation to be done by javac on that basis. final on fields allows the JVM some freedom to ignore happens-before relations.
There seems to be a lot of answers that are wandering conjectures. The truth is, there is no final modifier for local variables at the bytecode level. The virtual machine will never know that your local variables were defined as final or not.
The answer to your question is an emphatic no.
All method and variable can be overridden bydefault in subclasses.If we want to save the subclasses from overridig the members of superclass,we can declare them as final using the keyword final.
For e.g-
final int a=10;
final void display(){......}
Making a method final ensures that the functionality defined in the superclass will never be changed anyway. Similarly the value of a final variable can never be changed. Final variables behaves like class variables.
Strictly speaking about instance fields, final might improve performance slightly if a particular GC wants to exploit that. When a concurrent GC happens (that means that your application is still running, while GC is in progress), see this for a broader explanation, GCs have to employ certain barriers when writes and/or reads are done. The link I gave you pretty much explains that, but to make it really short: when a GC does some concurrent work, all read and writes to the heap (while that GC is in progress), are "intercepted" and applied later in time; so that the concurrent GC phase can finish it's work.
For final instance fields, since they can not be modified (unless reflection), these barriers can be omitted. And this is not just pure theory.
Shenandoah GC has them in practice (though not for long), and you can do, for example:
-XX:+UnlockExperimentalVMOptions
-XX:+UseShenandoahGC
-XX:+ShenandoahOptimizeInstanceFinals
And there will be optimizations in the GC algorithm that will make it slightly faster. This is because there will be no barriers intercepting final, since no one should modify them, ever. Not even via reflection or JNI.
The only thing that I can think of is that the compiler might optimize away the final variables and inline them as constants into the code, thus you end up with no memory allocated.
absolutely, as long as make object's life shorter which yield great benefit of memory management, recently we examined export functionality having instance variables on one test and another test having method level local variable. during load testing, JVM throws outofmemoryerror on first test and JVM got halted. but in second test, successfully able to get the report due to better memory management.
The only time I prefer declaring local variables as final is when:
I have to make them final so that they can be shared with some anonymous class (for example: creating daemon thread and let it access some value from enclosing method)
I want to make them final (for example: some value that shouldn't/doesn't get overridden by mistake)
Does they help in fast garbage collection?
AFAIK a object becomes a candidate of GC collection if it has zero strong references to it and in that case as well there is no guarantee that they will be immediately garbage collected . In general, a strong reference is said to die when it goes out of scope or user explicitly reassign it to null reference, thus, declaring them final means that reference will continue to exists till the method exists (unless its scope is explicitly narrowed down to a specific inner block {}) because you can't reassign final variables (i.e. can't reassign to null). So I think w.r.t Garbage Collection 'final' may introduce a unwanted possible delay so one must be little careful in defining there scope as that controls when they will become candidate for GC.
Related
I have code like:
public void foo()
{
Object x = new LongObject();
doSomething(x);
//More Code
// x is never used again
// x = null helps GB??
Object x2 = new LongObject();
doSomething(x2);
}
I would like that memory alocated by x could be free by GC if it's needed. But I don't know if set to null is necesary or compiler do it.
In point of fact, the JIT does liveness analysis on references (which at bytecode level are stored as slots in the current frame). If a reference is never again read from, its slot can be reused, and the JIT will know that. It is completely possible for an object to be garbage collected while a variable that refers to it is still in lexical scope, so long as the compiler and JIT are able to prove that the variable will never again be dereferenced.
The point is: scope is a construct of the language, and specifies what a name like x means at any point in the text of the program code that it occurs. Lifetime is a property of objects, and the JIT and GC manage that -- often in non-obvious ways.
Remember that the JIT can recompile your code while it's running, and will optimize your code as it sees what happens when it executes. Unless you're really certain you know what you're doing, don't try to outsmart the JIT. Write code that is correct and let the JIT do its job, and only worry about it if you have evidence that the JIT hasn't done its job well enough.
To answer your questions literally, the compiler (speaking of source code to bytecode compiler) never inserts null assignments, but still, assigning a variable to null is not necessary—usually.
As this answer explains, scope is a compile time thing and formally, an object is eligible to garbage collection, if it can not “be accessed in any potential continuing computation from any live thread”. But there is no guaranty about which eligible object will be identified by a particular implementation. As the linked answer also explains, JIT compiled code will only keep references to objects which will be subsequently accessed. This may go even further than you expect, allow garbage collection of objects that look like being in use in the source code, as runtime optimization may transform the code and reduce actual memory accesses.
But in interpreted mode, the analysis will not go so far and there might be object references in the current stack frame preventing the collection of the referent, despite the variable is not being used afterwards or even out of scope in the source code. There is no guaranty that switching from interpreted to compiled code while the method is executed is capable of getting rid of such a dangling references. It’s even unlikely that the hotspot optimizer considers compiling foo() when the actual heavy computation happens within doSomething.
Still, this is rarely an issue. Running interpreted happens only during the initialization or first time execution and even if these objects are large, there’s rarely a problem if such an object gets collected a bit later than it could. An average application consists of millions of objects.
However, if you ever think there could be an issue, you can easily fix this, without assigning null to the variable. Limit the scope:
public void foo()
{
{
Object x = new LongObject();
doSomething(x);
//More Code
}
{
Object x2 = new LongObject();
doSomething(x2);
}
}
Other than assigning null, limiting the scope of variables to the actual use is improving the source code quality, even in cases where it has no impact on the compiled code. While the scope is purely a source code thing, it can have an impact on the bytecode though. In the code above, compilers will reuse the location of x within the stack frame to store x2, so no dangling reference to the first LongObject exists during the second doSomething execution.
As said, this is rarely needed for memory management and improving source code quality should be driving you decisions, not attempts to help the garbage collector.
Does assigning an unused object reference to null in Java improve the garbage collection process in any measurable way?
My experience with Java (and C#) has taught me that is often counter intuitive to try and outsmart the virtual machine or JIT compiler, but I've seen co-workers use this method and I am curious if this is a good practice to pick up or one of those voodoo programming superstitions?
Typically, no.
But like all things: it depends. The GC in Java these days is VERY good and everything should be cleaned up very shortly after it is no longer reachable. This is just after leaving a method for local variables, and when a class instance is no longer referenced for fields.
You only need to explicitly null if you know it would remain referenced otherwise. For example an array which is kept around. You may want to null the individual elements of the array when they are no longer needed.
For example, this code from ArrayList:
public E remove(int index) {
RangeCheck(index);
modCount++;
E oldValue = (E) elementData[index];
int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null; // Let gc do its work
return oldValue;
}
Also, explicitly nulling an object will not cause an object to be collected any sooner than if it just went out of scope naturally as long as no references remain.
Both:
void foo() {
Object o = new Object();
/// do stuff with o
}
and:
void foo() {
Object o = new Object();
/// do stuff with o
o = null;
}
Are functionally equivalent.
In my experience, more often than not, people null out references out of paranoia not out of necessity. Here is a quick guideline:
If object A references object B and you no longer need this reference and object A is not eligible for garbage collection then you should explicitly null out the field. There is no need to null out a field if the enclosing object is getting garbage collected anyway. Nulling out fields in a dispose() method is almost always useless.
There is no need to null out object references created in a method. They will get cleared automatically once the method terminates. The exception to this rule is if you're running in a very long method or some massive loop and you need to ensure that some references get cleared before the end of the method. Again, these cases are extremely rare.
I would say that the vast majority of the time you will not need to null out references. Trying to outsmart the garbage collector is useless. You will just end up with inefficient, unreadable code.
Good article is today's coding horror.
The way GC's work is by looking for objects that do not have any pointers to them, the area of their search is heap/stack and any other spaces they have. So if you set a variable to null, the actual object is now not pointed by anyone, and hence could be GC'd.
But since the GC might not run at that exact instant, you might not actually be buying yourself anything. But if your method is fairly long (in terms of execution time) it might be worth it since you will be increasing your chances of GC collecting that object.
The problem can also be complicated with code optimizations, if you never use the variable after you set it to null, it would be a safe optimization to remove the line that sets the value to null (one less instruction to execute). So you might not actually be getting any improvement.
So in summary, yes it can help, but it will not be deterministic.
At least in java, it's not voodoo programming at all. When you create an object in java using something like
Foo bar = new Foo();
you do two things: first, you create a reference to an object, and second, you create the Foo object itself. So long as that reference or another exists, the specific object can't be gc'd. however, when you assign null to that reference...
bar = null ;
and assuming nothing else has a reference to the object, it's freed and available for gc the next time the garbage collector passes by.
It depends.
Generally speaking shorter you keep references to your objects, faster they'll get collected.
If your method takes say 2 seconds to execute and you don't need an object anymore after one second of method execution, it makes sense to clear any references to it. If GC sees that after one second, your object is still referenced, next time it might check it in a minute or so.
Anyway, setting all references to null by default is to me premature optimization and nobody should do it unless in specific rare cases where it measurably decreases memory consuption.
Explicitly setting a reference to null instead of just letting the variable go out of scope, does not help the garbage collector, unless the object held is very large, where setting it to null as soon as you are done with is a good idea.
Generally setting references to null, mean to the READER of the code that this object is completely done with and should not be concerned about any more.
A similar effect can be achieved by introducing a narrower scope by putting in an extra set of braces
{
int l;
{ // <- here
String bigThing = ....;
l = bigThing.length();
} // <- and here
}
this allows the bigThing to be garbage collected right after leaving the nested braces.
public class JavaMemory {
private final int dataSize = (int) (Runtime.getRuntime().maxMemory() * 0.6);
public void f() {
{
byte[] data = new byte[dataSize];
//data = null;
}
byte[] data2 = new byte[dataSize];
}
public static void main(String[] args) {
JavaMemory jmp = new JavaMemory();
jmp.f();
}
}
Above program throws OutOfMemoryError. If you uncomment data = null;, the OutOfMemoryError is solved. It is always good practice to set the unused variable to null
I was working on a video conferencing application one time and noticed a huge huge huge difference in performance when I took the time to null references as soon as I didn't need the object anymore. This was in 2003-2004 and I can only imagine the GC has gotten even smarter since. In my case I had hundreds of objects coming and going out of scope every second, so I noticed the GC when it kicked in periodically. However after I made it a point to null objects the GC stopped pausing my application.
So it depends on what your doing...
Yes.
From "The Pragmatic Programmer" p.292:
By setting a reference to NULL you reduce the number of pointers to the object by one ... (which will allow the garbage collector to remove it)
I assume the OP is referring to things like this:
private void Blah()
{
MyObj a;
MyObj b;
try {
a = new MyObj();
b = new MyObj;
// do real work
} finally {
a = null;
b = null;
}
}
In this case, wouldn't the VM mark them for GC as soon as they leave scope anyway?
Or, from another perspective, would explicitly setting the items to null cause them to get GC'd before they would if they just went out of scope? If so, the VM may spend time GC'ing the object when the memory isn't needed anyway, which would actually cause worse performance CPU usage wise because it would be GC'ing more earlier.
Even if nullifying the reference were marginally more efficient, would it be worth the ugliness of having to pepper your code with these ugly nullifications? They would only be clutter and obscure the intent code that contains them.
Its a rare codebase that has no better candidate for optimisation than trying to outsmart the Garbage collector (rarer still are developers who succeed in outsmarting it). Your efforts will most likely be better spent elsewhere instead, ditching that crufty Xml parser or finding some opportunity to cache computation. These optimisations will be easier to quantify and don't require you dirty up your codebase with noise.
Oracle doc point out "Assign null to Variables That Are No Longer Needed" https://docs.oracle.com/cd/E19159-01/819-3681/abebi/index.html
"It depends"
I do not know about Java but in .net (C#, VB.net...) it is usually not required to assign a null when you no longer require a object.
However note that it is "usually not required".
By analyzing your code the .net compiler makes a good valuation of the life time of the variable...to accurately tell when the object is not being used anymore. So if you write obj=null it might actually look as if the obj is still being used...in this case it is counter productive to assign a null.
There are a few cases where it might actually help to assign a null. One example is you have a huge code that runs for long time or a method that is running in a different thread, or some loop. In such cases it might help to assign null so that it is easy for the GC to know its not being used anymore.
There is no hard & fast rule for this. Going by the above place null-assigns in your code and do run a profiler to see if it helps in any way. Most probably you might not see a benefit.
If it is .net code you are trying to optimize, then my experience has been that taking good care with Dispose and Finalize methods is actually more beneficial than bothering about nulls.
Some references on the topic:
http://blogs.msdn.com/csharpfaq/archive/2004/03/26/97229.aspx
http://weblogs.asp.net/pwilson/archive/2004/02/20/77422.aspx
In the future execution of your program, the values of some data members will be used to computer an output visible external to the program. Others might or might not be used, depending on future (And impossible to predict) inputs to the program. Other data members might be guaranteed not to be used. All resources, including memory, allocated to those unused data are wasted. The job of the garbage collector (GC) is to eliminate that wasted memory. It would be disastrous for the GC to eliminate something that was needed, so the algorithm used might be conservative, retaining more than the strict minimum. It might use heuristic optimizations to improve its speed, at the cost of retaining some items that are not actually needed. There are many potential algorithms the GC might use. Therefore it is possible that changes you make to your program, and which do not affect the correctness of your program, might nevertheless affect the operation of the GC, either making it run faster to do the same job, or to sooner identify unused items. So this kind of change, setting an unusdd object reference to null, in theory is not always voodoo.
Is it voodoo? There are reportedly parts of the Java library code that do this. The writers of that code are much better than average programmers and either know, or cooperate with, programmers who know details of the garbage collector implementations. So that suggests there is sometimes a benefit.
As you said there are optimizations, i.e. JVM knows the place when the variable was last used and the object referenced by it can be GCed right after this last point (still executing in current scope). So nulling out references in most cases does not help GC.
But it can be useful to avoid "nepotism" (or "floating garbage") problem (read more here or watch video). The problem exists because heap is split into Old and Young generations and there are different GC mechanisms applied: Minor GC (which is fast and happens often to clean young gen) and Major Gc (which causes longer pause to clean Old gen). "Nepotism" does not allow for garbage in Young gen to be collected if it is referenced by garbage which was already tenured to an Old gen.
This is 'pathological' because ANY promoted node will result in the promotion of ALL following nodes until a GC resolves the issue.
To avoid nepotism it's a good idea to null out references from an object which is supposed to be removed. You can see this technique applied in JDK classes: LinkedList and LinkedHashMap
private E unlinkFirst(Node<E> f) {
final E element = f.item;
final Node<E> next = f.next;
f.item = null;
f.next = null; // help GC
// ...
}
Do you always assign null to an object after its scope has been reached?
Or do you rely on the JVM for garbage collection?
Do you do it for all sort of applications regardless of their length?
If so, is it always a good practice?
It's not necessary to explicitly mark objects as null unless you have a very specific reason. Furthermore, I've never seen an application that marks all objects as null when they are no longer needed. The main benefit of garbage collection is the intrinsic memory management.
no, don't do that, except for specific cases such as static fields or when you know a variable/field lives a lot longer than the code referencing it
yes, but with a working knowledge of your VM's limits (and how to cause blocks of memory to be held accidentally)
n/a
I declare almost all of my variables as "final". I also make my methods small and declare most variables local to methods.
Since they are final I cannot assign them null after use... but that is fine since the methods are small the objects are eligible for garbage collection once they return. Since most of the variables are local there is less chance of accidentally holding onto a reference for longer than needed (memory leak).
Assignin null to a variable does not implicitly mean it will be garbage collected right away. In fact it most likely won't be. Whether you practice setting variables to null is usually only cosmetic (with the exception of static variables)
We don't practice this assigning "null". If a variable's scope has reached it's end it should already be ready for GC. There may be some edge cases in which the scope lasts for a while longer due to a long running operation in which case it might make sense to set it to null, but I would imagine they would be rare.
It also goes without saying that if the variable is an object's member variable or a static variable and hence never really goes out of scope then setting it to null to GC is mandatory.
Garbage collection is not as magical as you might expect. As long as an object is referenced from any reachable object it simply can't be collected. So it might be absolutely necessary to null a reference in order to avoid memory leaks. I don't say you should do this always, but always when it's necessary.
As the others have mentioned, it's not usually necessary.
Not only that, but it clutters up your code and increases the data someone needs to read and understand when revisiting your code.
Assigning is not done to objects, it is done to variables, and it means that this variable then holds a reference to some object. Assigning NULL to a variable is not a way to destroy an object, it just clears one reference. If the variable you are clearing will leave its scope afterwards anyway, assigning NULL is just useless noise, because that happens on leaving scope in any case.
The one time I tend to use this practice is if I need to transform a large Collection in some early part of a method.
For example:
public void foo() {
List<? extends Trade> trades = loadTrades();
Map<Date, List<? extends Trade>> tradesByDate = groupTradesByDate(trades);
trades = null; // trades no longer required.
// Apply business logic to tradesByDate map.
}
Obviously I could reduce the need for this by refactoring this into another method: Map<Date, List<? extends Trade>>> loadTradesAndGroupByDate() so it really depends on circumstances / clarity of code.
I only assign a reference to null when:
The code really lies in a memory-critical part.
The reference has a wide scope (and must be reused later). If it is not the case I just declare it in the smallest possible code block. It will be available for collection automatically.
That means that I only use this technique in iterative process where I use the reference to store incoming huge collection of objects. After processing, I do not need the collection any more but I want to reuse the reference for the next collection.
In that case (and only in that case), I then call System.gc() to give a hint to the Garbage Collector. I monitored this technique through heap visualizer and it works very well for big collections (more then 500Mb of data).
When using the .Net I don't think there's a need to set the object to null. Just let the garbage collection happen.
- Do you always assign null to an object after its scope has been reached?
No
- Or do you rely on the JVM for garbage collection?
Yes
- Do you do it for all sort of applications regardless of their length?
Yes
- If so, is it always a good practice?
N/A
I assume you're asking this question because you've seen code with variables being assigned to null at the point where they will never be accessed again.
I dislike this style, but another programmer used it extensively, and said he was taught to do so at a programming course at his university. The reasoning he gave is that it would prevent undetectable bugs if he tried to reuse the variable later on, instead of indeterminate behavior, he'd get a null pointer exception.
So if you're prone to using variables where you shouldn't be using variables, it might make your code more easy to debug.
There was a class of memory leak bugs that happened regardless of whether I set the reference to null - if the library I was using was written in a language like C without memory management, then simply setting the object to null would not necessarily free the memory. We had to call the object's close() method to release the memory (which, of course, we couldn't do after setting it to null.)
It thus seems to me that the de facto method of memory management in java is to rely on the garbage collector unless the object/library you're using has a close() method (or something similar.)
Does assigning an unused object reference to null in Java improve the garbage collection process in any measurable way?
My experience with Java (and C#) has taught me that is often counter intuitive to try and outsmart the virtual machine or JIT compiler, but I've seen co-workers use this method and I am curious if this is a good practice to pick up or one of those voodoo programming superstitions?
Typically, no.
But like all things: it depends. The GC in Java these days is VERY good and everything should be cleaned up very shortly after it is no longer reachable. This is just after leaving a method for local variables, and when a class instance is no longer referenced for fields.
You only need to explicitly null if you know it would remain referenced otherwise. For example an array which is kept around. You may want to null the individual elements of the array when they are no longer needed.
For example, this code from ArrayList:
public E remove(int index) {
RangeCheck(index);
modCount++;
E oldValue = (E) elementData[index];
int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null; // Let gc do its work
return oldValue;
}
Also, explicitly nulling an object will not cause an object to be collected any sooner than if it just went out of scope naturally as long as no references remain.
Both:
void foo() {
Object o = new Object();
/// do stuff with o
}
and:
void foo() {
Object o = new Object();
/// do stuff with o
o = null;
}
Are functionally equivalent.
In my experience, more often than not, people null out references out of paranoia not out of necessity. Here is a quick guideline:
If object A references object B and you no longer need this reference and object A is not eligible for garbage collection then you should explicitly null out the field. There is no need to null out a field if the enclosing object is getting garbage collected anyway. Nulling out fields in a dispose() method is almost always useless.
There is no need to null out object references created in a method. They will get cleared automatically once the method terminates. The exception to this rule is if you're running in a very long method or some massive loop and you need to ensure that some references get cleared before the end of the method. Again, these cases are extremely rare.
I would say that the vast majority of the time you will not need to null out references. Trying to outsmart the garbage collector is useless. You will just end up with inefficient, unreadable code.
Good article is today's coding horror.
The way GC's work is by looking for objects that do not have any pointers to them, the area of their search is heap/stack and any other spaces they have. So if you set a variable to null, the actual object is now not pointed by anyone, and hence could be GC'd.
But since the GC might not run at that exact instant, you might not actually be buying yourself anything. But if your method is fairly long (in terms of execution time) it might be worth it since you will be increasing your chances of GC collecting that object.
The problem can also be complicated with code optimizations, if you never use the variable after you set it to null, it would be a safe optimization to remove the line that sets the value to null (one less instruction to execute). So you might not actually be getting any improvement.
So in summary, yes it can help, but it will not be deterministic.
At least in java, it's not voodoo programming at all. When you create an object in java using something like
Foo bar = new Foo();
you do two things: first, you create a reference to an object, and second, you create the Foo object itself. So long as that reference or another exists, the specific object can't be gc'd. however, when you assign null to that reference...
bar = null ;
and assuming nothing else has a reference to the object, it's freed and available for gc the next time the garbage collector passes by.
It depends.
Generally speaking shorter you keep references to your objects, faster they'll get collected.
If your method takes say 2 seconds to execute and you don't need an object anymore after one second of method execution, it makes sense to clear any references to it. If GC sees that after one second, your object is still referenced, next time it might check it in a minute or so.
Anyway, setting all references to null by default is to me premature optimization and nobody should do it unless in specific rare cases where it measurably decreases memory consuption.
Explicitly setting a reference to null instead of just letting the variable go out of scope, does not help the garbage collector, unless the object held is very large, where setting it to null as soon as you are done with is a good idea.
Generally setting references to null, mean to the READER of the code that this object is completely done with and should not be concerned about any more.
A similar effect can be achieved by introducing a narrower scope by putting in an extra set of braces
{
int l;
{ // <- here
String bigThing = ....;
l = bigThing.length();
} // <- and here
}
this allows the bigThing to be garbage collected right after leaving the nested braces.
public class JavaMemory {
private final int dataSize = (int) (Runtime.getRuntime().maxMemory() * 0.6);
public void f() {
{
byte[] data = new byte[dataSize];
//data = null;
}
byte[] data2 = new byte[dataSize];
}
public static void main(String[] args) {
JavaMemory jmp = new JavaMemory();
jmp.f();
}
}
Above program throws OutOfMemoryError. If you uncomment data = null;, the OutOfMemoryError is solved. It is always good practice to set the unused variable to null
I was working on a video conferencing application one time and noticed a huge huge huge difference in performance when I took the time to null references as soon as I didn't need the object anymore. This was in 2003-2004 and I can only imagine the GC has gotten even smarter since. In my case I had hundreds of objects coming and going out of scope every second, so I noticed the GC when it kicked in periodically. However after I made it a point to null objects the GC stopped pausing my application.
So it depends on what your doing...
Yes.
From "The Pragmatic Programmer" p.292:
By setting a reference to NULL you reduce the number of pointers to the object by one ... (which will allow the garbage collector to remove it)
I assume the OP is referring to things like this:
private void Blah()
{
MyObj a;
MyObj b;
try {
a = new MyObj();
b = new MyObj;
// do real work
} finally {
a = null;
b = null;
}
}
In this case, wouldn't the VM mark them for GC as soon as they leave scope anyway?
Or, from another perspective, would explicitly setting the items to null cause them to get GC'd before they would if they just went out of scope? If so, the VM may spend time GC'ing the object when the memory isn't needed anyway, which would actually cause worse performance CPU usage wise because it would be GC'ing more earlier.
Even if nullifying the reference were marginally more efficient, would it be worth the ugliness of having to pepper your code with these ugly nullifications? They would only be clutter and obscure the intent code that contains them.
Its a rare codebase that has no better candidate for optimisation than trying to outsmart the Garbage collector (rarer still are developers who succeed in outsmarting it). Your efforts will most likely be better spent elsewhere instead, ditching that crufty Xml parser or finding some opportunity to cache computation. These optimisations will be easier to quantify and don't require you dirty up your codebase with noise.
Oracle doc point out "Assign null to Variables That Are No Longer Needed" https://docs.oracle.com/cd/E19159-01/819-3681/abebi/index.html
"It depends"
I do not know about Java but in .net (C#, VB.net...) it is usually not required to assign a null when you no longer require a object.
However note that it is "usually not required".
By analyzing your code the .net compiler makes a good valuation of the life time of the variable...to accurately tell when the object is not being used anymore. So if you write obj=null it might actually look as if the obj is still being used...in this case it is counter productive to assign a null.
There are a few cases where it might actually help to assign a null. One example is you have a huge code that runs for long time or a method that is running in a different thread, or some loop. In such cases it might help to assign null so that it is easy for the GC to know its not being used anymore.
There is no hard & fast rule for this. Going by the above place null-assigns in your code and do run a profiler to see if it helps in any way. Most probably you might not see a benefit.
If it is .net code you are trying to optimize, then my experience has been that taking good care with Dispose and Finalize methods is actually more beneficial than bothering about nulls.
Some references on the topic:
http://blogs.msdn.com/csharpfaq/archive/2004/03/26/97229.aspx
http://weblogs.asp.net/pwilson/archive/2004/02/20/77422.aspx
In the future execution of your program, the values of some data members will be used to computer an output visible external to the program. Others might or might not be used, depending on future (And impossible to predict) inputs to the program. Other data members might be guaranteed not to be used. All resources, including memory, allocated to those unused data are wasted. The job of the garbage collector (GC) is to eliminate that wasted memory. It would be disastrous for the GC to eliminate something that was needed, so the algorithm used might be conservative, retaining more than the strict minimum. It might use heuristic optimizations to improve its speed, at the cost of retaining some items that are not actually needed. There are many potential algorithms the GC might use. Therefore it is possible that changes you make to your program, and which do not affect the correctness of your program, might nevertheless affect the operation of the GC, either making it run faster to do the same job, or to sooner identify unused items. So this kind of change, setting an unusdd object reference to null, in theory is not always voodoo.
Is it voodoo? There are reportedly parts of the Java library code that do this. The writers of that code are much better than average programmers and either know, or cooperate with, programmers who know details of the garbage collector implementations. So that suggests there is sometimes a benefit.
As you said there are optimizations, i.e. JVM knows the place when the variable was last used and the object referenced by it can be GCed right after this last point (still executing in current scope). So nulling out references in most cases does not help GC.
But it can be useful to avoid "nepotism" (or "floating garbage") problem (read more here or watch video). The problem exists because heap is split into Old and Young generations and there are different GC mechanisms applied: Minor GC (which is fast and happens often to clean young gen) and Major Gc (which causes longer pause to clean Old gen). "Nepotism" does not allow for garbage in Young gen to be collected if it is referenced by garbage which was already tenured to an Old gen.
This is 'pathological' because ANY promoted node will result in the promotion of ALL following nodes until a GC resolves the issue.
To avoid nepotism it's a good idea to null out references from an object which is supposed to be removed. You can see this technique applied in JDK classes: LinkedList and LinkedHashMap
private E unlinkFirst(Node<E> f) {
final E element = f.item;
final Node<E> next = f.next;
f.item = null;
f.next = null; // help GC
// ...
}
Hopefully a simple question. Take for instance a Circularly-linked list:
class ListContainer
{
private listContainer next;
<..>
public void setNext(listContainer next)
{
this.next = next;
}
}
class List
{
private listContainer entry;
<..>
}
Now since it's a circularly-linked list, when a single elemnt is added, it has a reference to itself in it's next variable. When deleting the only element in the list, entry is set to null. Is there a need to set ListContainer.next to null as well for Garbage Collector to free it's memory or does it handle such self-references automagically?
Garbage collectors which rely solely on reference counting are generally vulnerable to failing to collection self-referential structures such as this. These GCs rely on a count of the number of references to the object in order to calculate whether a given object is reachable.
Non-reference counting approaches apply a more comprehensive reachability test to determine whether an object is eligible to be collected. These systems define an object (or set of objects) which are always assumed to be reachable. Any object for which references are available from this object graph is considered ineligible for collection. Any object not directly accessible from this object is not. Thus, cycles do not end up affecting reachability, and can be collected.
See also, the Wikipedia page on tracing garbage collectors.
Circular references is a (solvable) problem if you rely on counting the references in order to decide whether an object is dead. No java implementation uses reference counting, AFAIK. Newer Sun JREs uses a mix of several types of GC, all mark-and-sweep or copying I think.
You can read more about garbage collection in general at Wikipedia, and some articles about java GC here and here, for example.
The actual answer to this is implementation dependent. The Sun JVM keeps track of some set of root objects (threads and the like), and when it needs to do a garbage collection, traces out which objects are reachable from those and saves them, discarding the rest. It's actually more complicated than that to allow for some optimizations, but that is the basic principle. This version does not care about circular references: as long as no live object holds a reference to a dead one, it can be GCed.
Other JVMs can use a method known as reference counting. When a reference is created to the object, some counter is incremented, and when the reference goes out of scope, the counter is decremented. If the counter reaches zero, the object is finalized and garbage collected. This version, however, does allow for the possibility of circular references that would never be garbage collected. As a safeguard, many such JVMs include a backup method to determine which objects actually are dead which it runs periodically to resolve self-references and defrag the heap.
As a non-answer aside (the existing answers more than suffice), you might want to check out a whitepaper on the JVM garbage collection system if you are at all interested in GC. (Any, just google JVM Garbage Collection)
I was amazed at some of the techniques used, and when reading through some of the concepts like "Eden" I really realized for the first time that Java and the JVM actually could beat C/C++ in speed. (Whenever C/C++ frees an object/block of memory, code is involved... When Java frees an object, it actually doesn't do anything at all; since in good OO code, most objects are created and freed almost immediately, this is amazingly efficient.)
Modern GC's tend to be very efficient, managing older objects much differently than new objects, being able to control GCs to be short and half-assed or long and thorough, and a lot of GC options can be managed by command line switches so it's actually useful to know what all the terms actually refer to.
Note: I just realized this was misleading. C++'s STACK allocation is very fast--my point was about allocating objects that are able to exist after the current routine has finished (which I believe SHOULD be all objects--it's something you shouldn't have to think about if you are going to think in OO, but in C++ speed may make this impractical).
If you are only allocating C++ classes on the stack, it's allocation will be at least as fast as Java's.
Java collects any objects that are not reachable. If nothing else has a reference to the entry, then it will be collected, even though it has a reference to itself.
yes Java Garbage collector handle self-reference!
How?
There are special objects called called garbage-collection roots (GC roots). These are always reachable and so is any object that has them at its own root.
A simple Java application has the following GC roots:
Local variables in the main method
The main thread
Static variables of the main class
To determine which objects are no longer in use, the JVM intermittently runs what is very aptly called a mark-and-sweep algorithm. It works as follows
The algorithm traverses all object references, starting with the GC
roots, and marks every object found as alive.
All of the heap memory that is not occupied by marked objects is
reclaimed. It is simply marked as free, essentially swept free of
unused objects.
So if any object is not reachable from the GC roots(even if it is self-referenced or cyclic-referenced) it will be subjected to garbage collection.
Simply, Yes. :)
Check out http://www.ibm.com/developerworks/java/library/j-jtp10283/
All JDKs (from Sun) have a concept of "reach-ability". If the GC cannot "reach" an object, it goes away.
This isn't any "new" info (your first to respondents are great) but the link is useful, and brevity is something sweet. :)