I was browsing some old books and found a copy of "Practical Java" by Peter Hagger. In the performance section, there is a recommendation to set object references to null when no longer needed.
In Java, does setting object references to null improve performance or garbage collection efficiency? If so, in what cases is this an issue? Container classes? Object composition? Anonymous inner classes?
I see this in code pretty often. Is this now obsolete programming advice or is it still useful?
It depends a bit on when you were thinking of nulling the reference.
If you have an object chain A->B->C, then once A is not reachable, A, B and C will all be eligible for garbage collection (assuming nothing else is referring to either B or C). There's no need, and never has been any need, to explicitly set references A->B or B->C to null, for example.
Apart from that, most of the time the issue doesn't really arise, because in reality you're dealing with objects in collections. You should generally always be thinking of removing objects from lists, maps etc by calling the appropiate remove() method.
The case where there used to be some advice to set references to null was specifically in a long scope where a memory-intensive object ceased to be used partway through the scope. For example:
{
BigObject obj = ...
doSomethingWith(obj);
obj = null; <-- explicitly set to null
doSomethingElse();
}
The rationale here was that because obj is still in scope, then without the explicit nulling of the reference, it does not become garbage collectable until after the doSomethingElse() method completes. And this is the advice that probably no longer holds on modern JVMs: it turns out that the JIT compiler can work out at what point a given local object reference is no longer used.
No, it's not obsolete advice. Dangling references are still a problem, especially if you're, say, implementing an expandable array container (ArrayList or the like) using a pre-allocated array. Elements beyond the "logical" size of the list should be nulled out, or else they won't be freed.
See Effective Java 2nd ed, Item 6: Eliminate Obsolete Object References.
Instance fields, array elements
If there is a reference to an object, it cannot be garbage collected. Especially if that object (and the whole graph behind it) is big, there is only one reference that is stopping garbage collection, and that reference is not really needed anymore, that is an unfortunate situation.
Pathological cases are the object that retains an unnessary instance to the whole XML DOM tree that was used to configure it, the MBean that was not unregistered, or the single reference to an object from an undeployed web application that prevents a whole classloader from being unloaded.
So unless you are sure that the object that holds the reference itself will be garbage collected anyway (or even then), you should null out everything that you no longer need.
Scoped variables:
If you are considering setting a local variable to null before the end of its scope , so that it can be reclaimed by the garbage collector and to mark it as "unusable from now on", you should consider putting it in a more limited scope instead.
{
BigObject obj = ...
doSomethingWith(obj);
obj = null; // <-- explicitly set to null
doSomethingElse();
}
becomes
{
{
BigObject obj = ...
doSomethingWith(obj);
} // <-- obj goes out of scope
doSomethingElse();
}
Long, flat scopes are generally bad for legibility of the code, too. Introducing private methods to break things up just for that purpose is not unheard of, too.
In memory restrictive environments (e.g. cellphones) this can be useful. By setting null, the objetc don't need to wait the variable to get out of scope to be gc'd.
For the everyday programming, however, this shouldn't be the rule, except in special cases like the one Chris Jester-Young cited.
Firstly, It does not mean anything that you are setting a object to null. I explain it below:
List list1 = new ArrayList();
List list2 = list1;
In above code segment we are creating the object reference variable name list1 of ArrayList object that is stored in the memory. So list1 is referring that object and it nothing more than a variable. And in the second line of code we are copying the reference of list1 to list2. So now going back to your question if I do:
list1 = null;
that means list1 is no longer referring any object that is stored in the memory so list2 will also having nothing to refer. So if you check the size of list2:
list2.size(); //it gives you 0
So here the concept of garbage collector arrives which says «you nothing to worry about freeing the memory that is hold by the object, I will do that when I find that it will no longer used in program and JVM will manage me.»
I hope it clear the concept.
One of the reasons to do so is to eliminate obsolete object references.
You can read the text here.
Related
Lets assume, there is a Tree object, with a root TreeNode object, and each TreeNode has leftNode and rightNode objects (e.g a BinaryTree object)
If i call:
myTree = null;
what really happens with the related TreeNode objects inside the tree? Will be garbage collected as well, or i have to set null all the related objects inside the tree object??
Garbage collection in Java is performed on the basis of "reachability". The JLS defines the term as follows:
"A reachable object is any object that can be accessed in any potential continuing computation from any live thread."
So long as an object is reachable1, it is not eligible for garbage collection.
The JLS leaves it up to the Java implementation to figure out how to determine whether an object could be accessible. If the implementation cannot be sure, it is free to treat a theoretically unreachable object as reachable ... and not collect it. (Indeed, the JLS allows an implementation to not collect anything, ever! No practical implementation would do that though2.)
In practice, (conservative) reachability is calculated by tracing; looking at what can be reached by following references starting with the class (static) variables, and local variables on thread stacks.
Here's what this means for your question:
If i call: myTree = null; what really happens with the related TreeNode objects inside the tree? Will be garbage collected as well, or i have to set null all the related objects inside the tree object??
Let's assume that myTree contains the last remaining reachable reference to the tree root.
Nothing happens immediately.
If the internal nodes were previously only reachable via the root node, then they are now unreachable, and eligible for garbage collection. (In this case, assigning null to references to internal nodes is unnecessary.)
However, if the internal nodes were reachable via other paths, they are presumably still reachable, and therefore NOT eligible for garbage collection. (In this case, assigning null to references to internal nodes is a mistake. You are dismantling a data structure that something else might later try to use.)
If myTree does not contain the last remaining reachable reference to the tree root, then nulling the internal reference is a mistake for the same reason as in 3. above.
So when should you null things to help the garbage collector?
The cases where you need to worry are when you can figure out that that the reference in some cell (local, instance or class variable, or array element) won't be used again, but the compiler and runtime can't! The cases fall into roughly three categories:
Object references in class variables ... which (by definition) never go out of scope.
Object references in local variables that are still in scope ... but won't be used. For example:
public List<Pig> pigSquadron(boolean pigsMightFly) {
List<Pig> airbornePigs = new ArrayList<Pig>();
while (...) {
Pig piggy = new Pig();
...
if (pigsMightFly) {
airbornePigs.add(piggy);
}
...
}
return airbornePigs.size() > 0 ? airbornePigs : null;
}
In the above, we know that if pigsMightFly is false, that the list object won't be used. But no mainstream Java compiler could be expected to figure this out.
Object references in instance variables or in array cells where the data structure invariants mean that they won't be used. #edalorzo's stack example is an example of this.
It should be noted that the compiler / runtime can sometimes figure out that an in-scope variable is effectively dead. For example:
public void method(...) {
Object o = ...
Object p = ...
while (...) {
// Do things to 'o' and 'p'
}
// No further references to 'o'
// Do lots more things to 'p'
}
Some Java compilers / runtimes may be able to detect that 'o' is not needed after the loop ends, and treat the variable as dead.
1 - In fact, what we are talking about here is strong reachability. The GC reachability model is more complicated when you consider soft, weak and phantom references. However, these are not relevant to the OP's use-case.
2 - In Java 11 there is an experimental GC called the Epsilon GC that explicitly doesn't collect anything.
They will be garbage collected unless you have other references to them (probably manual). If you just have a reference to the tree, then yes, they will be garbage collected.
You can't set an object to null, only a variable which might contain an pointer/reference to this object. The object itself is not affected by this. But if now no paths from any living thread (i.e. local variable of any running method) to your object exist, it will be garbage-collected, if and when the memory is needed. This applies to any objects, also the ones which are referred to from your original tree object.
Note that for local variables you normally not have to set them to null if the method (or block) will finish soon anyway.
myTree is just a reference variable that previously pointed to an object in the heap. Now you are setting that to null. If you don't have any other reference to that object, then that object will be eligible for garbage collection.
To let the garbage collector remove the object myTree just make a call to gc() after you've set it to null
myTree=null;
System.gc();
Note that the object is removed only when there is no other reference pointing to it.
In Java, you do not need to explicitly set objects to null to allow them to be GC'd. Objects are eligible for GC when there are no references to it (ignoring the java.lang.ref.* classes).
An object gets collected when there are no more references to it.
In your case, the nodes referred to directly by the object formally referenced by myTree (the root node) will be collected, and so on.
This of course is not the case if you have outstanding references to nodes outside of the tree. Those will get GC'd once those references go out of scope (along with anything only they refer to)
I am reading Effective Java and I came across this term, "Obsolete Reference".
When is a reference obsolete reference? I am assuming that all the objects that don't fall out of scope and remain unused are obsolete references. Correct me if I am wrong.
An obsolete reference (as used in the book, though it's not a widely used technical term) is one that is kept around but will never be used, preventing the object it refers to from being eligible for garbage collection, thus causing a memory leak.
An obsolete reference is simply a reference that will never be dereferenced again.
From Effective Java,
Holding onto obsolete references constitutes memory leaks in Java.
This is also termed as unintentional object retention.
Nulling out a reference to remove obsolete references to an object is
good, but one must not overdo it. The best way to eliminate an
obsolete reference is to reuse the variable in which it was contained
or to let it fall out of scope.
E.g for removing obsolete reference,
public Object pop() {
if (size == 0)
throw new EmptyStackException();
Object result = elements[--size];
elements[size] = null; // Eliminate obsolete reference
return result;
}
You are right. Basically, an obsolete reference is something which does-not affect the later flow of the program and should be set to null to aid garbage collection.
For example ;
String a="some value";
. . .
. . . //some processing here
//once done do this
a=null; //a is obsolete reference
unused objects , which are still having the references (may be not intentionally) and those reference are not refereed by your application/program/code , then that reference are obsolete reference. since the reference is still their for these unused objects , GC , is not possible for these objects and the objects which are inside those objects , and these leads memory leak issues.
A simple explanation is that even if the value popped from the method, that value is still being pointed inside the array by its index. So even if the program really is done with the passed-by reference copy, it will still be in the heap because of the elements array used in the stack.
In Java, to unload an object from the heap, is it sufficient to simply write myObject = null; and the GC will take care of it from there?
EDIT : Ok let me explain my use case, since everyone is assuming that I shouldn't explicitly null objects, I shouldn't worry about it, etc. That's missing the point. I am serializing an object, and am "consuming" a field of this object before I serialize it in order to save disk space. And before you jump down my throat for this, too, I cannot declare this field transient because I am including this field in the object sometimes, but not others.
Does setting an object to null have any effect on the GC?
In some modern VMs, actively setting a reference to null hinders the garbage collector. You should just forget about that.
For knowing when an object is garbage collected, look at the java.lang.ref package - although I can honestly say that in 16 years of Java programming, I've never needed to know when an object is garbage collected.
Can you elaborate on why you think you need this?
No; all references to that object must be lost/nulled. In practice this is something you shouldn't worry about.
Your object will be de-allocated when it is no longer used. Just be aware that any references left to the object will keep the object on the heap and simply assigning null to any single reference will not cause the underlying object to magically go away.
No, and no. myObject = null; will only help if there are no other references to the object, and in most cases it's superfluous because local objects go out of scope at the end of each method.
As for when objects are actually deallocated, that's completely up to the GC. What you can do is add a finalize method that will be called just before the object is deallocated, but this is problematic as well and should not be relied on.
Do you always assign null to an object after its scope has been reached?
Or do you rely on the JVM for garbage collection?
Do you do it for all sort of applications regardless of their length?
If so, is it always a good practice?
It's not necessary to explicitly mark objects as null unless you have a very specific reason. Furthermore, I've never seen an application that marks all objects as null when they are no longer needed. The main benefit of garbage collection is the intrinsic memory management.
no, don't do that, except for specific cases such as static fields or when you know a variable/field lives a lot longer than the code referencing it
yes, but with a working knowledge of your VM's limits (and how to cause blocks of memory to be held accidentally)
n/a
I declare almost all of my variables as "final". I also make my methods small and declare most variables local to methods.
Since they are final I cannot assign them null after use... but that is fine since the methods are small the objects are eligible for garbage collection once they return. Since most of the variables are local there is less chance of accidentally holding onto a reference for longer than needed (memory leak).
Assignin null to a variable does not implicitly mean it will be garbage collected right away. In fact it most likely won't be. Whether you practice setting variables to null is usually only cosmetic (with the exception of static variables)
We don't practice this assigning "null". If a variable's scope has reached it's end it should already be ready for GC. There may be some edge cases in which the scope lasts for a while longer due to a long running operation in which case it might make sense to set it to null, but I would imagine they would be rare.
It also goes without saying that if the variable is an object's member variable or a static variable and hence never really goes out of scope then setting it to null to GC is mandatory.
Garbage collection is not as magical as you might expect. As long as an object is referenced from any reachable object it simply can't be collected. So it might be absolutely necessary to null a reference in order to avoid memory leaks. I don't say you should do this always, but always when it's necessary.
As the others have mentioned, it's not usually necessary.
Not only that, but it clutters up your code and increases the data someone needs to read and understand when revisiting your code.
Assigning is not done to objects, it is done to variables, and it means that this variable then holds a reference to some object. Assigning NULL to a variable is not a way to destroy an object, it just clears one reference. If the variable you are clearing will leave its scope afterwards anyway, assigning NULL is just useless noise, because that happens on leaving scope in any case.
The one time I tend to use this practice is if I need to transform a large Collection in some early part of a method.
For example:
public void foo() {
List<? extends Trade> trades = loadTrades();
Map<Date, List<? extends Trade>> tradesByDate = groupTradesByDate(trades);
trades = null; // trades no longer required.
// Apply business logic to tradesByDate map.
}
Obviously I could reduce the need for this by refactoring this into another method: Map<Date, List<? extends Trade>>> loadTradesAndGroupByDate() so it really depends on circumstances / clarity of code.
I only assign a reference to null when:
The code really lies in a memory-critical part.
The reference has a wide scope (and must be reused later). If it is not the case I just declare it in the smallest possible code block. It will be available for collection automatically.
That means that I only use this technique in iterative process where I use the reference to store incoming huge collection of objects. After processing, I do not need the collection any more but I want to reuse the reference for the next collection.
In that case (and only in that case), I then call System.gc() to give a hint to the Garbage Collector. I monitored this technique through heap visualizer and it works very well for big collections (more then 500Mb of data).
When using the .Net I don't think there's a need to set the object to null. Just let the garbage collection happen.
- Do you always assign null to an object after its scope has been reached?
No
- Or do you rely on the JVM for garbage collection?
Yes
- Do you do it for all sort of applications regardless of their length?
Yes
- If so, is it always a good practice?
N/A
I assume you're asking this question because you've seen code with variables being assigned to null at the point where they will never be accessed again.
I dislike this style, but another programmer used it extensively, and said he was taught to do so at a programming course at his university. The reasoning he gave is that it would prevent undetectable bugs if he tried to reuse the variable later on, instead of indeterminate behavior, he'd get a null pointer exception.
So if you're prone to using variables where you shouldn't be using variables, it might make your code more easy to debug.
There was a class of memory leak bugs that happened regardless of whether I set the reference to null - if the library I was using was written in a language like C without memory management, then simply setting the object to null would not necessarily free the memory. We had to call the object's close() method to release the memory (which, of course, we couldn't do after setting it to null.)
It thus seems to me that the de facto method of memory management in java is to rely on the garbage collector unless the object/library you're using has a close() method (or something similar.)
Does assigning an unused object reference to null in Java improve the garbage collection process in any measurable way?
My experience with Java (and C#) has taught me that is often counter intuitive to try and outsmart the virtual machine or JIT compiler, but I've seen co-workers use this method and I am curious if this is a good practice to pick up or one of those voodoo programming superstitions?
Typically, no.
But like all things: it depends. The GC in Java these days is VERY good and everything should be cleaned up very shortly after it is no longer reachable. This is just after leaving a method for local variables, and when a class instance is no longer referenced for fields.
You only need to explicitly null if you know it would remain referenced otherwise. For example an array which is kept around. You may want to null the individual elements of the array when they are no longer needed.
For example, this code from ArrayList:
public E remove(int index) {
RangeCheck(index);
modCount++;
E oldValue = (E) elementData[index];
int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null; // Let gc do its work
return oldValue;
}
Also, explicitly nulling an object will not cause an object to be collected any sooner than if it just went out of scope naturally as long as no references remain.
Both:
void foo() {
Object o = new Object();
/// do stuff with o
}
and:
void foo() {
Object o = new Object();
/// do stuff with o
o = null;
}
Are functionally equivalent.
In my experience, more often than not, people null out references out of paranoia not out of necessity. Here is a quick guideline:
If object A references object B and you no longer need this reference and object A is not eligible for garbage collection then you should explicitly null out the field. There is no need to null out a field if the enclosing object is getting garbage collected anyway. Nulling out fields in a dispose() method is almost always useless.
There is no need to null out object references created in a method. They will get cleared automatically once the method terminates. The exception to this rule is if you're running in a very long method or some massive loop and you need to ensure that some references get cleared before the end of the method. Again, these cases are extremely rare.
I would say that the vast majority of the time you will not need to null out references. Trying to outsmart the garbage collector is useless. You will just end up with inefficient, unreadable code.
Good article is today's coding horror.
The way GC's work is by looking for objects that do not have any pointers to them, the area of their search is heap/stack and any other spaces they have. So if you set a variable to null, the actual object is now not pointed by anyone, and hence could be GC'd.
But since the GC might not run at that exact instant, you might not actually be buying yourself anything. But if your method is fairly long (in terms of execution time) it might be worth it since you will be increasing your chances of GC collecting that object.
The problem can also be complicated with code optimizations, if you never use the variable after you set it to null, it would be a safe optimization to remove the line that sets the value to null (one less instruction to execute). So you might not actually be getting any improvement.
So in summary, yes it can help, but it will not be deterministic.
At least in java, it's not voodoo programming at all. When you create an object in java using something like
Foo bar = new Foo();
you do two things: first, you create a reference to an object, and second, you create the Foo object itself. So long as that reference or another exists, the specific object can't be gc'd. however, when you assign null to that reference...
bar = null ;
and assuming nothing else has a reference to the object, it's freed and available for gc the next time the garbage collector passes by.
It depends.
Generally speaking shorter you keep references to your objects, faster they'll get collected.
If your method takes say 2 seconds to execute and you don't need an object anymore after one second of method execution, it makes sense to clear any references to it. If GC sees that after one second, your object is still referenced, next time it might check it in a minute or so.
Anyway, setting all references to null by default is to me premature optimization and nobody should do it unless in specific rare cases where it measurably decreases memory consuption.
Explicitly setting a reference to null instead of just letting the variable go out of scope, does not help the garbage collector, unless the object held is very large, where setting it to null as soon as you are done with is a good idea.
Generally setting references to null, mean to the READER of the code that this object is completely done with and should not be concerned about any more.
A similar effect can be achieved by introducing a narrower scope by putting in an extra set of braces
{
int l;
{ // <- here
String bigThing = ....;
l = bigThing.length();
} // <- and here
}
this allows the bigThing to be garbage collected right after leaving the nested braces.
public class JavaMemory {
private final int dataSize = (int) (Runtime.getRuntime().maxMemory() * 0.6);
public void f() {
{
byte[] data = new byte[dataSize];
//data = null;
}
byte[] data2 = new byte[dataSize];
}
public static void main(String[] args) {
JavaMemory jmp = new JavaMemory();
jmp.f();
}
}
Above program throws OutOfMemoryError. If you uncomment data = null;, the OutOfMemoryError is solved. It is always good practice to set the unused variable to null
I was working on a video conferencing application one time and noticed a huge huge huge difference in performance when I took the time to null references as soon as I didn't need the object anymore. This was in 2003-2004 and I can only imagine the GC has gotten even smarter since. In my case I had hundreds of objects coming and going out of scope every second, so I noticed the GC when it kicked in periodically. However after I made it a point to null objects the GC stopped pausing my application.
So it depends on what your doing...
Yes.
From "The Pragmatic Programmer" p.292:
By setting a reference to NULL you reduce the number of pointers to the object by one ... (which will allow the garbage collector to remove it)
I assume the OP is referring to things like this:
private void Blah()
{
MyObj a;
MyObj b;
try {
a = new MyObj();
b = new MyObj;
// do real work
} finally {
a = null;
b = null;
}
}
In this case, wouldn't the VM mark them for GC as soon as they leave scope anyway?
Or, from another perspective, would explicitly setting the items to null cause them to get GC'd before they would if they just went out of scope? If so, the VM may spend time GC'ing the object when the memory isn't needed anyway, which would actually cause worse performance CPU usage wise because it would be GC'ing more earlier.
Even if nullifying the reference were marginally more efficient, would it be worth the ugliness of having to pepper your code with these ugly nullifications? They would only be clutter and obscure the intent code that contains them.
Its a rare codebase that has no better candidate for optimisation than trying to outsmart the Garbage collector (rarer still are developers who succeed in outsmarting it). Your efforts will most likely be better spent elsewhere instead, ditching that crufty Xml parser or finding some opportunity to cache computation. These optimisations will be easier to quantify and don't require you dirty up your codebase with noise.
Oracle doc point out "Assign null to Variables That Are No Longer Needed" https://docs.oracle.com/cd/E19159-01/819-3681/abebi/index.html
"It depends"
I do not know about Java but in .net (C#, VB.net...) it is usually not required to assign a null when you no longer require a object.
However note that it is "usually not required".
By analyzing your code the .net compiler makes a good valuation of the life time of the variable...to accurately tell when the object is not being used anymore. So if you write obj=null it might actually look as if the obj is still being used...in this case it is counter productive to assign a null.
There are a few cases where it might actually help to assign a null. One example is you have a huge code that runs for long time or a method that is running in a different thread, or some loop. In such cases it might help to assign null so that it is easy for the GC to know its not being used anymore.
There is no hard & fast rule for this. Going by the above place null-assigns in your code and do run a profiler to see if it helps in any way. Most probably you might not see a benefit.
If it is .net code you are trying to optimize, then my experience has been that taking good care with Dispose and Finalize methods is actually more beneficial than bothering about nulls.
Some references on the topic:
http://blogs.msdn.com/csharpfaq/archive/2004/03/26/97229.aspx
http://weblogs.asp.net/pwilson/archive/2004/02/20/77422.aspx
In the future execution of your program, the values of some data members will be used to computer an output visible external to the program. Others might or might not be used, depending on future (And impossible to predict) inputs to the program. Other data members might be guaranteed not to be used. All resources, including memory, allocated to those unused data are wasted. The job of the garbage collector (GC) is to eliminate that wasted memory. It would be disastrous for the GC to eliminate something that was needed, so the algorithm used might be conservative, retaining more than the strict minimum. It might use heuristic optimizations to improve its speed, at the cost of retaining some items that are not actually needed. There are many potential algorithms the GC might use. Therefore it is possible that changes you make to your program, and which do not affect the correctness of your program, might nevertheless affect the operation of the GC, either making it run faster to do the same job, or to sooner identify unused items. So this kind of change, setting an unusdd object reference to null, in theory is not always voodoo.
Is it voodoo? There are reportedly parts of the Java library code that do this. The writers of that code are much better than average programmers and either know, or cooperate with, programmers who know details of the garbage collector implementations. So that suggests there is sometimes a benefit.
As you said there are optimizations, i.e. JVM knows the place when the variable was last used and the object referenced by it can be GCed right after this last point (still executing in current scope). So nulling out references in most cases does not help GC.
But it can be useful to avoid "nepotism" (or "floating garbage") problem (read more here or watch video). The problem exists because heap is split into Old and Young generations and there are different GC mechanisms applied: Minor GC (which is fast and happens often to clean young gen) and Major Gc (which causes longer pause to clean Old gen). "Nepotism" does not allow for garbage in Young gen to be collected if it is referenced by garbage which was already tenured to an Old gen.
This is 'pathological' because ANY promoted node will result in the promotion of ALL following nodes until a GC resolves the issue.
To avoid nepotism it's a good idea to null out references from an object which is supposed to be removed. You can see this technique applied in JDK classes: LinkedList and LinkedHashMap
private E unlinkFirst(Node<E> f) {
final E element = f.item;
final Node<E> next = f.next;
f.item = null;
f.next = null; // help GC
// ...
}