Why does this code sample produce a memory leak? - java

In the university we were given the following code sample and we were being told, that there is a memory leak when running this code. The sample should demonstrate that this is a situation where the garbage collector can't work.
As far as my object oriented programming goes, the only codeline able to create a memory leak would be
items=Arrays.copyOf(items,2 * size+1);
The documentation says, that the elements are copied. Does that mean the reference is copied (and therefore another entry on the heap is created) or the object itself is being copied? As far as I know, Object and therefore Object[] are implemented as a reference type. So assigning a new value to 'items' would allow the garbage collector to find that the old 'item' is no longer referenced and can therefore be collected.
In my eyes, this the codesample does not produce a memory leak. Could somebody prove me wrong? =)
import java.util.Arrays;
public class Foo
{
private Object[] items;
private int size=0;
private static final int ISIZE=10;
public Foo()
{
items= new Object[ISIZE];
}
public void push(final Object o){
checkSize();
items[size++]=o;
}
public Object pop(){
if (size==0)
throw new ///...
return items[--size];
}
private void checkSize(){
if (items.length==size){
items=Arrays.copyOf(items,2 * size+1);
}
}
}

The pop method produces the memory leak.
The reason is that you only reduce the number of items that are in the queue, but you don't actually remove them from the queue.The references remain in the array. If you don't remove them, the garbage collector, won't destruct the objects, even if the code that produced the object is executed.
Imagine this:
{
Object o = new Object();
myQueue.add(o);
}
Now you have only one reference for this object - the one in the array.
Later you do:
{
myQueue.pop();
}
This pop doesn't delete the reference. If you don't remove the reference the Garbage collector will think that you are still thinking of using this reference and that this object is useful.
So if you fill the Queue with n number of objects then you will hold reference for these n objects.
This is a the memory leak your teachers told you about.

Hint: the leak is in the pop method. Consider what happens to the references to a popped object ...

It's not a priori true that there's a memory leak here.
I think the prof has in mind that you're not nulling out popped items (in other words, after you return items[--size], you probably ought to set items[size] = null). But when the Foo instance goes out of scope, then everything will get collected. So it's a pretty weak exercise.

This example is discussed in Effective Java by Joshua Bloch. The leak is when popping elements. The references keep pointing to objects you don't use.

The code sample doesn't produce a leak. It's true that when you call pop(), the memory isn't freed for the appropriate object - but it will be when you next call push().
It's true that the sample never releases memory. However, the unreleased memory is always re-used. In this case, it doesn't really fit the definition of memory leak.
for(int i = 0; i < 1000; i++)
foo.push(new Object());
for(int i = 0; i < 1000; i++)
foo.pop();
This will produce memory that isn't freed. However, if you ran the loop again, or a hundred thousand million times, you wouldn't produce more memory that isn't freed. Therefore, memory is never leaked.
You can actually see this behaviour in many malloc and free (C) implementations- when you free memory, it isn't actually returned to the OS, but added to a list to be given back next time you call malloc. But we still don't suggest that free leaks memory.

Memory leaks are defined as unbounded growth in allocation caused by ongoing execution.
The explanations provided explain how objects could continue to be held active through references in the stack after popping, and can certainly result in all kinds of misbehaviour (for example when the caller releases what they think is the last reference and expects finalisation and memory recovery), but can hardly be called leaks.
As the stack is used to store other object references the previous orphaned objects will become truly inaccessible and be returned to the memory pool.
Your initial skepticism is valid. The code presented would provide bounded memory use growth with convergence to a long-term state.

Hint: Imagine what happens if you use a Foo object, insert into it 10000 "heavy" items, and then remove all of them using pop() because you don't need them anymore in your program.

I'm not going to flat out give you the answer, but look at what push(Object o) does that pop() doesn't do.

In the pop() method, the item on the size (i.e, items[size-1]) is not set to NULL. As a result, there still exists reference from objects items to items[size-1], although size has been reduced by one. During GC, items[size-1] won't be collected even if there is no other object pointing to it, which leads to memory leak.

Consider this demo:
Foo f = new Foo();
{
Object o1 = new Object();
Object o2 = new Object();
f.push(o1);
f.push(o2);
}
f.pop();
f.pop();
// #1. o1 and o2 still are refered in f.items, thus not deleted
f = null;
// #2. o1 and o2 will be deleted now
Several things should be improved in Foo which will fix this:
In pop, you should set the items entry to null.
You should introduce the opposite to checkSize, something like shrinkSize, which will make the array smaller (maybe in a similar way to checkSize).

Related

java - Memory management - Destroying a list when we don't need it [duplicate]

Does assigning an unused object reference to null in Java improve the garbage collection process in any measurable way?
My experience with Java (and C#) has taught me that is often counter intuitive to try and outsmart the virtual machine or JIT compiler, but I've seen co-workers use this method and I am curious if this is a good practice to pick up or one of those voodoo programming superstitions?
Typically, no.
But like all things: it depends. The GC in Java these days is VERY good and everything should be cleaned up very shortly after it is no longer reachable. This is just after leaving a method for local variables, and when a class instance is no longer referenced for fields.
You only need to explicitly null if you know it would remain referenced otherwise. For example an array which is kept around. You may want to null the individual elements of the array when they are no longer needed.
For example, this code from ArrayList:
public E remove(int index) {
RangeCheck(index);
modCount++;
E oldValue = (E) elementData[index];
int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null; // Let gc do its work
return oldValue;
}
Also, explicitly nulling an object will not cause an object to be collected any sooner than if it just went out of scope naturally as long as no references remain.
Both:
void foo() {
Object o = new Object();
/// do stuff with o
}
and:
void foo() {
Object o = new Object();
/// do stuff with o
o = null;
}
Are functionally equivalent.
In my experience, more often than not, people null out references out of paranoia not out of necessity. Here is a quick guideline:
If object A references object B and you no longer need this reference and object A is not eligible for garbage collection then you should explicitly null out the field. There is no need to null out a field if the enclosing object is getting garbage collected anyway. Nulling out fields in a dispose() method is almost always useless.
There is no need to null out object references created in a method. They will get cleared automatically once the method terminates. The exception to this rule is if you're running in a very long method or some massive loop and you need to ensure that some references get cleared before the end of the method. Again, these cases are extremely rare.
I would say that the vast majority of the time you will not need to null out references. Trying to outsmart the garbage collector is useless. You will just end up with inefficient, unreadable code.
Good article is today's coding horror.
The way GC's work is by looking for objects that do not have any pointers to them, the area of their search is heap/stack and any other spaces they have. So if you set a variable to null, the actual object is now not pointed by anyone, and hence could be GC'd.
But since the GC might not run at that exact instant, you might not actually be buying yourself anything. But if your method is fairly long (in terms of execution time) it might be worth it since you will be increasing your chances of GC collecting that object.
The problem can also be complicated with code optimizations, if you never use the variable after you set it to null, it would be a safe optimization to remove the line that sets the value to null (one less instruction to execute). So you might not actually be getting any improvement.
So in summary, yes it can help, but it will not be deterministic.
At least in java, it's not voodoo programming at all. When you create an object in java using something like
Foo bar = new Foo();
you do two things: first, you create a reference to an object, and second, you create the Foo object itself. So long as that reference or another exists, the specific object can't be gc'd. however, when you assign null to that reference...
bar = null ;
and assuming nothing else has a reference to the object, it's freed and available for gc the next time the garbage collector passes by.
It depends.
Generally speaking shorter you keep references to your objects, faster they'll get collected.
If your method takes say 2 seconds to execute and you don't need an object anymore after one second of method execution, it makes sense to clear any references to it. If GC sees that after one second, your object is still referenced, next time it might check it in a minute or so.
Anyway, setting all references to null by default is to me premature optimization and nobody should do it unless in specific rare cases where it measurably decreases memory consuption.
Explicitly setting a reference to null instead of just letting the variable go out of scope, does not help the garbage collector, unless the object held is very large, where setting it to null as soon as you are done with is a good idea.
Generally setting references to null, mean to the READER of the code that this object is completely done with and should not be concerned about any more.
A similar effect can be achieved by introducing a narrower scope by putting in an extra set of braces
{
int l;
{ // <- here
String bigThing = ....;
l = bigThing.length();
} // <- and here
}
this allows the bigThing to be garbage collected right after leaving the nested braces.
public class JavaMemory {
private final int dataSize = (int) (Runtime.getRuntime().maxMemory() * 0.6);
public void f() {
{
byte[] data = new byte[dataSize];
//data = null;
}
byte[] data2 = new byte[dataSize];
}
public static void main(String[] args) {
JavaMemory jmp = new JavaMemory();
jmp.f();
}
}
Above program throws OutOfMemoryError. If you uncomment data = null;, the OutOfMemoryError is solved. It is always good practice to set the unused variable to null
I was working on a video conferencing application one time and noticed a huge huge huge difference in performance when I took the time to null references as soon as I didn't need the object anymore. This was in 2003-2004 and I can only imagine the GC has gotten even smarter since. In my case I had hundreds of objects coming and going out of scope every second, so I noticed the GC when it kicked in periodically. However after I made it a point to null objects the GC stopped pausing my application.
So it depends on what your doing...
Yes.
From "The Pragmatic Programmer" p.292:
By setting a reference to NULL you reduce the number of pointers to the object by one ... (which will allow the garbage collector to remove it)
I assume the OP is referring to things like this:
private void Blah()
{
MyObj a;
MyObj b;
try {
a = new MyObj();
b = new MyObj;
// do real work
} finally {
a = null;
b = null;
}
}
In this case, wouldn't the VM mark them for GC as soon as they leave scope anyway?
Or, from another perspective, would explicitly setting the items to null cause them to get GC'd before they would if they just went out of scope? If so, the VM may spend time GC'ing the object when the memory isn't needed anyway, which would actually cause worse performance CPU usage wise because it would be GC'ing more earlier.
Even if nullifying the reference were marginally more efficient, would it be worth the ugliness of having to pepper your code with these ugly nullifications? They would only be clutter and obscure the intent code that contains them.
Its a rare codebase that has no better candidate for optimisation than trying to outsmart the Garbage collector (rarer still are developers who succeed in outsmarting it). Your efforts will most likely be better spent elsewhere instead, ditching that crufty Xml parser or finding some opportunity to cache computation. These optimisations will be easier to quantify and don't require you dirty up your codebase with noise.
Oracle doc point out "Assign null to Variables That Are No Longer Needed" https://docs.oracle.com/cd/E19159-01/819-3681/abebi/index.html
"It depends"
I do not know about Java but in .net (C#, VB.net...) it is usually not required to assign a null when you no longer require a object.
However note that it is "usually not required".
By analyzing your code the .net compiler makes a good valuation of the life time of the variable...to accurately tell when the object is not being used anymore. So if you write obj=null it might actually look as if the obj is still being used...in this case it is counter productive to assign a null.
There are a few cases where it might actually help to assign a null. One example is you have a huge code that runs for long time or a method that is running in a different thread, or some loop. In such cases it might help to assign null so that it is easy for the GC to know its not being used anymore.
There is no hard & fast rule for this. Going by the above place null-assigns in your code and do run a profiler to see if it helps in any way. Most probably you might not see a benefit.
If it is .net code you are trying to optimize, then my experience has been that taking good care with Dispose and Finalize methods is actually more beneficial than bothering about nulls.
Some references on the topic:
http://blogs.msdn.com/csharpfaq/archive/2004/03/26/97229.aspx
http://weblogs.asp.net/pwilson/archive/2004/02/20/77422.aspx
In the future execution of your program, the values of some data members will be used to computer an output visible external to the program. Others might or might not be used, depending on future (And impossible to predict) inputs to the program. Other data members might be guaranteed not to be used. All resources, including memory, allocated to those unused data are wasted. The job of the garbage collector (GC) is to eliminate that wasted memory. It would be disastrous for the GC to eliminate something that was needed, so the algorithm used might be conservative, retaining more than the strict minimum. It might use heuristic optimizations to improve its speed, at the cost of retaining some items that are not actually needed. There are many potential algorithms the GC might use. Therefore it is possible that changes you make to your program, and which do not affect the correctness of your program, might nevertheless affect the operation of the GC, either making it run faster to do the same job, or to sooner identify unused items. So this kind of change, setting an unusdd object reference to null, in theory is not always voodoo.
Is it voodoo? There are reportedly parts of the Java library code that do this. The writers of that code are much better than average programmers and either know, or cooperate with, programmers who know details of the garbage collector implementations. So that suggests there is sometimes a benefit.
As you said there are optimizations, i.e. JVM knows the place when the variable was last used and the object referenced by it can be GCed right after this last point (still executing in current scope). So nulling out references in most cases does not help GC.
But it can be useful to avoid "nepotism" (or "floating garbage") problem (read more here or watch video). The problem exists because heap is split into Old and Young generations and there are different GC mechanisms applied: Minor GC (which is fast and happens often to clean young gen) and Major Gc (which causes longer pause to clean Old gen). "Nepotism" does not allow for garbage in Young gen to be collected if it is referenced by garbage which was already tenured to an Old gen.
This is 'pathological' because ANY promoted node will result in the promotion of ALL following nodes until a GC resolves the issue.
To avoid nepotism it's a good idea to null out references from an object which is supposed to be removed. You can see this technique applied in JDK classes: LinkedList and LinkedHashMap
private E unlinkFirst(Node<E> f) {
final E element = f.item;
final Node<E> next = f.next;
f.item = null;
f.next = null; // help GC
// ...
}

Garbage collection vs manual memory management

This is a very basic question. I will formulate it using C++ and Java, but it's really language-independent.
Consider a well-known problem in C++:
struct Obj
{
boost::shared_ptr<Obj> m_field;
};
{
boost::shared_ptr<Obj> obj1(new Obj);
boost::shared_ptr<Obj> obj2(new Obj);
obj1->m_field = obj2;
obj2->m_field = obj1;
}
This is a memory leak, and everybody knows it :). The solution is also well-known: one should use weak pointers to break the "refcount interlocking". It is also known that this problem cannot be resolved automatically in principle. It's solely programmer's responsibility to resolve it.
But there's a positive thing: a programmer has full control on refcount values. I can pause my program in debugger and examine refcount for obj1, obj2 and understand that there's a problem. I also can set a breakpoint in destructor of an object and observe a destruction moment (or find out that object has not been destroyed).
My question is about Java, C#, ActionScript and other "Garbage Collection" languages. I might be missing something, but in my opinion they
Do not let me examine refcount of objects
Do not let me know when object is destroyed (okay, when object is exposed to GC)
I often hear that these languages just do not allow a programmer to leak a memory and that's why they are great. As far as I understand, they just hide memory management problems and make it hard to solve them.
Finally, the questions themselves:
Java:
public class Obj
{
public Obj m_field;
}
{
Obj obj1 = new Obj();
Obj obj2 = new Obj();
obj1.m_field = obj2;
obj2.m_field = obj1;
}
Is it memory leak?
If yes: how do I detect and fix it?
If no: why?
Managed memory systems are built on the assumption that you don't want to be tracing memory leak issue in the first place. Instead of making them easier to solve you try to make sure they never happen in the first place.
Java does have a lose term for "Memory Leak" which means any growth in memory which could impact your application, but there is never a point that the managed memory cannot clean up all the memory.
JVM don't use reference counting for a number of reasons
it cannot handled circular references as you have observed.
it has significant memory and threading overhead to maintain accurately.
there are much better, simpler ways of handling such situations for managed memory.
While the JLS doesn't ban the use of reference counts, it is not used in any JVM AFAIK.
Instead Java keeps track of a number of root contexts (e.g. each thread stack) and can trace which objects need to be keeps and which can be discarded based on whether those objects are strongly reachable. It also provides the facility for weak references (which are retained as long as the objects are not cleaned up) and soft references (which are not generally cleaned up but can be at the garbage collectors discretion)
AFAIK, Java GC works by starting from a set of well-defined initial references and computing a transitive closure of objects which can be reached from these references. Anything not reachable is "leaked" and can be GC-ed.
Java has a unique memory management strategy. Everything (except a few specific things) are allocated on the heap, and isn't freed until the GC gets to work.
For example:
public class Obj {
public Object example;
public Obj m_field;
}
public static void main(String[] args) {
int lastPrime = 2;
while (true) {
Obj obj1 = new Obj();
Obj obj2 = new Obj();
obj1.example = new Object();
obj1.m_field = obj2;
obj2.m_field = obj1;
int prime = lastPrime++;
while (!isPrime(prime)) {
prime++;
}
lastPrime = prime;
System.out.println("Found a prime: " + prime);
}
}
C handles this situation by requiring you to manually free the memory of both 'obj', and C++ counts references to 'obj' and automatically destroys them when they go out of scope.
Java does not free this memory, at least not at first.
The Java runtime waits a while until it feels like there is too much memory being used. After that the Garbage collector kicks in.
Let's say the java garbage collector decides to clean up after the 10,000th iteration of the outer loop. By this time, 10,000 objects have been created (which would have already been freed in C/C++).
Although there are 10,000 iterations of the outer loop, only the newly created obj1 and obj2 could possibly be referenced by the code.
These are the GC 'roots', which java uses to find all objects which could possibly be referenced. The garbage collector then recursively iterates down the object tree, marking 'example' as active in addiction to the garbage collector roots.
All those other objects are then destroyed by the garbage collector.
This does come with a performance penalty, but this process has been heavily optimized, and isn't significant for most applications.
Unlike in C++, you don't have to worry about reference cycles at all, since only objects reachable from the GC roots will live.
With java applications you do have to worry about memory (Think lists holding onto the objects from all iterations), but it isn't as significant as other languages.
As for debugging: Java's idea of debugging high memory values are using a special 'memory-analyzer' to find out what objects are still on the heap, not worrying about what is referencing what.
The critical difference is that in Java etc you are not involved in the disposal problem at all. This may feel like a pretty scary position to be but it is surprisingly empowering. All the decisions you used to have to make as to who is responsible for disposing a created object are gone.
It does actually make sense. The system knows much more about what is reachable and what is not than you. It can also make much more flexible and intelligent decisions about when to tear down structures etc.
Essentially - in this environment you can juggle objects in a much more complex way without worrying about dropping one. The only thing you now need to worry about is if you accidentally glue one to the ceiling.
As an ex C programmer having moved to Java I feel your pain.
Re - your final question - it is not a memory leak. When GC kicks in everything is discarded except what is reachable. In this case, assuming you have released obj1 and obj2 neither is reachable so they will both be discarded.
Garbage collection is not simple ref counting.
The circular reference example which you demonstrate will not occur in a garbage collected managed language because the garbage collector will want to trace allocation references all the way back to something on the stack. If there isn't a stack reference somewhere it's garbage. Ref counting systems like shared_ptr are not that smart and it's possible (like you demonstrate) to have two objects somewhere in the heap which keep each other from being deleted.
Garbage collected languages don't let you inspect refcounter because they have no-one. Garbage collection is an entirely different thing from refcounted memory management. The real difference is in determinism.
{
std::fstream file( "example.txt" );
// do something with file
}
// ... later on
{
std::fstream file( "example.txt" );
// do something else with file
}
in C++ you have the guarantee that example.txt has been closed after the first block is closed, or if an exception is thrown. Caomparing it with Java
{
try
{
FileInputStream file = new FileInputStream( "example.txt" );
// do something with file
}
finally
{
if( file != null )
file.close();
}
}
// ..later on
{
try
{
FileInputStream file = new FileInputStream( "example.txt" );
// do something with file
}
finally
{
if( file != null )
file.close();
}
}
As you see, you have traded memory management for all other resources management. That is the real diffence, refcounted objects still keep deterministic destruction. In garbage collection languages you must manually release resources, and check for exception. One may argue that explicit memory management can be tedious and error prone, but in modern C++ you it is mitigated by smart pointers and standard containers. You still have some responsibilities (circular references, for example), but think at how many catch/finally block you can avoid using deterministic destruction and how much typing a Java/C#/etc. programmer must do instead (as they have to manually close/release resources other than memory). And I know that there's using syntax in C# (and something similar in the newest Java) but it covers only the block scope lifetime and not the more general problem of shared ownership.

Why this program can cause memory Leak?

Well , i was going through Memory Leaks in Java .
I saw this simple below program where the author says that
Memory Leaks are possible with this below program
But could please tell me whats wrong with this program and why it can
produce a Memory Leak ??
package com.code.revisited.memoryleaks;
public class StackTest {
public static void main(String[] args) {
Stack<Integer> s = new Stack<>(10000);
for (int i = 0; i < 10000; i++) {
s.push(i);
}
while (!s.isEmpty()) {
s.pop();
}
while(true){
//do something
}
}
}
pop method is removing Integer objects from the Stack. But Integer objects are not de-referenced; this means that they will occupy memory.
Update:
This point is explained in Item 6 of Effective Java : Eliminate obsolete object references
If a stack grows and then shrinks, the objects
that were popped off the stack will not be garbage collected, even if the program
using the stack has no more references to them. This is because the stack maintains
obsolete references to these objects. An obsolete reference is simply a reference
that will never be dereferenced again.
The fix for this sort of problem is simple: null out references or remove object from Stack once they become obsolete. In given case pop method will decrement the top reference.
It really depends how the stack is implemented.
If this is Java's stack (java.util.Stack), then it should not happen. The underlying array is dynamic and may have unused slots, but they are explicitly set to null when popping items.
I guess that the stack in your example is not the standard one; It's probably a part of the example, and it illustrates this kind of memory leak. For example, if the pop() method decreases the underlying array index, but doesn't set the array item to null, then the code above should leave 1000 live objects in the heap, although they are probably not needed anymore by the program.
-- Edit --
Did you take the example from http://coderevisited.com/memory-leaks-in-java/?
If so, note that it also includes a stack implementation, just as I suspected.

Does assigning objects to null in Java impact garbage collection?

Does assigning an unused object reference to null in Java improve the garbage collection process in any measurable way?
My experience with Java (and C#) has taught me that is often counter intuitive to try and outsmart the virtual machine or JIT compiler, but I've seen co-workers use this method and I am curious if this is a good practice to pick up or one of those voodoo programming superstitions?
Typically, no.
But like all things: it depends. The GC in Java these days is VERY good and everything should be cleaned up very shortly after it is no longer reachable. This is just after leaving a method for local variables, and when a class instance is no longer referenced for fields.
You only need to explicitly null if you know it would remain referenced otherwise. For example an array which is kept around. You may want to null the individual elements of the array when they are no longer needed.
For example, this code from ArrayList:
public E remove(int index) {
RangeCheck(index);
modCount++;
E oldValue = (E) elementData[index];
int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null; // Let gc do its work
return oldValue;
}
Also, explicitly nulling an object will not cause an object to be collected any sooner than if it just went out of scope naturally as long as no references remain.
Both:
void foo() {
Object o = new Object();
/// do stuff with o
}
and:
void foo() {
Object o = new Object();
/// do stuff with o
o = null;
}
Are functionally equivalent.
In my experience, more often than not, people null out references out of paranoia not out of necessity. Here is a quick guideline:
If object A references object B and you no longer need this reference and object A is not eligible for garbage collection then you should explicitly null out the field. There is no need to null out a field if the enclosing object is getting garbage collected anyway. Nulling out fields in a dispose() method is almost always useless.
There is no need to null out object references created in a method. They will get cleared automatically once the method terminates. The exception to this rule is if you're running in a very long method or some massive loop and you need to ensure that some references get cleared before the end of the method. Again, these cases are extremely rare.
I would say that the vast majority of the time you will not need to null out references. Trying to outsmart the garbage collector is useless. You will just end up with inefficient, unreadable code.
Good article is today's coding horror.
The way GC's work is by looking for objects that do not have any pointers to them, the area of their search is heap/stack and any other spaces they have. So if you set a variable to null, the actual object is now not pointed by anyone, and hence could be GC'd.
But since the GC might not run at that exact instant, you might not actually be buying yourself anything. But if your method is fairly long (in terms of execution time) it might be worth it since you will be increasing your chances of GC collecting that object.
The problem can also be complicated with code optimizations, if you never use the variable after you set it to null, it would be a safe optimization to remove the line that sets the value to null (one less instruction to execute). So you might not actually be getting any improvement.
So in summary, yes it can help, but it will not be deterministic.
At least in java, it's not voodoo programming at all. When you create an object in java using something like
Foo bar = new Foo();
you do two things: first, you create a reference to an object, and second, you create the Foo object itself. So long as that reference or another exists, the specific object can't be gc'd. however, when you assign null to that reference...
bar = null ;
and assuming nothing else has a reference to the object, it's freed and available for gc the next time the garbage collector passes by.
It depends.
Generally speaking shorter you keep references to your objects, faster they'll get collected.
If your method takes say 2 seconds to execute and you don't need an object anymore after one second of method execution, it makes sense to clear any references to it. If GC sees that after one second, your object is still referenced, next time it might check it in a minute or so.
Anyway, setting all references to null by default is to me premature optimization and nobody should do it unless in specific rare cases where it measurably decreases memory consuption.
Explicitly setting a reference to null instead of just letting the variable go out of scope, does not help the garbage collector, unless the object held is very large, where setting it to null as soon as you are done with is a good idea.
Generally setting references to null, mean to the READER of the code that this object is completely done with and should not be concerned about any more.
A similar effect can be achieved by introducing a narrower scope by putting in an extra set of braces
{
int l;
{ // <- here
String bigThing = ....;
l = bigThing.length();
} // <- and here
}
this allows the bigThing to be garbage collected right after leaving the nested braces.
public class JavaMemory {
private final int dataSize = (int) (Runtime.getRuntime().maxMemory() * 0.6);
public void f() {
{
byte[] data = new byte[dataSize];
//data = null;
}
byte[] data2 = new byte[dataSize];
}
public static void main(String[] args) {
JavaMemory jmp = new JavaMemory();
jmp.f();
}
}
Above program throws OutOfMemoryError. If you uncomment data = null;, the OutOfMemoryError is solved. It is always good practice to set the unused variable to null
I was working on a video conferencing application one time and noticed a huge huge huge difference in performance when I took the time to null references as soon as I didn't need the object anymore. This was in 2003-2004 and I can only imagine the GC has gotten even smarter since. In my case I had hundreds of objects coming and going out of scope every second, so I noticed the GC when it kicked in periodically. However after I made it a point to null objects the GC stopped pausing my application.
So it depends on what your doing...
Yes.
From "The Pragmatic Programmer" p.292:
By setting a reference to NULL you reduce the number of pointers to the object by one ... (which will allow the garbage collector to remove it)
I assume the OP is referring to things like this:
private void Blah()
{
MyObj a;
MyObj b;
try {
a = new MyObj();
b = new MyObj;
// do real work
} finally {
a = null;
b = null;
}
}
In this case, wouldn't the VM mark them for GC as soon as they leave scope anyway?
Or, from another perspective, would explicitly setting the items to null cause them to get GC'd before they would if they just went out of scope? If so, the VM may spend time GC'ing the object when the memory isn't needed anyway, which would actually cause worse performance CPU usage wise because it would be GC'ing more earlier.
Even if nullifying the reference were marginally more efficient, would it be worth the ugliness of having to pepper your code with these ugly nullifications? They would only be clutter and obscure the intent code that contains them.
Its a rare codebase that has no better candidate for optimisation than trying to outsmart the Garbage collector (rarer still are developers who succeed in outsmarting it). Your efforts will most likely be better spent elsewhere instead, ditching that crufty Xml parser or finding some opportunity to cache computation. These optimisations will be easier to quantify and don't require you dirty up your codebase with noise.
Oracle doc point out "Assign null to Variables That Are No Longer Needed" https://docs.oracle.com/cd/E19159-01/819-3681/abebi/index.html
"It depends"
I do not know about Java but in .net (C#, VB.net...) it is usually not required to assign a null when you no longer require a object.
However note that it is "usually not required".
By analyzing your code the .net compiler makes a good valuation of the life time of the variable...to accurately tell when the object is not being used anymore. So if you write obj=null it might actually look as if the obj is still being used...in this case it is counter productive to assign a null.
There are a few cases where it might actually help to assign a null. One example is you have a huge code that runs for long time or a method that is running in a different thread, or some loop. In such cases it might help to assign null so that it is easy for the GC to know its not being used anymore.
There is no hard & fast rule for this. Going by the above place null-assigns in your code and do run a profiler to see if it helps in any way. Most probably you might not see a benefit.
If it is .net code you are trying to optimize, then my experience has been that taking good care with Dispose and Finalize methods is actually more beneficial than bothering about nulls.
Some references on the topic:
http://blogs.msdn.com/csharpfaq/archive/2004/03/26/97229.aspx
http://weblogs.asp.net/pwilson/archive/2004/02/20/77422.aspx
In the future execution of your program, the values of some data members will be used to computer an output visible external to the program. Others might or might not be used, depending on future (And impossible to predict) inputs to the program. Other data members might be guaranteed not to be used. All resources, including memory, allocated to those unused data are wasted. The job of the garbage collector (GC) is to eliminate that wasted memory. It would be disastrous for the GC to eliminate something that was needed, so the algorithm used might be conservative, retaining more than the strict minimum. It might use heuristic optimizations to improve its speed, at the cost of retaining some items that are not actually needed. There are many potential algorithms the GC might use. Therefore it is possible that changes you make to your program, and which do not affect the correctness of your program, might nevertheless affect the operation of the GC, either making it run faster to do the same job, or to sooner identify unused items. So this kind of change, setting an unusdd object reference to null, in theory is not always voodoo.
Is it voodoo? There are reportedly parts of the Java library code that do this. The writers of that code are much better than average programmers and either know, or cooperate with, programmers who know details of the garbage collector implementations. So that suggests there is sometimes a benefit.
As you said there are optimizations, i.e. JVM knows the place when the variable was last used and the object referenced by it can be GCed right after this last point (still executing in current scope). So nulling out references in most cases does not help GC.
But it can be useful to avoid "nepotism" (or "floating garbage") problem (read more here or watch video). The problem exists because heap is split into Old and Young generations and there are different GC mechanisms applied: Minor GC (which is fast and happens often to clean young gen) and Major Gc (which causes longer pause to clean Old gen). "Nepotism" does not allow for garbage in Young gen to be collected if it is referenced by garbage which was already tenured to an Old gen.
This is 'pathological' because ANY promoted node will result in the promotion of ALL following nodes until a GC resolves the issue.
To avoid nepotism it's a good idea to null out references from an object which is supposed to be removed. You can see this technique applied in JDK classes: LinkedList and LinkedHashMap
private E unlinkFirst(Node<E> f) {
final E element = f.item;
final Node<E> next = f.next;
f.item = null;
f.next = null; // help GC
// ...
}

Does setting Java objects to null do anything anymore?

I was browsing some old books and found a copy of "Practical Java" by Peter Hagger. In the performance section, there is a recommendation to set object references to null when no longer needed.
In Java, does setting object references to null improve performance or garbage collection efficiency? If so, in what cases is this an issue? Container classes? Object composition? Anonymous inner classes?
I see this in code pretty often. Is this now obsolete programming advice or is it still useful?
It depends a bit on when you were thinking of nulling the reference.
If you have an object chain A->B->C, then once A is not reachable, A, B and C will all be eligible for garbage collection (assuming nothing else is referring to either B or C). There's no need, and never has been any need, to explicitly set references A->B or B->C to null, for example.
Apart from that, most of the time the issue doesn't really arise, because in reality you're dealing with objects in collections. You should generally always be thinking of removing objects from lists, maps etc by calling the appropiate remove() method.
The case where there used to be some advice to set references to null was specifically in a long scope where a memory-intensive object ceased to be used partway through the scope. For example:
{
BigObject obj = ...
doSomethingWith(obj);
obj = null; <-- explicitly set to null
doSomethingElse();
}
The rationale here was that because obj is still in scope, then without the explicit nulling of the reference, it does not become garbage collectable until after the doSomethingElse() method completes. And this is the advice that probably no longer holds on modern JVMs: it turns out that the JIT compiler can work out at what point a given local object reference is no longer used.
No, it's not obsolete advice. Dangling references are still a problem, especially if you're, say, implementing an expandable array container (ArrayList or the like) using a pre-allocated array. Elements beyond the "logical" size of the list should be nulled out, or else they won't be freed.
See Effective Java 2nd ed, Item 6: Eliminate Obsolete Object References.
Instance fields, array elements
If there is a reference to an object, it cannot be garbage collected. Especially if that object (and the whole graph behind it) is big, there is only one reference that is stopping garbage collection, and that reference is not really needed anymore, that is an unfortunate situation.
Pathological cases are the object that retains an unnessary instance to the whole XML DOM tree that was used to configure it, the MBean that was not unregistered, or the single reference to an object from an undeployed web application that prevents a whole classloader from being unloaded.
So unless you are sure that the object that holds the reference itself will be garbage collected anyway (or even then), you should null out everything that you no longer need.
Scoped variables:
If you are considering setting a local variable to null before the end of its scope , so that it can be reclaimed by the garbage collector and to mark it as "unusable from now on", you should consider putting it in a more limited scope instead.
{
BigObject obj = ...
doSomethingWith(obj);
obj = null; // <-- explicitly set to null
doSomethingElse();
}
becomes
{
{
BigObject obj = ...
doSomethingWith(obj);
} // <-- obj goes out of scope
doSomethingElse();
}
Long, flat scopes are generally bad for legibility of the code, too. Introducing private methods to break things up just for that purpose is not unheard of, too.
In memory restrictive environments (e.g. cellphones) this can be useful. By setting null, the objetc don't need to wait the variable to get out of scope to be gc'd.
For the everyday programming, however, this shouldn't be the rule, except in special cases like the one Chris Jester-Young cited.
Firstly, It does not mean anything that you are setting a object to null. I explain it below:
List list1 = new ArrayList();
List list2 = list1;
In above code segment we are creating the object reference variable name list1 of ArrayList object that is stored in the memory. So list1 is referring that object and it nothing more than a variable. And in the second line of code we are copying the reference of list1 to list2. So now going back to your question if I do:
list1 = null;
that means list1 is no longer referring any object that is stored in the memory so list2 will also having nothing to refer. So if you check the size of list2:
list2.size(); //it gives you 0
So here the concept of garbage collector arrives which says «you nothing to worry about freeing the memory that is hold by the object, I will do that when I find that it will no longer used in program and JVM will manage me.»
I hope it clear the concept.
One of the reasons to do so is to eliminate obsolete object references.
You can read the text here.

Categories