Related
In my code there are many objects created and deleted during runtime which can have long reference chains (i.e. be reachable in terms of garbage collection from many places).
In my unit test I want to verify that when I "delete" an object, e.g. remove it from the list that primarily contains it, all references to that object are deleted. To be clear: I want to verify that no reference to this object exists anymore in my application.
How can I achieve this?
So far I came only up with this (testing that a PhantomReference is enqueued):
#Test
public void test_objectX_can_be_garbage_collected() throws Exception {
ReferenceQueue<MyClass> refsToBeDeleted = new ReferenceQueue<>();
MyClass obj = new MyClass();
PhantomReference<MyClass> phantomRef = new PhantomReference<MyClass>(obj, refsToBeDeleted);
obj = null; // delete last strong reference to obj
System.out.println("Reference is enqueued:" + phantomRef.isEnqueued()); // "false"
System.gc(); // invoke garbage collection
Thread.sleep(1000L); // required - see text
System.gc(); // invoke garbage collection again
System.out.println("Reference is enqueued:" + phantomRef.isEnqueued()); // true
}
with
public static class MyClass {
}
The doc says:
public static void gc()
Runs the garbage collector.
Calling the gc method suggests that the Java Virtual Machine expend effort toward recycling unused objects in order to make the memory they currently occupy available for quick reuse. When control returns from the method call, the Java Virtual Machine has made a best effort to reclaim space from all discarded objects.
(emphasis mine)
This is quite vague ("best effort") and seems quite non-deterministic, e.g. I have to either wait 1 second and call gc() again, or call it e.g. four times in a row for the reference to be enqueued.
Is there any way to do this properly? I want to test that objects don't persist and will eventually flood my memory after months of uptime.
Remarks:
1.) I know you should not call gc() because it's bad code. True, but I am calling it from test code to verify production code.
2.) I know that gc() is actually not deterministic from a local point of view. I'm aware of this and asking for a better alternative.
Edit:
I realized that this can be a kind of "special" unit test: If it succeeds once for a given code base, I know that this reference is garbage collected; if the unit test fails it might be because gc() didn't want to delete it, or it could not be deleted. So this test can verify that the memory is released, but it might need several attempts for that verification. However, this is "special" (read: "bad") because I would have to accept a randomly failing unit test.
AFAIK there is no way to make sure that an object has been garbage collected. Calling System.gc() will usually result in garbage collection, but not always.
I also have a infinitely running Java process - I call System.gc() at the beginning of each iteration and record the resulting amount of used memory. If this amount does not significantly change over time, I can be reasonably sure not to "leak" memory.
I've observed that when a heap dump is triggered, it usually does a full GC. If that is a policy of JVM to force GC during a Heap collection, then you can trigger a heap collection by
kill -3 pid
and see if it works.
You might need to add jvm arguments for getting a heap dump.
I do not see a problem using reference and System.gc() for testing (though I prefer weak references instead phantom ones). I do not remember any stability problems for such approach.
Reference based approach work only if you can get reference to object which may be leaked.
I have forked version of head dump analysis library from NetBeans, which I use to automate certain types of memory usage reports.
It is possible, for test code, to take heap dump of itself and inspect this dump assert specific number of objects of given type to be present.
Here you can find example of such approach.
Does assigning an unused object reference to null in Java improve the garbage collection process in any measurable way?
My experience with Java (and C#) has taught me that is often counter intuitive to try and outsmart the virtual machine or JIT compiler, but I've seen co-workers use this method and I am curious if this is a good practice to pick up or one of those voodoo programming superstitions?
Typically, no.
But like all things: it depends. The GC in Java these days is VERY good and everything should be cleaned up very shortly after it is no longer reachable. This is just after leaving a method for local variables, and when a class instance is no longer referenced for fields.
You only need to explicitly null if you know it would remain referenced otherwise. For example an array which is kept around. You may want to null the individual elements of the array when they are no longer needed.
For example, this code from ArrayList:
public E remove(int index) {
RangeCheck(index);
modCount++;
E oldValue = (E) elementData[index];
int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null; // Let gc do its work
return oldValue;
}
Also, explicitly nulling an object will not cause an object to be collected any sooner than if it just went out of scope naturally as long as no references remain.
Both:
void foo() {
Object o = new Object();
/// do stuff with o
}
and:
void foo() {
Object o = new Object();
/// do stuff with o
o = null;
}
Are functionally equivalent.
In my experience, more often than not, people null out references out of paranoia not out of necessity. Here is a quick guideline:
If object A references object B and you no longer need this reference and object A is not eligible for garbage collection then you should explicitly null out the field. There is no need to null out a field if the enclosing object is getting garbage collected anyway. Nulling out fields in a dispose() method is almost always useless.
There is no need to null out object references created in a method. They will get cleared automatically once the method terminates. The exception to this rule is if you're running in a very long method or some massive loop and you need to ensure that some references get cleared before the end of the method. Again, these cases are extremely rare.
I would say that the vast majority of the time you will not need to null out references. Trying to outsmart the garbage collector is useless. You will just end up with inefficient, unreadable code.
Good article is today's coding horror.
The way GC's work is by looking for objects that do not have any pointers to them, the area of their search is heap/stack and any other spaces they have. So if you set a variable to null, the actual object is now not pointed by anyone, and hence could be GC'd.
But since the GC might not run at that exact instant, you might not actually be buying yourself anything. But if your method is fairly long (in terms of execution time) it might be worth it since you will be increasing your chances of GC collecting that object.
The problem can also be complicated with code optimizations, if you never use the variable after you set it to null, it would be a safe optimization to remove the line that sets the value to null (one less instruction to execute). So you might not actually be getting any improvement.
So in summary, yes it can help, but it will not be deterministic.
At least in java, it's not voodoo programming at all. When you create an object in java using something like
Foo bar = new Foo();
you do two things: first, you create a reference to an object, and second, you create the Foo object itself. So long as that reference or another exists, the specific object can't be gc'd. however, when you assign null to that reference...
bar = null ;
and assuming nothing else has a reference to the object, it's freed and available for gc the next time the garbage collector passes by.
It depends.
Generally speaking shorter you keep references to your objects, faster they'll get collected.
If your method takes say 2 seconds to execute and you don't need an object anymore after one second of method execution, it makes sense to clear any references to it. If GC sees that after one second, your object is still referenced, next time it might check it in a minute or so.
Anyway, setting all references to null by default is to me premature optimization and nobody should do it unless in specific rare cases where it measurably decreases memory consuption.
Explicitly setting a reference to null instead of just letting the variable go out of scope, does not help the garbage collector, unless the object held is very large, where setting it to null as soon as you are done with is a good idea.
Generally setting references to null, mean to the READER of the code that this object is completely done with and should not be concerned about any more.
A similar effect can be achieved by introducing a narrower scope by putting in an extra set of braces
{
int l;
{ // <- here
String bigThing = ....;
l = bigThing.length();
} // <- and here
}
this allows the bigThing to be garbage collected right after leaving the nested braces.
public class JavaMemory {
private final int dataSize = (int) (Runtime.getRuntime().maxMemory() * 0.6);
public void f() {
{
byte[] data = new byte[dataSize];
//data = null;
}
byte[] data2 = new byte[dataSize];
}
public static void main(String[] args) {
JavaMemory jmp = new JavaMemory();
jmp.f();
}
}
Above program throws OutOfMemoryError. If you uncomment data = null;, the OutOfMemoryError is solved. It is always good practice to set the unused variable to null
I was working on a video conferencing application one time and noticed a huge huge huge difference in performance when I took the time to null references as soon as I didn't need the object anymore. This was in 2003-2004 and I can only imagine the GC has gotten even smarter since. In my case I had hundreds of objects coming and going out of scope every second, so I noticed the GC when it kicked in periodically. However after I made it a point to null objects the GC stopped pausing my application.
So it depends on what your doing...
Yes.
From "The Pragmatic Programmer" p.292:
By setting a reference to NULL you reduce the number of pointers to the object by one ... (which will allow the garbage collector to remove it)
I assume the OP is referring to things like this:
private void Blah()
{
MyObj a;
MyObj b;
try {
a = new MyObj();
b = new MyObj;
// do real work
} finally {
a = null;
b = null;
}
}
In this case, wouldn't the VM mark them for GC as soon as they leave scope anyway?
Or, from another perspective, would explicitly setting the items to null cause them to get GC'd before they would if they just went out of scope? If so, the VM may spend time GC'ing the object when the memory isn't needed anyway, which would actually cause worse performance CPU usage wise because it would be GC'ing more earlier.
Even if nullifying the reference were marginally more efficient, would it be worth the ugliness of having to pepper your code with these ugly nullifications? They would only be clutter and obscure the intent code that contains them.
Its a rare codebase that has no better candidate for optimisation than trying to outsmart the Garbage collector (rarer still are developers who succeed in outsmarting it). Your efforts will most likely be better spent elsewhere instead, ditching that crufty Xml parser or finding some opportunity to cache computation. These optimisations will be easier to quantify and don't require you dirty up your codebase with noise.
Oracle doc point out "Assign null to Variables That Are No Longer Needed" https://docs.oracle.com/cd/E19159-01/819-3681/abebi/index.html
"It depends"
I do not know about Java but in .net (C#, VB.net...) it is usually not required to assign a null when you no longer require a object.
However note that it is "usually not required".
By analyzing your code the .net compiler makes a good valuation of the life time of the variable...to accurately tell when the object is not being used anymore. So if you write obj=null it might actually look as if the obj is still being used...in this case it is counter productive to assign a null.
There are a few cases where it might actually help to assign a null. One example is you have a huge code that runs for long time or a method that is running in a different thread, or some loop. In such cases it might help to assign null so that it is easy for the GC to know its not being used anymore.
There is no hard & fast rule for this. Going by the above place null-assigns in your code and do run a profiler to see if it helps in any way. Most probably you might not see a benefit.
If it is .net code you are trying to optimize, then my experience has been that taking good care with Dispose and Finalize methods is actually more beneficial than bothering about nulls.
Some references on the topic:
http://blogs.msdn.com/csharpfaq/archive/2004/03/26/97229.aspx
http://weblogs.asp.net/pwilson/archive/2004/02/20/77422.aspx
In the future execution of your program, the values of some data members will be used to computer an output visible external to the program. Others might or might not be used, depending on future (And impossible to predict) inputs to the program. Other data members might be guaranteed not to be used. All resources, including memory, allocated to those unused data are wasted. The job of the garbage collector (GC) is to eliminate that wasted memory. It would be disastrous for the GC to eliminate something that was needed, so the algorithm used might be conservative, retaining more than the strict minimum. It might use heuristic optimizations to improve its speed, at the cost of retaining some items that are not actually needed. There are many potential algorithms the GC might use. Therefore it is possible that changes you make to your program, and which do not affect the correctness of your program, might nevertheless affect the operation of the GC, either making it run faster to do the same job, or to sooner identify unused items. So this kind of change, setting an unusdd object reference to null, in theory is not always voodoo.
Is it voodoo? There are reportedly parts of the Java library code that do this. The writers of that code are much better than average programmers and either know, or cooperate with, programmers who know details of the garbage collector implementations. So that suggests there is sometimes a benefit.
As you said there are optimizations, i.e. JVM knows the place when the variable was last used and the object referenced by it can be GCed right after this last point (still executing in current scope). So nulling out references in most cases does not help GC.
But it can be useful to avoid "nepotism" (or "floating garbage") problem (read more here or watch video). The problem exists because heap is split into Old and Young generations and there are different GC mechanisms applied: Minor GC (which is fast and happens often to clean young gen) and Major Gc (which causes longer pause to clean Old gen). "Nepotism" does not allow for garbage in Young gen to be collected if it is referenced by garbage which was already tenured to an Old gen.
This is 'pathological' because ANY promoted node will result in the promotion of ALL following nodes until a GC resolves the issue.
To avoid nepotism it's a good idea to null out references from an object which is supposed to be removed. You can see this technique applied in JDK classes: LinkedList and LinkedHashMap
private E unlinkFirst(Node<E> f) {
final E element = f.item;
final Node<E> next = f.next;
f.item = null;
f.next = null; // help GC
// ...
}
Java programmers know that JVM runs a Garbage Collector, and System.gc() would just be a suggestion to JVM to run a Garbage Collector. It is not necessarily that if we use System.gc(), it would immediately run the GC.Please correct me if I misunderstand Java's Garbage Collector.
Is/are there any other way/s doing memory management other than relying on Java's Garbage Collector?If you intend to answer the question by some sort of programming practice that would help managing the memory, please do so.
The most important thing to remember about Java memory management is "nullify" your reference.
Only objects that are not referenced are to be garbage collected.
For example, objects in the following code is never get collected and your memory will be full just to do nothing.
List objs = new ArrayList();
for (int i = 0; i < Integer.MAX_VALUE; i++) objs.add(new Object());
But if you don't reference those object ... you can loop as much as you like without memory problem.
List objs = new ArrayList();
for (int i = 0; i < Integer.MAX_VALUE; i++) new Object();
So what ever you do, make sure you remove reference to object to no longer used (set reference to null or clear collection).
When the garbage collector will run is best left to JVM to decide. Well unless your program is about to start doing things that use a lot of memory and is speed critical so you may suggest JVM to run GC before going in as you may likely get the garbaged collected and extra memory to go on. Other wise, I personally see no reason to run System.gc().
Hope this helps.
Below is little summary I wrote back in the days (I stole it from some blog, but I can't remember where from - so no reference, sorry)
There is no manual way of doing garbage collection in Java.
Java Heap is divided into three generation for the sake of garbage collection. These are the young generation, tenured or old generation, and Perm area.
New objects are created in the young generation and subsequently moved to the old generation.
String pool is created in Perm area of Heap, Garbage collection can occur in perm space but depends on upon JVM to JVM.
Minor garbage collection is used to move an object from Eden space to Survivor 1 and Survivor 2 space, and Major collection is used to move an object from young to tenured generation.
Whenever Major garbage collection occurs application, threads stops during that period which will reduce application’s performance and throughput.
There are few performance improvements has been applied in garbage collection in Java 6 and we usually use JRE 1.6.20 for running our application.
JVM command line options -Xms and -Xmx is used to setup starting and max size for Java Heap. The ideal ratio of this parameter is either 1:1 or 1:1.5 based on my experience, for example, you can have either both –Xmx and –Xms as 1GB or –Xms 1.2 GB and 1.8 GB.
Command line options: -Xms:<min size> -Xmx:<max size>
Just to add to the discussion: Garbage Collection is not the only form of Memory Management in Java.
In the past, there have been efforts to avoid the GC in Java when implementing the memory management (see Real-time Specification for Java (RTSJ)). These efforts were mainly dedicated to real-time and embedded programming in Java for which GC was not suitable - due to performance overhead or GC-introduced latency.
The RTSJ characteristics
Immortal and Scoped Memory Management - see below for examples.
GC and Immortal/Scoped Memory can coexist withing one application
RTSJ requires a specially modified JVM.
RTSJ advantages:
low latency, no GC pauses
delivers predictable performance that is able to meet real-time system requirements
Why RTSJ failed/Did not make a big impact:
Scoped Memory concept is hard to program with, error-prone and difficult to learn.
Advance in Real-time GC algoritms reduced the GC pause-time in such way that Real-time GCs replaced the RTSJ in most of the real-time apps. However, Scoped Memories are still used in places where no latencies are tolerated.
Scoped Memory Code Example (take from An Example of Scoped Memory Usage):
import javax.realtime.*;
public class ScopedMemoryExample{
private LTMemory myMem;
public ScopedMemoryExample(int Size) {
// initialize memory
myMem = new LTMemory(1000, 5000);
}
public void periodicTask() {
while (true)) {
myMem.enter(new Runnable() {
public void run() {
// do some work in the SCOPED MEMORY
new Object();
...
// end of the enter() method, the scoped Memory is emptied.
}
});
}
}
}
Here, a ScopedMemory implementation called LTMemory is preallocated. Then a thread enters the scoped memory, allocates the temporary data that are needed only during the time of the computation. After the end of the computation, the thread leaves the scoped memory which immediately makes the whole content of the specific ScopedMemory to be emptied. No latency introduced, done in constant time e.g. predictable time, no GC is triggered.
From my experience, in java you should rely on the memory management that is provided by JVM itself.
The point I'd focus on in this topic is to configure it in a way acceptable for your use case. Maybe checking/understanding JVM tuning options would be useful: http://docs.oracle.com/cd/E15523_01/web.1111/e13814/jvm_tuning.htm
You cannot avoid garbage collection if you use Java. Maybe there are some obscure JVM implementations that do, but I don't know of any.
A properly tuned JVM shouldn't require any System.gc() hints to operate smoothly. The exact tuning you would need depends heavily on what your application does, but in my experience, I always turn on the concurrent-mark-and-sweep option with the following flag: -XX:+UseConcMarkSweepGC. This flag allows the JVM to take advantage of the extra cores in your CPU to clean up dead memory on a background thread. It helps to drastically reduce the amount of time your program is forcefully paused when doing garbage collections.
Well, the GC is always there -- you can't create objects that are outside its grasp (unless you use native calls or allocate a direct byte buffer, but in the latter case you don't really have an object, just a bunch of bytes). That said, it's definitely possible to circumvent the GC by reusing objects. For instance, if you need a bunch of ArrayList objects, you could just create each one as you need it and let the GC handle memory management; or you could call list.clear() on each one after you finish with it, and put it onto some queue where somebody else can use it.
Standard best practices are to not do that sort of reuse unless you have good reason to (ie, you've profiled and seen that the allocations + GC are a problem, and that reusing objects fixes that problem). It leads to more complicated code, and if you get it wrong it can actually make the GC's job harder (because of how the GC tracks objects).
Basically the idea in Java is that you should not deal with memory except using "new" to allocate new objects and ensure that there is no references left to objects when you are done with them.
All the rest is deliberately left to the Java Runtime and is - also deliberately - defined as vaguely as possible to allow the JVM designers the most freedom in doing so efficiently.
To use an analogy: Your operating system manages named areas of harddisk space (called "files") for you. Including deleting and reusing areas you do not want to use any more. You do not circumvent that mechanism but leave it to the operating system
You should focus on writing clear, simple code and ensure that your objects are properly done with. This will give the JVM the best possible working conditions.
You are correct in saying that System.gc() is a request to the compiler and not a command. But using below program you can make sure it happens.
import java.lang.ref.WeakReference;
public class GCRun {
public static void main(String[] args) {
String str = new String("TEMP");
WeakReference<String> wr = new WeakReference<String>(str);
str = null;
String temp = wr.get();
System.out.println("temp -- " + temp);
while(wr.get() != null) {
System.gc();
}
}
}
I would suggest to take a look at the following tutorials and its contents
This is a four part tutorial series to know about the basics of garbage collection in Java:
Java Garbage Collection Introduction
How Java Garbage Collection Works?
Types of Java Garbage Collectors
Monitoring and Analyzing Java Garbage Collection
I found This tutorial very helpful.
"Nullify"ing the reference when not required is the best way to make an object eligible for Garbage collection.
There are 4 ways in which an object can be Garbage collected.
Point the reference to null, once it is no longer required.
String s = new String("Java");
Once this String is not required, you can point it to null.
s = null;
Hence, s will be eligible for Garbage collection.
Point one object to another, so that both reference points to same object and one of the object is eligible for GC.
String s1 = new String("Java");
String s2 = new String("C++");
In future if s2 also needs to pointed to s1 then;
s1 = s2;
Then the object having "Java" will be eligible for GC.
All the objects created within a method are eligible for GC once the method is completed. Hence, once the method is destroyed from the stack of the thread then the corresponding objects in that method will be destroyed.
Island of Isolation is another concept where the objects with internal links and no extrinsic link to reference is eligible for Garbage collection.
"Island of isolation" of Garbage Collection
Examples:
Below is a method of Camera class in android. See how the developer has pointed mCameraSource to null once it is not required. This is expert level code.
public void release() {
if (mCameraSource != null) {
mCameraSource.release();
mCameraSource = null;
}
}
How Garbage Collector works?
Garbage collection is performed by the daemon thread called Garbage Collector. When there is sufficient memory available that time this demon thread has low priority and it runs in background. But when JVM finds that the heap is full and JVM wants to reclaim some memory then it increases the priority of Garbage collector thread and calls Runtime.getRuntime.gc() method which searches for all the objects which are not having reference or null reference and destroys those objects.
Does assigning an unused object reference to null in Java improve the garbage collection process in any measurable way?
My experience with Java (and C#) has taught me that is often counter intuitive to try and outsmart the virtual machine or JIT compiler, but I've seen co-workers use this method and I am curious if this is a good practice to pick up or one of those voodoo programming superstitions?
Typically, no.
But like all things: it depends. The GC in Java these days is VERY good and everything should be cleaned up very shortly after it is no longer reachable. This is just after leaving a method for local variables, and when a class instance is no longer referenced for fields.
You only need to explicitly null if you know it would remain referenced otherwise. For example an array which is kept around. You may want to null the individual elements of the array when they are no longer needed.
For example, this code from ArrayList:
public E remove(int index) {
RangeCheck(index);
modCount++;
E oldValue = (E) elementData[index];
int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null; // Let gc do its work
return oldValue;
}
Also, explicitly nulling an object will not cause an object to be collected any sooner than if it just went out of scope naturally as long as no references remain.
Both:
void foo() {
Object o = new Object();
/// do stuff with o
}
and:
void foo() {
Object o = new Object();
/// do stuff with o
o = null;
}
Are functionally equivalent.
In my experience, more often than not, people null out references out of paranoia not out of necessity. Here is a quick guideline:
If object A references object B and you no longer need this reference and object A is not eligible for garbage collection then you should explicitly null out the field. There is no need to null out a field if the enclosing object is getting garbage collected anyway. Nulling out fields in a dispose() method is almost always useless.
There is no need to null out object references created in a method. They will get cleared automatically once the method terminates. The exception to this rule is if you're running in a very long method or some massive loop and you need to ensure that some references get cleared before the end of the method. Again, these cases are extremely rare.
I would say that the vast majority of the time you will not need to null out references. Trying to outsmart the garbage collector is useless. You will just end up with inefficient, unreadable code.
Good article is today's coding horror.
The way GC's work is by looking for objects that do not have any pointers to them, the area of their search is heap/stack and any other spaces they have. So if you set a variable to null, the actual object is now not pointed by anyone, and hence could be GC'd.
But since the GC might not run at that exact instant, you might not actually be buying yourself anything. But if your method is fairly long (in terms of execution time) it might be worth it since you will be increasing your chances of GC collecting that object.
The problem can also be complicated with code optimizations, if you never use the variable after you set it to null, it would be a safe optimization to remove the line that sets the value to null (one less instruction to execute). So you might not actually be getting any improvement.
So in summary, yes it can help, but it will not be deterministic.
At least in java, it's not voodoo programming at all. When you create an object in java using something like
Foo bar = new Foo();
you do two things: first, you create a reference to an object, and second, you create the Foo object itself. So long as that reference or another exists, the specific object can't be gc'd. however, when you assign null to that reference...
bar = null ;
and assuming nothing else has a reference to the object, it's freed and available for gc the next time the garbage collector passes by.
It depends.
Generally speaking shorter you keep references to your objects, faster they'll get collected.
If your method takes say 2 seconds to execute and you don't need an object anymore after one second of method execution, it makes sense to clear any references to it. If GC sees that after one second, your object is still referenced, next time it might check it in a minute or so.
Anyway, setting all references to null by default is to me premature optimization and nobody should do it unless in specific rare cases where it measurably decreases memory consuption.
Explicitly setting a reference to null instead of just letting the variable go out of scope, does not help the garbage collector, unless the object held is very large, where setting it to null as soon as you are done with is a good idea.
Generally setting references to null, mean to the READER of the code that this object is completely done with and should not be concerned about any more.
A similar effect can be achieved by introducing a narrower scope by putting in an extra set of braces
{
int l;
{ // <- here
String bigThing = ....;
l = bigThing.length();
} // <- and here
}
this allows the bigThing to be garbage collected right after leaving the nested braces.
public class JavaMemory {
private final int dataSize = (int) (Runtime.getRuntime().maxMemory() * 0.6);
public void f() {
{
byte[] data = new byte[dataSize];
//data = null;
}
byte[] data2 = new byte[dataSize];
}
public static void main(String[] args) {
JavaMemory jmp = new JavaMemory();
jmp.f();
}
}
Above program throws OutOfMemoryError. If you uncomment data = null;, the OutOfMemoryError is solved. It is always good practice to set the unused variable to null
I was working on a video conferencing application one time and noticed a huge huge huge difference in performance when I took the time to null references as soon as I didn't need the object anymore. This was in 2003-2004 and I can only imagine the GC has gotten even smarter since. In my case I had hundreds of objects coming and going out of scope every second, so I noticed the GC when it kicked in periodically. However after I made it a point to null objects the GC stopped pausing my application.
So it depends on what your doing...
Yes.
From "The Pragmatic Programmer" p.292:
By setting a reference to NULL you reduce the number of pointers to the object by one ... (which will allow the garbage collector to remove it)
I assume the OP is referring to things like this:
private void Blah()
{
MyObj a;
MyObj b;
try {
a = new MyObj();
b = new MyObj;
// do real work
} finally {
a = null;
b = null;
}
}
In this case, wouldn't the VM mark them for GC as soon as they leave scope anyway?
Or, from another perspective, would explicitly setting the items to null cause them to get GC'd before they would if they just went out of scope? If so, the VM may spend time GC'ing the object when the memory isn't needed anyway, which would actually cause worse performance CPU usage wise because it would be GC'ing more earlier.
Even if nullifying the reference were marginally more efficient, would it be worth the ugliness of having to pepper your code with these ugly nullifications? They would only be clutter and obscure the intent code that contains them.
Its a rare codebase that has no better candidate for optimisation than trying to outsmart the Garbage collector (rarer still are developers who succeed in outsmarting it). Your efforts will most likely be better spent elsewhere instead, ditching that crufty Xml parser or finding some opportunity to cache computation. These optimisations will be easier to quantify and don't require you dirty up your codebase with noise.
Oracle doc point out "Assign null to Variables That Are No Longer Needed" https://docs.oracle.com/cd/E19159-01/819-3681/abebi/index.html
"It depends"
I do not know about Java but in .net (C#, VB.net...) it is usually not required to assign a null when you no longer require a object.
However note that it is "usually not required".
By analyzing your code the .net compiler makes a good valuation of the life time of the variable...to accurately tell when the object is not being used anymore. So if you write obj=null it might actually look as if the obj is still being used...in this case it is counter productive to assign a null.
There are a few cases where it might actually help to assign a null. One example is you have a huge code that runs for long time or a method that is running in a different thread, or some loop. In such cases it might help to assign null so that it is easy for the GC to know its not being used anymore.
There is no hard & fast rule for this. Going by the above place null-assigns in your code and do run a profiler to see if it helps in any way. Most probably you might not see a benefit.
If it is .net code you are trying to optimize, then my experience has been that taking good care with Dispose and Finalize methods is actually more beneficial than bothering about nulls.
Some references on the topic:
http://blogs.msdn.com/csharpfaq/archive/2004/03/26/97229.aspx
http://weblogs.asp.net/pwilson/archive/2004/02/20/77422.aspx
In the future execution of your program, the values of some data members will be used to computer an output visible external to the program. Others might or might not be used, depending on future (And impossible to predict) inputs to the program. Other data members might be guaranteed not to be used. All resources, including memory, allocated to those unused data are wasted. The job of the garbage collector (GC) is to eliminate that wasted memory. It would be disastrous for the GC to eliminate something that was needed, so the algorithm used might be conservative, retaining more than the strict minimum. It might use heuristic optimizations to improve its speed, at the cost of retaining some items that are not actually needed. There are many potential algorithms the GC might use. Therefore it is possible that changes you make to your program, and which do not affect the correctness of your program, might nevertheless affect the operation of the GC, either making it run faster to do the same job, or to sooner identify unused items. So this kind of change, setting an unusdd object reference to null, in theory is not always voodoo.
Is it voodoo? There are reportedly parts of the Java library code that do this. The writers of that code are much better than average programmers and either know, or cooperate with, programmers who know details of the garbage collector implementations. So that suggests there is sometimes a benefit.
As you said there are optimizations, i.e. JVM knows the place when the variable was last used and the object referenced by it can be GCed right after this last point (still executing in current scope). So nulling out references in most cases does not help GC.
But it can be useful to avoid "nepotism" (or "floating garbage") problem (read more here or watch video). The problem exists because heap is split into Old and Young generations and there are different GC mechanisms applied: Minor GC (which is fast and happens often to clean young gen) and Major Gc (which causes longer pause to clean Old gen). "Nepotism" does not allow for garbage in Young gen to be collected if it is referenced by garbage which was already tenured to an Old gen.
This is 'pathological' because ANY promoted node will result in the promotion of ALL following nodes until a GC resolves the issue.
To avoid nepotism it's a good idea to null out references from an object which is supposed to be removed. You can see this technique applied in JDK classes: LinkedList and LinkedHashMap
private E unlinkFirst(Node<E> f) {
final E element = f.item;
final Node<E> next = f.next;
f.item = null;
f.next = null; // help GC
// ...
}
Hopefully a simple question. Take for instance a Circularly-linked list:
class ListContainer
{
private listContainer next;
<..>
public void setNext(listContainer next)
{
this.next = next;
}
}
class List
{
private listContainer entry;
<..>
}
Now since it's a circularly-linked list, when a single elemnt is added, it has a reference to itself in it's next variable. When deleting the only element in the list, entry is set to null. Is there a need to set ListContainer.next to null as well for Garbage Collector to free it's memory or does it handle such self-references automagically?
Garbage collectors which rely solely on reference counting are generally vulnerable to failing to collection self-referential structures such as this. These GCs rely on a count of the number of references to the object in order to calculate whether a given object is reachable.
Non-reference counting approaches apply a more comprehensive reachability test to determine whether an object is eligible to be collected. These systems define an object (or set of objects) which are always assumed to be reachable. Any object for which references are available from this object graph is considered ineligible for collection. Any object not directly accessible from this object is not. Thus, cycles do not end up affecting reachability, and can be collected.
See also, the Wikipedia page on tracing garbage collectors.
Circular references is a (solvable) problem if you rely on counting the references in order to decide whether an object is dead. No java implementation uses reference counting, AFAIK. Newer Sun JREs uses a mix of several types of GC, all mark-and-sweep or copying I think.
You can read more about garbage collection in general at Wikipedia, and some articles about java GC here and here, for example.
The actual answer to this is implementation dependent. The Sun JVM keeps track of some set of root objects (threads and the like), and when it needs to do a garbage collection, traces out which objects are reachable from those and saves them, discarding the rest. It's actually more complicated than that to allow for some optimizations, but that is the basic principle. This version does not care about circular references: as long as no live object holds a reference to a dead one, it can be GCed.
Other JVMs can use a method known as reference counting. When a reference is created to the object, some counter is incremented, and when the reference goes out of scope, the counter is decremented. If the counter reaches zero, the object is finalized and garbage collected. This version, however, does allow for the possibility of circular references that would never be garbage collected. As a safeguard, many such JVMs include a backup method to determine which objects actually are dead which it runs periodically to resolve self-references and defrag the heap.
As a non-answer aside (the existing answers more than suffice), you might want to check out a whitepaper on the JVM garbage collection system if you are at all interested in GC. (Any, just google JVM Garbage Collection)
I was amazed at some of the techniques used, and when reading through some of the concepts like "Eden" I really realized for the first time that Java and the JVM actually could beat C/C++ in speed. (Whenever C/C++ frees an object/block of memory, code is involved... When Java frees an object, it actually doesn't do anything at all; since in good OO code, most objects are created and freed almost immediately, this is amazingly efficient.)
Modern GC's tend to be very efficient, managing older objects much differently than new objects, being able to control GCs to be short and half-assed or long and thorough, and a lot of GC options can be managed by command line switches so it's actually useful to know what all the terms actually refer to.
Note: I just realized this was misleading. C++'s STACK allocation is very fast--my point was about allocating objects that are able to exist after the current routine has finished (which I believe SHOULD be all objects--it's something you shouldn't have to think about if you are going to think in OO, but in C++ speed may make this impractical).
If you are only allocating C++ classes on the stack, it's allocation will be at least as fast as Java's.
Java collects any objects that are not reachable. If nothing else has a reference to the entry, then it will be collected, even though it has a reference to itself.
yes Java Garbage collector handle self-reference!
How?
There are special objects called called garbage-collection roots (GC roots). These are always reachable and so is any object that has them at its own root.
A simple Java application has the following GC roots:
Local variables in the main method
The main thread
Static variables of the main class
To determine which objects are no longer in use, the JVM intermittently runs what is very aptly called a mark-and-sweep algorithm. It works as follows
The algorithm traverses all object references, starting with the GC
roots, and marks every object found as alive.
All of the heap memory that is not occupied by marked objects is
reclaimed. It is simply marked as free, essentially swept free of
unused objects.
So if any object is not reachable from the GC roots(even if it is self-referenced or cyclic-referenced) it will be subjected to garbage collection.
Simply, Yes. :)
Check out http://www.ibm.com/developerworks/java/library/j-jtp10283/
All JDKs (from Sun) have a concept of "reach-ability". If the GC cannot "reach" an object, it goes away.
This isn't any "new" info (your first to respondents are great) but the link is useful, and brevity is something sweet. :)