I'm battling out of memory issues with my application and am trying to get my head around garbage collection. If I have the following code:
public void someMethod() {
MyObject myObject = new MyObject();
myObject.doSomething(); //last use of myObject in this scope
doAnotherThing();
andEvenMoreThings();
}
So my question is, will myObject be available for garbage collection after myObject.doSomething() which is the last use of this object, or after the completion of someMethod() where it comes out of scope? I.e. is the garbage collection smart enough to see that though a local variable is still in scope, it won't be used by the rest of the code?
"Where it comes out of scope"
public void someMethod() {
MyObject myObject = new MyObject();
myObject.doSomething(); //last use of myObject in this scope
myObject = null; //Now available for gc
doAnotherThing();
andEvenMoreThings();
}
The best thing you can do is to take your code put it in a loop with a delay and hook up a profiler to it.
If you are using a later version of Java then JVisualVm comes as standard.
If you are on windows and have JAVA_HOME set
%JAVA_HOME%/bin/jvisualvm
This will launch a profiler and you can see what objects are being collected and what are not. In my opinion this is an essential part of being a programmer and its fun to find the memory leaks.
Hope this helps
Btw in later java 6, there is type of escape analysis where JVM can find out that your instance of MyObject doesn't leave the method, so it even can place it entirely on stack and you will not need any GC for it at all.
So my question is, will myObject be available for garbage collection after myObject.doSomething() which is the last use of this object, or after the completion of someMethod() where it comes out of scope?
The former.
I.e. is the garbage collection smart enough to see that though a local variable is still in scope, it won't be used by the rest of the code?
Scope is not visible to the GC which sees only registers, stacks, global variables and references from heap blocks to other heap blocks. So scope is irrelevant.
After local scope is out. as object can be reused and can live in the local scope. ie. it is marked for gc. but not actually gc'ed .
Code optimization will probably notice where the last usage of myObject is and make it available for garbage collection there, however technically it will be until the variable no longer refers to it (by being assigned to something else) or goes out of scope.
Related
This question already has answers here:
Declaring variables inside or outside of a loop
(20 answers)
Closed 8 years ago.
I know similar question has been asked many times previously but I am still not convinced about when objects become eligible for GC and which approach is more efficient.
Approach one:
for (Item item : items) {
MyObject myObject = new MyObject();
//use myObject.
}
Approach Two:
MyObject myObject = null;
for (Item item : items) {
myObject = new MyObject();
//use myObject.
}
I understand: "By minimizing the scope of local variables, you increase the readability and maintainability of your code and reduce the likelihood of error". (Joshua Bloch).
But How about performance/memory consumption? In Java Objects are Garbage collected when there is no reference left to the object. If there are e.g. 100000 items then 100000 objects will be created. In Approach One each object will have a reference (myObject) to it so they are not eligible for GC?
Where as in Approach Two with every loop iteration you are removing reference from the object created in previous iteration. so surely objects start becoming eligible after the first loop iteration.
Or is it a trade off between performance and code readability & maintainability?
What have I misunderstood?
Note:
Assuming I care about performance and myObject is not needed after the loop.
Thanks In Advance
If there are e.g. 100000 items then 100000 objects will be created in Approach One and each object will have a reference (myObject) to it so they are not eligible for GC?
No, from Garbage Collector's point of view both the approaches work the same i.e. no memory is leaked. With approach two, as soon as the following statement runs
myObject = new MyObject();
the previous MyObject that was being referenced becomes an orphan (unless while using that Object you passed it around, say, to another method where that reference was saved) and is eligible for garbage collection.
The difference is that once the loop runs out you would have the last instance of MyObject still reachable through the myObject reference originally created outside the loop.
Does GC know when references go out of scope during the loop execution or it can only know at the end of method?
First of all there's only one reference, not references. It's the objects that are getting unreferenced in the loop. Secondly, the garbage collection doesn't kick in spontaneously. So forget the loop, it may not even happen when the method exits.
Notice that I said, orphan objects become eligible for gc, not that they get collected immediately. Garbage collection never happens in real time, it happens in phases. In the mark phase, all the objects that are not reachable through a live thread anymore are marked for deletion. Then in the sweep phase, memory is reclaimed and additionally compacted much like defragmenting a hard drive. So, it works more like a batch rather than piecemeal operations.
GC isn't bothered about scopes or methods as such. It only looks for unreferenced objects and it does so when it feels like doing it. You can't force it. The only thing that you can be sure of is that GC would run if the JVM is running out of memory but you can't pin exactly when it would do so.
But, all this does not mean that GC can't kick in while the method executes or even while the loop is running. If you had, say, a Message Processor that processed 10,000 messages every 10 mins or so and then slept in between i.e. the bean waits within the loop, does 10,000 iterations and then waits again; GC would definitely kick into action to reclaim memory even though the method hasn't run to completion yet.
You have misunderstood when objects become eligible for GC - they do this when they are no longer reachable from an active thread. In this context that means:
When the only reference to them goes out of scope (approach 1).
When the only reference to them is assigned another value (approach 2).
So, the instance of MyObject would be eligible for GC at the end of each loop iteration whichever approach was used. The difference (theoretically) between the two approaches is that the JVM would have to allocate memory for a new object reference each iteration in approach 1 but not in approach 2. However, this assumes the Java compiler and/or Just-In-Time compiler is not smart to optimise approach 1 to actually act like approach 2.
In any case, I would go for the more readable and less error prone approach 1 on the grounds that:
The performance overhead for a single object reference allocation is tiny.
It will probably get optimised away anyway.
In both approaches objects will get Garbage collected.
In Approach 1: As and when for loop exits , all the local variable inside for loop get Garbage collected , as the loop ends.
In Approach 2 : As when new new reference is assigned to myObject variable the earlier has no proper reference .So that earlier get garbage collected and so on until loop runs.
So in both approaches there is no performance bottle neck.
I would not expect declaring the variable inside a block to have a detrimental impact on performance.
At least notionally the JVM allocates the stack frame at the start of the method and destroys it at the end. By implication will have the cumulative size to accommodate all the local variables.
See section 2.6 in here:
http://docs.oracle.com/javase/specs/jvms/se7/html/jvms-2.html
That is consistent with other languages such as C where resizing the stack frame as the function/method executes is an overhead with no apparent return.
So wherever you declare it shouldn't make a difference.
Indeed declaring variables in blocks may help the compiler realize that the effective size of the stack frame can be smaller:
void foo() {
int x=6;
int y=7;
int z=8;
//.....
}
Versus
void bar() {
{
int x=6;
//....
}
{
int y=7;
//....
}
{
int z=8;
//....
}
}
Notice that bar() clearly only needs one local variable not 3.
Though making the stack frame smaller is unlikely to have any real influence on performance!
However when a reference goes out of the scope may make the object it references available for garbage collection. You would otherwise need to set references to null which is an untidy and unnecessary bother (and tinsy weenie overhead).
Without question you should declare variables inside a loop if (and only if) you don't need to access them outside the loop.
IMHO blocked statements (like bar has above) are under used.
If a method proceeds in stages you can protect the later stages from variable pollution using blocks.
With suitable (short) comments it can often be more readable (and efficient) way of structuring code than breaking it down it lost of private methods.
I have a chunky algorithm (Hashlife) where making earlier artifacts available for garbage collection during the method can make the difference between getting to the end and getting OutOfMemoryError.
I am currently trying to avoid GC_CONCURRENT calls, so i'm running through my main loop.
I've noticed that i often create a complex object to do calculations.
So my question is would declaring that object as a field of the class opposed to declaring it in the methods that use it help performance?
Or because my english has probably hurt you brain, here's the code example as field
class myclass{
private MyObject myObject;
...
public void myLoopedMethod(...){
myObject = new MyObject(...);
myObject.dostuff;
}
Example in method
class myclass{
...
public void myLoopedMethod(...){
MyObject myObject = new MyObject(...);
myObject.dostuff;
}
The right scope would be the method, but my doubt is that by making it a field, the memory is always freed and allocated in the same spot. Is this true and does this help avoiding GC calls?
Also, i should probably do something like this, but I'm interested if the above logic makes sense.
class myclass{
private MyObject myObject;
...
public void myMethod(...){
myObject.setNewValues(...);
myObject.dostuff;
}
}
but my doubt is that by making it a field, the memory is always freed
and allocated in the same spot. Is this true and does this help
avoiding GC calls?
There is no guarantee that memory allocated in the same spot. It is implementation detail.
In your example case, if instance variable, all objects referenced by this instance variable will be eligible for GC except last object which still has a reference from the instance variable (last one will become eligible for GC when it has no reachable references).
In case of defining inside method, all objects referenced by this reference will become eligible for GC as soon as loop done.
So, better go with defining inside method unless you need reference to the object identified in loop.
Coming to question avoiding GC calls, I think both approaches will have almost same amount of GC activity. Unless you have real issue with memory I would suggest don't worry about memory allocation and GC, VMs are intelligent enough to take care that stuff.
Yes, if the object creation is inside the often-called method, it will result in more work for the garbage collector.
Always measure instead of speculating... The theoretical advantages might be insignificant.
I'm trying to monitor the gc activity in my app, using -verbosegc flag. I can see there are full and minor collections, but is there a way to determine (subscrbing to events/vm flags/whatever) which objects actually collected?
Thanks!
For general information about objects in memory I would suggest you look into jvisualvm (it is in the bin folder of your JDK). It has alot of useful information about what the VM is doing as your program runs, including information about the various objects, and memory state.
If you want something more specific you can use WeakReferences and ReferenceQueues. This option might be viable if you are only interested in objects of a few type. You can create a WeakReference to the objects as they are created with a common ReferenceQueue, and then have another thread check the Queue periodically (note that the queue only says the objects are reachable, not that they are actually collected):
static ReferenceQueue<MyObject> MY_QUEUE = new ReferenceQueue<MyObject>();
static class MyReference extends WeakReference<MyObject>{
public final String name;
public MyReference(MyObject o, ReferenceQueue<MyObject> q){
super(o, q);
name = o.toString();
}
}
static{
Thread t = new Thread(){
public void run(){
while(true){
MyReference r = (MyReference)MY_QUEUE.remove();
System.out.println(r.name+" eligible for collection");
}
}
}
t.setDaemon(true);
t.start();
}
public MyObject(){
//normal init...
new MyReference(this, MY_QUEUE);
}
I haven't used this flag myself -XX:-TraceClassUnloading. It's meant to Trace unloading of classes.
finalize method of the Object class is called just before the GC collects the object. Override the method in your class as follows:
#Override
protected void finalize() throws Throwable {
System.out.println(this+" collected");
super.finalize();
}
Note that you can only monitor your own classes with this method. So, since String is a final class, you cannot monitor a String object with this way.
I know this isn't how you'd like to solve your problem, but it might be useful anyway.
You can capture the event on your own objects by overriding the finalize method. It doesn't 100% guarantee that the object will be Garbage Collected since it can create references to itself, but it's a start.
Take a look at this article it's a pretty good GC Tutorial.
I'm not entirely sure why you need to know this. Most people want to know this to determine if they have memory leaks. (In java that means that you keep an object alive by keeping a reference to an object).
Netbeans has great tools to look at the memory use of any java application (also the ones not running from netbeans!) They can tell you how many objects have been collected and where your memory usage is going, and many more useful statistics.
To analyze memory problems you need to check which objects are not GCed, instead of checking which objects got GCed.
To check which objects are GCed you can always use any profiler like Jprofiler etc.
In general you cannot determine which object is reclaimed. You can however find get a subset of it, but you have to reference them using Weak, Soft and Phantom reference. In general what you do is create an object then reference it using one of those reference. See this article.
Today my colleagues and me have a discussion about the usage of the final keyword in Java to improve the garbage collection.
For example, if you write a method like:
public Double doCalc(final Double value)
{
final Double maxWeight = 1000.0;
final Double totalWeight = maxWeight * value;
return totalWeight;
}
Declaring the variables in the method final would help the garbage collection to clean up the memory from the unused variables in the method after the method exits.
Is this true?
Here's a slightly different example, one with final reference-type fields rather than final value-type local variables:
public class MyClass {
public final MyOtherObject obj;
}
Every time you create an instance of MyClass, you'll be creating an outgoing reference to a MyOtherObject instance, and the GC will have to follow that link to look for live objects.
The JVM uses a mark-sweep GC algorithm, which has to examine all the live refereces in the GC "root" locations (like all the objects in the current call stack). Each live object is "marked" as being alive, and any object referred to by a live object is also marked as being alive.
After the completion of the mark phase, the GC sweeps through the heap, freeing memory for all unmarked objects (and compacting the memory for the remaining live objects).
Also, it's important to recognize that the Java heap memory is partitioned into a "young generation" and an "old generation". All objects are initially allocated in the young generation (sometimes referred to as "the nursery"). Since most objects are short-lived, the GC is more aggressive about freeing recent garbage from the young generation. If an object survives a collection cycle of the young generation, it gets moved into the old generation (sometimes referred to as the "tenured generation"), which is processed less frequently.
So, off the top of my head, I'm going to say "no, the 'final' modifer doesn't help the GC reduce its workload".
In my opinion, the best strategy for optimizing your memory-management in Java is to eliminate spurious references as quickly as possible. You could do that by assigning "null" to an object reference as soon as you're done using it.
Or, better yet, minimize the size of each declaration scope. For example, if you declare an object at the beginning of a 1000-line method, and if the object stays alive until the close of that method's scope (the last closing curly brace), then the object might stay alive for much longer that actually necessary.
If you use small methods, with only a dozen or so lines of code, then the objects declared within that method will fall out of scope more quickly, and the GC will be able to do most of its work within the much-more-efficient young generation. You don't want objects being moved into the older generation unless absolutely necessary.
Declaring a local variable final will not affect garbage collection, it only means you can not modify the variable. Your example above should not compile as you are modifying the variable totalWeight which has been marked final. On the other hand, declaring a primitive (double instead of Double) final will allows that variable to be inlined into the calling code, so that could cause some memory and performance improvement. This is used when you have a number of public static final Strings in a class.
In general, the compiler and runtime will optimize where it can. It is best to write the code appropriately and not try to be too tricky. Use final when you do not want the variable to be modified. Assume that any easy optimizations will be performed by the compiler, and if you are worried about performance or memory use, use a profiler to determine the real problem.
No, it is emphatically not true.
Remember that final does not mean constant, it just means you can't change the reference.
final MyObject o = new MyObject();
o.setValue("foo"); // Works just fine
o = new MyObject(); // Doesn't work.
There may be some small optimisation based around the knowledge that the JVM will never have to modify the reference (such as not having check to see if it has changed) but it would be so minor as to not worry about.
Final should be thought of as useful meta-data to the developer and not as a compiler optimisation.
Some points to clear up:
Nulling out reference should not help GC. If it did, it would indicate that your variables are over scoped. One exception is the case of object nepotism.
There is no on-stack allocation as of yet in Java.
Declaring a variable final means you can't (under normal conditions) assign a new value to that variable. Since final says nothing about scope, it doesn't say anything about it's effect on GC.
Well, I don't know about the use of the "final" modifier in this case, or its effect on the GC.
But I can tell you this: your use of Boxed values rather than primitives (e.g., Double instead of double) will allocate those objects on the heap rather than the stack, and will produce unnecessary garbage that the GC will have to clean up.
I only use boxed primitives when required by an existing API, or when I need nullable primatives.
Final variables cannot be changed after initial assignment (enforced by the compiler).
This does not change the behaviour of the garbage collection as such. Only thing is that these variables cannot be nulled when not being used any more (which may help the garbage collection in memory tight situations).
You should know that final allows the compiler to make assumptions about what to optimize. Inlining code and not including code known not to be reachable.
final boolean debug = false;
......
if (debug) {
System.out.println("DEBUG INFO!");
}
The println will not be included in the byte code.
There is a not so well known corner case with generational garbage collectors. (For a brief description read the answer by benjismith for a deeper insight read the articles at the end).
The idea in generational GCs is that most of the time only young generations need to be considered. The root location is scanned for references, and then the young generation objects are scanned. During this more frequent sweeps no object in the old generation are checked.
Now, the problem comes from the fact that an object is not allowed to have references to younger objects. When a long lived (old generation) object gets a reference to a new object, that reference must be explicitly tracked by the garbage collector (see article from IBM on the hotspot JVM collector), actually affecting the GC performance.
The reason why an old object cannot refer to a younger one is that, as the old object is not checked in minor collections, if the only reference to the object is kept in the old object, it will not get marked, and would be wrongly deallocated during the sweep stage.
Of course, as pointed by many, the final keyword does not reallly affect the garbage collector, but it does guarantee that the reference will never be changed into a younger object if this object survives the minor collections and makes it to the older heap.
Articles:
IBM on garbage collection: history, in the hotspot JVM and performance. These may no longer be fully valid, as it dates back in 2003/04, but they give some easy to read insight into GCs.
Sun on Tuning garbage collection
GC acts on unreachable refs. This has nothing to do with "final", which is merely an assertion of one-time assignment. Is it possible that some VM's GC can make use of "final"? I don't see how or why.
final on local variables and parameters makes no difference to the class files produced, so cannot affect runtime performance. If a class has no subclasses, HotSpot treats that class as if it is final anyway (it can undo later if a class that breaks that assumption is loaded). I believe final on methods is much the same as classes. final on static field may allow the variable to be interpreted as a "compile-time constant" and optimisation to be done by javac on that basis. final on fields allows the JVM some freedom to ignore happens-before relations.
There seems to be a lot of answers that are wandering conjectures. The truth is, there is no final modifier for local variables at the bytecode level. The virtual machine will never know that your local variables were defined as final or not.
The answer to your question is an emphatic no.
All method and variable can be overridden bydefault in subclasses.If we want to save the subclasses from overridig the members of superclass,we can declare them as final using the keyword final.
For e.g-
final int a=10;
final void display(){......}
Making a method final ensures that the functionality defined in the superclass will never be changed anyway. Similarly the value of a final variable can never be changed. Final variables behaves like class variables.
Strictly speaking about instance fields, final might improve performance slightly if a particular GC wants to exploit that. When a concurrent GC happens (that means that your application is still running, while GC is in progress), see this for a broader explanation, GCs have to employ certain barriers when writes and/or reads are done. The link I gave you pretty much explains that, but to make it really short: when a GC does some concurrent work, all read and writes to the heap (while that GC is in progress), are "intercepted" and applied later in time; so that the concurrent GC phase can finish it's work.
For final instance fields, since they can not be modified (unless reflection), these barriers can be omitted. And this is not just pure theory.
Shenandoah GC has them in practice (though not for long), and you can do, for example:
-XX:+UnlockExperimentalVMOptions
-XX:+UseShenandoahGC
-XX:+ShenandoahOptimizeInstanceFinals
And there will be optimizations in the GC algorithm that will make it slightly faster. This is because there will be no barriers intercepting final, since no one should modify them, ever. Not even via reflection or JNI.
The only thing that I can think of is that the compiler might optimize away the final variables and inline them as constants into the code, thus you end up with no memory allocated.
absolutely, as long as make object's life shorter which yield great benefit of memory management, recently we examined export functionality having instance variables on one test and another test having method level local variable. during load testing, JVM throws outofmemoryerror on first test and JVM got halted. but in second test, successfully able to get the report due to better memory management.
The only time I prefer declaring local variables as final is when:
I have to make them final so that they can be shared with some anonymous class (for example: creating daemon thread and let it access some value from enclosing method)
I want to make them final (for example: some value that shouldn't/doesn't get overridden by mistake)
Does they help in fast garbage collection?
AFAIK a object becomes a candidate of GC collection if it has zero strong references to it and in that case as well there is no guarantee that they will be immediately garbage collected . In general, a strong reference is said to die when it goes out of scope or user explicitly reassign it to null reference, thus, declaring them final means that reference will continue to exists till the method exists (unless its scope is explicitly narrowed down to a specific inner block {}) because you can't reassign final variables (i.e. can't reassign to null). So I think w.r.t Garbage Collection 'final' may introduce a unwanted possible delay so one must be little careful in defining there scope as that controls when they will become candidate for GC.
I was browsing some old books and found a copy of "Practical Java" by Peter Hagger. In the performance section, there is a recommendation to set object references to null when no longer needed.
In Java, does setting object references to null improve performance or garbage collection efficiency? If so, in what cases is this an issue? Container classes? Object composition? Anonymous inner classes?
I see this in code pretty often. Is this now obsolete programming advice or is it still useful?
It depends a bit on when you were thinking of nulling the reference.
If you have an object chain A->B->C, then once A is not reachable, A, B and C will all be eligible for garbage collection (assuming nothing else is referring to either B or C). There's no need, and never has been any need, to explicitly set references A->B or B->C to null, for example.
Apart from that, most of the time the issue doesn't really arise, because in reality you're dealing with objects in collections. You should generally always be thinking of removing objects from lists, maps etc by calling the appropiate remove() method.
The case where there used to be some advice to set references to null was specifically in a long scope where a memory-intensive object ceased to be used partway through the scope. For example:
{
BigObject obj = ...
doSomethingWith(obj);
obj = null; <-- explicitly set to null
doSomethingElse();
}
The rationale here was that because obj is still in scope, then without the explicit nulling of the reference, it does not become garbage collectable until after the doSomethingElse() method completes. And this is the advice that probably no longer holds on modern JVMs: it turns out that the JIT compiler can work out at what point a given local object reference is no longer used.
No, it's not obsolete advice. Dangling references are still a problem, especially if you're, say, implementing an expandable array container (ArrayList or the like) using a pre-allocated array. Elements beyond the "logical" size of the list should be nulled out, or else they won't be freed.
See Effective Java 2nd ed, Item 6: Eliminate Obsolete Object References.
Instance fields, array elements
If there is a reference to an object, it cannot be garbage collected. Especially if that object (and the whole graph behind it) is big, there is only one reference that is stopping garbage collection, and that reference is not really needed anymore, that is an unfortunate situation.
Pathological cases are the object that retains an unnessary instance to the whole XML DOM tree that was used to configure it, the MBean that was not unregistered, or the single reference to an object from an undeployed web application that prevents a whole classloader from being unloaded.
So unless you are sure that the object that holds the reference itself will be garbage collected anyway (or even then), you should null out everything that you no longer need.
Scoped variables:
If you are considering setting a local variable to null before the end of its scope , so that it can be reclaimed by the garbage collector and to mark it as "unusable from now on", you should consider putting it in a more limited scope instead.
{
BigObject obj = ...
doSomethingWith(obj);
obj = null; // <-- explicitly set to null
doSomethingElse();
}
becomes
{
{
BigObject obj = ...
doSomethingWith(obj);
} // <-- obj goes out of scope
doSomethingElse();
}
Long, flat scopes are generally bad for legibility of the code, too. Introducing private methods to break things up just for that purpose is not unheard of, too.
In memory restrictive environments (e.g. cellphones) this can be useful. By setting null, the objetc don't need to wait the variable to get out of scope to be gc'd.
For the everyday programming, however, this shouldn't be the rule, except in special cases like the one Chris Jester-Young cited.
Firstly, It does not mean anything that you are setting a object to null. I explain it below:
List list1 = new ArrayList();
List list2 = list1;
In above code segment we are creating the object reference variable name list1 of ArrayList object that is stored in the memory. So list1 is referring that object and it nothing more than a variable. And in the second line of code we are copying the reference of list1 to list2. So now going back to your question if I do:
list1 = null;
that means list1 is no longer referring any object that is stored in the memory so list2 will also having nothing to refer. So if you check the size of list2:
list2.size(); //it gives you 0
So here the concept of garbage collector arrives which says «you nothing to worry about freeing the memory that is hold by the object, I will do that when I find that it will no longer used in program and JVM will manage me.»
I hope it clear the concept.
One of the reasons to do so is to eliminate obsolete object references.
You can read the text here.