GC optimization: declaring object as field opposed to declaring it locally - java

I am currently trying to avoid GC_CONCURRENT calls, so i'm running through my main loop.
I've noticed that i often create a complex object to do calculations.
So my question is would declaring that object as a field of the class opposed to declaring it in the methods that use it help performance?
Or because my english has probably hurt you brain, here's the code example as field
class myclass{
private MyObject myObject;
...
public void myLoopedMethod(...){
myObject = new MyObject(...);
myObject.dostuff;
}
Example in method
class myclass{
...
public void myLoopedMethod(...){
MyObject myObject = new MyObject(...);
myObject.dostuff;
}
The right scope would be the method, but my doubt is that by making it a field, the memory is always freed and allocated in the same spot. Is this true and does this help avoiding GC calls?
Also, i should probably do something like this, but I'm interested if the above logic makes sense.
class myclass{
private MyObject myObject;
...
public void myMethod(...){
myObject.setNewValues(...);
myObject.dostuff;
}
}

but my doubt is that by making it a field, the memory is always freed
and allocated in the same spot. Is this true and does this help
avoiding GC calls?
There is no guarantee that memory allocated in the same spot. It is implementation detail.
In your example case, if instance variable, all objects referenced by this instance variable will be eligible for GC except last object which still has a reference from the instance variable (last one will become eligible for GC when it has no reachable references).
In case of defining inside method, all objects referenced by this reference will become eligible for GC as soon as loop done.
So, better go with defining inside method unless you need reference to the object identified in loop.
Coming to question avoiding GC calls, I think both approaches will have almost same amount of GC activity. Unless you have real issue with memory I would suggest don't worry about memory allocation and GC, VMs are intelligent enough to take care that stuff.

Yes, if the object creation is inside the often-called method, it will result in more work for the garbage collector.
Always measure instead of speculating... The theoretical advantages might be insignificant.

Related

Does not storing a newly declared object cause a memory leak?

What I mean to say through the post title is - doing this:
public static void makeNewObjectAndDoTask() {
new SomeClass().doYourTask();
}
I have myself written such code in languages such Java and JavaScript - declaring a new object without storing it in a variable, JUST to call one of its methods. Does this cause memory leaks? ..or does the object get cleared at the end of the method-stack / gets freed by the Java Garbage Collector?
Should I just be doing this instead - for safety?:
public static void makeNewObjectAndDoTask() {
SomeClass obj = new SomeClass().doYourTask();
obj = null;
//System.gc(); // Perhaps also call the collector manually?
}
As the commentors already answered, there is no memory leak in code like
public static void makeNewObjectAndDoTask() {
new SomeClass().doYourTask();
}
at least in itself, assuming that the SomeClass() constructor and the doYourTask() methods don't create memory leaks.
Definitely, the garbage collector will clean up the SomeClass instance at some time in the future.
How does it work?
Instances that are no longer accessible from program code will be garbage collected. Accessibility means being referenced in a variable, field, array element, method argument and so on.
As soon as the new SomeClass().doYourTask(); statement has finished, there is no way to access this individual SomeClass instance any more. So, it fulfills the garbage collection criteria.
The next time the garbage collector runs, it can reclaim the memory occupied by the instance (and its fields, recursively, as long as they aren't referenced elsewhere).
The alternative code
public static void makeNewObjectAndDoTask() {
SomeClass obj = new SomeClass().doYourTask();
obj = null;
}
only delays the garbage collection opportunity, as it stores a reference in obj, thus making the instance accessible for at least a tiny additional period of time, until you assign obj = null;.
Manually calling the garbage collector as in System.gc(); rarely is a good idea. It forces the GC to run (and to spend execution time on cleaning up memory), instead of relying on the JVM's highly optimized GC scheduling strategies. Don't do it unless you have a thorough understanding of the garbage collector, which led you to the conclusion that the GC strategy fails in your case.
We don't want OutOfMemoryErrors, and we don't want excessive time wasted for garbage collection, and the standard GC system does a very good job in both aspects.

Change a Non Static Variable to Static Variable

I am reading about the GC and read that when an object becomes eligible for garbage collection, the GC has to run the finalize method on it. The finalize method is guaranteed to run only once, thus the GC marks the object as finalized and gives it a rest until the next cycle.
In the finalize method, you can technically “resurrect” an object, for example, by assigning it to a static field. The object would become alive again and not eligible for garbage collection, so the GC would not collect it during the next cycle.
The object, however, would be marked as finalized, so when it would become eligible again, the finalize method would not be called. In essence, you can turn this “resurrection” trick only once for the lifetime of the object.
I find this fascinating. However, if my variable is non-static, how to I change it to the static inside the finalize method?
Remember:
An object becomes eligible for Garbage collection or GC if it is not reachable from any live threads or by any static references. So the hack is to add the object to a static resource inside the finalize method and this will prevent the garbage collection for only one time. The finalize method is protected, so, can be overridden by subclasses, whether they are in the same package or not.
This is a dangerous practice and no need to use inside application code.
Changing a variable definition at runtime isn't easy and in some cases next to impossible. There might be some nasty reflection tricks that might involve inline compiling, classloading etc. but you shouldn't do that. Changing a variable from static to non-static or vice versa would also involve moving the data around in storage and deal with potential collisions - so don't do that.
Anyways variables are just references and to resurrect an object you'd just need to create a new reference from a live thread. That could be done with some collection that's referenced by a static variable and which the this reference is added to.
Example (for illustration purposes only, do not use it unless you really know what you are doing and have a good reason to):
class NastyResurrector {
public static Collection<Object> RESURRECTED_OBJECTS = ...;// use whatever collection implementation you like
}
Then in finalize() you'd call NastyResurrector.RESURRECTED_OBJECTS.add(this) and there you have your reference.
However, I'll quote from the source of your question (question Q11):
Beware that this ugly hack should be used only if you really know what you’re doing
That's the most important takeaway in my opinion.

Should variables be declared inside the loop or outside the loop in java [duplicate]

This question already has answers here:
Declaring variables inside or outside of a loop
(20 answers)
Closed 8 years ago.
I know similar question has been asked many times previously but I am still not convinced about when objects become eligible for GC and which approach is more efficient.
Approach one:
for (Item item : items) {
MyObject myObject = new MyObject();
//use myObject.
}
Approach Two:
MyObject myObject = null;
for (Item item : items) {
myObject = new MyObject();
//use myObject.
}
I understand: "By minimizing the scope of local variables, you increase the readability and maintainability of your code and reduce the likelihood of error". (Joshua Bloch).
But How about performance/memory consumption? In Java Objects are Garbage collected when there is no reference left to the object. If there are e.g. 100000 items then 100000 objects will be created. In Approach One each object will have a reference (myObject) to it so they are not eligible for GC?
Where as in Approach Two with every loop iteration you are removing reference from the object created in previous iteration. so surely objects start becoming eligible after the first loop iteration.
Or is it a trade off between performance and code readability & maintainability?
What have I misunderstood?
Note:
Assuming I care about performance and myObject is not needed after the loop.
Thanks In Advance
If there are e.g. 100000 items then 100000 objects will be created in Approach One and each object will have a reference (myObject) to it so they are not eligible for GC?
No, from Garbage Collector's point of view both the approaches work the same i.e. no memory is leaked. With approach two, as soon as the following statement runs
myObject = new MyObject();
the previous MyObject that was being referenced becomes an orphan (unless while using that Object you passed it around, say, to another method where that reference was saved) and is eligible for garbage collection.
The difference is that once the loop runs out you would have the last instance of MyObject still reachable through the myObject reference originally created outside the loop.
Does GC know when references go out of scope during the loop execution or it can only know at the end of method?
First of all there's only one reference, not references. It's the objects that are getting unreferenced in the loop. Secondly, the garbage collection doesn't kick in spontaneously. So forget the loop, it may not even happen when the method exits.
Notice that I said, orphan objects become eligible for gc, not that they get collected immediately. Garbage collection never happens in real time, it happens in phases. In the mark phase, all the objects that are not reachable through a live thread anymore are marked for deletion. Then in the sweep phase, memory is reclaimed and additionally compacted much like defragmenting a hard drive. So, it works more like a batch rather than piecemeal operations.
GC isn't bothered about scopes or methods as such. It only looks for unreferenced objects and it does so when it feels like doing it. You can't force it. The only thing that you can be sure of is that GC would run if the JVM is running out of memory but you can't pin exactly when it would do so.
But, all this does not mean that GC can't kick in while the method executes or even while the loop is running. If you had, say, a Message Processor that processed 10,000 messages every 10 mins or so and then slept in between i.e. the bean waits within the loop, does 10,000 iterations and then waits again; GC would definitely kick into action to reclaim memory even though the method hasn't run to completion yet.
You have misunderstood when objects become eligible for GC - they do this when they are no longer reachable from an active thread. In this context that means:
When the only reference to them goes out of scope (approach 1).
When the only reference to them is assigned another value (approach 2).
So, the instance of MyObject would be eligible for GC at the end of each loop iteration whichever approach was used. The difference (theoretically) between the two approaches is that the JVM would have to allocate memory for a new object reference each iteration in approach 1 but not in approach 2. However, this assumes the Java compiler and/or Just-In-Time compiler is not smart to optimise approach 1 to actually act like approach 2.
In any case, I would go for the more readable and less error prone approach 1 on the grounds that:
The performance overhead for a single object reference allocation is tiny.
It will probably get optimised away anyway.
In both approaches objects will get Garbage collected.
In Approach 1: As and when for loop exits , all the local variable inside for loop get Garbage collected , as the loop ends.
In Approach 2 : As when new new reference is assigned to myObject variable the earlier has no proper reference .So that earlier get garbage collected and so on until loop runs.
So in both approaches there is no performance bottle neck.
I would not expect declaring the variable inside a block to have a detrimental impact on performance.
At least notionally the JVM allocates the stack frame at the start of the method and destroys it at the end. By implication will have the cumulative size to accommodate all the local variables.
See section 2.6 in here:
http://docs.oracle.com/javase/specs/jvms/se7/html/jvms-2.html
That is consistent with other languages such as C where resizing the stack frame as the function/method executes is an overhead with no apparent return.
So wherever you declare it shouldn't make a difference.
Indeed declaring variables in blocks may help the compiler realize that the effective size of the stack frame can be smaller:
void foo() {
int x=6;
int y=7;
int z=8;
//.....
}
Versus
void bar() {
{
int x=6;
//....
}
{
int y=7;
//....
}
{
int z=8;
//....
}
}
Notice that bar() clearly only needs one local variable not 3.
Though making the stack frame smaller is unlikely to have any real influence on performance!
However when a reference goes out of the scope may make the object it references available for garbage collection. You would otherwise need to set references to null which is an untidy and unnecessary bother (and tinsy weenie overhead).
Without question you should declare variables inside a loop if (and only if) you don't need to access them outside the loop.
IMHO blocked statements (like bar has above) are under used.
If a method proceeds in stages you can protect the later stages from variable pollution using blocks.
With suitable (short) comments it can often be more readable (and efficient) way of structuring code than breaking it down it lost of private methods.
I have a chunky algorithm (Hashlife) where making earlier artifacts available for garbage collection during the method can make the difference between getting to the end and getting OutOfMemoryError.

At what point exactly is an object available for garbage collection?

I'm battling out of memory issues with my application and am trying to get my head around garbage collection. If I have the following code:
public void someMethod() {
MyObject myObject = new MyObject();
myObject.doSomething(); //last use of myObject in this scope
doAnotherThing();
andEvenMoreThings();
}
So my question is, will myObject be available for garbage collection after myObject.doSomething() which is the last use of this object, or after the completion of someMethod() where it comes out of scope? I.e. is the garbage collection smart enough to see that though a local variable is still in scope, it won't be used by the rest of the code?
"Where it comes out of scope"
public void someMethod() {
MyObject myObject = new MyObject();
myObject.doSomething(); //last use of myObject in this scope
myObject = null; //Now available for gc
doAnotherThing();
andEvenMoreThings();
}
The best thing you can do is to take your code put it in a loop with a delay and hook up a profiler to it.
If you are using a later version of Java then JVisualVm comes as standard.
If you are on windows and have JAVA_HOME set
%JAVA_HOME%/bin/jvisualvm
This will launch a profiler and you can see what objects are being collected and what are not. In my opinion this is an essential part of being a programmer and its fun to find the memory leaks.
Hope this helps
Btw in later java 6, there is type of escape analysis where JVM can find out that your instance of MyObject doesn't leave the method, so it even can place it entirely on stack and you will not need any GC for it at all.
So my question is, will myObject be available for garbage collection after myObject.doSomething() which is the last use of this object, or after the completion of someMethod() where it comes out of scope?
The former.
I.e. is the garbage collection smart enough to see that though a local variable is still in scope, it won't be used by the rest of the code?
Scope is not visible to the GC which sees only registers, stacks, global variables and references from heap blocks to other heap blocks. So scope is irrelevant.
After local scope is out. as object can be reused and can live in the local scope. ie. it is marked for gc. but not actually gc'ed .
Code optimization will probably notice where the last usage of myObject is and make it available for garbage collection there, however technically it will be until the variable no longer refers to it (by being assigned to something else) or goes out of scope.

Can using too many static variables cause a memory leak in Java?

If my application has too many static variables or methods, then as per definition they will be stored in heap. Please correct me if I am wrong
1) Will these variables be on heap until application is closed?
2) Will they be available for GC at any time? If not can I say it is a memory leak?
Static methods are just methods, they are not stored on the heap, they just don't get to use a "this" parameter.
Static variables serve as "roots" to the GC. As a result, unless you explicitly set them to null, they will live as long as the program lives, and so is everything reachable from them.
A situation is only considered a memory leak if you intend for the memory to become free and it doesn't become free. If you intend for your static variable to contain a reference to an object for part of the time, and you forget to set it to null when you're done with that object, you would likely end up with a leak. However, if you put it in the static variable and intend for it to be there for as long as the program is running, then it is most definitely not a leak, it is more likely a "permanent singleton". If the object got reclaimed while you wanted it to still exist, that would have been very bad.
As for your question about the heap: All objects in Java exist either on the heap or on the stack. Objects are created on the heap with the new operator. A reference is then attached to them. If the reference becomes null or falls out of scope (e.g., end of block), the GC realizes that there is no way to reach that object ever again and reclaims it. If your reference is in a static variable, it never falls out of scope but you can still set it to null or to another object.
If you have a static hashmap and you add data to it... the data will never disappear and you have a leak - in case you do not need the data anymore. If you need the data, it is not a leak, but a huge pile of memory hanging around.
Objects directly or indirectly referenced by statics will remain on the heap until the appropriate class loader can be collected. There are cases (ThreadLocal, for instance) where other objects indirectly reference the class loader causing it to remain uncollected.
If you have a static List, say, and add references to it dynamically, then you can very easily end up with "object lifetime contention issues". Avoid mutable statics for many reasons.
As long as you can reference these variables from somewhere in the code it can't by GCed which means that they will be there until the end of the application.
Can you call it a memory leak, I wouldn't call it a memory leak, usually a memory leak is memory that you normally expect to recover but you never do, or you only recover part of it. Also memory leaks usually get worse in time (eg: every time you call a method more memory is "leaked") however in this case the memory usage for those variables is (kind of) static.
It won't cause a memory leak in the classic C sense... For example
Class A{
static B foo;
...
static void makeFoo(){
foo = new B();
foo = new B();
}
In this case, a call to makeFoo() won't result in a memory leak, as the first instance can be garbage collected.

Categories