I have developed simulation software for a robotics competition coming up. This software's purpose is to learn how to play a game using NEAT.
To do this the simulation must be run many many many times. However, I've just recently noticed a bad memory leak in the program. It appears that every 10 seconds 1 more mb of memory is allocated.
I believe that the memory leak lies within my Game class because this class is actually responsible for running through the simulation.
My question is:
If I were to set game to null before starting another game would that allow the garbage collector to deallocate every child object within game or do I also have to set those to null.
Would this do the trick?
{
//=-=-=-=-=-=-=-=-=--=-=-=-=-=-=
Game game = new Game(someParams);
while(!(game.isFinished()))
{
game.run();
game.draw();
}
//do some stuff for NEAT
//remove the memory
game = null;
System.gc();
//=-=-=-=-=-=-=-=-=-=-=-=-=-
}
At the beginning of this method you assign a brand new instance to game. Setting it to null at the end might help depending on what you do after. If this is the end of the method, the setting it to null will change nothing, because the game reference is destroyed immediately after (as if you had assigned null). If you continue doing things in the same method after setting it to null, it might help, but it won't be the final solution.
Memory leaks in Java usually happen because you forget to release a reference to an object. For instance adding it to a list, and forgetting to remove it when you are over with it.
JDK provides some tools for that:
$ jmap -dump:format=b,live,file=/tmp/dump <pid>
$ jhat /tmp/dump
jhat creates a HTTP server listening on port 7000. Open http://localhost:7000 with your browser and at the end of the first page you'll find the option "Show instance counts for all classes (excluding platform)."
Click on it and you will see a list of all the loaded classes, ordered by the number of instances. One or two of the classes will have an abnormally-high number of instances. Click on "instances" and you will have the list of all the instances for that class.
Clicking on one of the instances you'll see the actual object, and under "References to this object" a list of objects keeping a reference to it.
Some referencing objects will have valid references to it. Others might have a forgotten reference. Try to identify which object (a List, a Map, a Set, etc), keeps a forgotten reference by checking several instances of the objects which is not being released.
Related
I read about how methods are executed and this is what I understand:
1) Methods are allocated memory in method area and only a single copy is maintained which is used across all the instances of the class.
2) When a method is called from an instance then the current thread(single threaded env) say main gets loaded and then stack is loaded with the method being called via instance.eg:
main(String ags[])
{
A a = new A();
a.method();
}
// code of method
method()
{
for(int i=0;i<25;i++)
system.out.println(i);
}
so for this thread it has its own call stack and then on method call same method body with its local variables gets pushed onto the same stack above main method.
Now based on above understanding, what I dont understand is that in multi threaded environment how the same code will behave if I run two threads
and both share the same instance. eg:
//My run method for myRunnable
run()
{
a.method();
}
Thread one = new Thread(new myRunnable(a)); // object from above
Thread two= new Thread(new myRunnable(a));
Now when the two threads start executing they will have there own call stack.
How will the method of the shared object execute in this case?.
Thanks
1) Methods are allocated memory in method area and only a single copy is maintained which is used across all the instances of the class >> that means that the bytecode of method implementation is only one per all instances. And method bytecode memory region is separated from the object's heap.
Each thread has its own stack of course, just like you explain it.
If you have multiple threads running the same method on the same object concurrently, you have the following situation:
local variables are stored on each thread's stack. They are not shared and do not conflict.
The object instance (this) is stored on the heap, as well as all its fields (such as this.foo). The heap is shared. To ensure that this works properly, you have to apply thread synchronization mechanisms as appropriate.
static fields are also shared and access must be coordinated, too
In your example, the i in the loop is a local variable. Both threads will print all of the numbers in sequence (but the output of the two threads is interleaved in an undefined order).
OK, you walk into a room.
Somebody hands you a clipboard and a pencil and a whiteboard marker, and then
tells you to start following the instructions that are written on a certain
poster on the wall.
There's a whiteboard on another wall: It looks like a spreadsheet with rows
and columns, and numbers and words written in the cells. Your clipboard has a
sheet of paper with more rows and columns, and some numbers are written in
some of the cells in pencil.
The instructions tell you, step-by-step, how to perform some complex
calculation. They say things like,
...
Step 37: Copy the number from B5 on the whiteboard into J2 on your
clipboard.
Step 38: Add J2 through J7 on your clipboard, and write the result in J9.
Step 39: If the result in J9 is greater than the value in whiteboard-C9,
then go back to step 22, otherwise, go on to step 40.
Step 40: Erase whiteboard-C9, and then copy the value from clipboard-J9
into that location.
...
There's a space on your clipboard where you can write your own notes. You can
use it, for example, to keep track of what step you're on, or whatever else
you need to remember in order to get the job done.
There are other posters on the wall, and there are other people, each with
his/her own clipboard. Some of the people are following instructions from the
same poster as you, and some of them are reading from other posters.
Everybody is reading from/writing to the same whiteboard.
Everybody is going at her/his own pace. The ones who are reading the same
poster as you are not necessarily on the same step as you, and because each of
you had different initial numbers written on your clipboards, you may not even
be performing the instructions in the same sequence.
This is a simplistic model of multi-threaded computing: The posters on the
wall are the methods, The whiteboard is the heap, the people are threads, and
your clipboards are your stacks.
It's also, roughly similar to how scientific/engineering calculations were
done during the industrial age. The people who did that kind of work were
called "computers".
If you're coordinating the whole thing, and it's time to add a new "thread" (i.e., when a new volunteer walks into the room), then you'll need to give that person his/her own clipboard (stack), with its own initial values (parameters), but you don't give the new person her/his own poster (methods): You just point her/him at one of the posters that already is up on the wall.
Let us say that I have a chess-website. People log in and play chess against others, and my Java-program is doing all the calculations. I see only two options on how I would do it:
Run a new "instance" of the Java-program for every chess-game. Meaning that I essentially write java chess in the terminal every time a new chess-game starts.
Run one instance of the Java-program, but create a new Board() with two Player whenever a new game is started. But in this case I need to pay attention to memory leaks, since I will never be terminating the Java-program.
I am assuming that the first option is bad. This assumption is not really based on any knowledge, so I could very well be wrong. But for the sake of this post I am going with the second option. If I am wrong let me know.
Going with the second option, I could do something like this every time a new chess-game is started:
Player p1 = new Player(white);
Player p2 = new Player(black);
Board b1 = new Board(p1,p2);
startMatch(b1);
and when the game is over these three objects are no longer needed and should be removed from the memory. What I have heard is that the Java-garbage-collector collects all objects that are unreachable.
So if I do this:
p1 = null;
p2 = null;
b1 = null;
I have accomplished the task? If yes, have I done it in a good way or is this incredibly cringe-worthy and disgusting?
It's sufficient for the values (or instances) to go out of scope (once unreachable they're eligible for garbage collection), there is no need to explicitly null your references (unless the instance(s) containing them will never go out of scope).
I know the threads save the values of the variables in the cpu cache where it is running because in this way the cpu doesnt have wait so much time when it's necessary to get the values inside in the variables.
But for example if i have this object
public class MyObject {
int a = 2;
}
and now the thread do something like this:
MyObject obj = new MyObject();
obj.a=3;
My question is:
what will be saved in the cpu cache ?
all the MyObject structure or just the reference?
I think all the structure (have more sense) but i prefer to ask because i would like to be sure about that.
I'm a noob about multithread and i'm sure is more complex how a cpu cache works, but at the moment i need just basic information.
In your example, only one thread is acting. For this thread, cache is transparent - there is no way to determine if a value is in cache, in main memory, or both. First all values are put in the cache but then very soon, in an unknown moment of time they are pushed out.
"i would like to be sure about that" - why? Your program behaviour does not depend on this.
These question has two sides:
What the CPU is doing: The CPU is designed to keep everything in the cache that is needed very often. If you change a value it will keep changes in the cache until it is needs to write it to the main memory (actually it depends on the CPUs strategy write-back vs write-through). The "need" to write it to main memory is programatically controlled or the CPU descides its needing space for other stuff. To answer one part of your question: For the CPU everything is data, the value you set in Java, and the internal object data structures. To access your value, you need the object address first, so that is very probably in the cache, too :)
The second point, is what Java programmer should expect and not expect: This is very exactly defined in the Java Memory Model. Just start here: http://en.wikipedia.org/wiki/Java_Memory_Model
So for your lines:
MyObject obj = new MyObject();
obj.a=3;
There is no guarantee that another thread running after this code, sees the new value. And it also may not see your new object reference but null instead. You need a synchronized block or a volatile variable.
I have some objects that are created/destroyed very often and that can exist in many lists at the same time. To ensure I have no references left to them the objects have a flag isDestroyed, and if this is set, each list is responsible for removing the object from the list.
However this is ofcourse a growing ground for memory leaks. What if I forget to remove objects from one of the lists? To visually monitor that the program behaves correctly, I override finalize and increase a global variable to track destructions (not a formal test, only to get an idea). However as I have no control over the GC, I could in theory wait forever until something is destroyed.
So the question is two-fold: When having objects that are in multiple lists, is a "isDestroyed" considered a good way to control the object lifetime? It forces everyone who uses the object to take care to remove it from their lists, which seems bad.
And, is there any good way to see when the reference count reaches zero on an object, ie when its scheduled for destruction?
EDIT: To be more specific, in my case I my objects represent physical entities in a room. And I have one manager class that draws each object, therefore it is in one list. Another list contains all the objects that are clickable, so there I have another list. Having all objects in one list and using polymorphism or instance of is not an option in this case. When a object is "destroyed", it should neither be shown or clickable in any way, therefore I want to remove it from both lists.
You should have a look at the java.lang.ref Package.
And, is there any good way to see when the reference count reaches
zero on an object, ie when its scheduled for destruction?
You can use the ReferenceQueue Object
From JavaDoc of java.lang.ref.ReferenceQueue
Reference queues, to which registered reference objects are appended
by the garbage collector after the appropriate reachability changes
are detected.
I think this what WeakReference and ReferenceQueue is for - you create a WeakReference for the object you are tracking and associate it with a ReferenceQueue. Then you have another thread that processes WeakReference(s) as it is returned from ReferenceQueue.remove(). WeakReference's are added to ReferenceQueue when the referenced objects is GC'd. But can you give an example on what these lists you are trying to clean up when the referenced objects are dead?
The way this is usually handled is through the Observer pattern. Each list attaches a destroy-listener that gets notified upon destruction. How this meshes with you architecture, I have no details to judge from.
If you want to be notified I'm almost sure you need PhantomReference, read here:
http://weblogs.java.net/blog/2006/05/04/understanding-weak-references
Is there a way to create register a handler that will be called exactly at the time when the last reference to a certain object is released?
An example would be an object that is backed by a physical data file and once the object become unreferenced, the file should be closed and than renamed. It would be nice if this was possible without having to explicitly call a "close" method on that object.
All the notification mechanisms I am aware of from the Weak/Phantom reference area only state that notification will occur at some point in time but there is no gurantee as to when this will happen...
In short, no.
The Java specification explicitly denies you the ability to know when the last reference is released. JVM implementations (and optimizations) depend on this. There is no hook.
From my understanding, and I've looked for some time to find a "destructor" for java objects, there is no way to know when you lose the last reference. Java tracks references to objects but for performance reasons, updates this information only during garbage collection.
The closest thing is the finalize method which should be called during garbage collection but there's no guarantee that it will be called even then.
I think WeakReference does what you want. A WeakReference gets put into the ReferenceQueue as soon as its weakly reachable (i.e. all strong references are gone).
See this article by Ethan Nicholas.
If you are worried about some references not reaching the ReferenceQueue at shutdown, then keep a list of all objects created (using WeakReferences or PhantomReferences). Add a shutdown hook that checks the list for any outstanding references and perform whatever action you need.
Problem is, "How do you implement this, without something holding a reference to the object?"
Even if you could get passed that problem, say with a service we'll call the HandleManager, the HandleManager would then have to create a new reference to the object, to pass to your handler. Then, your handler could either (a) store a reference to it, which would confuse the HandleManager which was expecting to destroy the unreferenced object; or (b) release the reference, which means that the final reference was once again released, which means the Handler has to be called again....
If you need to manage external resources like files, the best you can do in java is a close() function (whatever name you choose). You can use finalize() as a "belt and suspenders" insurance policy, but that has unpredictable timing. So your main line of defense needs to be the close() function.
See my answer Why would you ever implement finalize()?
This cannot be done with Java -- it needs a reference-counting garbage collector as far as I can tell. Have you considered opening and closing your object's physical data file as needed, rather than keeping it open for the lifetime of the object?
You could override finalize() in your object, but that is problematic for reasons others have mentioned.
For your specific example, you could take a look at using something like File.deleteOnExit(), which would remove the file once the VM exits.