Possible Memory Leak with JOGL using VBOs

Possible Memory Leak with JOGL using VBOs - java

We are currently developing an application which visualizes huge vector fields (> 250'000) on a sphere/plane in 4D. To speed up the process we are using VBOs for the vertices, normals and colors. To prepare the data before sending down to the GPU we are using Buffers (FloatBuffer, ByteBuffer, etc..).
Some data to the cylinders:
Each cylinder uses 16 * 9 + 16 * 3 = 192 floats -> 192 * 4 Bytes = 768 bytes.
After sending down the vertices we are doing the following cleanup:
// clear all buffers
vertexBufferShell.clear();
indexBufferShell.clear();
vertexBufferShell = null;
indexBufferShell = null;
We have monitored it with JConsole and we found out that the GarbageCollector is not run "correctly". Even if we switch down the cylinder count, the memory does not get freed up. In the JConsole Monitoring Tool there is a button to Run the GC and if we do that manually it frees up the memory (If we have loaded a huge amount of cylinders and decrease it a lot, sometimes over 600mb gets cleaned by the GC).
Here an image of the JConsole:
Now the question is how can we clean up this Buffers by ourself in the code? Calling the clear method and set the reference to null is not enough. We have also tried to call System.gc() but with no effect. Do you have any idea?

There is any number of reason the memory usage could increase. I would say its not a memory leak unless the memory increases every time you perform this operation. If it only occurs the first time, it may be that this library needs some memory to load.
I suggest you take a heap dump or at least jmap -histo:live before and after to see where the memory increase is.
If you use a memory profiler like VisualVM or YourKit it will show you where and why memory is being retained.

Its not really a memory leak if the gc is able to clean it up. It might be a waste of memory, but your app seems to be configured to allow it to use over 800MB of heap. This is a trade-off between garbage collection performance and memory usage. You could also try to simply run your application with a smaller heap size.

There might not be a memory leak, but objects going to the Ternured (area where the objects that passed alive in a minor gc goes).
These big step you see might be the Young Eden that is full and after a minor gc is moving alive objects to Ternure.
You can also try to tune up the garbage collector and the memory.
you might have plenty middle length live objects that are constantly passing to the Ternured releasing them in full gc. If you dimension them well those objects go minor gc.
There are plenty jvm arguments to do this.
A good place to look at is here.
This one is suitable for you:
-XX:NewSize=2.125m
Default size of new generation (in bytes)
[5.0 and newer: 64 bit VMs are scaled 30% larger; x86: 1m; x86, 5.0 and older: 640k]
Regards.

The JVM will not free any objects until it has to (e.g.-Xmx reached). Thats one of the main concepts behind all GCs which you can find in the current JVM. They are all optimised for throughput, even the concurrent one. I see nothing unusual on the GC graph.
It would be a leak if the used heap after a full GC would constantly grow over time - if it doesn't -> all good.
in short: foo=null; will not release the object only the reference. The GC can free the memory whenever it likes to.
also:
buffer.clear() does not clear the buffer, it sets pos=0 and limit=capacity - thats all. Please refer to the javadoc for more info.
VisualVM +1
have fun :)
(offtopic: if the buffers are large and static you should allocate them in the permgen. Buffers.newDirectFloatBuffer() would be one of those utility methods in latest gluegen-rt)

Related

JVM garbage collection suddenly takes a lot of CPU

I have an application which reads XML-responses from a server.
This is working nice, until I try to read ~200.000 XML-responses. When I reach that magic number, the handling time reduces with a factor 10.
When I let it run, at some point the JVM would say that the GC is taking 90% of CPU-time. So I first tried to optimize my code - using fields instead of local variables, using intern on my strings (Since I have a lot of copies) and so on.
This helped a bit, but it still went slow after approx 100k XML-files. I then tried using Visual VM to see what was going on, and what I saw was:
Up until 18:02, everything works fine. Then suddenly the garbage collector is going bananas, and stealing CPU-time, which then in turn stabilizes memory consumption. I would understand this, if we we're hitting maximum memory of the heap, but I've set max heap size at 8 gb.
There is nothing different happening at that point, it's basically a giant loop doing the same thing over and over.
What is happening and what can I do in this situation?

Your heap size is insufficient for your workflow. You may have memory leak, or it just specific of your application.
Normal pattern for parallel GC algorithm (which you have enabled)
Young GC
Young GC
...
Full GC
Though, once old space is full (~5.6 GiB for your setup), pattern would switch to
Full GC
Full GC
Full GC
...
Full GC is order of magnitude longer, so application would stay in GC pause (with high CPU consumption) almost all time. VisialVM incorrectly charts GC CPU usage, in reality blue spikes are as high as orange line on CPU chart.
If memory usage grows due to memory leak, you should address that.
If it is application design specific, you need increase old space by
either increasing total heap size
or reducing young space (-Xmn=SIZE option) to save more memory for old space

Is it good to set the max and min JVM heap size the same?

Currently in our testing environment the max and min JVM heap size are set to the same value, basically as much as the dedicated server machine will allow for our application. Is this the best configuration for performance or would giving the JVM a range be better?

Peter 's answer is correct in that -Xms is allocated at startup and it will grow up to -Xmx (max heap size) but it's a little misleading in how he has worded his answer. (Sorry Peter I know you know this stuff cold).
Setting ms == mx effectively turns off this behavior. While this used to be a good idea in older JVMs, it is no longer the case. Growing and shrinking the heap allows the JVM to adapt to increases in pressure on memory yet reduce pause time by shrinking the heap when memory pressure is reduced. Sometimes this behavior doesn't give you the performance benefits you'd expect and in those cases it's best to set mx == ms.
OOME is thrown when heap is more than 98% of time is spent collecting and the collections cannot recover more than 2% of that. If you are not at max heaps size then the JVM will simply grow so that you're beyond that boundaries. You cannot have an OutOfMemoryError on startup unless your heap hits the max heap size and meets the other conditions that define an OutOfMemoryError.
For the comments that have come in since I posted. I don't know what the JMonitor blog entry is showing but this is from the PSYoung collector.
size_t desired_size = MAX2(MIN2(eden_plus_survivors, gen_size_limit()),
min_gen_size());
I could do more digging about but I'd bet I'd find code that serves the same purpose in the ParNew and PSOldGen and CMS Tenured implementations. In fact it's unlikely that CMS would be able to return memory unless there has been a Concurrent Mode Failure. In the case of a CMF the serial collector will run and that should include a compaction after which top of heap would most likely be clean and therefore eligible to be deallocated.

Main reason to set the -Xms is for if you need a certain heap on start up. (Prevents OutOfMemoryErrors from happening on start up.) As mentioned above, if you need the startup heap to match the max heap is when you would match it. Otherwise you don't really need it. Just asks the application to take up more memory that it may ultimately need. Watching your memory use over time (profiling) while load testing and using your application should give you a good feel for what to need to set them to. But it isn't the worse thing to set them to the same on start up. For a lot of our apps, I actually start out with something like 128, 256, or 512 for min (startup) and one gigabyte for max (this is for non application server applications).
Just found this question on stack overflow which may also be helpful side-effect-for-increasing-maxpermsize-and-max-heap-size. Worth the look.

AFAIK, setting both to the same size does away with the additional step of heap resizing which might be in your favour if you pretty much know how much heap you are going to use. Also, having a large heap size reduces GC invocations to the point that it happens very few times. In my current project (risk analysis of trades), our risk engines have both Xmx and Xms to the same value which pretty large (around 8Gib). This ensures that even after an entire day of invoking the engines, almost no GC takes place.
Also, I found an interesting discussion here.

Definitely yes for a server app. What's the point of having so much memory but not using it?
(No it doesn't save electricity if you don't use a memory cell)
JVM loves memory. For a given app, the more memory JVM has, the less GC it performs. The best part is more objects will die young and less will tenure.
Especially during a server startup, the load is even higher than normal. It's brain dead to give server a small memory to work with at this stage.

From what I see here at http://java-monitor.com/forum/showthread.php?t=427
the JVM under test begins with the Xms setting, but WILL deallocate memory it doesn't need and it will take it upto the Xmx mark when it needs it.
Unless you need a chunk of memory dedicated for a big memory consumer initially, there's not much of a point in putting in a high Xms=Xmx. Looks like deallocation and allocation occur even with Xms=Xmx

Too Many Garbage Problem on Java

I have an application, basically, create a new byte array (less than 1K) store some data after few seconds (generally less than 1 minute, but some data stored up to 1 hour) write to disk and data will goes to garbage. Approximatelly 400 packets per second created. I read some articles that say don't worry about GC especially quickly created and released memory parts (on Java 6).
GC runs too long cause some problem about on my application.
I set some GC parameters(Bigger XMX and ParalelGC),this decrease Full GC time decrease but not enough yet. I have 2 idea,
Am I focus GC parameters or create Byte array memory pool mechanism? Which one is better?

The frequency of performing a GC is dependant on the object size, but the cost (the clean up time) is more dependant on the number of objects. I suspect the long living arrays are being copied between the spaces until it end up in the old space and finally discarded. Cleaning the old gen is relatively expensive.
I suggest you try using ByteBuffer to store data. These are like byte[] but have a variable size and can be slightly more efficient if you can use direct byte buffers with NIO. Pre-allocating your buffers can be more efficient to preallocate your buffers. (though can waste virtual memory)
BTW: The direct byte buffers use little heap space as they use memory in the "C" space.

I suggest you do some analysis into why GC is not working well enough for you. You can use jmap to dump out the heap and then use jhat or Eclipse Memory Analyser to see what objects are living in it. You might find that you are holding on to references that you no longer need.
The GC is very clever and you could actually make things worse by trying to outsmart it with your own memory management code. Try tuning the parameters and maybe you can try out the new G1 Garbage Collector too.
Also, remember, that GC loves short-lived, immutable objects.

Use profiler to identify the code snippet
Try with WeakReferences.
Suggest an GC algo to the VM
-Xgc: parallel
Set a big Heap and shared mem
-XX:+UseISM -XX:+AggressiveHeap
set below for garbage collection.
-XX:SurvivorRatio 8
This may help
http://download.oracle.com/docs/cd/E12840_01/wls/docs103/perform/JVMTuning.html#wp1130305

How to ensure JVM starts with value of Xms

When I run a java program with the starting heap size of 3G (set by -Xms3072m VM argument), JVM doesn't start with that size. It start with 400m or so and then keeps on acquiring more memory as required.
This is a serious problem for me. I know JVM is going to need the said amount after some time. And when JVM increases is its memory as per the need, it slows down. During the time when JVM acquires more memory, considerable amount of time is spent in garbage collection. And I suppose memory acquisition is an expensive task.
How do I ensure that JVM actually respects the start heap size parameter?
Update: This application creates lots of objects, most of which die quickly. Some resulting objects are required to stay in memory (which get transferred out of young heap.) During this operation, all these objects need to be in memory. After the operation, I can see that all the objects in young heap are claimed successfully. So there are no memory leaks.
The same operation runs smoothly when the heap size reaches 3G. That clearly indicates the extra time required is spent in acquiring memory.
This Sun JDK 5.

If I am not mistaken, Java tries to get the reservation for the memory from the OS. So if you ask for 3 GB as Xms, Java will ask the OS, if this is available but not start with all the memory right away... it might even reserve it (not allocate it). But these are details.
Normally, the JVM runs up to the Xms size before it starts serious old generation garbage collection. Young generation GC runs all the time. Normally GC is only noticeable when old gen GC is running and the VM is in between Xms and Xmx or, in case you set it to the same value, hit roughly Xmx.
If you need a lot of memory for short lived objects, increase that memory area by setting the young area to... let's say 1 GB -XX:NewSize=1g because it is costly to move the "trash" from the young "buckets" into the old gen. Because in case it has not turned into real trash yet, the JVM checks for garbage, does not find any, copies it between the survivor spaces, and finally moves into the old gen. So try to suppress the check for the garbage in the young gen, when you know that you do not have any and postpone this somehow...
Give it a try!

I believe your problem is not coming from where you think.
It looks like what's costing you the most are the GC cycles, and not the allocation of heap size. If you are indeed creating and deleting lots of objects.
You should be focusing your effort on profiling, to find out exactly what is costing you so much, and work on refactoring that.
My hunch - object creation and deletion, and GC cycles.
In any case, -Xms should be setting minimum heap size (check this with your JVM if it is not Sun). Double-check to see exactly why you think it's not the case.

i have used sun's vm and started with minimum set to 14 gigs and it does start off with that.
maybe u should try setting both the xms and xmx values to the same amt, ie try this-
-Xms3072m -Xmx3072m

Why do you think the heap allocation is not right? Taking any operating system tool that shows only 400m does not mean it isn't allocated.
I don't get really what you are after. Is the 400m and above already a problem or is your program supposed to need that much? If you really have the need to deal with that much memory and it seems you need a lot of objects than you can do several things:
If the memory consumption doesn't match your gut feeling it is the right amount than you probably are leaking memory. That would explain why it "slows down" over time. Maybe you missed to remove objects from one structure so they don't get garbage collected and are slowing lookups and such down.
Your memory settings are maybe the trouble in itself. Garbage collection is not run per se. It is only called if there is some threshold reached. If you give it a big heap setting and your operating system has plenty of memory the garbage collection runs not often.
The characteristics you mentioned would be a scenario where a lot of objects are created and shortly after they would be deleted again. Otherwise the garbage collection wouldn't be a problem (some sort of generational gc). That means you have only "young" objects. Consider using an object pool if you are needing objects only a short period of time. That would eliminate the garbage collection at all.
If you know there are good times in your code for running gc you can consider running it manually to be able to see if it changes anything. This is what you would need
Runtime r = Runtime.getRuntime();
r.gc();
This is just for debugging purposes. The gc is doing a great job most of the time so there shouldn't be the need to invoke the gc on your own.

Java 6 Excessive Memory Usage

Does Java 6 consume more memory than you expect for largish applications?
I have an application I have been developing for years, which has, until now taken about 30-40 MB in my particular test configuration; now with Java 6u10 and 11 it is taking several hundred while active. It bounces around a lot, anywhere between 50M and 200M, and when it idles, it does GC and drop the memory right down. In addition it generates millions of page faults. All of this is observed via Windows Task Manager.
So, I ran it up under my profiler (jProfiler) and using jVisualVM, and both of them indicate the usual moderate heap and perm-gen usages of around 30M combined, even when fully active doing my load-test cycle.
So I am mystified! And it not just requesting more memory from the Windows Virtual Memory pool - this is showing up as 200M "Mem Usage".
CLARIFICATION: I want to be perfectly clear on this - observed over an 18 hour period with Java VisualVM the class heap and perm gen heap have been perfectly stable. The allocated volatile heap (eden and tenured) sits unmoved at 16MB (which it reaches in the first few minutes), and the use of this memory fluctuates in a perfect pattern of growing evenly from 8MB to 16MB, at which point GC kicks in an drops it back to 8MB. Over this 18 hour period, the system was under constant maximum load since I was running a stress test. This behavior is perfectly and consistently reproducible, seen over numerous runs. The only anomaly is that while this is going on the memory taken from Windows, observed via Task Manager, fluctuates all over the place from 64MB up to 900+MB.
UPDATE 2008-12-18: I have run the program with -Xms16M -Xmx16M without any apparent adverse affect - performance is fine, total run time is about the same. But memory use in a short run still peaked at about 180M.
Update 2009-01-21: It seems the answer may be in the number of threads - see my answer below.
EDIT: And I mean millions of page faults literally - in the region of 30M+.
EDIT: I have a 4G machine, so the 200M is not significant in that regard.

In response to a discussion in the comments to Ran's answer, here's a test case that proves that the JVM will release memory back to the OS under certain circumstances:
public class FreeTest
{
public static void main(String[] args) throws Exception
{
byte[][] blob = new byte[60][1024*1024];
for(int i=0; i<blob.length; i++)
{
Thread.sleep(500);
System.out.println("freeing block "+i);
blob[i] = null;
System.gc();
}
}
}
I see the JVM process' size decrease when the count reaches around 40, on both Java 1.4 and Java 6 JVMs (from Sun).
You can even tune the exact behaviour with the -XX:MaxHeapFreeRatio and -XX:MinHeapFreeRatio options -- some of the options on that page may also help with answering the original question.

I don't know about the page faults. but about the huge memory allocated for Java:
Sun's JVM only allocates memory, never deallocates it (until JVM death) deallocates memory only after a specific ratio between internal memory needs and allocated memory drops beneath a (tunable) value. The JVM starts with the amount specified in -Xms and can be extended up to the amount specified in -Xmx. I'm not sure what the defaults are. Whenever the JVM needs more memory (new objects / primitives / arrays) it allocates an entire chunk from the OS. However, when the need subsides (a momentary need, see 2 as well) it doesn't deallocates the memory back the the OS immediately, but keeps it to itself until that ratio has been reached. I was once told that JRockit behaves better, but I can't verify it.
Sun's JVM runs a full GC based on several triggers. One of them is the amount of available memory - when it falls down too much the JVM tries to perform a full GC to free some more. So, when more memory is allocated from the OS (momentary need) the chance for a full GC is lowered. This means that while you may see 30Mb of "live" objects, there might be a lot more "dead" objects (not reachable), just waiting for a GC to happen. I know yourkit has a great view called "dead objects" where you may see these "left-overs".
In "-server" mode, Sun's JVM runs GC in parallel mode (as opposed the older serial "stop the world" GC). This means that while there may be garbage to collect, it might not be collected immediately because of other threads taking all available CPU time. It will be collected before reaching out of memory (well, kinda. see http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html), if more memory can be allocated from the OS, it might be before the GC runs.
Combined, a large initial memory configuration and short bursts creating a lot of short-lived objects might create a scenario as described.
edit: changed "never deallcoates" to "only after ratio reached".

Excessive thread creation explains your problem perfectly:
Each Thread gets its own stack, which is separate from heap memory and therefore not registered by profilers
The default thread stack size is quite large, IIRC 256KB (at least it was for Java 1.3)
Tread stack memory is probably not reused, so if you create and destroy lots of threads, you'll get lots of page faults
If you ever really need to have hundreds of threads aound, the thread stack size can be configured via the -Xss command line parameter.

Garbage collection is a rather arcane science. As the state of the art develops, un-tuned behaviour will change in response.
Java 6 has different default GC behaviour and different "ergonomics" to earlier JVM versions. If you tell it that it can use more memory (either explicitly on the command line, or implicitly by failing to specify anything more explicit), it will use more memory if it believes that this is likely to improve performance.
In this case, Java 6 appears to believe that reserving the extra space which the heap could grow into will give it better performance - presumably because it believes that this will cause more objects to die in Eden space, and limit the number of objects promoted to the tenured generation space. And from the specifications of your hardware, the JVM doesn't think that this extra reserved heap space will cause any problems. Note that many (though not all) of the assumptions the JVM makes in reaching its conclusion are based on "typical" applications, rather than your specific application. It also makes assumptions based on your hardware and OS profile.
If the JVM has made the wrong assumptions, you can influence its behaviour through the command line, though it is easy to get things wrong...
Information about performance changes in java 6 can be found here.
There is a discussion about memory management and performance implications in the Memory Management White Paper.

Over the last few weeks I had cause to investigate and correct a problem with a thread pooling object (a pre-Java 6 multi-threaded execution pool), where is was launching far more threads than required. In the jobs in question there could be up to 200 unnecessary threads. And the threads were continually dying and new ones replacing them.
Having corrected that problem, I thought to run a test again, and now it seems the memory consumption is stable (though 20 or so MB higher than with older JVMs).
So my conclusion is that the spikes in memory were related to the number of threads running (several hundred). Unfortunately I don't have time to experiment.
If someone would like to experiment and answer this with their conclusions, I will accept that answer; otherwise I will accept this one (after the 2 day waiting period).
Also, the page fault rate is way down (by a factor of 10).
Also, the fixes to the thread pool corrected some contention issues.

Lots of memory allocated outside Java's heap after upgrading to Java 6u10? Can only be one thing:
Java6 u10 Release Notes: "New Direct3D Accelerated Rendering Pipeline (...) Enabled by Default"
Sun enabled Direct 3D accelerations by default in Java 6u10. This option creates lots of (temporary?) native memory buffers, which are allocated outside the Java Heap. Add the following vm argument to disable it again:
-Dsun.java2d.d3d=false
Note that this will NOT disable 2D hardware acceleration, just some features that can make use of 3D hardware acceleration. You will see that your Java heap usage will increase by up to 7MB, but that's a good trade-off because you'll save ~100MB(+) of this temporary volatile memory.
I did a fair amount of testing within 2 Swing desktop application, on two platforms:
a high-end Intel-i7 with nVidia GTX 260 graphics card,
a 3-year laptop with Intel graphics.
On both hardware platforms the option made practically zero subjective difference. (Tests included: scrolling tables, zooming graphical flowsheets, charts, etc.). On the few tests where something was subtly different, disabling d3d counter-intuitively increased performance. I suspect that memory management/bandwidth problems counteracted whatever benefits the d3d accelerated functions were supposed to achieve. (Your mileage may vary!)
If you need to do some performance tuning, here's an excellent reference (e.g. "Troubleshooting Java 2D")

Are you using the ConcMarkSweep collector? It can increase the amount of memory required for your application due to increased memory fragmentation, and "floating garbage" - objects that become unreachable only after the collector has examined them, and therefore are not collected until the next pass.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.