Too Many Garbage Problem on Java - java

I have an application, basically, create a new byte array (less than 1K) store some data after few seconds (generally less than 1 minute, but some data stored up to 1 hour) write to disk and data will goes to garbage. Approximatelly 400 packets per second created. I read some articles that say don't worry about GC especially quickly created and released memory parts (on Java 6).
GC runs too long cause some problem about on my application.
I set some GC parameters(Bigger XMX and ParalelGC),this decrease Full GC time decrease but not enough yet. I have 2 idea,
Am I focus GC parameters or create Byte array memory pool mechanism? Which one is better?

The frequency of performing a GC is dependant on the object size, but the cost (the clean up time) is more dependant on the number of objects. I suspect the long living arrays are being copied between the spaces until it end up in the old space and finally discarded. Cleaning the old gen is relatively expensive.
I suggest you try using ByteBuffer to store data. These are like byte[] but have a variable size and can be slightly more efficient if you can use direct byte buffers with NIO. Pre-allocating your buffers can be more efficient to preallocate your buffers. (though can waste virtual memory)
BTW: The direct byte buffers use little heap space as they use memory in the "C" space.

I suggest you do some analysis into why GC is not working well enough for you. You can use jmap to dump out the heap and then use jhat or Eclipse Memory Analyser to see what objects are living in it. You might find that you are holding on to references that you no longer need.
The GC is very clever and you could actually make things worse by trying to outsmart it with your own memory management code. Try tuning the parameters and maybe you can try out the new G1 Garbage Collector too.
Also, remember, that GC loves short-lived, immutable objects.

Use profiler to identify the code snippet
Try with WeakReferences.
Suggest an GC algo to the VM
-Xgc: parallel
Set a big Heap and shared mem
-XX:+UseISM -XX:+AggressiveHeap
set below for garbage collection.
-XX:SurvivorRatio 8
This may help
http://download.oracle.com/docs/cd/E12840_01/wls/docs103/perform/JVMTuning.html#wp1130305

Related

How to gracefully tell Java about total memory limits?

I have troubles with Java memory consumption.
I'd like to say to Java something like this: "you have 8GB of memory, please use it, and only it. Only if you really can't put all your resources in this memory pool, then fail with OOM".
I know, there are default parameters like -Xmx - they limit only the heap. There are also plenty of other parameters, I know. The problems with these parameters are:
They aren't relevant. I don't want to limit the heap size to 6GB (and trust that native memory won't take more than 2GB). I do want to limit all the memory (heap, native, whatever). And do that effectively, not just saying "-Xmx1GB" - to be safe.
There is too many different parameters related to memory, and I don't know how to configure all of them to achieve the goal.
So, I don't want to go there and care about heap, perm and whatever types of memory. My high-level expectation is: since there is only 8GB, and some static memory is needed - take the static memory from the 8GB, and carefully split the remaining memory between other dynamic memory entities.
Also, ulimit and similar things don't work. I don't want to kill the java process once it consumes more memory than expected. I want Java does its best to not reach the limit firstly, and only if it really, really can't - kill the process.
And I'm OK to define even 100 java parameters, why not. :) But then I need assistance with the full list of needed parameters (for, say, Java 8).
Have you tried -XX:MetaspaceSize?
Is this what you need?
Please, read this article: http://karunsubramanian.com/websphere/one-important-change-in-memory-management-in-java-8/
Keep in mind that this is only valid to Java 8.
AFAIK, there is no java command line parameter or set of parameters that will do that.
Your best bet (IMO) is to set the max heap size and the max metaspace size and hope that other things are going to be pretty static / predictable for your application. (It won't cover the size of the JVM binary and it probably won't cover native libraries, memory mapped files, stacks and so on.)
In a comment you said:
So I'm forced to have a significant amount of memory unused to be safe.
I think you are worrying about the wrong thing here. Assuming that you are not constrained by address space or swap space limitations, memory that is never used doesn't matter.
If a page of your address space is not used, the OS will (in the long term) swap it out, and give the physical RAM page to something else.
Pages in the heap won't be in that situation in a typical Java application. (Address space pages will cycle between in-use and free as the GC moves objects within and between "spaces".)
However, the flip-side is that a GC needs the total heap size to be significantly larger than the sum of the live objects. If too much of the heap is occupied with reachable objects, the interval between garbage collection runs decreases, and your GC ergonomics suffer. In the worst case, a JVM can grind to a halt as the time spent in the GC tends to 100%. Ugly. The GC overhead limit mechanism prevents this, but that just means that your JVM gets an OOME sooner.
So, in the normal heap case, a better way to think about it is that you need to keep a portion of memory "unused" so that the GC can operate efficiently.

java full gc taking too long

I have a Java client which consumes a large amount of data from a server. If the client does not keep up with the data stream at a fast enough rate, the server disconnects the socket connection. My client gets disconnected a few times per day. I ran jconsole to see the memory usage, and the heap space graph looks like a fairly well defined sawtooth pattern, oscillating between about 0.5GB and 1.8GB (2GB of heap space is allocated). But every time I get disconnected is during a full GC (but not on every full GC). I see the full GC takes a bit over 1 second on average. Depending on the time of day, full GC happens as often as every 5 minutes when busy, or up to 30 minutes can go by in between full GCs during the slow periods.
I suspect if I can reduce the full GC time, the client will be able to better keep up with the incoming data, but I do not have much experience with GC tuning. Does anyone have some insight on if this might be a good idea, and how to do it? Or is there an alternative idea which may work as well?
** UPDATE **
I used -XX:+UseConcMarkSweepGC and it improved, but I still got disconnected during the very busy moments. So I increased the heap allocation to 3GB to help weather through the busy moments and it seems to be chugging along pretty well now, but it's only been 1 day without a disconnection. Maybe if I get some time I will go through and try to reduce the amount of garbage created which I'm confident will help as well. Thanks for all the suggestions.
Full GC could take very long to complete, and is not that easy to tune.
One way to (easily) tune it is to increase the heap space - generally speaking, double the heap space can double the interval between two GCs, but will double the time consumed by a GC. If the program you are running has very clear usage patterns, maybe you can consider increase the heap space to make the interval so large that you can guarantee to have some idle time to try to make the system perform a GC. On the other hand, following this logic, if the heap is small a full garbage collection will finish in a instant, but that seems like inviting more troubles than helping.
Also, -XX:+UseConcMarkSweepGC might help since it will try to perform the GC operations concurrently (not stopping your program; see here).
Here's a very nice talk by Til Gene (CTO of Azul systems, maker of high performance JVM, and published several GC algos), about GC in JVM in general.
It is not easy to tune away the Full GC. A much better approach is to produce less garbage. Producing less garbage reduces pressure on the collection to pass objects into the tenured space where they are more expensive to collect.
I suggest you use a memory profiler to
reduce the amount of garbage produced. In many applications this can be reduce by a factor of 2 - 10x relatively easily.
reduce the size of the objects you are creating e.g. use primitive and smaller datatypes like double instead of BigDecimal.
recycle mutable object instead of discarding them.
retain less data on the client if you can.
By reducing the amount of garbage you create, objects are more likely to die in the eden, or survivor spaces meaning you have far less Full collections, which can be shorter as well.
Don't take it for granted you have to live with lots of collections, in extreme cases you can avoid it almost completely http://vanillajava.blogspot.ro/2011/06/how-to-avoid-garbage-collection.html
Take out calls to Runtime.getRuntime().gc() - When garbage collection is triggered manually it either does nothing or it does a full stop-the-world garbage collection. You want incremental GC to happen.
Have you tried using the server jvm from a jdk install? It changes a bunch of the default configuration settings (including garbage collection) and is easy to try - just add -server to your java command.
java -server
What is all the garbage that gets created? Can you generate less of it? Where possible, try to use the valueOf methods. By using less memory you'll save yourself time in gc AND in memory allocation.

MAT space vs. TaskManager space

after searching the web for a while I decided to ask you for help with my problem.
My program should analyze logfiles, which are really big. They are about 100mb up to 2gb. I want to read the files using NIO-classes like FileChannel.
I don't want to save the files in memory, but I want to process the lines immediately. The code works.
Now my problem: I analyzed the Memory usage with the Eclipse MAT plugin and it says about 18mb of data is saved (that fits). But TaskManager in Windows says that about 180mb are used by the JVM.
Can you tell me WHY this is?
I don't want to save the data reading with the FileChannel, i just want to process it. I am closing the Channel afterwards - I thought every data would be deleted then?
I hope you guys can help me with the difference between the used space is shown in MAT and the used space is shown in TaskManager.
MAT will only show objects that are actively references by your program. The JVM uses more memory than that:
Its own code
Non-object data (classes, compiled bytecode e.t.c.)
Heap space that is not currently in use, but has already been allocated.
The last case is probably the most major one. Depending on how much physical memory there is on a computer, the JVM will set a default maximum size for its heap. To improve performance it will keep using up to that amount of memory with minimal garbage collection activity. That means that objects that are no longer referenced will remain in memory, rather than be garbage collected immediately, thus increasing the total amount of memory used.
As a result, the JVM will generally not free any memory it has allocated as part of its heap back to the system. This will show-up as an inordinate amount of used memory in the OS monitoring utilities.
Applications with high object allocation/de-allocation rates will be worse - I have an application that uses 1.8GB of memory, while actually requiring less than 100MB. Reducing the maximum heap size to 120 MB, though, increases the execution time by almost a full order of magnitude.

Possible Memory Leak with JOGL using VBOs

We are currently developing an application which visualizes huge vector fields (> 250'000) on a sphere/plane in 4D. To speed up the process we are using VBOs for the vertices, normals and colors. To prepare the data before sending down to the GPU we are using Buffers (FloatBuffer, ByteBuffer, etc..).
Some data to the cylinders:
Each cylinder uses 16 * 9 + 16 * 3 = 192 floats -> 192 * 4 Bytes = 768 bytes.
After sending down the vertices we are doing the following cleanup:
// clear all buffers
vertexBufferShell.clear();
indexBufferShell.clear();
vertexBufferShell = null;
indexBufferShell = null;
We have monitored it with JConsole and we found out that the GarbageCollector is not run "correctly". Even if we switch down the cylinder count, the memory does not get freed up. In the JConsole Monitoring Tool there is a button to Run the GC and if we do that manually it frees up the memory (If we have loaded a huge amount of cylinders and decrease it a lot, sometimes over 600mb gets cleaned by the GC).
Here an image of the JConsole:
Now the question is how can we clean up this Buffers by ourself in the code? Calling the clear method and set the reference to null is not enough. We have also tried to call System.gc() but with no effect. Do you have any idea?
There is any number of reason the memory usage could increase. I would say its not a memory leak unless the memory increases every time you perform this operation. If it only occurs the first time, it may be that this library needs some memory to load.
I suggest you take a heap dump or at least jmap -histo:live before and after to see where the memory increase is.
If you use a memory profiler like VisualVM or YourKit it will show you where and why memory is being retained.
Its not really a memory leak if the gc is able to clean it up. It might be a waste of memory, but your app seems to be configured to allow it to use over 800MB of heap. This is a trade-off between garbage collection performance and memory usage. You could also try to simply run your application with a smaller heap size.
There might not be a memory leak, but objects going to the Ternured (area where the objects that passed alive in a minor gc goes).
These big step you see might be the Young Eden that is full and after a minor gc is moving alive objects to Ternure.
You can also try to tune up the garbage collector and the memory.
you might have plenty middle length live objects that are constantly passing to the Ternured releasing them in full gc. If you dimension them well those objects go minor gc.
There are plenty jvm arguments to do this.
A good place to look at is here.
This one is suitable for you:
-XX:NewSize=2.125m
Default size of new generation (in bytes)
[5.0 and newer: 64 bit VMs are scaled 30% larger; x86: 1m; x86, 5.0 and older: 640k]
Regards.
The JVM will not free any objects until it has to (e.g.-Xmx reached). Thats one of the main concepts behind all GCs which you can find in the current JVM. They are all optimised for throughput, even the concurrent one. I see nothing unusual on the GC graph.
It would be a leak if the used heap after a full GC would constantly grow over time - if it doesn't -> all good.
in short: foo=null; will not release the object only the reference. The GC can free the memory whenever it likes to.
also:
buffer.clear() does not clear the buffer, it sets pos=0 and limit=capacity - thats all. Please refer to the javadoc for more info.
VisualVM +1
have fun :)
(offtopic: if the buffers are large and static you should allocate them in the permgen. Buffers.newDirectFloatBuffer() would be one of those utility methods in latest gluegen-rt)

Does allocation speed depend on the garbage collector being used?

My app is allocating a ton of objects (>1mln per second; most objects are byte arrays of size ~80-100 and strings of the same size) and I think it might be the source of its poor performance.
The app's working set is only tens of megabytes. Profiling the app shows that GC time is negligibly small.
However, I suspect that perhaps the allocation procedure depends on which GC is being used, and some settings might make allocation faster or perhaps make a positive influence on cache hit rate, etc.
Is that so? Or is allocation performance independent on GC settings under the assumption that garbage collection itself takes little time?
Of course your performance depends on the allocator used. But you have profiled GC and saw that it is not much of an issue. Also, one of the strengths of the GC is fast allocation at the expense of slower collection.
I think you are having issues with resulting fragmentation which makes memory access pattern problematic for the cpu, since it may need to invalidate its cache too often. Most GC algorithms doesn't reclaim space in an optimum way.
Since your working set is limited and predictable, you might want to use an object pool which is allocated beforehand. You may also want to use reference counting to avoid much of the manual memory management. Technically it is still GC but not in the common sense of the GC.
Still, I don't think the performance is much affected by how you manage memory but how you actually use, access it. Most likely your profiler has the definite answer.
There are two distinct aspects to object allocation. The first is finding a suitable area of memory - with todays generational garbarge collectors, this is usually very fast (in the order of a few 10ths of machine cycles).
The second is the initialization of the objects you allocate. Since everything you allocate in Java is initialized, the cost for initialization can easily outweight the cost of allocation (except for the most simple, smallest objects). There is more. Since initialization requires writing the entire memory area the new object occupies (if you allocate a "new byte[1<<20]" for example, the entire megabyte needs to be set to zeros), this also usually pulls that memory into the cpu's cache, evicting other, older cache lines (which may or may not belong to your current "hot" working set).
If you do comparatively little processing on each of your arrays, those effects can severly affect the performance of your code. This can be partially avoided by re-using the same arrays over and over, but it usually makes the program logic more complex. It is also often not easy to determine if cache trashing is really the culprit. Its impossible to say from what little information is given in your question.
Does your VM try to pool strings? I had heard once, that IBM's VM did something like string interning but dynamically (no idea if its true) perhaps your VM is trying doing extra work to build an internal data structure of String internals.
Are you doing something like byte b[] = new byte[100]; String s = new String(b); by any chance? You might try not to allocate the String objects, and instead allocate some random object which has a reference to the byte[] (for comparison).

Categories