Quick question
Does the jmap heap dump include only the old generation, or also the young generation ?
Long explanation
I have 2 heap dump (jmap -heap:format=b 9999) :
one of my server with no load (no HTTP request)
one while working 50% CPU, a high load (benchmarking)
Now, the first dump shows a heap size bigger than the second (which I thought was weird).
Could that be because the Young generation (at high load) is changing often because the Garbage collector is running often (yes, the JVM is almost full) ? Old generation is full at 99%, I've noticed the young generation space usage vary a lot.
So that would mean that I made the second dump right after the GC did his job, this is why its size is smaller. Am I right ?
Additionnal informations :
Java args :
-XX:+UseParallelGC -XX:+AggressiveHeap
-Xms2048m -Xmx4096m -XX:NewSize=64m
-XX:PermSize=64m -XX:MaxPermSize=512m
Quick Answer
Both - The Heap is made up of the Young and Old Generation. So when you take a heap dump, the contents contains both. The Stats of the heap dump should be seperated. Try removing the binary portion of your command and look at it in plain text. You will see a summary of your configuration, and then a break down of each generation. On the flip side, a -histo would just show all objects on the heap with no distinction
Long Answer
It could be that a garbage collection had just finished for the 2nd process. Or the opposite, the first process may not have had a full collection in a while and was sitting at higher memory. Was this application/server just restarted when you took the capture? If you look at an idle process using a tool like jvisualvm, you will see the memory allocation graphs move up and down even though your process isn't doing any work. This is just the JVM doing its own thing.
Typically your Full GC should kick off well before it reaches a 99% mark in the Old Gen. The JVM will decide when to run the full GC. Your Young Gen will fluctuate alot as this is where objects are created/removed the fastest. There will be many partial GC's done to clean out the young gen before a full GC get run. The difference between the two will mean a pause in your JVM activity. Seeing a partial GC will not hurt your application. But the pause times of a full GC will stop your application while it runs. So you will want to minimize those the best you can.
If you are looking for memory leaks or just profiling to see how your application(s) gc is working, I would recommend using the startup flags to print the garbage collection stats.
-XX:+PrintGCDetails -verbose:gc -Xloggc:/log/path/gc.log
run your program for a while and then load the captured log in a tool to help visualize the results. I personally use the Garbage Collection and Memory Visualizer offered in the IBM Support Assistant Workbench. It will provide you with a summary of the captured garbage collection stats as well as a dynamic graph which you can use to see how the memory in your application has been behaving. This will not give you what objects were on your heap.
At problem times in your application, if you can live with a pause, I would modify your jmap command to do the following
jmap -dump:format=b,file=/file/location/dump.hprof <pid>
Using a Tool like MAT, you will be able to see all of the objects, leak suspects, and various other stats about the generations, references of the heap.
Edit For Tuning discussion
Based on your start-up parameters there are a few things you can try
Set your -Xms equal to the same value
as your -Xmx. By doing this the JVM
doesn't have to spend time allocating
more heap. If you expect your application to take 4gb, then give it all right away
Depending on the number of processors
on the system you are running this
application you can set the flag for
-XX:ParallelGCThreads=##.
I havent tried this one but the
documentation shows a parameter for
-XX:+UseParallelOldGC which shows some promise to reduce time of old GC
collection.
Try changing your new generation size to 1/4 of the Heap. (1024 instead of 64) This could be forcing too much to the old generation. The default size is around 30%, you have your application configured to use around 2% for young gen. Though you have no max size configured, I worry that too much data would be moved to old gen because the new gen is too small to handle all of the requests. Thus you have to perform more full (paused) GC's in order to clean up the memory.
I do believe overall if you are allocating this much memory so fast, then it could be a generation problem where the memory is allocated to the Old generation prematurely. If a tweak to your new gen size doesn't work, then you need to either add more heap via the -Xmx configuration, or take a step back and find exactly what is holding onto the memory. The MAT tool which we discussed earlier can show you the references holding onto the objects in memory. I would recommend trying bullet points 1 & 4 first. This will be trial and error for you to get the right values
Related
We are having frecuent outages in our app, basically the heap grows over time to the point were the GC takes a lot of CPU time and execute for several minutes, decreasing the app perfomance drastically. The app is in JSF with a tomcat server.
In the mean time, we:
Increased the heap size from 15G to 26G (-Xms27917287424 -Xmx27917287424)
Take several heap dumps (we are trying to determine the problem using these)
Activated GC logs
With the heap size increase, GC is not executing for that much time but still takes a lot of CPU and frezees the app.
So the question is:
Is this normal? When the GC executes it frees memory, so i think this probably isn't a memory leak (Am I right?)
Is there a way of optimize the GC or maybe this behavior is just a sympthom of something wrong in the app itself?
How can I monitor and analyze this without taking a heap dump?
UPDATE:
I changed JSF from 2.2 to 2.3 because some heap dumps were pointing that JSF was using a lot of memory.
That didn't work out, and yesterday we had and outage again, but this time a little different (from my point of view). Also this time, we had to reset tomcat because the app didn't work anymore after a while
In this case, the garbage collector is running when de old gen heap is not full, and the new generation GC is running all the time.
¿What can be the cause of this?
As has been said in the comments, the behaviour of the application does not look unreasonable. Your code is continually allocating objects that leads to heap space filling up, causing the GC to run. There does not appear to be a memory leak since GC reclaims a lot of space and the overall used space is not continually increasing.
What does appear to be an issue is that a significant number of objects are being promoted to the old-gen before being collected. Major GC cycles are more expensive in terms of CPU due to the relocation and remapping of objects (assuming you're using a compacting algorithm).
To reduce this, you could try increasing the size of the young generation. This will have happened when you increased the overall heap size but not by enough. Ideally, you want the majority of objects to be collected during a minor GC cycle since this is effectively free (the GC does nothing to the objects in Eden space as they are collected). You can do this with the -XX:NewRatio= or -XX:NewSize= flags. You could also try changing the survivor space sizes, again to increase the number of objects collected before tenuring. (use the -XX:SurvivorRatio= flag for this).
For monitoring, I find Flight Recorder and Mission Control very useful as you can drill down into details of how many objects of specific types are allocated. It's also easy to connect to a running JVM or take dumps for later analysis.
We have a few containers running java processes with docker. One thing we've been noticing is a huge amount of memory that is taken up just by running a simple spring-boot app without even including our own code (just to try and get some kind of memory profile independent of any issues we might introduce).
What I saw was the memory consumed by docker/the JVM was hovering around 2.5. We did have a decent amount of extra deps included in it (camel, hibernate, some spring-boot deps) but that wasn't what really threw me off. What I saw was that despite docker saying it consumed 2.5GB of memory for the app, running jconsole against it read that it was consuming up to 1GB (down to ~200MB after a GC and slowly climbing). The memory footprint on docker remained where it was after the GC as well (2.5GB).
Furthermore, when I dumped the heap to see what kinds of object are taking up that space, it looks like the heap was only 33MB large after I loaded the .hprof file into MAT. None of this makes much sense to me. Currently, I'm looking at the non-heap space in jconsole reported at 115MB while the heap space is at 331MB.
I've already read a ton (on SO and other sites) about the JVM memory regions and some things specifically reporting that the heap dumps might be smaller but none of them were this far off that I could tell and beyond that, many of the suggested things to watch for were that the GC is run whenever a heap dump is taken and that MAT has a setting to show or hide unreachable objects. All of this was taken into account before posting here and now I just feel like something else is at play that I can't capture myself and I haven't found online.
I fully expect that the numbers might be a little off but it seems extreme that they're off by a factor of 10 in the best case scenario and off by nearly a factor of 100 when looking at the docker-reported memory usage.
Does anyone know what I might be missing here?
EDIT: This is also an app running with Java 8, not yet running with Java 11. It's on the JIRA board to do but not yet planned for.
EDIT2: Adding screenshots. Spike in the JConsole screen shot is from running GC.
JConsole gives you the amount of committed memory: 3311616 KiB ~= 3GiB
This is how much memory your java process consumes, as seen by the OS.
It is unrelated to how much heap is currently in use to hold Java objects, also reported by JConsole as 130237 kbyte ~= 130 MiB.
This is also unrelated to how many Objects are actually alive: By default MAT will remove unreachable Objects when you load the heap dump. You can enable the option by going to Preferences -> Memory Analyzer -> Keep Unreachable Objects (See the MAT documentation). So if you have a lot of short lived objects, the difference can be quite massive.
I see that it also reports a Max Heap of about 9GiB. It means that you have set Xmx parameter to a large value.
Hotspot GC's are not very good at reclaiming unused memory. They tend to use all the space available to them (the Max heap size, set by Xmx) and then never decommit the heap, effectively keeping it reserved for the Java process instead of releasing it to the OS.
If you want to minimize the memory footprint of your process from the OS perspective, I recommend that you set a lower Xmx, maybe -Xmx1g, so as to not allow Java to grow too much (of course, Xmx will also need to be high enough to accomodate for your application workload!).
If really want an adaptative heap, you can also switch to G1 (-XX:+UseG1GC) and a more recent Java, as the hotspot team has delivered some improvements recently.
Dave
OS monitoring tools will show to you the amount of memory that is allocated by a process. So this:
mean that your java process have 2.664G of memory allocated (java heap + meta space)
JConsole shows to you the memory that your code is "consuming" (ignoring the meta space)
I see 2 possible explanations:
You have set -Xms with a huge value
You have a lot of static
code (or other content) loaded on your meta space.
I have an instance of zookeeper that has been running for some time... (Java 1.7.0_131, ZK 3.5.1-1), with -Xmx10G -XX:+UseParallelGC.
Recently there was a leadership change, and the memory usage on most instances in the quorum went from ~200MB to 2GB+. I took a jmap dump, and what I found that was interesting was that there was a lot of byte[] serialization data (>1GB) that had no GC Root, but hadn't been collected.
(This is ByteArrayOutputStream, DataOutputStream, org.apache.jute.BinaryOutputArchive, or HeapByteBuffer, BinaryOutputArchive).
Looking at the gc log, shortly before the election change, the full GC was running every 4-5 minutes. After the election, the tenuring threshold increases from 1 to 15 (max) and the full GC runs less and less often, eventually it doesn't even run on some days.
After severals days, suddenly, and mysteriously to me, something changes, and the memory plummets back to ~200MB with Full GC running every 4-5 minutes.
What I'm confused about here, is how so much memory can have no GC Root, and not get collected by a full GC. I even tried triggering a GC.run from jcmd a few times.
I wondered if something in ZK native land was holding onto this memory, or leaking this memory... which could explain it.
I'm looking for any debugging suggestions; I'm planning on upgrading Java 1.8, maybe ZK 3.5.4, but would really like to root cause this before moving on.
So far I've used visualvm, GCviewer and Eclipse MAT.
(Solid vertical black lines are full GC. Yellow is young generation).
I am not an expert on ZK. However, I have been tuning JVMs on Weblogic for a while and I feel, based on this information, that your configuration is generating the expansion and shrinking of the heaps (-Xmx10G -XX:+UseParallelGC). Thus, perhaps you should try using -Xms10G and -Xmx10G to avoid this resizing. Importantly, each time the JVM is resized a full GC is executed so avoiding this process is a good way to minimize the number of full garbage collections.
Please read this
"When a Hotspot JVM starts, the heap, the young generation and the perm generation space are
allocated to their initial sizes determined by the -Xms, -XX:NewSize, and -XX:PermSize parameters
respectively, and increment as-needed to the maximum reserved size, which are -Xmx, -
XX:MaxNewSize, and -XX:MaxPermSize. The JVM may also shrink the real size at runtime if the
memory is not needed as much as originally specified. However, each resizing activity triggers a
Full Garbage Collection (GC), and therefore impacts performance. As a best practice, we
recommend that you make the initial and maximum sizes identical"
Source: http://www.oracle.com/us/products/applications/aia-11g-performance-tuning-1915233.pdf
If you could provide your gc.log, it would be useful to analyse this case thoroughly.
Best regards,
RCC
I am running a server application using JVM sunjava-1.6.0_21.
My application is data heavy and acts as a cache server. So it stores a lot of long living data that we do not expect to get GC throughout the application is running.
I am setting following JVM parameters -Xmx16384M and -Xms16384M.
After the required data has been loaded, following is the memory usage of application
Total heap space is :13969522688
Maximum heap space is :15271002112
Free heap space is :3031718040
Long term (old gen) heap storage: Used=10426MB Max=10922MB
Used/Max=95%
Old gen usage - I have confirmed that is due to actual data and is not expected to get free.
My question is that by default JVM sizing of heap space ( it is allocating 10922MB old gen), that is leaving very little free space in old gen section.
Can less free space in old gen impact the application?
If yes, how should I handle this? Should I experiment with JVM tuning parameters like newratio and try to increase space available for old gen or any other way I should tune the application.
Can less free space in old gen impact the application?
If your Tenured Gen gets full a major collection will happened and this type of collection is costly.
You can use the options : -verbose:gc and -XX:+PrintGCDetails to know if Full GC happend too often. If it's the case so yes it can impact the performance of your application.
If yes, how should I handle this? Should I experiment with JVM tuning parameters like newratio and try to increase space available for old gen or any other way I should tune the application.
You can give a try to NewRatio but keep in mind that if your eden is too short your tenured gen will probably get filled faster.
To conclude, you should use a monitoring tool to have a better idea of the VM options you have to use. It will show you easily how your generations are filled during the app execution, it's easier to read and understand than gc logs ;)
If you know that the length of life of your objects is that long play with the parameters that set the size of the regions in relation to each other.
You can set ratios in young generation and old generation (eden and ternured spaces) as well as both survivors.
The target is to minimize the full garbage collection by allowing the minor garbage collection to free all memory.
You prevent the garbage collection to free objects by keeping them reachable in your application. I mean that you should only care for those objects being deleted by minor garbage collections.
Enable the parameters
-verbose:gc -Xloggc:/opt/tomcat/logs/gc.out -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
Then with a tool GCViewer you can see time spent in gc and the number (size) of objects removed. Among some useful metrics.
I am running an application server on Linux 64bit with 8 core CPUs and 6 GB memory.
The server must be highly responsive.
After some inspection I found that the application running on the server creates rather a huge amount of short-lived objects, and has only about 200~400 MB long-lived objects(as long as there is no memory leak)
After reading http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html
I use these JVM options
-server -Xms2g -Xmx2g -XX:MaxPermSize=256m -XX:NewRatio=1 -XX:+UseConcMarkSweepGC
Result: the minor GC takes 0.01 ~ 0.02 sec, the major GC takes 1 ~ 3 sec
the minor GC happens constantly.
How can I further improve or tune the JVM?
larger heap size? but will it take more time for GC?
larger NewSize and MaxNewSize (for young generation)?
other collector? parallel GC?
is it a good idea to let major GC take place more often? and how?
Result: the minor GC takes 0.01 ~ 0.02 sec, the major GC takes 1 ~ 3 sec the minor GC happens constantly.
Unless you are reporting pauses, I would say that the CMS collector is doing what you have asked it to do. By definition, CMS will use a larger percentage of the CPU than the Serial and Parallel collectors. This is the penalty you pay for low pause times.
If you are seeing 1 to 3 second pause times, I'd say that you need to do some tuning. I'm no expert, but it looks like you should start by reducing the value of CMSInitiatingOccupancyFraction from the default value of 92.
Increasing the heap size will improve the "throughput" of the GC. But if your problem is long pauses, increasing the heap size is likely to make the problem worse.
Careful .... GC can be a hairy subject if you are not cautious. Within any runtime (JVM for Java / CLR for .Net) there are several processes that take place. Generally there is an early stage optimization of memory (Young Generational Garbage Collection / Young Gen GC & Old Generational Garbage Collection / Old Gen GC). The young gen gc happens on a regular basis and is commonly attributed to your smaller pauses / hiccups. The old gen gc is normally what is going on when you see the long "stop the world" pauses.
Why you might ask? The reason you get pauses with your runtime / JVM is that when the runtime does its cleanup of the Heap it has to go through what is called a phase change. It stops the threads running your application in order to mark and swap pointers to optimize your available memory. Yong gen is faster as it is mainly releasing objects that are only temporary. Old gen, however, evaluates all the objects on the heap and when you run out of memory will it will kick of to free up much needed memory.
Why the Caution? Old gen gets exponentially worse in pause time the more heap you use. at 2-4 GB in total heap size you should be fine on modern runtimes like Java 6 (JDK 1.6+). Once you go beyond that threashold you will see exponential increases in pause times. I have run into some clients that have to restart servers - as in some rare cases where a heap is large GC pause times can take longer than a full restart.
There are some new tools out there that are pretty cool and can give you a leading edge on evaluating if GC is your pain. JHiccup is one and it is free from the azulsystemswebsite. At this time I think it is only for Linux though. They also have a JVM that has a re-built GC algorithm that runs pauseless ... but if you are on a single server deployment with a non-critical app it may not be cost effective (that one is not free).
To sum up - if your runtime / JVM / CLR heap is less than 2 GB adding more memory will help. Be sure to give yourself some overhead. You never want to hit 100% Heap size / memory size if possible. That is when the long pauses are the longest. Give yourself an extra 20%+ memory over what you think you will need. That way you have room for the GC algorithms to move objects around for optimization. If you ever plan to go large ... there is one tool that fixes the circa 1990 JVM technology (Azul Systems Zing JVM), but it is not free. They do offer an open source tool to diagnose GC issues. The JVM (as I have tried it) also has a really cool thread level visibility tool that lets you report on any leaks, bugs, or locks in production without overhead (some trick with offloading data the JVM already deals with and time stamping). That has saved tons of dev test time ... but again, not for small apps.
Stay below 4 GB. Give extra headroom. And if you want you can turn on these flags to monitor GC for Java / JVM:
java -verbose:gc myProgram
java -Xloggc:D:/log/myLogFile.log -XX:+PrintGCDetails myProgram
You may try some of the other collectors Hotspot uses. There are more than one.
If you are on Linux go ahead and try the JHiccup tool as well. It is free.
You may be interested in trying the low-pause Garbage-First collector instead of concurrent mark-sweep (although it's not necessarily more performant for all collections, it's supposed to have a better worst case). It's enabled by -XX:+UseG1GC and is supposed to be really awesomesauce, but you may want to give it a thorough evaluation before using it in production. It has probably improved since, but it seems to have been a bit buggy a year ago, as seen in Experience with JDK 1.6.x G1 (“Garbage First”)
It is perfectly fine for the garbage collection to run in parallel with your program, if you have plenty of cpu, which you do.
What you want, is to make absolutely certain that you will not run into a scenario where the garbage collection PAUSES your main program.
Have you tried simply not stating any flags except saying you want the server VM (for the Sun JVM), and then put your server under heavy load to see how it behaves? Only then can you see, if you get any improvements from tinkering with options.
This actually sounds like a throughput app and should probably use the throughput collector. I would balance the size of the new gen making it big enough to not GC too often and small enough to prevent long pauses. 20ms sounds like a long minor GC to me. I also suspect your survivor space is set too large and is just being wasted. If you don't have much surviving to old gen, you shouldn't have that much surviving your minor GCs.
In the end, you should use jvmstat and VisualGC to really get a feel for how your app is using memory.
For high responsive server application, I think you want to see the major GC happens less frequently. Here is the list of parameters would help.
-XX:+CMSParallelRemarkEnabled
-XX:+CMSScavengeBeforeRemark
-XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSInitiatingOccupancyFraction=50
-XX:CMSWaitDuration=300000
-XX:GCTimeRatio=40
Larger heap size may not help on low pause, as long as your app doesn't run out of memory.
Larger NewSize and MaxNewSize would help on throughput, may not help on low pause. If you choose to take this approach, you may consider give GC threads more execution time by setting -XX:GCTimeRatio lower. The point is to remember to take a holistic when tuning JVM.
I think previous posters have missed something very obvious- the Perm Generation size is too low. If the system uses 200 to 400 MB as permanent generation- then it may be best to set Max Perm Gen to 400 MB. the PerGen size should also be set to the same value. You will then never run out of Permanent Generation Space.
Currently- it looks like the JVM has to spend a lot of time moving objects in and out of Permanent Generation. This can take time. JVM tries to allocate contiguous memory areas to Java Objects- this speeds memory access due to hardware level features. In order to do that, it is very helpful to have plenty of buffer in memory. If Permanent Generation is almost full, newly discovered permanent objects must be split or existing objects must be shuffled. This is what triggers a full GC, as well as causes long full GC pauses.
The question states that the Permanent Generation size has already been measured- if this has not been done, it should be measured using a tool. These tools process logs generated by the JVM with the verboseGC option turned on.
All the mark and sweep options listed above- may not be needed with this basic improvement.
People throw GC options as solutions without evaluating how mature they have proven to be in actual use.