Explain observed JVM Garbage Collection on JBoss Server

Explain observed JVM Garbage Collection on JBoss Server - java

With VisualVM I am observing the following heap usage on a JBoss server:
The server is started with the following (relevant) JVM options:
-Xrs -Xms3072m -Xmx3072m -XX:MaxPermSize=512m -XX:+UseParallelOldGC -Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000
And we currently also have enabled GC logging:
-XX:+PrintGC -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -Xloggc:log\gc.log
Basically I am happy with the observed pattern, since it looks like we don't have any memory leaks (the pattern repeats itself over days).
However I am wondering if there is room for optimization?
First of all, I don't understand why the garbage collection already kicks in when the heap usage reaches about 2GB? It looks to me like it could kick in later since the heap would have 3GB available?
Further more I would be interested in tips regarding the observed heap usage pattern and the used JVM options:
Does the observed pattern allow me to draw conclusions about the used GC strategy (UseParallelOldGC)? Ist this strategy the right one, or should I try to use another one given the observed heap usage?
Can I optimize the GC process, so that the full heap size (3GB) is used?
Right now it looks like the full 3GB are never used, should I reduce the Xms/Xmx to 2.5GB?
Are there any obvious GC optimizations that I am missing? Like tuning -XX:NewSize or -XX:NewRatio?
Any other tips that come to mind?
Thanks!

I'd say the GC behaviour in your screen-shot looks 'normal'.
You'd usually want major collections to trigger before the heap space gets too full or it would be very easy to encounter OutOfMemoryError's, based on a number of scenarios.
Also, are you aware that Java's heap space is divided into distinct areas for new (eden), current (survivor) and old (tenured) objects?
This answer provides some excellent information on the subject, so I won't repeat it here:
How is the java memory pool divided?
Very basically, each area of the heap triggers its own collections. The eden space is normally collected often and 'quickly' the survivor and tenured spaces are usually larger and take longer to collect.
Could you reduce your heap size based on the above graph?
Yes. However, your current configuration allows your application some breathing room, if it's ever likely to encounter busier periods or spikes in load.
Can you optimize GC?
Yes, but there are no magic settings. The first question is do you really need to? If your application is just a non-interactive 'processor', I really wouldn't bother. If you have a genuine need for a low pause application, then there are some tweaks available. The trade off is generally that you'll need more resources to achieve the same result.
My experience is that low-pause JVM configurations have a very noticeable fall-off point when load increases. If your application is usually fairly idle, but you expect a 'quick' response when it is called, low pause may be appropriate. On a busier system, with peaks in traffic / load, you may prefer a more traditional approach.
Summary
In any case, don't be tempted to make arbitrary changes to 'improve' your configuration. Be scientific and professional about your approach.
If you don't have production metrics available, consider using tools like Apache JMeter to build load test scenarios to simulate the typical live load on you application, increased load (by say, 10%, 20% or 50% etc.) and intermittent peak load.
Use metrics for both the GC and the application, measuring at least:
Average throughput.
Peak throughput.
Average load (CPU and memory).
Peak load.
Application pause times (total and individual pauses).
Time spent performing collections.
Reliability (OOME's etc.).
Once you're happy that you've recorded an accurate benchmark on the performance of you application with its current configuration, only then should you start making any changes.
Obviously, record you configuration and its metrics. Document any changes and then perform the same benchmark tests. Then you'll be able to see any performance gain (or loss) and any trade-off that may be applicable.
Here's the some further reading from Oracle on the subject to get you started:
Java SE 6 Virtual Machine Garbage Collection Tuning

Related

Make ZGC run often

ZGC runs not often enough. GC logs show that it runs once every 2-3 minutes for my application and because of this, my memory usage goes high between GC cycles (as high as 90%). After GC, it drops to as low as 20%.
How to increase GC run's frequency to run more often?

-XX:ZCollectionInterval=N - set maximum gap between collections to N seconds.
-XX:ZUncommitDelay=M - set the delay until unused memory is returned to the OS to M seconds.

Before tuning the GC, I would recommend to investigate why this is happening. Might have some issue/bug in your application.
[Some notes about GC]
-XX:ZUncommitDelay=M (Check if it is supported by your linux kernel)
-XX:+ZProactive: Enables proactive GC cycles when using ZGC. By default, this option is enabled. ZGC will start a proactive GC cycle if doing so is expected to have minimal impact on the running application. This is useful if the application is mostly idle or allocates very few objects, but you still want to keep the heap size down and allow reference processing to happen even when there are a lot of free space on the heap.
More details about ZGC config. options can be found:
ZGC Home Page.
Oracle Documentation

Presently (as of JDK 17), ZGC's primary strategy is to wait until the last possible moment of the heap filling up and then do a collection. Its goals are
Avoid unnecessary CPU load by running GC only when it's necessary.
Start the GC early enough so that it will finish before the heap actually fills up (since the heap filling up would be bad, leading to a temporary application stall).
It does this by measuring how fast your app is allocating memory, how long the GC takes to run, and predicting at what point it should start the GC. You can find the exact algorithm in the source code.
ZGC also exposes some knobs for running GC more often (ie, proactively), but honestly I don't find them terribly effective. You can find more info in my other answer. G1 does a better job of being proactive, but whether that's good or not depends on your use-case. (It sounds like you care more about throughput than memory usage, so I think you should prefer ZGC's behavior.)
However, if you find that ZGC is making mistakes in predicting when the heap will fill up and that your application really is hitting stalls, please share that info here or on the ZGC mailing list.

Java: how to trace/monitor GC times for the CMS garbage collector?

I'm having trouble figuring out a way to monitor the JVM GC for memory exhaustion issues.
With the serial GC, we could just look at the full GC pause times and have a pretty good notion if the JVM was in trouble (if it took more than a few seconds, for example).
CMS seems to behave differently.
When querying lastGcInfo from the java.lang:type=GarbageCollector,name=ConcurrentMarkSweep MXBean (via JMX), the reported duration is the sum of all GC steps, and is usually several seconds long. This does not indicate an issue with GC, to the contrary, I've found that too short GC times are usually more of an indicator of trouble (which happens, for example, if the JVM goes into a CMS-concurrent-mark-start-> concurrent mode failure loop).
I've tried jstat as well, which gives the cumulative time spent garbage collecting (unsure if it's for old or newgen GC). This can be graphed, but it's not trivial to use for monitoring purposes. For example, I could parse jstat -gccause output and calculate differences over time, and trace+monitor that (e.g. amount of time spent GC'ing over the last X minutes).
I'm using the following JVM arguments for GC logging:
-Xloggc:/xxx/gc.log
-XX:+PrintGCDetails
-verbose:gc
-XX:+PrintGCDateStamps
-XX:+PrintReferenceGC
-XX:+PrintPromotionFailure
Parsing gc.log is also an option if nothing else is available, but the optimal solution would be to have a java-native way to get at the relevant information.
The information must be machine-readable (to send to monitoring platforms) so visual tools are not an option. I'm running a production environment with a mix of JDK 6/7/8 instances, so version-agnostic solutions are better.
Is there a simple(r) way to monitor CMS garbage collection? What indicators should I be looking at?

Fundamentally one wants two things from the CMS concurrent collector
the throughput of the concurrent cycle to keep up with the promotion rate, i.e. the objects surviving into the old gen per unit of time
enough room in the old generation for objects promoted during a concurrent cycle
So let's say the IHOP is fixed to 70% then you probably are approaching a problem when it reaches >90% at some point. Maybe even earlier if you do some large allocations that don't fit into the young generation or outlive it (that's entirely application-specific).
Additionally you usually want it to spend more time outside the concurrent cycle than in it, although that depends on how tightly you tune the collector, in principle you could have the concurrent cycle running almost all the time, but then you have very little throughput margin and burn a lot of CPU time on concurrent collections.
If you really really want to avoid even the occasional Full GC then you'll need even more safety margins due to fragmentation (CMS is non-compacting). I think this can't be monitored via MX beans, you'll have to to enable some CMS-specific GC logging to get fragmentation info.

For viewing GC logs:
If you have already enabled GC logging, I suggest GCViewer - this is an open source tool that can be used to view GC logs and look at parameters like throughput, pause times etc.
For profiling:
I don't see a JDK version mentioned in the question. For JDK 6, I would recommend visualvm to profile an application. For JDK 7/8 I would suggest mission control. You can find these in JDK\lib folder. These tools can be used to see how the application performs over a period of time and during GC (can trigger GC via visualvm UI).

Garbage collection tuning a production application

I've been tasked with tuning a production application that consists of a Spring MVC REST interface serving large (~0mb - 100mb) json documents from a Gemfire in memory cache backend. The application runs on a CentOS server inside Tomcat 7 on JDK 1.6. We realized that the application needed to be tuned because we were seeing frequent stop the world old generation garbage collections which would eventually lead to java.lang.OutOfMemoryError: GC overhead limit exceeded errors if left unattended.
Through some trial and error and monitoring I've managed to tune the application with these parameters:
-Xms20g
-Xmx20g
-XX:PermSize=256m
-XX:MaxPermSize=256m
-XX:NewSize=8g
-XX:MaxNewSize=8g
-XX:SurvivorRatio=8
-XX:+DisableExplicitGC
-XX:+UseConcMarkSweepGC
-XX:+UseParNewGC
-XX:CMSInitiatingOccupancyFraction=70
The garbage collection behavior that I'm seeing now (48 hours under heavy test load) is that eden space collection is happening about once every 10 seconds and lasting about .04 seconds. The old generation is not growing at all after 48 hours and there have been 0 collections in that space.
My question is should I be concerned about not having the old generation garbage collected? Overall does this look like a healthy tuning?
Edit:
For anyone who cares my GC log is available here http://filebin.ca/2U8awo1KTS1D/udf-gc.log.0

My question is should I be concerned about not having the old generation garbage collected?
The logs look fine. Given the trends the old gen occupancy grows very slowly. So it will take several days until it becomes full enough for a concurrent marking cycle to be initiated.
Overall does this look like a healthy tuning?
it seems like you're giving it much more memory than it needs.
Old gen occupancy is around 2G / 12G. This means you could probably shrink it to 4G and still take many hours before a concurrent cycle gets started
Most young objects only live to age 1 (out of 15) in the young generation. This means the young generation could be shrunk too without increasing object promotion too much
-XX:CMSInitiatingOccupancyFraction=70
That should be combined with XX:+UseCMSInitiatingOccupancyOnly

Tuning Garbage Collection is no different for generic performance tuning in the sense that - without having the requirements you can (for non-trivial applications at least) effectively keep improving forever. At some point the improvements no longer matter for the practical use case. That is why you should be having the goals in place.
The goals regarding GC should be derived from the generic performance requirements. These in turn are usually describing three dimensions
Latency. Or more precisely, acceptable latency distribution per service published by application. For example - 99% of login() operations must complete under 500ms and worst case can not exceed 2,500ms.
Throughput. How many operations per time unit must be completed. Tougher to measure for large monoliths, but if running microservices you can express this as "1,000 login operations processed per second".
Capacity. Adding more resources & scaling out will improve the situation, but for practical matters, things such as the monthly AWS bill will set limits in this regard.
Having these requirements in place, you can start building/deriving from them and if necessary, optimizing further. The company I am affiliated with published a rather thorough handbook about GC tuning recently so you can check more from the GC tuning sections of the handbook.

JVM consumes 100% CPU with a lot of GC

After running a few days the CPU load of my JVM is about 100% with about 10% of GC (screenshot).
The memory consumption is near to max (about 6 GB).
The tomcat is extremely slow at that state.

Since it's too much for a comment i'll write it up ans answer:
Looking at your charts it seems to be using CPU for non-GC tasks, peak "GC activity" seems to stay within 10%.
So on first impression it would seem that your task is simply CPU-bound, so if that's unexpected maybe you should do some CPU-profiling on your java application to see if something pops out.
Apart from that, based on comments I suspect that physical memory filling up might evict file caches and memory-mapped things, leading to increased page faults which forces the CPU to wait for IO.
Freeing up 500MB on a manual GC out of a 4GB heap does not seem all that much, most GCs try to keep pause times low as their primary goal, keep the total time spent in GC within some bound as secondary goal and only when the other goals are met they try to reduce memory footprint as tertiary goal.
Before recommending further steps you should gather more statistics/provide more information since it's hard to even discern what your actual problem is from your description.
monitor page faults
figure out which GC algorithm is used in your setup and how they're tuned (-XX:+PrintFlagsFinal)
log GC activity - I suspect it's pretty busy with minor GCs and thus eating up its pause time or CPU load goals
perform allocation profiling of your application (anything creating excessive garbage?)
You also have to be careful to distinguish problems caused by the java heap reaching its sizing limit vs. problems causing by the OS exhausting its physical memory.
TL;DR: Unclear problem, more information required.
Or if you're lazy/can afford it just plug in more RAM / remove other services from the machine and see if the problem goes away.

I learned to check this on GC problems:
Give the JVM enough memory e.g. -Xmx2G
If memory is not sufficient and no more RAM is available on the host, analyze the HEAP dump (e.g. by jvisualvm).
Turn on Concurrent Marc and Sweep:
-XX:+UseConcMarkSweepGC -XX:+UseParNewGC
Check the garbage collection log: -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:gc.log
My Solution:
But I solved that problem finally by tuning the cache sizes.
The cache sizes were to big, so memory got scarce.

if you want keep the memory of your server free you can simply try the vm-parameter
-Xmx2G //or any different value
This ensures your program never takes more than 2 Gigabyte of Ram. But be aware if case of high workload the server may be get an OutOfMemoryError.
Since a old generation (full) GC may block your whole server from working for some seconds java will try to avoid a Full Garbage collection.
The Ram-Limitation may trigger a Full-Generation GC more easy (or even support more objects to be collected by Young-Generation GC).
From my (more guessing than actually knowing) opinion: I don't think another algorithm can help so much here.

Change in GC behaviour after move from Java5 to 6

We've recently migrated our systems from Sun Java 5 to Java6 server VM (specifically, 1.6.0_16 on Linux 32 bit). We've noticed that the garbage collection behaviour has changed in such a way as to trigger our heap-warning monitoring system.
The heap usage graphs indicate a much "spikier" memory usage profile than we saw with Java5, with the VM letting heap usage get very high before running a big GC. It doesn't appear to be a problem with the application system itself (it never actually runs out of memory), but it's giving the monitoring system the occasional spurious "hair on fire" signals whenever the usage spike approaches the threshold.
We could increase the heap max and hope the spike doesn't simply get bigger, but I'd much rather find out if there's a way we can tune the JVM parameters in such a way that we get a smoother profile, even if we loose a bit of performance.
I'm guessing there might be some -XX option we can set to achieve this, but I an't see any such thing in the docs. Anyone know of such an option?

It sounds like you would really like to have something more like a concurrent collection (as opposed to standard big-bang collections):
The concurrent collector is designed
for applications that prefer shorter
garbage collection pauses and that can
afford to share processor resources
with the garbage collector while the
application is running.
Perhaps even more important, you should ensure that you're using the correct VM with the right options, over and above the specific garbage collection options. For example, I've tripped over the client vs. server VM issue multiple times in my own life.

Have fun reading and playing (Java 6 GC tuning :-)

Can you confirm the same GC scheme/mechanism is employed? Do you calculate higher GC overhead in 1.6 or are pause times greater over any given duration?
Max and min heap free directives may help with some of your heap ergonomics too.
-XX:MinHeapFreeRatio and -XX:MaxHeapFreeRatio
http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html#generation_sizing.total_heap

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.