Why is CMS full GC single-threaded?

Why is CMS full GC single-threaded? - java

Whenever there's concurrent mode failure or promotion failure using CMS it does full GC using single thread. Why it couldn't do full GC using parallel collector to decrease the full GC penalty?

There is no particular reason other than that it hasn't been implemented that way and engineering effort is focused on G1. Most users of CMS just try to tune it in a way that it never happens, "never" meaning at an interval greater than whatever requires JVM restarts anyway. The parallel old collector can't be reused by simply calling its code since internal data structurs between the collectors differ, so it would involve non-trivial implementation effort.
Google devs have proposed a patch to contribute parallel full GC to CMS, but i wouldn't count on it becoming available in any openjdk builds anytime soon.

Related

In Java 1.8, how do I choose a garbage collector？

Suppose my environment is Java 1.8, my application is a batch application and there is no requirement for latency, I don't know whether I should choose Parallel GC or G1 GC?
I understand that Parallel GC is optimized for throughput and is more suitable for batch applications like mine, but I find that all Java applications around me are using G1 garbage collector, so I am not sure if I don't need Parallel GC if I have G1, or if I am looking for throughput, Parallel GC is the best choice. better choice?

I went through the first chapter of the Java® Performance Companion book and there is a passage in it that describes this:
As of this writing, G1 primarily targets the use case of large Java heaps with reasonably low pauses, and also those applications that are using CMS GC. There are plans to use G1 to also target the throughput use case, but for applications looking for high throughput that can tolerate longer GC pauses, Parallel GC is currently the better choice.
This exactly answers my question, if 1 project is purely for completing timed tasks, or consuming MQ, it usually has no requirement for pause time, which would be more appropriate with Parallel GC.

Controlling Java garbage collection in real-time system

We are running a RT system in Java. It often uses relatively large heaps (100+GB) and serves requests coming from message queue. Each request must be handled fast (<100ms) to meet the SLAs.
We are experiencing serious GC-related problems, because it often happens that GC causes stop-the-world collection during a request (200+ms), resulting in failure.
One of our developers with reasonable knowledge of GCs spent quite some time with tuning GC parameters and trying different GCs. After several days, he came up with some parametrization that we jokingly call "evolved by genetic algorithm". It lowers the GC pauses, but is still far from meeting the SLA requirements.
The solution I am looking for is to protect some critical parts of code from GC, and after a request is finished, let the GC do as much work as it needs, before taking next request. Occasional pauses outside the requests would be OK, because we have several workers and garbage-collecting workers would just not ask for requests for a while.
I have some ideas which are silly, ugly, and most probably not working, but hopefully they illustrate the problem:
Occasionally call Thread.sleep() in the receiving thread, praying for the GC to do some work in the meantime,
Invoke System.gc() or Runtime.gc() between requests, again hopelessly praying for it to help,
Mess the code entirely with hacky patterns like https://stackoverflow.com/a/6915221/1137187.
The last important note is that we are a low-budget startup and commercial solutions such as Zing® are not an option for us, we are looking for a non-commercial solution.
Any ideas? We would rewrite our code entirely to C++ (we didn't know that GC might be a problem rather than solution at the beginning), but the code-base is too large already to do that.

Any ideas?
Use a different JVM? Azul claims to be able to handle such cases. Redhat and Oracle are contributing shenandoah and zgc to openjdk, respectively, with similar goals in mind, so maybe you could try experimental builds if you don't want a commercial solution.
There also are other JVMs focused on realtime applications, but as I understand it they focus on harder realtime requirements on smaller systems, yours sounds more like soft realtime requirements.
Another thing you can attempt is significantly reducing object allocations (profile your application!) by using pre-allocated objects or more compact data representations where applicable. Reducing allocation pressure while keeping the new gen size the same means increased mortality rate per collection, which should speed up young collections.
Choosing hardware to maximize memory bandwidth might help too.
Invoke System.gc() or Runtime.gc() between requests, again hopelessly praying for it to help,
This might work when combined with -XX:+ExplicitGCInvokesConcurrent, otherwise it would trigger a single-threaded STW collection with CMS or G1 (I'm assuming you're using one of those). But that approach is seems brittle and requires lots of tuning and monitoring.

How to monitor memory after major garbage collection via JMX or code

Many monitoring tools, like the otherwise phantastic JavaMelody, just monitor the current memory usage. If you want to check for memory leaks or impending out of memory situations, this is not particularily helpful, if you have an application that generates loads of garbage which gets collected immediately. Not perfect, but IMHO much more interesting, would it be to monitor the memory usage immediately after a major garbage collection. If that's high, a crash is looming over you.
So: can you find out the memory usage immediately after the last major garbage collection - either from Java code or via JMX? I know there are some tools like VisualVM which do this (which is no option for production use), and it can be written in the garbage collection log, but I'm looking for a more straightforward solution than parsing the garbage collection logfile. :-) To be clear: I'm looking for something that can easily be used in any application in production, not any expensive tool for debugging.
In case that matters: JDK 7 with -XX:+UseConcMarkSweepGC , but I am interested in general answers, too.

Information about memory available right after gc (youg or old) is available via JMX.
Garbage collector MBean has attribute LastGcInfo which is composite data object including information about memory pool sizes before and after GC.
In addition, starting with Java 7 JMX notification subscription could be used to receive GC events without polling.
You can find example of code working with GC MBean here.

Probably 'Dynatrace' is an option... it's a very powerful monitoring tool (not only for memory).
http://www.dynatrace.com/en/index.html

A very crude way would be to monitor the minima of Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory() for some time. At least that would not require you to know intimate details about memory pools, as monitoring LastGcInfo in Alexey Ragozin's answer does. This might require you to get notifications about garbage collections.

Is garbage collection detrimental to the performance of this type of program

I'm building a program that will live on an AWS EC2 instance (probably) be invoked periodically via a cron job. The program will 'crawl'/'poll' specific websites that we've partnered with and index/aggregate their content and update our database. I'm thinking java is a perfect fit for a language to program this application in. Some members of our engineering team are concerned about the performance detriment of java's garbage collection feature, and are suggesting using C++.
Are these valid concerns? This is an application that will be invoked possible once every 30 minutes via cron job, and as long as it finishes its task within that time frame the performance is acceptable I would assume. I'm not sure if garbage collection would be a performance issue, since I would assume the server will have plenty of memory and the actual act of tracking how many objects point to an area of memory and then declaring that memory free when it reaches 0 doesn't seem too detrimental to me.

No, your concerns are most likely unfounded.
GC can be a concern, when dealing with large heaps & fractured memory (requires a stop the world collection) or medium lived objects that are promoted to old generation but then quickly de-referenced (requires excessive GC, but can be fixed by resizing ratio of new:old space).
A web crawler is very unlikely to fit either of the above two profiles - you probably don't need a massive old generation and should have relatively short lived objects (page representation in memory while you parse out data) and this will be efficiently dealt with in the young generation collector.
We have an in-house crawler (Java) that can happily handle 2 million pages per day, including some additional post-processing per page, on commodity hardware (2G RAM), the main constraint is bandwidth. GC is a non-issue.
As others have mentioned, GC is rarely an issue for throughput sensitive applications (such as a crawler) but it can (if one is not careful) be an issue for latency sensitive apps (such as a trading platform).

The typical concern C++ programmers have for GC is one of latency. That is, as you run a program, periodic GCs interrupt the mutator and cause spikes in latency. Back when I used to run Java web applications for a living, I had a couple customers who would see latency spikes in the logs and complain about it — and my job was to tune the GC to minimize the impact of those spikes. There are some relatively complicated advances in GC over the years to make monstrous Java applications run with consistently low latency, and I'm impressed with the work of the engineers at Sun (now Oracle) who made that possible.
However, GC has always been very good at handling tasks with high throughput, where latency is not a concern. This includes cron jobs. Your engineers have unfounded concerns.
Note: A simple experimental GC reduced the cost of memory allocation / freeing to less than two instructions on average, which improved throughput, but this design is fairly esoteric and requires a lot of memory, which you don't have on EC2.
The simplest GCs around offer a tradeoff between large heap (high latency, high throughput) and small heap (lower latency, lower throughput). It takes some profiling to get it right for a particular application and workload, but these simple GCs are very forgiving in a large heap / high throughput / high latency configuration.

Fetching and parsing websites will take way more time than the garbage collector, its impact will be probably neliglible. Moreover, the automatic memory management is often more efficient when dealing with a lot of small objects (such as strings) than a manual memory management via new/delete. Not talking about the fact that the garbage collected memory is easier to use.

I don't have any hard numbers to back this up, but code that does a lot of small string manipulations (lots of small allocations and deallocations in a short period of time) should be much faster in a garbage-collected environment.
The reason is that modern GC's "re-pack" the heap on a regular basis, by moving objects from an "eden" space to survivor spaces and then to a tenured object heap, and modern GC's are heavily optimized for the case where many small objects are allocated and then deallocated quickly.
For example, constructing a new string in Java (on any modern JVM) is as fast as a stack allocation in C++. By contrast, unless you're doing fancy string-pooling stuff in C++, you'll be really taxing your allocator with lots of small and quick allocations.
Plus, there are several other good reasons to consider Java for this sort of app: it has better out-of-the-box support for network protocols, which you'll need for fetching website data, and it is much more robust against the possibility of buffer overflows in the face of malicious content.

Garbage collection (GC) is fundamentally a space-time tradeoff. The more memory you have, the less time your program will need to spend performing garbage collection. As long as you have a lot of memory available relative to the maximum live size (total memory in use), the main performance hit of GC -- whole-heap collections -- should be a rare event. Java's other advantages (notably robustness, security, portability, and an excellent networking library) make this a no-brainer.
For some hard data to share with your colleagues showing that GC performs as well as malloc/free with plenty of available RAM, see:
"Quantifying the Performance of Garbage Collection vs. Explicit Memory Management", Matthew Hertz and Emery D. Berger, OOPSLA 2005.
This paper provides empirical answers to an age-old question: is
garbage collection faster/slower/the same speed as malloc/free? We
introduce oracular memory management, an approach that lets us measure
unaltered Java programs as if they used malloc and free. The result: a
good GC can match the performance of a good allocator, but it takes 5X
more space. If physical memory is tight, however, conventional garbage
collectors suffer an order-of-magnitude performance penalty.
Paper: PDF
Presentation slides: PPT, PDF

Java without gc - io

I would like to run a Java program with garbage collection switched off. Managing memory in my own code is not so difficult.
However the program needs quite a lot of I/O.
Is there any way (short of using JNI for all I/O operations) that I could achieve this using pure Java?
Thanks
Daniel

What you are trying to achieve is frequently done in investment banking to develop low-latency real-time systems.
To avoid GC you simply need to make sure not to allocate memory after the startup and warm-up phase of your application.
As you seem to have noticed Java NIO internally does unwanted memory allocation.
Unfortunately, you have no choice but write JNI replacements for the problematic calls.
You need at least to write a replacement for the NIO Selector.
You will have to avoid using most of the Java libraries due to similar unwanted memory allocations.
For example you will have to avoid using immutable object like String, avoid Boxing, re-implement Collections that preallocate enough entries for the whole lifetime of your program.
Writing Java code this way is not easy, but certainly possible.
I am developing a platform to do just so.

Managing memory in my own code is not
so difficult.
It's not difficult - It's impossible. For example:
public void foo() {
Object o = new Object();
// free(o); // Doh! No "free" keyword in Java.
}
Without the aid of the garbage collector how can the memory consumed by o be reclaimed?
I'm assuming from your question that you might want to avoid the sporadic pauses caused by garbage collection due to the high level of I/O being performed by your app. If this is the case there are techniques for minimising the number of objects created (e.g. re-using objects from a pool). You could also consider enabling the Concurrent Mark Sweep Collector.
The concurrent mark sweep collector,
also known as the concurrent collector
or CMS, is targeted at applications
that are sensitive to garbage
collection pauses.

It's very hard (but not impossible) to disable GC in a JVM.
Look at the JNI "critical" functions for hints.
You can also essentially ensure you don't GC by not allocating any more objects (write a JVMTI agent that slaps you if you do, and instrument your code).
Finally, you can force a fatal OutOfMemoryError by ensuring that every object you allocate is never freed, thus when you hit -Xmx memory used, you'll fall over as GC won't be able to reclaim anything (mind you, you'll GC one or more times at this point before you fall over in a heap).
The real question is why you'd want to? What upside do you see in doing it? Is it for realtime? If so, I'd consider looking at one of the several realtime JVMs available on the market (Oracle, IBM, & others all sell them). I can't honestly think of another reason to do this while still using Java.

The only way you are going to be able to turn off garbage collection is to modify the JVM. This is should be feasible with OpenJDK 6 codebase.
However, the what you will get at the end is a JVM that leaks memory like crazy, with no reasonable hope of fixing the leaks. The Java class library APIs are designed and implemented on the assumption that there is a GC taking care of memory management. This is so fundamental that any serious attempt to "fix" it would lead to a language / library that is not recognizable as Java.
If you want a non-garbage collected language, use C or C++.

Modern JVM's are so good at handling short-lived objects that any scheme you devise on your own will be slower.
This is because the objects you handle yourself will become long-lived and receive extra deluxe treatment from the JVM in terms of being moved around etc. Of course, this is by the garbage collector, which you want to turn off, but you can do very little without any gc.
So, before you start considering what optimization to use, then establish a baseline where you have a large unoptimized, program and profile it. Then do your tweaks, and see if it helps, but you will never know if you do not have a baseline.

As other people have mentioned you can't disable the GC. However, you can choose to use the experimental 'Epsilon' garbage collector which will never actually perform any garbage collections. Warning: it will crash if your JVM runs out of memory (because it's not doing any garbage collections).
There's more info (including the command-line switch to use) at:
http://openjdk.java.net/jeps/318
Good luck!

GarbageCollection is automated memory management in java.So you can not disable GC

Since you say, "its all about predictability not straight line speed," you should look at using a realtime Java system with deterministic garbage collection.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.