Is G1GC still not officially production ready? - java

I wonder what the official status of the "garbage first" (G1) collector in the JDK 7 release is. I would like to use G1 as a low pause gc alternative to CMS, but only if I can really trust on its robustness.
Before JDK 7 was out, G1 was advertised as the shiny new gc going to replace the CMS collector and even to be the default gc in JDK 7. However, now with Oracle JDK 7u1, G1 is not the default gc on any machine I have tried.
Even though one does not need to specify -XX:+UnlockExperimentalVMOptions anymore when using -XX:+UseG1GC in JDK 7, it's a JVM feature that's officially completely undocumented:
Java 7 (JDK 7) garbage collection and documentation on G1
The only official document I could find that mentions G1 is seriously outdated and was written long before JDK 7 was out:
http://www.oracle.com/technetwork/java/javase/tech/g1-intro-jsp-135488.html
For example, the official "Java HotSpot VM Options" documentation ( http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html ) documents how to enable and tune the other collectors but does not even mention the existence of G1. As if it didn't exist!
This is quite confusing and I wonder what the real status of the G1 and what its future is. Is it really stable yet? Have the remaining issues (like leaks, spurious crashes and missing instrumentation support) been resolved? And if so, why does Oracle treat the G1GC as an undocumented (embarrassing?) secret? Is G1 perhaps a failed project that's now silently discontinued? Or do I need to pay for documentation and support? Or is it just still beta? Can someone enlighten me on what's going on here?

The place to ask this question is on the hotspot-gc-dev mailing list.
If you look through the archives you'll find that there is a lot of work being done. A lot of the mail appears to be commits and review requests/comments so they're busy working on it.
I haven't found any official news announcements, but that is how Oracle works. You might be able to ask on that mailing list how they think they're going, if you're happy with an unofficial and non-binding comment from one of the devs.
EDIT: #scravy sent an email to the mailing list, this is the response received:
I don't think there is a simple answer to this question, though
probably not. G1's initial focus was to provide reasonable pauses
for extremely large heaps. Which means today it might not be the
best choice for everyone. We think the technology has 'a lot of
legs' though, meaning that with adaptation, it can address many
different kinds of garbage collection demands. So one day, it
might effectively be the default collector, but it is too soon to
know for sure.
Considering that GC behavior changes can be very disruptive to
existing deployments, we are reluctant to make shifts like this
even in major releases without considerable advanced notice. So in current releases, if you
don't specify a collector, we attempt to make some simple
automated choices, but I doubt we'd make radical changes to that
behavior in the near term.
For the bigger question regarding is G1 supported, the current
answer is no. But keep in mind that the support commitment that
Oracle makes to its paid customers for supported products is
fairly significant, and there is much more to it than just meeting
the functionality and reliability requirements.
We continue to encourage everyone to test and evaluate G1, and of
course, deliver feedback to us, as we continue significant
development on G1.
-John
EDIT: According to this link on Oracle's site it looks like G1GC is now fully supported.

We are already using G1GC, from almost a year and half. Its doing great in our mission critical transaction processing system, and It proved to be a great support w.r.t high throughput, low pause, concurrency, multi-threading and optimized heavy memory management.
We are using following JVM settings:
-server -d64 -Xms512m -Xmx3072m -XX:+HeapDumpOnOutOfMemoryError -XX:+UseG1GC
-XX:+UnlockExperimentalVMOptions -XX:+AggressiveOpts -XX:+DoEscapeAnalysis
-XX:MaxGCPauseMillis=400 -XX:GCPauseIntervalMillis=8000
-XX:+UseCompressedOops -XX:NewRatio=50

According to this: http://www.oracle.com/technetwork/java/javase/tech/g1-intro-jsp-135488.html,
G1 development is now focused primarily on resolving any remaining
reliability issues and improving performance
Also,
In terms of GC pause times, G1 is sometimes better and sometimes worse
than CMS. Work is ongoing to make G1 consistently as good as, if not
better than, CMS.
So G1 is supposedly going to replace CMS when the official JDK SE 7 is out.

AFAIK, G1 is not secret -- it's open for experimental use long enough -- at least year or two. Every JavaOne comes with some lection about how good G1 will be :)
From unofficial sources: it's one of current focuses for java engeneers, to make G1 production ready at last. They just was not ready to open it for JDK 7. Just keep waiting :)

It looks like the page linked in the question has been updated:
The Garbage-First (G1) garbage collector is fully supported in Oracle
JDK 7 update 4 and later releases.
(Note, however, that for embedded platforms like ARM, it is not yet supported at all in 7u4.)

G1 GC is production ready since Java 7 update 4 version launch.
From oracle article (under The G1 Garbage Collector), you can find real use cases for G1 GC.
Applications running today with either the CMS or the ParallelOldGC garbage collector would benefit switching to G1 if the application has one or more of the following traits.
Full GC durations are too long or too frequent.
The rate of object allocation rate or promotion varies significantly.
Undesired long garbage collection or compaction pauses (longer than 0.5 to 1 second)
Have a look at related question for more details about G1GC and key parameters to be fine tuned:
Java 7 (JDK 7) garbage collection and documentation on G1
Regarding your other queries:
Is it really stable yet? Have the remaining issues (like leaks, spurious crashes and missing instrumentation support) been resolved? And if so, why does Oracle treat the G1GC as an undocumented (embarrassing?) secret? Is G1 perhaps a failed project that's now silently discontinued? Or do I need to pay for documentation and support? Or is it just still beta? Can someone enlighten me on what's going on here?
G1GC is stable.
I have not found any leaks in this algorithm.
Oracle did not keep it undocumented. You can find more info about G1GC here and here
G1 is not a failed project and G1GC is going to be default GC algorithm in newer versions of java (java 9)
You don't need to pay for support. It's not beta.

Related

Does Java G1 garbage collector respect MaxHeapFreeRatio parameter?

Does the Java G1 garbage collector (as implemented in Open JDK) respect the -XX:MaxHeapFreeRatio=n JVM parameter?
Does it respect it in Java 8?
I found JEP 346: Promptly Return Unused Committed Memory from G1
delivered in Java 12, but it's not clear to me what was the state before it.
A non-authoritative answer that I found is based on https://bugs.openjdk.java.net/browse/JDK-8078039?focusedCommentId=13632717&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-13632717
Most notably, some GCs support it, some do not. This RFE doesn't say
which GC is used, so I assume we are talking about the default GC
which supports MaxHeapFreeRatio. G1 also supports this option.
To shrink the heap (and release memory) a full GC is required. It the
application doesn't trigger a full GC manually, it may take a while
before one is triggered by the JVM.
So the answer would be (for Java 8): yes... but no.
yes, it supports the option
but only when full GC happens, which may be never
(I remain curious whether there is a more authoritative source and what's the current state, in more modern JVMs)
I do not know a more authoritative answer than the source code and yes, you are correct in your answer - the release of the memory will happen only after a Full GC (at least before that JEP).
For java-8:
That argument does matter, look here for example.
That code is not very complicated to understand, and here is the actual shrink that happens.
For a much more extensive answer (regarding java-11, but still qualifies to java-8), read this.
The bottom-line is that - that flag does matter, but how exactly is implementation dependent. There is no simple answer to your question.

How to add a garbage collector of an older version to a JRE of newer version

I was asked a question as below in a recent interview.
How to add a garbage collector of an older version to a JRE of newer version
Couldn't get a proper answer from internet. Can anyone explain this to me?
Thanks in advance
You can't. The garbage collector is part of the JVM, and can't just be moved around. If you have the source code for both you could consider trying to patch the collector back in - but it would probably be a very large task, and I'd be incredibly wary of the reliability of the result... other bits of the JVM may well depend on assumptions about the garbage collectors available.
Assuming these are different versions of the same VM, I'd expect a garbage collector only to be retired if the available ones were superior in most ways, so I don't expect you'd get significant benefit anyway - at least outside some very specialized situations.
(I'm glad that this was just an interview question rather than a real life situation. Although as Jigar says, maybe they weren't actually thinking of adding a particular collector to a VM that didn't really support it. Either way, it seems like a pretty bizarre interview question.)
May be interviewer was looking for vm flags that you can pass to activate particular garbage collector for example with jre 7 you can still say -XX:-UseSerialGC to make it use serial gc

What versions of java are slow for gc logging?

I've been told by my company's support team that some versions of java have a significant performance impact when we turn on -verbose:gc. However I can't figure out if this is the case or not.
Was this logging slow(ish) at some point, and when did it stop?
The reason I ask is that there's some hesitation about applying this to a production environment to investigate potential memory leaks (and whether we can stop doing periodic restarts of the system...).
Specifically I'm talking about Java 1.4.2 which I think introduced the argument, and what service pack it applies up to.
I know you asked about the impact of verbose:gc (Amir is correct), but based on the comments I see you are investigating a memory leak.
Is it possible for you to get a histogram of your environment? verbose GC will only show you that there is a memory leak, not where the memory is sitting.
you mention java 1.4.2, is that your current version? If you are using 1.5 or higher you can use
jmap -histo <pid> > file.txt
This will give you a breakdown of all the objects in memory. You will freeze your JVM for a time dependent on the amount of memory in the system. (2GB can freeze for a minute or so on even good hardware) test this on a development system first. I know you don't want to impact your production environment but this is a necessary evil to find the source of the problem. Do a capture right before the periodic restart to lesson your impact.
I suggest that you do the following:
Write some benchmark that is likely to stress the garbage collection. (Create large linked data structures with weak references, etc, etc).
Install a copy of the same version of the JVM as you are using in production on some test box.
Run the benchmark with various GC logging settings, including the settings that you want to run in production, measuring the performance impact on the benchmark.
If you do this right, it will give you some solid evidence about what the likely performance impact will be for your production server.

Why is memory management so visible in Java VM?

I'm playing around with writing some simple Spring-based web apps and deploying them to Tomcat. Almost immediately, I run into the need to customize the Tomcat's JVM settings with -XX:MaxPermSize (and -Xmx and -Xms); without this, the server easily runs out of PermGen space.
Why is this such an issue for Java VMs compared to other garbage collected languages? Comparing counts of "tune X memory usage" for X in Java, Ruby, Perl and Python, shows that Java has easily an order of magnitude more hits in Google than the other languages combined.
I'd also be interested in references to technical papers/blog-posts/etc explaining design choices behind JVM GC implementations, across different JVMs or compared to other interpreted language VMs (e.g. comparing Sun or IBM JVM to Parrot). Are there technical reasons why JVM users still have to deal with non-auto-tuning heap/permgen sizes?
The title of your question is misleading (not on purpose, I know): PermSize issues (and there are a lot of them, I was one of the first one to diagnose a Tomcat/Sun PermGen issue years ago, when there wasn't any knowledge on the issue yet) are not a Java specifity but a Sun VM specifity.
If you use a VM that doesn't use permanent generation (like, say, an IBM VM if I'm not mistaken) you cannot have permgen issues.
So it's is not a "Java" problem, but a Sun VM implementation problem.
Java gives you a bit more control about memory -- strike one for people wanting to apply that control there, vs Ruby, Perl, and Python, which give you less control on that. Java's typical implementation is also very memory hungry (because it has a more advanced garbage collection approach) wrt the typical implementations of the dynamic languages... but if you look at JRuby or Jython you'll find it's not a language issue (when these different languages use the same underlying VM, memory issues are pretty much equalized). I don't know of a widespread "Perl on JVM" implementation, but if there's one I'm willing to bet it wouldn't be measurably different in terms of footprint from JRuby or Jython!
Python/Perl/Ruby allocate their memory with malloc() or an optimization thereof. The limit to the heap space is determined by the operating system rather than the VM, so there's no need for options like -Xmxn. Also, the garbage collection is simpler, based mostly on reference counting. So there's a lot less to fine-tune.
Furthermore, dynamic languages tend to be implemented with bytecode interpreters rather than JIT compilers, so they aren't used for performance-critical code anyway.
The essence of #WizardOfOdds and #Alex-Martelli's answers appear to be correct: Java has an advanced set of GC options, and sometimes you need to tune them. However, I'm still not entirely clear on why you might design a JVM with or without a permanent generation. I have found a bunch of useful links about garbage collection in Java, though not necessarily in comparison to other languages with GC. Briefly:
The Sun GC evolves very slowly due to the fact that it is deployed everywhere and people may rely on quirks in its implementation.
Sun has detailed white papers on GC design and options, such as Tuning Garbage Collection with the 5.0 Java[tm] Virtual Machine.
There is a new GC in the wings, called the G1 GC. Alex Miller has a good summary of relevant blog posts and a link to the technical paper. But it still has a permanent generation (and doesn't necessarily do a great job with it).
Jon Masamitsu has (had?) an interesting blog at Sun various details of garbage collection.
Happy to update this answer with more details if anyone has them.
This is because Tomcat is running in the Java Virtual Machine, while other languages are either compiled or interpreted and run against your actual machine. When you set -Xmx and -Xms you are saying that you want to JVM to run like a computer with am amount of ram somewhere in the set range.
I think the reason so many people run in to this is that the default values are relatively low and people end up hitting the default ceiling pretty quickly (instead of waiting until you run out of actual ram as you would with other languages).

Killer facility or scenario that would make another JVM a better choice than the Sun JVM?

For Java SE there are several JVM's available for running in production on x86:
IBM J9
Oracle JRockit - http://www.oracle.com/technology/products/jrockit/index.html
Apache Harmony - http://harmony.apache.org/
The one in OS X (if a Mac) which appears to be Sun with Aqua Swing.
OpenJDK
plus some custom offerings for running on a server:
Azul - http://www.azulsystems.com/
Google App Engine Java - http://code.google.com/intl/da/appengine/docs/java/overview.html
Other platforms:
Sun Solaris JVM - better scalability than x86?
(edit) GNU compiler for Java - http://gcc.gnu.org/java/ - can compile to native code on multiple platforms.
The Sun JVM has a distinct advantage with the jvisualvm program, which allows runtime inspection of running code. Is there any technical advantages of any other JVM that might make it a better choice for development and/or production?
In other words, is there a killer facility or scenario that would make any investment of time/effort/money worth it in another JVM?
(Please also suggest additional JVM's if they would be a good choice).
JRockit comes with JRockit Mission Control, which is a tools suite you can use to monitor the JVM and your application. You can download it here, it's free to use for development.
Mission Control has a lot of features that VisualVM is missing, for instance an online memory leak detector, a latency analyzer, Eclipse integration, JMX.logging to file. etc. If you want to compare VisualVM with Mission Control here are the release notes and the documentation for the latest version.
IBM J9
This is the kind of sales speech you can read or hear about J9:
IBM has released an SDK for Java 6. Product binaries are available for Linux on x86 and 64-bit AMD, and AIX for PPC for 32- and 64-bits. In addition to supporting the Java SE 6 Platform specification, the new SDK also focuses on, Data sharing between Java Virtual Machines, Enhanced diagnostics information, Operating system stack backtraces, Updated jdmpview tool, platform stability, and performance.
Some would say that the IBM SDK has some advantages beyond speed, that the use and expansion of PermGenSpace is much better than in the Sun SDK or GCJ (not a big deal for client applications, but heavy lifting J2EE servers, especially portal servers, can really cause the Sun JDK heartburn). But, according to this paper comparing Sun vs IBM JVM GC, it turns out that memory performance depends mostly on the application and not so much on the VM.
So, while it's true that the IBM JVM is well known for its troubleshooting features (more advanced than Sun's JVM), I'm not convinced by the differences at the GC level.
And Sun's JVM has a big advantage over IBM, at least on Solaris: DTrace providers. Actually, I've been mainly working with Weblogic on Solaris so Sun' JVM has always been the natural choice.
Oracle JRockit
I did some benchmarks of BEA/Oracle JRockit some years ago and it was indeed a fast VM and it was then supporting bigger heaps than Sun's VM at this time. But it has some stability problems which is not really good for production. Things might have changed since then though.
Apache Harmony
I might be wrong but, to me, Harmony is made of code donations from IBM (benefits: the community is doing maintenance) and I don't really see why I should consider Harmony rather than IBM J9.
Apple's JDK
I never had to use Mac for production so I can't really answer. I just remember Apple needed some time to bundle Java 6, and I don't know why. This is maybe not rational but this makes me suspicious.
OpenJDK
I know that some vendor are offering production support (e.g. RedHat with RHEL 5.3+, see this blog entry) for OpenJDK so it might be an option for platforms not supported by Sun. However, unless someone can tell me what makes OpenJDK work better than Sun's, I think I'll install Sun JVM on supported platforms.
So to me, the choices are actually: Sun's JVM unless I've to run some Websphere stuff, in which case I'd choose IBM J9. But to be honest, I've never faced a situation that I couldn't solve on a Sun's JVM and that could have justified (temporary) swapping to IBM' one so I can't actually tell if the troubleshooting features are that nice. But I admit that I may suffer from a lack of knowledge of IBM's JVM.
Some applications, like financial and computational science would benefit greatly from hardware implementations of decimal floating point. IBM has rolled out a series of processors (POWER6, the z9 and z10 mainframes) which implement the IEEE 754-2008 decimal floating point standard. The latest IBM JDK can use hardware acceleration for BigDecimals.
To allow developers to easily take advantage of the dedicated DFP hardware, IBM
Developer Kit for Java 6 has built-in support for 64-bit DFP through the
BigDecimal class library. The JVM seamlessly uses the DFP hardware when
available to remove the need for computationally expensive software-based decimal
arithmetic, thus improving application performance.
It's a very fringe case, but if you have a z10 mainframe handy and want to use its decimal floating point unit in Java, then the IBM JVM would be a better choice than the Sun JVM.
-- Flaviu Cipcigan
The typical feature/scenario that you should look at is performance and speed.
No matter what the white papers say, ultimately you need to benchmark and decide for yourself.
I am biased towards IBM, because I worked there a few years ago. I didn't personally deal with jvm development, but I remember that the emphasis of the jvm development group was on the following points:
Proprietary garbage collection optimizations. This includes not only faster GC, but also more configuration options, like GC for server and client. This was before Sun offered similar options.
Much faster (x10) native performance with the JNI interface. This was particularly important at the time when Eclipse/WSAD began to gain traction and swt was heavily used. If your app uses JNI a lot, then I think it's worth while for you to benchmark the IBM jdk against the Sun jdk.
Stability and reliability. I think this is only relevant if you buy commercial support from IBM, like SLA for a service tier (WebSphere and db2, clustered environment, etc.). In this case, IBM will guaranty the stability of their offering only if you use their jvm.
Regarding OpenJDK, I recommend that you look at this history of OpenJDK. My understanding is that OpenJDK 7 will be almost identical to Sun's jdk 7, so the performance is very likely to be identical. The primary differences will be licensing, and small components like webstart.
Oracle's jvm is useful if you want to run java code from within your database (via stored-procedure). It includes some optimizations that help the db run faster in this scenario.
As others have said, Sun has been catching up on their competitors. I think that in the 1.4 days the differences were much more noticeable, but no so much today. Regarding jvisualvm, other vendors also offer similar tools, so I don't think that is an issue.
Finally, there is one other metric (albeit a bit controversial) to indicate how serious are those vendors about their VM's. That is the number of related patents that they issue. It might be useful if you need to convince your boss, or if you like to read patents :)
Patent search: ibm and java - 4559 patents.
Patent search: oracle and java - 323.
Not strictly a JVM, but still a Java implementation: gcj. It has the advantage of supporting many processors, so if you target one of the embedded processors, gcj may be your only choice. Plus, since it is a true compiler (not just a JIT), you save the overhead of JIT compilation (both in memory and cycles) on the embedded target.
Back in the Java 1.4 days my team used the IBM JVM for a high-volume, message-based system running on Linux. Why did we do this? Because we benchmarked the different JVMs! JRockit was actually the fastest, but it would occasionally crash, so not so great for production.
As always with these things, measure it!
A few years back (JDK1.4), different JVMs had different advantages:
the IBM JVM was able to do heap dumps (programatically, on signals, or on OOM), and the heaproot utility was very useful to track memory leaks (less intrusive than profilers). No other JVM had this.
JRockit had many useful options that the Sun JVM didn't have, parallel collection. It was also (much) faster (and more stable than the Sun VM).
Today, the Sun one has these features, but I'm sure there are others.
Speed could be one. Garbage collection strategies another.
If you're using WebLogic, some bugs in the Sun JVM may lead to bugs in WebLogic. These bugs are more likely to be solved faster in JRockit.
From what I've been told, the main difference with Sun's JVM and IBM's JVM is in the actual garbage collectors, IBM's garbage collector(s?) are much more configurable than Sun's and are made only the business world in mind. Additionally IBM's JVM can tell a lot more than "I just crashed, here's my heapdump" in error situations which is obviously important in the business world which is the main living space of IBM's JVM.
So assuming I haven't been lied to, I'd say that IBM's JVM should be used when doing memory-intensive things in business software which relies on aggressive or otherwise highly tunable garbage collection.
In my own experience and at face value, I see simplicity in the IBM GC design. Sun's various and new GCs are excellent no doubt and offer a host of tuning options at minute levels, but even in some of the most active web apps I know that handle heavy/aggressive new objects and keep a lot in the heap for cache I rarely see GC ever exceed 1% even trying to keep the footprint low. Sure we could probably tune it better but there's a diminishing return.
I have had much more of a challenge in the exact same applications running IBM's JDK. In particular having issues with pinned clusters and having to tune -Xk.
Now I could mention about a dozen items that both IBM and Sun should implement for the killer JVM but not the scope of your question I presume :)
Incremental garbage collection and very small runtime size for realtime embedded systems is the only thing that would really matter enough to warrant chancing less stability or platform support. There was some jvm with this in mind but I forget its name now.

Categories