Analysing large Java heap dumps - memory error

Analysing large Java heap dumps - memory error - java

I have a very peculiar problem. I have a heap dump of 30 GB and I want to analyze the same on my laptop (which has 8 GB of RAM). I tried doing that with MAT and IBM Heap analyzer, but as per their recommendation the Xmx size should be more than the dump size. I also tried to analyze the heap dump with the heapDumpParser.bat file of MAT but received memory error.
Any suggestions on how I can analyze the dump on my laptop successfully?
Thanks in advance!

Memory Analyzer is probably the best tool for analysing out of memory issues but it does require a lot of memory.
If you are unable to find a machine large enough to run to handle your dump you could try using the jdmpview command line tool that ships with the IBM SDK to perform some basic investigation.
It will work best with the core dumps generated on out of memory rather than the phd files as it does not need to load the contents into memory.
You can find it in jre/bin and need to run:
jdmpview -core core_file_name
You should probably start by running the command:
info class
as that will generate a basic list of object types, instance counts and sizes.
There are full docs here:
http://www-01.ibm.com/support/knowledgecenter/SSYKE2_8.0.0/com.ibm.java.win.80.doc/diag/tools/dump_viewer_dtfjview/dump_viewer.html

Related

How do I analyze a Java heap dump when local memory is less than the size of the dumped heap? [duplicate]

I have a HotSpot JVM heap dump that I would like to analyze. The VM ran with -Xmx31g, and the heap dump file is 48 GB large.
I won't even try jhat, as it requires about five times the heap memory (that would be 240 GB in my case) and is awfully slow.
Eclipse MAT crashes with an ArrayIndexOutOfBoundsException after analyzing the heap dump for several hours.
What other tools are available for that task? A suite of command line tools would be best, consisting of one program that transforms the heap dump into efficient data structures for analysis, combined with several other tools that work on the pre-structured data.

Normally, what I use is ParseHeapDump.sh included within Eclipse Memory Analyzer and described here, and I do that onto one our more beefed up servers (download and copy over the linux .zip distro, unzip there). The shell script needs less resources than parsing the heap from the GUI, plus you can run it on your beefy server with more resources (you can allocate more resources by adding something like -vmargs -Xmx40g -XX:-UseGCOverheadLimit to the end of the last line of the script.
For instance, the last line of that file might look like this after modification
./MemoryAnalyzer -consolelog -application org.eclipse.mat.api.parse "$#" -vmargs -Xmx40g -XX:-UseGCOverheadLimit
Run it like ./path/to/ParseHeapDump.sh ../today_heap_dump/jvm.hprof
After that succeeds, it creates a number of "index" files next to the .hprof file.
After creating the indices, I try to generate reports from that and scp those reports to my local machines and try to see if I can find the culprit just by that (not just the reports, not the indices). Here's a tutorial on creating the reports.
Example report:
./ParseHeapDump.sh ../today_heap_dump/jvm.hprof org.eclipse.mat.api:suspects
Other report options:
org.eclipse.mat.api:overview and org.eclipse.mat.api:top_components
If those reports are not enough and if I need some more digging (i.e. let's say via oql), I scp the indices as well as hprof file to my local machine, and then open the heap dump (with the indices in the same directory as the heap dump) with my Eclipse MAT GUI. From there, it does not need too much memory to run.
EDIT:
I just liked to add two notes :
As far as I know, only the generation of the indices is the memory intensive part of Eclipse MAT. After you have the indices, most of your processing from Eclipse MAT would not need that much memory.
Doing this on a shell script means I can do it on a headless server (and I normally do it on a headless server as well, because they're normally the most powerful ones). And if you have a server that can generate a heap dump of that size, chances are, you have another server out there that can process that much of a heap dump as well.

First step: increase the amount of RAM you are allocating to MAT. By default it's not very much and it can't open large files.
In case of using MAT on MAC (OSX) you'll have file MemoryAnalyzer.ini file in MemoryAnalyzer.app/Contents/MacOS. It wasn't working for me to make adjustments to that file and have them "take". You can instead create a modified startup command/shell script based on content of this file and run it from that directory. In my case I wanted 20 GB heap:
./MemoryAnalyzer -vmargs -Xmx20g --XX:-UseGCOverheadLimit ... other params desired
Just run this command/script from Contents/MacOS directory via terminal, to start the GUI with more RAM available.

I suggest trying YourKit. It usually needs a little less memory than the heap dump size (it indexes it and uses that information to retrieve what you want)

The accepted answer to this related question should provide a good start for you (if you have access to the running process, generates live jmap histograms instead of heap dumps, it's very fast):
Method for finding memory leak in large Java heap dumps
Most other heap analysers (I use IBM http://www.alphaworks.ibm.com/tech/heapanalyzer) require at least a percentage of RAM more than the heap if you're expecting a nice GUI tool.
Other than that, many developers use alternative approaches, like live stack analysis to get an idea of what's going on.
Although I must question why your heaps are so large? The effect on allocation and garbage collection must be massive. I'd bet a large percentage of what's in your heap should actually be stored in a database / a persistent cache etc etc.

This person http://blog.ragozin.info/2015/02/programatic-heapdump-analysis.html
wrote a custom "heap analyzer" that just exposes a "query style" interface through the heap dump file, instead of actually loading the file into memory.
https://github.com/aragozin/heaplib
Though I don't know if "query language" is better than the eclipse OQL mentioned in the accepted answer here.

The latest snapshot build of Eclipse Memory Analyzer has a facility to randomly discard a certain percentage of objects to reduce memory consumption and allow the remaining objects to be analyzed. See Bug 563960 and the nightly snapshot build to test this facility before it is included in the next release of MAT. Update: it is now included in released version 1.11.0.

A not so well known tool - http://dr-brenschede.de/bheapsampler/ works well for large heaps. It works by sampling so it doesn't have to read the entire thing, though a bit finicky.

This is not a command line solution, however I like the tools:
Copy the heap dump to a server large enough to host it. It is very well possible that the original server can be used.
Enter the server via ssh -X to run the graphical tool remotely and use jvisualvm from the Java binary directory to load the .hprof file of the heap dump.
The tool does not load the complete heap dump into memory at once, but loads parts when they are required. Of course, if you look around enough in the file the required memory will finally reach the size of the heap dump.

I came across an interesting tool called JXray. It provides limited evaluation trial license. Found it very useful to find memory leaks. You may give it a shot.

Try using jprofiler , its works good in analyzing large .hprof, I have tried with file sized around 22 GB.
https://www.ej-technologies.com/products/jprofiler/overview.html
$499/dev license but has a free 10 day evaluation

When the problem can be "easily" reproduced, one unmentioned alternative is to take heap dumps before memory grows that big (e.g., jmap -dump:format=b,file=heap.bin <pid>).
In many cases you will already get an idea of what's going on without waiting for an OOM.
In addition, MAT provides a feature to compare different snapshots, which can come handy (see https://stackoverflow.com/a/55926302/898154 for instructions and a description).

Memory Leak Suspects

In our team, we are using a service which has spill over problem. It is caused by long API latency in which GC time took most of the parts. Then I found that the heap memory usage is very high. I got the heap dump using jmap for the service which is about 4.4 GB. I used the Eclipse Memory Analyzer to parse the heap dump. I found that 2.8GB of the heap dump is unreachable objects.
Anyone has the suggestions that what should I do to further debug this problem?
Thank you.

If you have a heap dump from when it run out of memory, I suggest use MAT to find any suspicious dominator trees which is narrow reference path to a large retained set size.
It could be same classes are ending up in different class loaders or could be HTTP session retention if web application or bad cache problem.
I suggest you, start with simple things first.
Quick look at what and where jars are being loaded.
Make sure class unloading is enabled (with CMS).
Use memory analyser tool on the heap dump to see exactly what is being retained and by whom.

Usage of XX:HeapDumpSegmentSize and XX:SegmentedHeapDumpThreshold

I've encountered a problem that dumping a heap fails due to the following message:
java.lang.OutOfMemoryError: Java heap space
Dumping heap to /usr/local/webapp/logs/java_pid<MY_PID>.hprof ...
Dump file is incomplete: file size limit
I've found a similar QA here: XX:+HeapDumpOnOutOfMemoryError Max file size limit, but it just refers to options we can use to solve this problem, not how to use them. Searching on the web doesn't help me either because there's little information on these. To make matters worse, I cannot easily test them because the problem seldom occurs.
I found that my application reached its max heap size(8GB) and tried to dump the heap, but failed due to the error above and the dumped file size is only 2.2GB, which seems incomplete and any tools (like jhat, jconsole, jvirtualvm) can't open it.
How can I avoid the problem by using options XX:HeapDumpSegmentSize and XX:SegmentedHeapDumpThreshold?
Environment
Java 1.8.0_71
Options that seem to be relevant:
-Xmx8196m
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=$LOG_DIR
-XX:OnOutOfMemoryError="kill -9 %p"

HeapDumpSegmentSize and SegmentedHeapDumpThreshold are developer options, they cannot be changed in production JVM. These options will not help you anyway, because they are not about splitting heap dump into multiple files, but rather about creating multiple segments inside one file.
Make sure that ulimit -f is large enough to create 8G files.

Java : Get heap dump without jmap or without hanging the application

In few circumstance, our application is using around 12 GB of memory.
We tried to get the heap dump using jmap utility. Since the application is using some GB of memory it causes the application to stop responding and causes problem in production.
In our case the heap usage suddenly increases from 2-3 GB to 12GB in 6 hours. In an attempt to find teh memory usage trend we tried to collect the heap dump every one hour after restarting the application. But as said since using the jmap causes the application to hang we need to restart it and we are not able to get the trend of memory usage.
Is there a way to get the heap dump without hanging the application or is there a utility other than jmap to collect heap dump.
Thoughts on this highly appreciated, since without getting the trend of memory usage it is highly difficult to fix the issue.
Note: Our application runs in CentOS.
Thanks,
Arun

Try the following. It comes with JDK >= 7:
/usr/lib/jvm/jdk-YOUR-VERSION/bin/jcmd PID GC.heap_dump FILE-PATH-TO-SAVE
Example:
/usr/lib/jvm/jdk1.8.0_91/bin/jcmd 25092 GC.heap_dump /opt/hd/3-19.11-jcmd.hprof
This dumping process is much faster than dumping with jmap! Dumpfiles are much smaller, but it's enough to give your the idea, where the leaks are.
At the time of writing this answer, there are bugs with Memory Analyzer and IBM HeapAnalyzer, that they cannot read dumpfiles from jmap (jdk8, big files). You can use Yourkit to read those files.

First of all, it is (AFAIK) essential to freeze the JVM while a thread dump / snapshot is being taken. If JVM was able to continue running while the snapshot was created, it would be next to impossible to get a coherent snapshot.
So are there other ways to get a heap dump?
You can get a heap dump using VisualVM as described here.
You can get a heap dump using jconsole or Eclipse Memory Analyser as described here.
But all of these are bound to cause the JVM to (at least) pause.
If your application is actually hanging (permanently!) that sounds like a problem with your application itself. My suggestion would be to see if you can track down that problem before looking for the storage leak.
My other suggestion is that you look at a single heap dump, and use the stats to figure out what kind(s) of object are using all of the space ... and why they are reachable. There is a good chance that you don't need the "trend" information at all.

You can use GDB to get the heap dump without running jmap on the target VM however this will still hang the application for the amount of time required to write the heap dump to disk. Assuming a disk speed of 100MB/s (a basic mirrored array or single disk) this is still 2 minutes of downtime.
http://blogs.atlassian.com/2013/03/so-you-want-your-jvms-heap/
The only true way to avoid stopping the JVM is transactional memory and a kernel that takes advantage of it to provide a process snapshot facility. This is one of the dreams of the proponents of STM but it's not available yet. VMWare's hot-migration comes close but depends on your allocation rate not exceeding network bandwidth and it doesn't save snapshots. Petition them to add it for you, it'd be a neat feature.

A heap dump analyzed with the right tool will tell you exactly what is consuming the heap. It is the best tool for tracking down memory leaks. However, collecting a heap dump is slow let alone analyzing it.
With knowledge of the workings of your application, sometimes a histogram is enough to give you a clue of where to look for the problem. For example, if MyClass$Inner is at the top of the histogram and MyClass$Inner is only used in MyClass, then you know exactly which file to look for a problem.
Here's the command for collecting a histogram.
jcmdpidGC.class_histogram filename=histogram.txt

To add to Stephen's answers, you can also trigger a heap dump via API for the most common JVM implementations:
example for the Oracle JVM
API for the IBM JVM

Running out of memory while analyzing a Java Heap Dump

I have a curious problem, I need to analyze a Java heap dump (from an IBM JRE) which has 1.5GB in size, the problem is that while analyzing the dump (I've tried HeapAnalyzer and the IBM Memory Analyzer 0.5) the tools runs out of memory I can't really analyze the dump. I have 3GB of RAM in my machine, but seems like it's not enough to analyze the 1.5 GB dump,
My question is, do you know a specific tool for heap dump analysis (supporting IBM JRE dumps) that I could run with the amount of memory I have?
Thanks.

Try the SAP memory analyzer tool, which also has an eclipse plugin. This tool creates index files on disk as it processes the dump file and requires much less memory than your other options. I'm pretty sure it supports the newer IBM JRE's. That being said - with a 1.5 GB dump file, you might have no other option but to run a 64-bit JVM to analyze this file - I usually estimate that a heap dump file of size n takes 5*n memory to open using standard tools, and 3*n memory to open using MAT, but your milage will vary depending on what the dump actually contains.

It's going to difficult to analyze 1.5GB heap dump on a 3GB RAM. Because in that 3GB your OS, other processes, services,... easily would occupy 0.5 GB. So you are left with only 2.5GB. heapHero tool is efficient in analyzing heap dumps. It should take only 0.5GB more than the size of heap dump to analyze. You can give it try. But best recommendation is to analyze heap dump on a machine which has adequate memory OR you can get an AWS ec2 instance just for the period of analyzing heap dumps. After analyzing heap dumps, you can terminate the instance.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.