How to debug Java OutOfMemory exceptions? - java

What is the best way to debug java.lang.OutOfMemoryError exceptions?
When this happens to our application, our app server (Weblogic) generates a heap dump file. Should we use the heap dump file? Should we generate a Java thread dump? What exactly is the difference?
Update: What is the best way to generate thread dumps? Is kill -3 (our app runs on Solaris) the best way to kill the app and generate a thread dump? Is there a way to generate the thread dump but not kill the app?

Analyzing and fixing out-of-memory errors in Java is very simple.
In Java the objects that occupy memory are all linked to some other objects, forming a giant tree. The idea is to find the largest branches of the tree, which will usually point to a memory leak situation (in Java, you leak memory not when you forget to delete an object, but when you forget to forget the object, i.e. you keep a reference to it somewhere).
Step 1. Enable heap dumps at run time
Run your process with -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp
(It is safe to have these options always enabled. Adjust the path as needed, it must be writable by the java user)
Step 2. Reproduce the error
Let the application run until the OutOfMemoryError occurs.
The JVM will automatically write a file like java_pid12345.hprof.
Step 3. Fetch the dump
Copy java_pid12345.hprof to your PC (it will be at least as big as your maximum heap size, so can get quite big - gzip it if necessary).
Step 4. Open the dump file with IBM's Heap Analyzer or Eclipse's Memory Analyzer
The Heap Analyzer will present you with a tree of all objects that were alive at the time of the error.
Chances are it will point you directly at the problem when it opens.
Note: give HeapAnalyzer enough memory, since it needs to load your entire dump!
java -Xmx10g -jar ha456.jar
Step 5. Identify areas of largest heap use
Browse through the tree of objects and identify objects that are kept around unnecessarily.
Note it can also happen that all of the objects are necessary, which would mean you need a larger heap. Size and tune the heap appropriately.
Step 6. Fix your code
Make sure to only keep objects around that you actually need. Remove items from collections in a timely manner. Make sure to not keep references to objects that are no longer needed, only then can they be garbage-collected.

I've had success using a combination of Eclipse Memory Analyzer (MAT) and Java Visual VM to analyze heap dumps. MAT has some reports that you can run that give you a general idea of where to focus your efforts within your code. VisualVM has a better interface (in my opinion) for actually inspecting the contents of the various objects that you are interested in examining. It has a filter where you can have it display all instances of a particular class and see where they are referenced and what they reference themselves. It has been a while since I've used either tool for this they may have a closer feature set now. At the time using both worked well for me.

What is the best way to debug java.lang.OutOfMemoryError exceptions?
The OutOfMemoryError describes type of error in the message description. You have to check the description of the error message to handle the exception.
There are various root causes for out of memory exceptions. Refer to oracle documentation page for more details.
java.lang.OutOfMemoryError: Java heap space:
Cause: The detail message Java heap space indicates object could not be allocated in the Java heap.
java.lang.OutOfMemoryError: GC Overhead limit exceeded:
Cause: The detail message "GC overhead limit exceeded" indicates that the garbage collector is running all the time and Java program is making very slow progress
java.lang.OutOfMemoryError: Requested array size exceeds VM limit:
Cause: The detail message "Requested array size exceeds VM limit" indicates that the application (or APIs used by that application) attempted to allocate an array that is larger than the heap size.
java.lang.OutOfMemoryError: Metaspace:
Cause: Java class metadata (the virtual machines internal presentation of Java class) is allocated in native memory (referred to here as metaspace)
java.lang.OutOfMemoryError: request size bytes for reason. Out of swap space?:
Cause: The detail message "request size bytes for reason. Out of swap space?" appears to be an OutOfMemoryError exception. However, the Java HotSpot VM code reports this apparent exception when an allocation from the native heap failed and the native heap might be close to exhaustion
java.lang.OutOfMemoryError: Compressed class space
Cause: On 64-bit platforms a pointer to class metadata can be represented by a 32-bit offset (with UseCompressedOops). This is controlled by the command line flag UseCompressedClassPointers (on by default).
If the UseCompressedClassPointers is used, the amount of space available for class metadata is fixed at the amount CompressedClassSpaceSize. If the space needed for UseCompressedClassPointers exceeds CompressedClassSpaceSize, a java.lang.OutOfMemoryError with detail Compressed class space is thrown.
Note: There is more than one kind of class metadata - klass metadata and other metadata. Only klass metadata is stored in the space bounded by CompressedClassSpaceSize. The other metadata is stored in Metaspace.
Should we use the heap dump file? Should we generate a Java thread dump? What exactly is the difference?
Yes. You can use this heap heap dump file to debug the issue using profiling tools like visualvm or mat
You can use Thread dump to get further insight about status of threads.
Refer to this SE question to know the differenes:
Difference between javacore, thread dump and heap dump in Websphere
What is the best way to generate thread dumps? Is kill -3 (our app runs on Solaris) the best way to kill the app and generate a thread dump? Is there a way to generate the thread dump but not kill the app?
kill -3 <process_id> generates Thread dump and this command does not kill java process.

It is generally very difficult to debug OutOfMemoryError problems. I'd recommend using a profiling tool. JProfiler works pretty well. I've used it in the past and it can be very helpful, but I'm sure there are others that are at least as good.
To answer your specific questions:
A heap dump is a complete view of the entire heap, i.e. all objects that have been created with new. If you're running out of memory then this will be rather large. It shows you how many of each type of object you have.
A thread dump shows you the stack for each thread, showing you where in the code each thread is at the time of the dump. Remember that any thread could have caused the JVM to run out of memory but it could be a different thread that actually throws the error. For example, thread 1 allocates a byte array that fills up all available heap space, then thread 2 tries to allocate a 1-byte array and throws an error.

You can also use jmap/jhat to attach to a running Java process. These (family of) tools are really useful if you have to debug a live running application.
You can also leave jmap running as a cron task logging into a file which you can analyse later (It is something which we have found useful to debug a live memory leak)
jmap -histo:live <pid> | head -n <top N things to look for> > <output.log>
Jmap can also be used to generate a heap dump using the -dump option which can be read through the jhat.
See the following link for more details
http://www.lshift.net/blog/2006/03/08/java-memory-profiling-with-jmap-and-jhat
Here is another link to bookmark
http://java.sun.com/developer/technicalArticles/J2SE/monitoring/

It looks like IBM provides a tool for analyzing those heap dumps: http://www.alphaworks.ibm.com/tech/heaproots ; more at http://www-01.ibm.com/support/docview.wss?uid=swg21190476 .

Once you get a tool to look at the heap dump, look at any thread that was in the Running state in the thread stack. Its probably one of those that got the error. Sometimes the heap dump will tell you what thread had the error right at the top.
That should point you in the right direction. Then employ standard debugging techniques (logging, debugger, etc) to hone in on the problem. Use the Runtime class to get the current memory usage and log it as the method in or process in question executes.

I generally use Eclipse Memory Analyzer. It displays the suspected culprits (the objects which are occupying most of the heap dump) and different call hierarchies which is generating those objects. Once that mapping is there we can go back to the code and try to understand if there is any possible memory leak any where in the code path.
However, OOM doesn't always mean that there is a memory leak. It's always possible that the memory needed by an application during the stable state or under load is not available in the hardware/VM. For example, there could be a 32 bit Java process (max memory used ~ 4GB) where as the VM has just 3 GB. In such a case, initially the application may run fine, but OOM may be encountered as and when the memory requirement approaches 3GB.
As mentioned by others, capturing thread dump is not costly, but capturing heap dump is. I have observed that while capturing heap dump application (generally) freezes and only a kill followed by restart helps to recover.

Related

WebSphere out of memory error

We use WebSphere application server for our application and we regularly get out of memory error. To debug this we added log to check used memory at certain places and below is the observation.
The used memory is not decreasing until it reaches threshold limit. We use below memory configuration:
InitialHeapSize="1024" maximumHeapSize="2048"
So until it crosses 1024 the memory is not released. In the case of OOM error, the memory is not released only even though some threads are not in use.
I assumed that the heap size was not released. But the java Runtime API is displaying that there is memory available. Java operations like method class, string opertaions are working but its failing when JNDI look up is made with outofmemory exception. As a result, the system is failing because of unavilability of connection.
Stack trace:
com.ibm.websphere.naming.CannotInstantiateObjectException: Exception occurred while the JNDI NamingManager was processing a javax.naming.Reference object. [Root exception is java.lang.OutOfMemoryError]
at com.ibm.ws.naming.util.Helpers.processSerializedObjectForLookupExt(Helpers.java:1033)
at com.ibm.ws.naming.util.Helpers.processSerializedObjectForLookup(Helpers.java:730)
Dynamo , you will have to perform a heap analysis to find out what causes the OOM for your. It is a free tooling that allows you to find out what is causing the issue in the server. May be it is a rogue application that is blocking too much memory or a resource that is leaking too much memory etc.
you can look at this for more information. Your setting of initial heap and maximum heap is something you want to tune (If you have it too deep for GC , your CPU will be very high during GC vs constant overhead usage issues if it is too frequent)
https://www.ibm.com/developerworks/community/groups/service/html/communityview?communityUuid=4544bafe-c7a2-455f-9d43-eb866ea60091
You need to generate Heap Dump and Thread Dump via wasadmin and analyze for root causes
There will be some differences depending on the platform and edition you are using, but, there is built in support for generating heap dumps:
See, for example:
http://www.ibm.com/support/knowledgecenter/SSAW57_8.5.5/com.ibm.websphere.nd.doc/ae/tprf_enablingheapdump.html
Generally, you will either want to enable generation of heap dumps, then force an OOM, then use the HeapAnalyzer to analyze the resulting heap dump. Or, you can manually generate heaps when large memory usage is seen.
Some caution: What may look like a memory leak may be a very large but transient memory use. A view of memory usage over time will be needed to conclude that there is an actual leak.
Regardless, the path for handling this sort of problem inevitably leads to generating a heap dump and doing analysis.

Java : Get heap dump without jmap or without hanging the application

In few circumstance, our application is using around 12 GB of memory.
We tried to get the heap dump using jmap utility. Since the application is using some GB of memory it causes the application to stop responding and causes problem in production.
In our case the heap usage suddenly increases from 2-3 GB to 12GB in 6 hours. In an attempt to find teh memory usage trend we tried to collect the heap dump every one hour after restarting the application. But as said since using the jmap causes the application to hang we need to restart it and we are not able to get the trend of memory usage.
Is there a way to get the heap dump without hanging the application or is there a utility other than jmap to collect heap dump.
Thoughts on this highly appreciated, since without getting the trend of memory usage it is highly difficult to fix the issue.
Note: Our application runs in CentOS.
Thanks,
Arun
Try the following. It comes with JDK >= 7:
/usr/lib/jvm/jdk-YOUR-VERSION/bin/jcmd PID GC.heap_dump FILE-PATH-TO-SAVE
Example:
/usr/lib/jvm/jdk1.8.0_91/bin/jcmd 25092 GC.heap_dump /opt/hd/3-19.11-jcmd.hprof
This dumping process is much faster than dumping with jmap! Dumpfiles are much smaller, but it's enough to give your the idea, where the leaks are.
At the time of writing this answer, there are bugs with Memory Analyzer and IBM HeapAnalyzer, that they cannot read dumpfiles from jmap (jdk8, big files). You can use Yourkit to read those files.
First of all, it is (AFAIK) essential to freeze the JVM while a thread dump / snapshot is being taken. If JVM was able to continue running while the snapshot was created, it would be next to impossible to get a coherent snapshot.
So are there other ways to get a heap dump?
You can get a heap dump using VisualVM as described here.
You can get a heap dump using jconsole or Eclipse Memory Analyser as described here.
But all of these are bound to cause the JVM to (at least) pause.
If your application is actually hanging (permanently!) that sounds like a problem with your application itself. My suggestion would be to see if you can track down that problem before looking for the storage leak.
My other suggestion is that you look at a single heap dump, and use the stats to figure out what kind(s) of object are using all of the space ... and why they are reachable. There is a good chance that you don't need the "trend" information at all.
You can use GDB to get the heap dump without running jmap on the target VM however this will still hang the application for the amount of time required to write the heap dump to disk. Assuming a disk speed of 100MB/s (a basic mirrored array or single disk) this is still 2 minutes of downtime.
http://blogs.atlassian.com/2013/03/so-you-want-your-jvms-heap/
The only true way to avoid stopping the JVM is transactional memory and a kernel that takes advantage of it to provide a process snapshot facility. This is one of the dreams of the proponents of STM but it's not available yet. VMWare's hot-migration comes close but depends on your allocation rate not exceeding network bandwidth and it doesn't save snapshots. Petition them to add it for you, it'd be a neat feature.
A heap dump analyzed with the right tool will tell you exactly what is consuming the heap. It is the best tool for tracking down memory leaks. However, collecting a heap dump is slow let alone analyzing it.
With knowledge of the workings of your application, sometimes a histogram is enough to give you a clue of where to look for the problem. For example, if MyClass$Inner is at the top of the histogram and MyClass$Inner is only used in MyClass, then you know exactly which file to look for a problem.
Here's the command for collecting a histogram.
jcmdpidGC.class_histogram filename=histogram.txt
To add to Stephen's answers, you can also trigger a heap dump via API for the most common JVM implementations:
example for the Oracle JVM
API for the IBM JVM

Exception in thread "http-8080-10" java.lang.OutOfMemoryError: Java

I have a Web application running on my 64-bit Windows Server 2003, Oracle 11G database and Apache Tomcat 6.0 Web Server.
Application is on live environment and around 3000 of user using the application I have encountered Java Heap Out Of Memory Error. After increasing Heap space it's resolved.
Now again I am facing same issue, below is the error stack trace:
Exeption in thread "http-8080-10" java.lang.OutOfMemoryError: Java
heap space Aug 23, 2013 8:48:00 PM com.SessionClunter
getActiveSessions Exeption in thread "http-8080-11"
java.lang.OutOfMemoryError: Java heap space Exeption in thread
"http-8080-4" Exeption in thread "http-8080-7"
java.lang.OutOfMemoryError: Java heap space
Your problem could be caused by a few things (at a conceptual level):
You could simply have too many simultaneous users or user sessions.
You could be attempting to process too many user requests simultaneously.
You could be attempting to process requests that are too large (in some sense).
You could have a memory leak ... which could be related to some of the above issue, or could be unrelated.
There is no simple solution. (You've tried the only easy solution ... increasing the heap size ... and it hasn't worked.)
The first step in solving this is to change your JVM options to get it to take a heap dump when a OOME occurs. Then you use a memory dump analyser to examine the dump, and figure out what objects are using too much memory. That should give you some evidence that will allow you to narrow down the possible causes ...
If you keep getting OutOfMemoryError no matter how much you increase the max heap, then your application probably has a memory leak, which you must solve by getting into the code and optimizing it. Short of that, you have no other choice but keep increasing the max heap until you can.
You can look for memory leaks and optimize using completely free tools like this:
Create a heap dump of your application when it uses a lot of memory, but before it would crash, using jmap that is part of the Java installation used by your JVM container (= tomcat in your case):
# if your process id is 1234
jmap -dump:format=b,file=/var/tmp/dump.hprof 1234
Open the heap dump using the Eclipse Memory Analyzer (MAT)
MAT gives suggestions about potential memory leaks. Try to follow those.
Look at the histogram tab. It shows all the objects that were in memory at the time of the dump, grouped by their class. You can order by memory use and number of objects. When you have a memory leak, usually there are shockingly too many instances of some objects that clearly don't make sense all. I often tracked down memory leaks based on that info alone.
Another useful free JVM monitoring tool is VisualVM. A non-free but very powerful tool is JProfiler.

Access Memory Usage of JVM from within my Application?

I have a Grails/Spring application which runs in a servlet container on a web server like Tomcat. Sometime my app crashes because the JVM reaches its maximal allowed memory (Xmx).
The error which follows is a "java.lang.OutOfMemoryError" because Java heap space is full.
To prevent this error I want to check from within my app how much memory is in use and how much memory the current JVM has remaining.
How can I access these parameters from within my application?
Try to understand when OOM is thrown instead of trying to manipulate it through the application. And also, even if you are able to capture those values from within your application - how would you prevent the error? By calling GC explicitly. Know that,
Java machine specifications says that
OutOfMemoryError: The Java virtual machine implementation has run out of either virtual or physical memory, and the automatic storage manager was unable to reclaim enough memory to satisfy an object creation request.
Therefore, GC is guaranteed to run before a OOM is thrown. Your application is throwing an OOME after it has just run a full garbage collect, and discovered that it still doesn't have enough free heap to proceed.
This would be a memory leak or in general your application could have high memory requirement. Mostly if the OOM is thrown with in short span of starting the application - it is usually that application needs more memory, if your server runs fine for some time and then throw OOM then it is most likely a memory leak.
To discover the memory leak, use the tools mentioned by people above. I use new-relic to monitor my application and check the frequency of GC runs.
PS Scavenge aka minor-GC aka the parallel object collector runs for young generation only, and PS MarkAndSweep aka major GC aka parallel mark and sweep collector is for old generation. When both are run – its considered a full GC. Minor gc runs are pretty frequent – a Full GC is comparatively less frequent. Note the consumption of different heap spaces to analyze your application.
You can also try the following option -
If you get OOM too often, then start java with correct options, get a heap dump and analyze it with jhat or with memory analyzer from eclipse (http://www.eclipse.org/mat/)
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=path to dump file
You can try the Grails Melody Plugin that display's the info in the url /monitoring relative to your context.
To prevent this error I want to check from within my app how much
memory is in use and how much memory the current JVM has remaining.
I think that it is not the best idea to proceed this way. Much better is to investigate what actually breaks your app and eliminate error or make some limitation there. There could be many different scenarios and your app can become unpredictable. So to sum up - capturing memory level for monitoring purpose is OK (but there are many dedicated tools for that) but in my opinion depending on these values in application logic is not recommended and bad practice
To do this you would use a profiler to profile your application and JVM, rather than having code to monitor such metrics inside your application.
Profiling is a form of dynamic program analysis that measures, for example, the space (memory) or time complexity of a program, the usage of particular instructions, or frequency and duration of function calls
Here are some good java profilers:
http://visualvm.java.net/ (Free)
http://www.ej-technologies.com/products/jprofiler/overview.html (Paid)

-XX:+HeapDumpOnOutOfMemoryError not creating hprof file in OOM

I start my java code (1.6.0_16 in Vista) with the following params (among others) -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=../logs. I run the code and I can see in the logs there are two OOM.
The first one I know cause I can see in the stdout that the hprof file is being created:
java.lang.OutOfMemoryError: Java heap space
Dumping heap to ../logs\java_pid4604.hprof ...
Heap dump file created [37351818 bytes in 1.635 secs]
And then, towards the end of the code I get another OOM, I capture this, but I don't get a second hprof file created. Anybody knows why is that?? Is it because I have captured the OOM exception?
I wouldn't try to recover from an OutOfMemoryError as some objects might end up in an undefined state (just thinking about an ArrayList that couldn't allocate its array to store date for instance).
Regarding your question, I'd suspect that -XX:+HeapDumpOnOutOfMemoryError is only creating a single dump intentionally to prevent multiple heap dumps: just think about several threads throwing an OOME at the same time, causing a heap dump for each thrown exception.
As a summary: don't try to recover from OOME and don't expect the JVM to write more than a single heap dump. However, if you still feel the need to generate a heap dump, you could try to manually handle an OOME exception and call jmap to create a dump or use "-XX:+HeapDumpOnCtrlBreak" (not sure though, how to simulate CtrlBreak programmatically).
Out of memory generates only one dump-file on the first error. If you want to get more you can try jmap or keep jconsole on the jvm (version 6) then you can after everything crashed i.e in the morning create your own dump from jconsole (or your analyser tool of choice).
More on the dumping subject can be read in Eclipse MemoryAnalyser.

Categories