How to troubleshoot an unresponsive Java application/process in Linux - java

Say your application is unresponsive and you cannot attach a debugger to it, as it rejects everything. All you have is a Linux Bash and process id. How would you investigate the issue? What tools would you use? My goal is to better my troubleshooting skills using Java.
This particular issue we had in production, on customer site.

You could take a thread dump from the application by issuing:
kill -3
That would give you some information as to the current state of the threads and hopefully help diagnose the issue. However, the trick is not in taking the thread dump, but reading the thread dump produced - since they can be a little overwhelming to look at. See this link for more info on reading a thread dump.
http://manikandakumar.blogspot.com/2006/12/reading-thread-dumps.html
You could also take a look at jstack which is part of the JDK - I've not used it specifically, see:
http://java.sun.com/j2se/1.5.0/docs/tooldocs/share/jstack.html

I agree with Jon that you should use kill -3 to get a thread dump. I have found Thread Dump Analyzer useful for viewing thread dumps.
You should also take a look at the memory usage of the process using top. Does it look like the app has run out of heap space? If so, you could try and use the jmap tool to obtain a heap dump and/or histogram count of the objects on the heap. You may need to use the -F option if the app has really hung up and I have experienced cases where jmap simply would not work against a hung Java process. Once you have a heap dump you could use Eclipse Memory Analyzer to investigate it.
You don't mention whether your application has any logging. If not you should look into adding logging that could help debug production issues.

jstack <pid>

Sounds like an interview question.
You could also try attach jconsole to see what it is doing.

If you have Java 6 you can try to connect with Visualvm (https://visualvm.dev.java.net/) which ships with current JDKs to connect to the VM. With this Tool you are able to create a complete MemoryDump (not only thread dump) of your VM Process. You can load this Memory Dump into VisualVM or Eclipse with the MAT Plugin (Memory Analyzer Tools http://www.eclipse.org/mat/).
After some time of loading an computation you can browse the complete Heap of your Application, search form Memory Leaks etc.
Analysing Heap Dumps is a great way to improve your TroubleShooting Skills.

I agree with others that Thread dumps are the way to go.
I would like to add that you should get lot's of thread dumps.
You can do very simple profiling with just a few unix commands.
Check my post here

I know this is an old question but I would like to share information with our other friends who are facing this issue and come across this post.
You can capture the thread dump and use some tools like fastThread, Samurai to analyze your thread dumps.
You can check out the following blog to see 8 different options to take thread dump: How to capture thread dump?

Related

Java : Get heap dump without jmap or without hanging the application

In few circumstance, our application is using around 12 GB of memory.
We tried to get the heap dump using jmap utility. Since the application is using some GB of memory it causes the application to stop responding and causes problem in production.
In our case the heap usage suddenly increases from 2-3 GB to 12GB in 6 hours. In an attempt to find teh memory usage trend we tried to collect the heap dump every one hour after restarting the application. But as said since using the jmap causes the application to hang we need to restart it and we are not able to get the trend of memory usage.
Is there a way to get the heap dump without hanging the application or is there a utility other than jmap to collect heap dump.
Thoughts on this highly appreciated, since without getting the trend of memory usage it is highly difficult to fix the issue.
Note: Our application runs in CentOS.
Thanks,
Arun
Try the following. It comes with JDK >= 7:
/usr/lib/jvm/jdk-YOUR-VERSION/bin/jcmd PID GC.heap_dump FILE-PATH-TO-SAVE
Example:
/usr/lib/jvm/jdk1.8.0_91/bin/jcmd 25092 GC.heap_dump /opt/hd/3-19.11-jcmd.hprof
This dumping process is much faster than dumping with jmap! Dumpfiles are much smaller, but it's enough to give your the idea, where the leaks are.
At the time of writing this answer, there are bugs with Memory Analyzer and IBM HeapAnalyzer, that they cannot read dumpfiles from jmap (jdk8, big files). You can use Yourkit to read those files.
First of all, it is (AFAIK) essential to freeze the JVM while a thread dump / snapshot is being taken. If JVM was able to continue running while the snapshot was created, it would be next to impossible to get a coherent snapshot.
So are there other ways to get a heap dump?
You can get a heap dump using VisualVM as described here.
You can get a heap dump using jconsole or Eclipse Memory Analyser as described here.
But all of these are bound to cause the JVM to (at least) pause.
If your application is actually hanging (permanently!) that sounds like a problem with your application itself. My suggestion would be to see if you can track down that problem before looking for the storage leak.
My other suggestion is that you look at a single heap dump, and use the stats to figure out what kind(s) of object are using all of the space ... and why they are reachable. There is a good chance that you don't need the "trend" information at all.
You can use GDB to get the heap dump without running jmap on the target VM however this will still hang the application for the amount of time required to write the heap dump to disk. Assuming a disk speed of 100MB/s (a basic mirrored array or single disk) this is still 2 minutes of downtime.
http://blogs.atlassian.com/2013/03/so-you-want-your-jvms-heap/
The only true way to avoid stopping the JVM is transactional memory and a kernel that takes advantage of it to provide a process snapshot facility. This is one of the dreams of the proponents of STM but it's not available yet. VMWare's hot-migration comes close but depends on your allocation rate not exceeding network bandwidth and it doesn't save snapshots. Petition them to add it for you, it'd be a neat feature.
A heap dump analyzed with the right tool will tell you exactly what is consuming the heap. It is the best tool for tracking down memory leaks. However, collecting a heap dump is slow let alone analyzing it.
With knowledge of the workings of your application, sometimes a histogram is enough to give you a clue of where to look for the problem. For example, if MyClass$Inner is at the top of the histogram and MyClass$Inner is only used in MyClass, then you know exactly which file to look for a problem.
Here's the command for collecting a histogram.
jcmdpidGC.class_histogram filename=histogram.txt
To add to Stephen's answers, you can also trigger a heap dump via API for the most common JVM implementations:
example for the Oracle JVM
API for the IBM JVM

How to see what my Java process is doing right now?

I have an app server process that's constantly at 100% CPU. By constantly I mean hours, or even days.
I know how to generate a heap/thread dump, but I'm looking for more dynamic information. I would like to know what is using so much CPU in there. There are tens (or probably 100+) threads. I know what those threads are, but I need to know which of them are using my CPU so much.
How can I obtain this information?
Use a profiler. There is one included in VisualVM which comes with the Oracle JDK.
An advanced commercial one (trial licenses available) is YourKit.
By creating a thread dump. You can use the jstack to connect to a running java process to get the thread dump. If you take two or more thread dumps over a period of time you can by analyzing them figure out which ones are actively using CPU. Typically the threads in the RUNNING state are the ones you need to focus on.
I personally use YourKit for this.
VisualVM also has some profiling capabilities, but I haven't used them.
in linux try kill -3 processid it will generate thread dump. You can analyze this to see what is happening in the java process.

Java Visual VM skewing CPU

i am trying to analyze the CPU usage for a Java UI application running on Windows. I connected it to VisualVM, but it looks like the highest percentage for CPU usage is being used by
sum.rmi.transport.tcp.TCPTransport$ConnectionHandler.run();
I believe this is being used to supply information to VisualVM and hence VisualVM is skewing the results that i'm trying to investigate. Does any one have a way to get a better indication of what is occurring or a better method to determine what in a running java application is taking up so much CPU.
Try to use sampler first.
For detailed information use the profiler and set root methods. See Profiling With VisualVM, Part 1 and Profiling With VisualVM, Part 2 for more information about CPU and Memory profiling.
That sounds awfully suspicious. Try cross referencing the data with results from hprof. You won't need any external applications running, and the data will simply be dumped to a text file from your own process. Are you connecting to your process remotely?

How do I create a thread dump via JMX?

I have a Tomcat running as a Windows Service, and those are known not to work well with jstack. jconsole is working well, on the other hand, and I can see stacks of individual threads (I'm connecting to "localhost:port" to access it).
How can I use jconsole or a similar tool to dump all the thread stacks into a file? (similar to jstack)
You can use the ThreadMXBean management interface.
This FullThreadDump class demonstrates the capability to get a full thread dump and also detect deadlock remotely using JMX.
Nowadays you can use jvisualvm tool to connect to your remote JVM through JMX and create a thread dump. Don't know if this was available
Here's another code sample that will write a stack dump to a file:
http://pastebin.com/zwcKC0hz
We use this over JMX to give us an approximation of the stack dump you get when you make a JMX request or if the process detects high, unexpected load.
It would be helpful if you take a flight recording to get a deeper view on the JVM behavior, specially focusing on the Hot Methods.
Usually, a recording of half an hour is enough. To trigger a recording, you must be logged in to the machines, and issue the following command:
If using Java HotSpot 1.8.x:
$JAVA_HOME/bin/jcmd VM.unlock_commercial_features
$JAVA_HOME/bin/jcmd JFR.start duration=1800s settings=profile filename=/tmp/recording.jfr
IF using java HotSpot 1.7.x:
Edit your $HOME/conf/wrapper.conf file by adding the following parameters on JVM startup:
wrapper.java.additiona.=-XX:+UnlockCommercialFeatures
wrapper.java.additional.=-XX:+FlightRecorder
(replace with the corresponding positional number )
Then, have your instances restarted. Once done, issue the following command :
$JAVA_HOME/bin/jcmd JFR.start duration=1800s settings=profile filename=/tmp/recording.jfr
The flight recording wil produce a file on /tmp/recording.jfr upon termination.

Why does Tomcat 5.5 (with Java 1.4, running on Windows XP 32-bit) suddenly hang?

I've been running Tomcat 5.5 with Java 1.4 for a while now with a huge webapp. Most of the time it runs fine, but sometimes it will just hang, with no exception generated, and no apparant way of getting it to run again other than re-starting Tomcat. The tomcat instance is allowed a gigabyte of memory on the heap, but rarely exceeds 300 MB. Has anyone else run into this issue, and is there a solution for it?
For clarification: I determined how much memory it is using via Task Manager and via Eclipse (I've also tried running it outside of Eclipse, but get the same problem eventually, though it takes a little longer). With Eclipse, I look at the memory allocated via its little (optional) memory pane and the amount allocated to javaw.exe via the task manager. I use the sysdeo? tomcat plugin for Eclipse.
For any jvm process, force a thread dump. In windows, this can be done with CTRL-BREAK, I believe, in the console window.
In *nix, it is almost always "kill -3 jvm-pid".
This may show if you have threads waiting on db connection pool/thread pool, etc.
Another thing to check out is how many connections you have currently to the JVM -- either use NETSTAT or SysInternals utility such as tcpconn/tcpview (google it).
Also, try to run with the verbose:gc JVM flag. For Sun's JVM, run like "java -verbose:gc". This will show your garbage collections. If it is collecting a lot (FULL COLLECTIONS, expecially) then you probably have a memory leak. The full collections are costly, especially on large heaps like that.
How are you determining that only 300mb are being used?
It sounds like you're hitting a deadlock.
If you can reproduce it in a dev environment then try attaching a debugger once it's happened. Take a look at your threads and see if you have any deadlocks.
If you can't get a debugger to attach you should be able to generate a thread dump, as Dustin pointed out.
Try increasing the logging sensitivity for the Tomcat application server.
http://tomcat.apache.org/tomcat-5.5-doc/logging.html
You can increase the sensitivity to FINEST or ALL for most of them for a few days and see if that helps you catch anything.
I agree with creating multiple thread dumps and viewing them though this: Thread Dump Analyzer

Categories