How to debug JVM hanging on exit occasionally - java

I have a server that occasionally hangs when it exits. The hang only occures about 1/10 or less of the time and so far we can't figure out a way to reliably recreate the issue. I've walked through my code and thought that I am closing all resources and killing my threads, but obviously some of the time I don't close right.
Can anyone suggest debugging tips to help me test this when I can't reliably recreate it? I've tried running JVisualVM once it goes down, but it doesn't help much other then showing me the sigterm threads are running still and everything is at 0% CPU, which I assume means a deadlock somewhere.

When the process hangs you can send SIGQUIT (kill -3) to the process and it will generate a thread dump. The output goes to stderr so make sure that is being captured.

You could try using JConsole to monitor your server. You can visually monitor the memory, CPU usages and no of threads etc. It also can detect deadlocks if there are.

Related

Thread starts and fails to stops with Tomcat. What's happening?

i have a java multi-threaded program that is running. i am running it on a tomcat server. when the threads are still running, some executing tasks, some still waiting for some thing to return and all kinds of things, assume i stop the server all of a sudden in this scenario.. when i do i get a warning on the tomcat terminal saying a thread named x is still running and the server is being stopped so this might lead to a memory leakage. what is the OS actually trying to tell me here? can someone help me understand this?? i am running this program on my system several times and i have stopped the server abruptly 3 times and i have seen this message when ever i do that. have i runined my server? (i mean my system). did i do something very dangerous????
please help.
Thanks in advance!
when i do i get a warning on the tomcat terminal saying a thread named x is still running and the server is being stopped so this might lead to a memory leakage. what is the OS actually trying to tell me here?
Tomcat (not the OS) is surmising from this extra thread that some part of your code forked a thread that may not be properly cleaning itself up. It is thinking that maybe this thread is forked more than once and if your process runs for a long time, it could fill up usable memory which would cause the JVM to lock up or at least get very slow.
have i ruined my server? (i mean my system). did i do something very dangerous????
No, no. This is about the tomcat process itself. It is worried that this memory leak may stop its ability to do its job as software -- nothing more. Unless you see more than one thread or until you see memory problems with your server (use jconsole for this) then I would only take it as a warning and a caution.
It sounds like your web server is forking processes which are not terminated when you stop the server. Those could lead to a memory leak because they represent processes that will never die unless you reboot or manually terminate them with the kill command.
I doubt that you will permanently damage your system, unless those orphaned processes are doing something bad, but that would be unrelated to you stopping the server. You should probably do something like ps aux | grep tomcat to find the leftover processes and do the following
Kill them so they don't take up more system resoures.
Figure out why they are persisting when the server is stopped. This sounds like a misbehaving server.

how do I know what's stopping a spring webapp from shutdown

we've got a slightly grown spring webapp (on tomcat 7) that is very slow in shutdown. (which has negative impacts on the performance of our continous delivery)
My suspicion is, that there must be some bean that is blocking (or taking very long) in it's #PreDestroy method.
So far I've ensured that it's not related to a thread(pool) that is not shut down correctly by giving distinct names to every pool, thread and timer and ensuring that they are either daemon threads or being shut down correctly.
Has anybody every solved a situation like this and can give me a hint on how to cope with this?
BTW: killing the tomcat process is not an option - we really need a clean shutdown for our production system.
Profiling would be the nuclear option. It's probably easy to get a picture of what's happening (especially if it is just blocked threads since that state will be long lived) just using thread dumps. If you take 2 dumps separated by a few seconds and they show the same or similar output for one or more threads then that is probably the bottleneck. You can get a thread dump using jstack or "kill -3" (on a sensible operating system).
and if you're on Windows, then selecting the java console window, and hitting ctrl + pause will dump to that window - just hit 'enter' to resume execution

Awake sleeping detached java process

I wrote a Java program, which analyses other programs. The execution may take very long (= days). Now (after three days), I have the problem, that my program / process is sleeping (S). It still has allocated 50% of the memory and sometimes it prints new output, but top shows must of the time 0% CPU.
I used jstack to be sure, that there are still runnable threads. Hence, it seems not to be a deadlock problem. I do not know, why the process does not get more cpu time. I chanced the niceness of the java process from 0 to -10, but nothing happends.
More details:
The process runs on a linux server: Ubuntu 10.04.4 LTS.
I started my process with screen. So, I do not have to be logged in all the time.
screen -S analyse ant myParameters
The server has almost nothing to do.
Thanks for your help.
Start your program in debug mode. Then you can attach to it with any Java debugger and inspect what it is doing.

Standalone Java App dies after a few days

We have a Java App that connects via RMI to another Java app.
There are multiple instances of this app running at the same time, and after a few days an instance just stops processing... the CPU is in 0 and I have an extra thread listening to an specific port that helps to shutdown the App.
I can connect to the specific port but the app doesn't do anything.
We're using Log4j to keep a log and nothing is written, so there aren't any exceptions thrown.
We also use c3p0 for the DB connections.
Anyone have ideas?
Thanks,
I would suggest starting with a thread dump of the affected application.
You need to see what is going on on a thread by thread basis. It could be that you have a long running thread, or other process which is blocking other work from being done.
Since you are running linux, you can get your thread dump with the following command
kill -3 <pid>
If you need help reading the output, please post it in your original question.
If nothing is shown from the thread dump, other alternatives can be looked at.
Hum... I would suggest using JMetter to stress the Application and take note of anything weird that might be happening (such as Memory Leaks, Deadlocks and such). Also review the code for any Exceptions that might interrupt the program (or System.exit() calls). Finally, if other people have access to the computer, makes sense to check if the process wasn't killed manually somehow.

Java process is hanging for no apparent reason

I am running a Java process with Xmx2000m, the host OS is linux centos, jdk 1.6 update 22. Lately I have been experiencing a weird behavior in the process, it becomes totally unresponsive with no apparent reason, no logs, no errors, nothing.. I am using jconsole to monitor the processor, heap and Perm memory are not full, threads and loaded classes are not leaking..
Explanation anyone?
I doubt anyone can give you an explanation since there are lots of possible reasons and not nearly enough information. However, I suggest that you jstack the process once it's hung to figure out what the threads are doing, and take it from there. It sounds like a deadlock or thrashing of some sort.
Do a thread dump. If you have access to the foreground process on Linux, use ctrl-\. Or use jstack to dump stack remotely. Or you can actually poke it through JMX via jconsole at MBeans/java.lang/Threading/Operations/dumpAllThreads.
Without knowing more about your app, it's hard to speculate about the cause. Presumably your threads are either a) blocked or b) exited. If they are blocked, they could be waiting for I/O on a database or other operation OR they could be waiting on a lock or monitor (deadlocked). If a deadlock exists, the thread dump will tell you which threads are deadlocked, which lock, and (in Java 6) annotate the stack with where locks have been taken. You can also search for deadlocks with the JMX method, available through jconsole at MBeans/java.lang/Threading/Operations/find[Monitor]DeadlockedThreads().
Or your threads may have received unhandled exceptions and exited. Check out Thread's uncaughtExceptionHandlers or (better) use Executors in java.util.concurrent.
And finally, the other classic source of pauses in Java is GC. Run with -verbose:gc and other GC flags to see if it's doing a full GC collection. You can also turn this on dynamically in jconsole by flipping the flag at MBeans/java.lang/Memory/Attributes/Verbose.
Agree with aix, but would like to add a couple of recommendataions.
1. check your system. Run top to see whether the system itself is healthy, CPU is not 100% and memory is available. If not, fix this.
2. application may freeze as a result of dead lock. Check this.
Ok here are some updates I wanted to share:
There is an incompatability between NTPL (Linux’s new thread library) and the Java 1.6+ JVM. A random bug causes the JVM to hang and eat up 100% CPU.
To work around it set LD_ASSUME_KERNEL=2.4.1 before running the JVM, export LD_ASSUME_KERMEL=2.4.1 . This disables NTPL: problem solved!
But for compatibility reasons, I'm still looking for a solution that uses NTPL.
Threads can be traced using jvisualvm and jconsole, and deadlocks can be avoided too. Note that there are several network services each with separate thread pools, and they all become unreachable.
Check the jvisualvm of the process right before the crash.
http://www.jadyounan.com/wp-content/uploads/2010/12/process.png
Could you elaborate more on what you are doing ? 2000 for memory is rather a lot.

Categories