No threaddump generated on kill -3 - java

Is there a possibility that kill -3 / quit PID prints nothing i.e. an empty thread dump? We heard a story from a support engineer and was wondering if some experts could validate.
This is on Java 6_26 on RHEL 5

I have only seen this when the server redirects to stdout, like JBoss, and stdout has been redirected to /dev/null because whoever set up the server thought that everything going to stdout was already going to a named log file.

The console output of JVM thread dump on some servers is redirected to a log file. In case of Tomcat Server it is usually Catalina.out.

I have seen the behavior you describe in a standalone Java application (Oracle JDK 1.6.20+, Linux), but I can't tell how to reproduce this behavior consistently. It may have been after an OutOfMemoryError in one of the threads but I'm not sure any more.
I also think that what I got was not just an empty dump, but that the command actually froze and didn't return me to the shell until I pressed ctrl+C after waiting for a while. Either way, I'm sure that the behavior of jstack was exactly the same as of kill -3. When it happened, the app was in such bad shape that it didn't react to normal kill and only kill -9 worked on it. There were no redirections and under normal circumstances the app reacted to kill -3 as it should.

Related

java -XX:OnError for post mortem?

I have a java tool running on a raspi. I ssh into the raspi and start that tool using
java -jar name.jar &
After a while (hours or days), the process doesn't run anymore. I have pretty extensive logging in my code, but my log doesn't show any error. So the question would be how I can analyze the situation? I thought using -XX:OnError method, but what would best to specify? Any other ideas what I can do?
Update:
an hs_err_pid file I am not able to find. What is the working directoy when I start the program like that? I have scanned the directory from where I started the java app, /var and /tmp and /home/pi.
Update 2:
Working directory is shown as /home/pi there was no err pid file. I am running it now as
java -XX:OnError="/home/pi/Server/deah.sh" -XX:ErrorFile=/home/pi/Server/hs_err_pid%p.log -jar /home/pi/Server/myjavatool-0.1.2.jar &
Can I "simulate" a crash that I see if the err file is being created? a kill -9 doesn't do the trick.
So I can think of a number of possibilities.
The JVM is panic / crashing. If this happens, then a crash file should be created somewhere. So ... wait until it happens again. (I am not sure if the OnError handler gets called in that case.)
The application itself is causing the JVM to exit; e.g. by calling System.exit(...) somewhere. If that is the case, then you shouldn't get a crash file and (I think) that the OnError handler won't be run.
Something external to the application is killing it. If it is a SIG_KILL (-9) signal, then the JVM won't get a chance to do anything. There won't be a crash dump, the OnError won't be run, there will be nothing in the application logs, and JVM shutdown hooks won't be run.
What could cause a SIG_KILL? One possibility is some other application. A second possibility is the OOM killer. That is a builtin function of (many) Linux based systems that reacts to excessive virtual memory paging activity by sending a SIG_KILL to the process that appears to be the cause. If the OOM killer is doing this, then you should see a log entry in the system log files.
At this point, the OOM killer seems the most likely culprit.
More information:
Linux's OOM Process Killer
Can I "simulate" a crash that I see if the err file is being created? a kill -9 doesn't do the trick.
I don't think so. A kill -9 sends a SIG_KILL which can't be caught by the target process. The JVM would not have a chance to generate a crash file.
If you want to be sure that no err file is being created, use the find command to search the entire file system; e.g.
$ sudo find / -name hs_err_\*

java application being abrutply terminated on shutdown without having signal sent

To be clear, this is a upstart/linux debugging problem not a java problem per-se.
I have a java application installing a shutdownhook. It shows some funny behaviour in ubuntu-GNOME, namely that the shutdown hook never run if a restart or power off were scheduled. At first i thought it was a problem with my shutdown hook, so i simplified it until it was only writing a line to file (yes i know about the log4j2 logger problem with shutdownhooks so i disabled theirs too). Then when that didn't work i started hacking /etc/init/sendsigs
I added this to the beginning of the do_stop function:
app="$(jps | grep appname.jar | cut -d' ' -f1)"
echo "$app" >> "/home/i30817/output.txt"
sure enough, that showed that it was no longer running, so the shutdown hooks were never activated by sendsigs
Then i used lastcomm from here replacing my edit of sendsigs by:
echo `/home/i30817/lastcomm` > "/home/i30817/output.txt"
And it told me that the java process exited with 1 and was not signaled:
java X i30817 ?? 10.26 secs Sun Mar 2 12:44 E 1
but this still didn't help me find what actually killed it and why. This problem is not reproducible with a smaller example, so it's probably something in the larger application (but not the shutdown hook, since it was minimized) that doesn't like the shutdown process and manages to kill the process, but i can't figure it out... redirecting the process output to a file doesn't say anything either eg:
java -jar /home/i30817/Documents/projects/app/dist/app.jar > allout.text 2>&1
is empty of everything but normal app output
Can you help me figure out this? There are a lot of duplicate questions about the same thing too (but they think it's the shutdownhooks malfunctioning).
edit: more detail, now that i understand the problem a bit better. I think now that processes not being there on sendsigs is normal. Java applications, and maybe others use a protocol from the window manager where SIGHUP, SIGHUP and SIGCONT is sent on shutdown/logout. The JVM hooks SIGHUP to launch the shutdown hooks. I tested this with a very small example that only adds a shutdownhook and has a infinite cycle on main, and ran it with a system tap script in the background:
java -jar app.jar
and in another shell
sudo stap -o process.txt sigkill.stp
However when i tried that with my application i think i found the culprit:
PROCESS: SIGSEGV java.signal_generate sent to java 2280
but don't really know what to do about it considering there is no thread dump or anything and this is strange to reproduce (only my app, only during shutdown).
edit2: now i suspect the reason for the 'abrupt' termination without core dump is the ulimit during shutdown. So i'm trying to solve that in preparation for a bug report. I edited /etc/security/limits.conf to add this and rebooted
* soft core unlimited
root hard core unlimited
* hard rss 10000
(fs.suid_dumpable=2 was set by ubuntu, no apparmor i think)
but during shutdown i edited /etc/init.d/sendsigs again to print ulimit -a and sleep for 30 seconds before killing the processes, and it seems that during reboot the ulimit gets reset again? And moreover, it had a different output like it was using another executable version, for instance instead of saying 'core file size' it had 'core(dump)' or something like that).
edit3: ah, i need to have fs.suid_dumpable=1 instead - gonna try now.
Maybe the init ulimit doesn't matter for shutdown core dump triggering. After all the jvm was executed from the user env so it should be using the user ulimit.
edit4: eh. After much commenting of code i reached the following conclusion that i could have reached from RTFM:
the sigsegvs are harmless.
the non-zero exit code is not.
If the AWT is still up, the signal is always non zero and the shutdown hooks never run. Even a small example still prevents execution of shutdown hooks in linux reboot if a JFrame is up (unlike windows, where they will start). Looking at the source, the application shutdownhooks are run on a slot by themselves, slot 1. I bet slot '0' is the AWT and that is halting the system somehow.
I guess it's time to check the package private signal handling libs to see if i can get SIGHUP before the JVM decides to terminate everything without even giving the opportunity for cleanup code to run.
According to docs
http://docs.oracle.com/javase/7/docs/api/java/lang/Runtime.html#addShutdownHook(java.lang.Thread)
In rare circumstances the virtual machine may abort, that is, stop running without shutting down cleanly....
If the virtual machine aborts then no guarantee can be made about whether or not any shutdown hooks will be run.

Does "kill -QUIT" ever actually kill the JVM?

Running kill -QUIT on a Unix system will trigger a thread dump. I know this because I have done this hundreds of times.
However, another developer tells me he has seen this "crash the JVM" and using twiddle or the JMX API is "safer".
I'm struggling to find any references online to kill -QUIT behaving this way.
Can anyone confirm that it could actually kill the java process/cause the JVM to quit?
(Obviously one way for it to do this would if someone didn't correctly type "-QUIT" :-))
In 12 years I have never seen kill -QUIT crash a JVM. But as Disco 3 says, if you're doing a thread dump while the JVM is in distress (which is when you usually do thread dumps), it may (possibly?) crash with an OutOfMemoryError. But anything could crash a JVM in that situation. I wouldn't hesitate to use kill -QUIT, but you may find jstack more useful because it will dump the thread dump to your stdout rather than the JVM's.

Tomcat stopped without any log or any stack

We have trouble with Tomcat 5.5 which stops at night on our production servers (Linux CentOS 4.8) and we have no idea why it stops...
There is no Tomcat's log in catalina.out or any application's log.
We tried different things to find why the server stops:
configure Tomcat to be able to generate a core dump
instrument System.exit() method with javassist to find if the method was called
add a shutdown hook to the JVM (with Runtime.getRuntime().addShutdownHook())
None of them worked, we have no core dump, the Exit method and the shutdown hook are not called.
My conclusions are:
The VM is not terminated properly but crash without any log.
Any idea or log to read to find why Tomcat stops?
1) Make sure you know where stderr is redirected and check if anything got printed there.
2) Check the memory limits on Tomcat and how much free memory does the system have. Review the Linux system logs under /var/log to see if anything suspicious happened during the time. For example, kernel can randomly kill a process (almost) without a trace if the system is running low on memory.
We've ran 5.5 in production for years and never had any unexplained shutdowns, FWIW.
This worked for me.
As suggested here in other answers checked system logs in /var/log/messages but permission denied for me. So, I used dmesg command instead and got this in the logs
"Out of memory: Kill process 14606 (java) score 106 or sacrifice child".
In the output I also noticed Swap Memory free 0 K. Ran top command to confirm the same. So, somehow there was a high memory usage which caused the OS to kill my tomcat process.
After spending hours finally got the reason.
ps -ef | grep tomcat showed that there were several tomcat processes running for the same application. It seems that, earlier tomcat shutdowns might not have been completed successfully and due to some reason the processes were not killed even after the shutdown, which was causing the high memory usage.
So, killed all running tomcat processes using kill. SWAP memory got freed.
Started tomcat again, worked fine. :)
Tomcat 7 has an option inside catalina to prevent the System.exit class call or something similar: http://ci.apache.org/projects/tomcat/tomcat7/docs/security-manager-howto.html .
Maybe there's a similar option for the 5.5 version. Try the documentation.
There are options to redirect the output to the same console that you use to start Tomcat. This information is redirected to logs when you execute on Unix based systems, on Windows, it remains with the console if not redirected.
Most probably there is a stack-overflow exception. This is typical behavior of Tomcat when it happens. For example, you're trying to serialize to JSON or XML beans with cyclic dependencies (but without handling of the cycles).
Everytime I had this issue (several times) it always has been this one. All other stops are usually logged properly (like OutOfMemory etc).
This type of stops leaves no trace anywhere.

Restoring/Restarting a java daemon from crash

I am running a java app as daemon on a linux machine using a customized shell script.
Since I am new to java and linux, I want to know is it possible that the app itself resurrects itself(just like restart) and recovers from cases like app crashing, unhandled exceptions or out of memory etc.
thanks in advance
Ashish Sharma
The JVM is designed to die when there is an unrecoverable error. The ones you described fall in this category.
You could, however, easily write a shell script or a Python script that checks if the process is alive, and if it is dead, waits a few seconds and revive it. As a hint to doing this, the Unix command "pgrep" is your friend, as you can check for the exact command line used to fire a JVM (and thus including the starting class file). This way, you can determine if that specific JVM instance is running, and restart it.
All that being said, you may want to add some reporting or logging capability and check if often, because it is too easy to assume that things are ok when in fact the daemon is dying every few minutes. Make sure you've done what you could to prevent it from dying before resurrecting it.
There are Wrappers that can handle that, like Java Service Wrapper (Be aware, that the Community Edition ist under GPL) and some alternatives, mentioned here
To be honest, relaunching the daemon without any question after a crash is probably not a good idea; well it depends greatly on the type of processing achieved by your daemon, but if for example it processes files from a given directory, or requests coming from a queue manager, and the file / message contains some unexpected data causing the crash, relaunching the daemon would make it crash again immediately (excepting when the file / message is removed no matter it has been correctly processed or not, but as well it seems not to be a good idea).
In short, it's probably better to track down the possible crash reasons and fix them when possible (or at least log the the problem and go ahead, provided that the log message would ever be scanned to warn at last a human being, so some action can be engaged upon such "failures").
Anyway if you have very good reasons to do such, a solution even simpler than "checking that the process is alive" (as it would probably in some way involve some "ps -blahblah" stuff), you could just put the java program launching in a shell "while true" loop as follows :
while true
do
# launch the java program here, no background
# when crashing, the shell will be given hand back
java -classpath blahblah...
echo "program crashed, relaunching it..."
done
On unix based systems, you may use "inittab" to specify the program. If process dies, it is re-started by OS.(respawn)
I am not sure if the app itself can handle such crashes. You could write a shell script in linux which could be running as a cron job itself to manage the app, checking if the java app is running on scheduled intervals and if not, it will restart it automatically.

Categories