We have trouble with Tomcat 5.5 which stops at night on our production servers (Linux CentOS 4.8) and we have no idea why it stops...
There is no Tomcat's log in catalina.out or any application's log.
We tried different things to find why the server stops:
configure Tomcat to be able to generate a core dump
instrument System.exit() method with javassist to find if the method was called
add a shutdown hook to the JVM (with Runtime.getRuntime().addShutdownHook())
None of them worked, we have no core dump, the Exit method and the shutdown hook are not called.
My conclusions are:
The VM is not terminated properly but crash without any log.
Any idea or log to read to find why Tomcat stops?
1) Make sure you know where stderr is redirected and check if anything got printed there.
2) Check the memory limits on Tomcat and how much free memory does the system have. Review the Linux system logs under /var/log to see if anything suspicious happened during the time. For example, kernel can randomly kill a process (almost) without a trace if the system is running low on memory.
We've ran 5.5 in production for years and never had any unexplained shutdowns, FWIW.
This worked for me.
As suggested here in other answers checked system logs in /var/log/messages but permission denied for me. So, I used dmesg command instead and got this in the logs
"Out of memory: Kill process 14606 (java) score 106 or sacrifice child".
In the output I also noticed Swap Memory free 0 K. Ran top command to confirm the same. So, somehow there was a high memory usage which caused the OS to kill my tomcat process.
After spending hours finally got the reason.
ps -ef | grep tomcat showed that there were several tomcat processes running for the same application. It seems that, earlier tomcat shutdowns might not have been completed successfully and due to some reason the processes were not killed even after the shutdown, which was causing the high memory usage.
So, killed all running tomcat processes using kill. SWAP memory got freed.
Started tomcat again, worked fine. :)
Tomcat 7 has an option inside catalina to prevent the System.exit class call or something similar: http://ci.apache.org/projects/tomcat/tomcat7/docs/security-manager-howto.html .
Maybe there's a similar option for the 5.5 version. Try the documentation.
There are options to redirect the output to the same console that you use to start Tomcat. This information is redirected to logs when you execute on Unix based systems, on Windows, it remains with the console if not redirected.
Most probably there is a stack-overflow exception. This is typical behavior of Tomcat when it happens. For example, you're trying to serialize to JSON or XML beans with cyclic dependencies (but without handling of the cycles).
Everytime I had this issue (several times) it always has been this one. All other stops are usually logged properly (like OutOfMemory etc).
This type of stops leaves no trace anywhere.
Related
I have a server with 4 CPU's and 16GB of RAM.
There is a Weblogic Admin server and 2 managed servers and a Tomcat server running in this Ubuntu Machine.
The resource utilization explodes at times which is very unusual. This has never happened before and I think it has something to do with the Java Parameters that I used.
Have a look at this:
Weblogic Cluster:
Admin Server : qaas-01
Managed Servers : qams-01, qams-02
In the below image you will be able to see that the java processes associated with the above are multiplying and consuming too much memory.
Figured out that this is more generic and not specific to Weblogic.
A lot of processes are behaving the same way.
In the below picture its Apache Tomcat and Jenkin's slave process thats replicating and consuming memory.
Can anyone help me identify the real issue?
This question is quite broad, so start looking into why it may be happening. Post your JVM flags also and if you changed anything that may be causing this.
First you need to figure out what is taking up your CPU time.
Check weblogic config console to generate a stack trace to see what is going on. You may need to sit and watch the CPU so you can run that when it spikes. You can also force a stack trace using jstack. To get java stacktrace you may need to sudo and execute it as the user running the server otherwise you get OS thread dump which may not be as useful. Read about jstack.
If above does not give enough info as to why the CPU spiked, and since this is ubuntu you can run:
timeout 20 strace -cvf -p {SERVER PID HERE} -o strace_digest.txt
This will run strace for 20 seconds and report on which OS calls are being made most frequently. This can give you a hint as to what is going on.
Enable and check the garbage collection log and see how often it runs, it may not have enough memory. See if there is a correlation between GC running and CPU spike.
I don't think there is a definitive way to help you solve CPU spike by looking at top, but above is a start to get you debugging.
To be clear, this is a upstart/linux debugging problem not a java problem per-se.
I have a java application installing a shutdownhook. It shows some funny behaviour in ubuntu-GNOME, namely that the shutdown hook never run if a restart or power off were scheduled. At first i thought it was a problem with my shutdown hook, so i simplified it until it was only writing a line to file (yes i know about the log4j2 logger problem with shutdownhooks so i disabled theirs too). Then when that didn't work i started hacking /etc/init/sendsigs
I added this to the beginning of the do_stop function:
app="$(jps | grep appname.jar | cut -d' ' -f1)"
echo "$app" >> "/home/i30817/output.txt"
sure enough, that showed that it was no longer running, so the shutdown hooks were never activated by sendsigs
Then i used lastcomm from here replacing my edit of sendsigs by:
echo `/home/i30817/lastcomm` > "/home/i30817/output.txt"
And it told me that the java process exited with 1 and was not signaled:
java X i30817 ?? 10.26 secs Sun Mar 2 12:44 E 1
but this still didn't help me find what actually killed it and why. This problem is not reproducible with a smaller example, so it's probably something in the larger application (but not the shutdown hook, since it was minimized) that doesn't like the shutdown process and manages to kill the process, but i can't figure it out... redirecting the process output to a file doesn't say anything either eg:
java -jar /home/i30817/Documents/projects/app/dist/app.jar > allout.text 2>&1
is empty of everything but normal app output
Can you help me figure out this? There are a lot of duplicate questions about the same thing too (but they think it's the shutdownhooks malfunctioning).
edit: more detail, now that i understand the problem a bit better. I think now that processes not being there on sendsigs is normal. Java applications, and maybe others use a protocol from the window manager where SIGHUP, SIGHUP and SIGCONT is sent on shutdown/logout. The JVM hooks SIGHUP to launch the shutdown hooks. I tested this with a very small example that only adds a shutdownhook and has a infinite cycle on main, and ran it with a system tap script in the background:
java -jar app.jar
and in another shell
sudo stap -o process.txt sigkill.stp
However when i tried that with my application i think i found the culprit:
PROCESS: SIGSEGV java.signal_generate sent to java 2280
but don't really know what to do about it considering there is no thread dump or anything and this is strange to reproduce (only my app, only during shutdown).
edit2: now i suspect the reason for the 'abrupt' termination without core dump is the ulimit during shutdown. So i'm trying to solve that in preparation for a bug report. I edited /etc/security/limits.conf to add this and rebooted
* soft core unlimited
root hard core unlimited
* hard rss 10000
(fs.suid_dumpable=2 was set by ubuntu, no apparmor i think)
but during shutdown i edited /etc/init.d/sendsigs again to print ulimit -a and sleep for 30 seconds before killing the processes, and it seems that during reboot the ulimit gets reset again? And moreover, it had a different output like it was using another executable version, for instance instead of saying 'core file size' it had 'core(dump)' or something like that).
edit3: ah, i need to have fs.suid_dumpable=1 instead - gonna try now.
Maybe the init ulimit doesn't matter for shutdown core dump triggering. After all the jvm was executed from the user env so it should be using the user ulimit.
edit4: eh. After much commenting of code i reached the following conclusion that i could have reached from RTFM:
the sigsegvs are harmless.
the non-zero exit code is not.
If the AWT is still up, the signal is always non zero and the shutdown hooks never run. Even a small example still prevents execution of shutdown hooks in linux reboot if a JFrame is up (unlike windows, where they will start). Looking at the source, the application shutdownhooks are run on a slot by themselves, slot 1. I bet slot '0' is the AWT and that is halting the system somehow.
I guess it's time to check the package private signal handling libs to see if i can get SIGHUP before the JVM decides to terminate everything without even giving the opportunity for cleanup code to run.
According to docs
http://docs.oracle.com/javase/7/docs/api/java/lang/Runtime.html#addShutdownHook(java.lang.Thread)
In rare circumstances the virtual machine may abort, that is, stop running without shutting down cleanly....
If the virtual machine aborts then no guarantee can be made about whether or not any shutdown hooks will be run.
In bash is a useful command trap. It intercepts various signals like SIGKILL, SIGHUP etc to process.
So... We have a problem that Tomcat sometimes dies without any visible reasons. And - without any helpful information in log-files.
My idea is add trap to its java-analog command to collect jstack before JVM with Tomcat will die.
How can I do it in Java? Please note - I'm not Java-programmer.
Thanks for any tips.
The JVM (Oracle's at least) already installs signal handlers and translates signals into exceptions and other useful things. See http://docs.oracle.com/javase/7/docs/webnotes/tsg/TSG-VM/html/signals.html.
Usually when Tomcat dies without logs, it's a symptom of running out of stack or heap space (Tomcat runs in the same JVM as the web apps, and a misbehaved app can crash the server before the logs are flushed).
What version of Tomcat are you using? If you are using Tomcat 6+, you can disable log buffering completely so that the final messages are flushed as they are written. See http://tomcat.apache.org/tomcat-6.0-doc/logging.html.
For JULI, a negative bufferSize will force flushes after each write.
I was playing around with the Java memory heap allocation and I think I did something that set it not just for any specific tomcat folder but for the entire system as I can't run my application in Spring or with a custom tomcat folder that I had been also using for testing. When I try to run the application, it just hangs at "INFO: Initializing Spring root WebApplicationContext".
I am not fully clear on how I could have set this on the Linux command line by doing something like "export CATALINA_OPTS="-Xms2000m -Xmx4500m" or "JAVA_OPTS="-Xms2000m -Xmx4500m -XX:MaxPermSize=4500m"
I think I accidentally put the settings I had meant for my cloud server (which has a higher amount of memory) so I gave it more memory allocation than my entire system has.
I would appreciate any help if anyone can let me know if it is possible I did this and if this could be causing the issue and how I can get an output from my system as to the current allocation so I can check / change it, I have tried "export CATALINA_OPTS="-Xms2000m -Xmx4500m" but it still won't work for me.
I would like to restore everything to default settings. I normally set the memory heap allocation in the /bin/startup.sh file in the tomcat folder but I had forgotten this, I think, and was experimenting with things on the command line.
Thanks
Check all of the log files for Tomcat. It's possible that the startup of the Spring WebApplicationContext is encountering errors, but logging it elsewhere.
Setting the max heap size too large for Tomcat sounds like a red herring to me. If the initial heap size argument is larger than the maximum size of memory on the system, the Java process will fail immediately - Tomcat will not get to the point of initializing your Spring application.
See if you can increase the server timeout if you are running it from an IDE. From command line, you'll have to wait.
If the problem persists, try cleaning /reloading the application
Isn't finding bugs what debuggers are for? Why not simply start your Tomcat in debug mode, wait until it "hangs", and find out what it is doing by suspending the relevant thread and looking at the stack trace?
Or even simpler, do a "thread dump" once the app hangs?
For me it was another service that was not reachable
spring was retrying again and again to reach and wating until it timeouts
Once I started another service it worked automatically.
I'm having problems with jetty crashing intermittently, I'm using Jetty 6.1.24.
I'm running a neo4j Spring MVC webapp, Jetty will stay running for approx 1 hour and then I have to restart Jetty. It is running on small amazon ec2 instance, debian with 1.7gb of RAM.
I start Jetty using java -Xmx900m -server -jar start.jar
I am connecting to the server using putty, when Jetty crashes the putty session disconnects, I cannot see what error caused it to crash.
I would like to be able to see if it is an error generated by Spring, I'm not sure how to log the output from the spring app with Jetty. Or if it is Jetty or a memory issue, what would be the best way to monitor Jetty? I cannot recreate this on my local machine running windows. What do you think would be the best way to approach this? Thanks
This isn't really a programmer question; perhaps it'll be moved over to ServerFault.
You didn't specifically state which operating system you're using, but I'm hazarding a guess at some Linux distribution. You have two options of figuring out what's wrong:
Start your session in screen. Screen will live for as long as the actual machine is powered on, until you reboot the operating system (or you exit screen).
you start screen like this
screen
and you get a new prompt where you can start your program (cd foo, jetty, etc). When you're happy and you just need to go somewhere, you can disconnect the screen by hitting CTRL+A and then CTRL+D. you'll drop back to the place you were before you invoked screen.
To get back to seeing the screen you type screen -R which means to resume an existing screen. you should see jetty again.
The nice thing is that if you lose connection (or you close putty by accident or whatever) then you can use screen -list to get a list of running screens, and then forcibly detach them -D and reattach them to the current putty -R, no harm done!
Use nohup. Nohup more or less detaches the process you're running from the console, so none of its output comes to the terminal. You start your program in the normal fashion, but you add the word nohup to your command.
For example:
nohup ls -l &
After ls -l is complete, your output is stored in nohup.out.
When you say crash do you mean the JVM segfaults and disappears? If that's the case I'd check and make sure you aren't exhausting the machine's available memory. Java on linux will crash when the system memory gets so low the JVM cannot allocate up to its maximum memory. For example, you've set the max JVM memory to 500MB of which it's using 250MB at the moment. However, the Linux OS only has 128MB available. This produces unstable results and the JVM will segfault.
On windows the JVM is more well behaved in this scenario and throws OutOfMemoryError when the system is running low on memory.
Validate how much system memory is available around the time of your crashes.
Verify if other processes on your box are eating up a lot of memory. Turn off anything that could be competing with the JVM.
Run jconsole and connect it to your JVM. That will tell you how memory is being used in your JVM process and give you a history to look back through when it does crash.
Eliminate any native code you might be loading into the JVM when doing this type of testing.
I believe Jetty has some native code to do high volume request processing. Make sure that's not being used. You want to isolate the crashes to Java and NOT some strange native lib. If you take out the native stuff and find it works then you have your answer as to what's causing it. If it continues to crash then it very well could be what I'm describing.
You can force the JVM to allocate all the memory at startup with -Xms900m that can make sure the JVM doesn't fight with other processes for memory. Once it has the full Xmx amount allocated it won't crash. Not a solution, but you can easily test it this way.
When you start java, redirect both outputs (stdout and stderr) to a file:
Using Bash:
java -Xmx900m -server -jar start.jar > stdout.txt 2> stderr.txt
After the crash, inspect those files.
If the crash is due to a signal (like SEGV=segmentation fault), there should be a file dump by the JVM at the location you've started java. For Sun VM (hotspot), it's something like hs_err_pid12121.log (here 12121 is the process ID).
Putty disconnecting STRONGLY hints that the server is running out of memory and starts shutting down processes left and right. It is probably your jetty instance growing too big.
The easiest thing to do now, is adding 1-2 Gb more swap space and do it again. Also note that you can use the jvisualvm to attach to the jetty instance to get runtime information directly.