Hanging java Process on solaris - java

We have a problem where our java process is hanging forever,
Unless a Kill -9 is issued against it.
The same Process is running successfully in the Other Solaris Envs,
Java process consist single thread which starts and end after doing some processing On the Data ,Though from the logs and data we can see that the code is completely executed and all the data is processed.
but if we do JPS we will always see that process is running.
we are Using EHcache with spring for caching purpose and UCP for the connection pool.
On The dB side we Have ORACLE RAC Structure.
took several Jstacks and can never See the Process sticking in the my code.
though from thread dump can see there are lot of UCP threads hanging there.
Also Adding a Shutdown hook and removing It in the end,but some reason seems the shutdownhook is never called.
Due to project restrictions ,cant paste the code.
can Anyone Please help

my customer is facing same problem with our installer hanging on Solaris. when installer ran in debug mode, we realized that java which is embedded with installer is hanging. Please post in case any of you found answer for it.

Related

java process finishes after some time with no particular reason

I have a java .jar file that i launch on an AWS instance in detached mode. So when i exit the ssh session, it still runs.
The app does some network stuff, and is expected to run for days until it finishes it task.
I have made logs all over the app, made log in the end of main method. I also made a global try/catch and added logging to the catch section.
Still, after some days i enter into ssh and see that the app just stopped running. No exceptions, main method did not complete because the log in the end did not trigger. It seems that the process was just killed in the middle of working. Sometimes it works for 5 hours, sometimes for 3-4 days without stopping.
I have no idea what could be the cause of this. I expect the java process to run until it finished, or until it crashes. Am i missing something?
upd:
it is an aws t2.micro, i think, the free tier one. It runs ubuntu 18.04.3 LTS
You need to monitor the server and application. The first thing to look at is your instance cloudwatch statistics for any CPU or memory spikes. If you find one, you know what you need to fix if you want to run your application on micro instance. For further reading
Monitoring Your Instances Using CloudWatch
Alternatively, you can collect and dump the java process statistics regularly when you are running the application. This can give insight of how heap,stack and cpu usage. Check this SO post for further details :
How do I monitor the computer's CPU, memory, and disk usage in Java?

Thread starts and fails to stops with Tomcat. What's happening?

i have a java multi-threaded program that is running. i am running it on a tomcat server. when the threads are still running, some executing tasks, some still waiting for some thing to return and all kinds of things, assume i stop the server all of a sudden in this scenario.. when i do i get a warning on the tomcat terminal saying a thread named x is still running and the server is being stopped so this might lead to a memory leakage. what is the OS actually trying to tell me here? can someone help me understand this?? i am running this program on my system several times and i have stopped the server abruptly 3 times and i have seen this message when ever i do that. have i runined my server? (i mean my system). did i do something very dangerous????
please help.
Thanks in advance!
when i do i get a warning on the tomcat terminal saying a thread named x is still running and the server is being stopped so this might lead to a memory leakage. what is the OS actually trying to tell me here?
Tomcat (not the OS) is surmising from this extra thread that some part of your code forked a thread that may not be properly cleaning itself up. It is thinking that maybe this thread is forked more than once and if your process runs for a long time, it could fill up usable memory which would cause the JVM to lock up or at least get very slow.
have i ruined my server? (i mean my system). did i do something very dangerous????
No, no. This is about the tomcat process itself. It is worried that this memory leak may stop its ability to do its job as software -- nothing more. Unless you see more than one thread or until you see memory problems with your server (use jconsole for this) then I would only take it as a warning and a caution.
It sounds like your web server is forking processes which are not terminated when you stop the server. Those could lead to a memory leak because they represent processes that will never die unless you reboot or manually terminate them with the kill command.
I doubt that you will permanently damage your system, unless those orphaned processes are doing something bad, but that would be unrelated to you stopping the server. You should probably do something like ps aux | grep tomcat to find the leftover processes and do the following
Kill them so they don't take up more system resoures.
Figure out why they are persisting when the server is stopped. This sounds like a misbehaving server.

java application being abrutply terminated on shutdown without having signal sent

To be clear, this is a upstart/linux debugging problem not a java problem per-se.
I have a java application installing a shutdownhook. It shows some funny behaviour in ubuntu-GNOME, namely that the shutdown hook never run if a restart or power off were scheduled. At first i thought it was a problem with my shutdown hook, so i simplified it until it was only writing a line to file (yes i know about the log4j2 logger problem with shutdownhooks so i disabled theirs too). Then when that didn't work i started hacking /etc/init/sendsigs
I added this to the beginning of the do_stop function:
app="$(jps | grep appname.jar | cut -d' ' -f1)"
echo "$app" >> "/home/i30817/output.txt"
sure enough, that showed that it was no longer running, so the shutdown hooks were never activated by sendsigs
Then i used lastcomm from here replacing my edit of sendsigs by:
echo `/home/i30817/lastcomm` > "/home/i30817/output.txt"
And it told me that the java process exited with 1 and was not signaled:
java X i30817 ?? 10.26 secs Sun Mar 2 12:44 E 1
but this still didn't help me find what actually killed it and why. This problem is not reproducible with a smaller example, so it's probably something in the larger application (but not the shutdown hook, since it was minimized) that doesn't like the shutdown process and manages to kill the process, but i can't figure it out... redirecting the process output to a file doesn't say anything either eg:
java -jar /home/i30817/Documents/projects/app/dist/app.jar > allout.text 2>&1
is empty of everything but normal app output
Can you help me figure out this? There are a lot of duplicate questions about the same thing too (but they think it's the shutdownhooks malfunctioning).
edit: more detail, now that i understand the problem a bit better. I think now that processes not being there on sendsigs is normal. Java applications, and maybe others use a protocol from the window manager where SIGHUP, SIGHUP and SIGCONT is sent on shutdown/logout. The JVM hooks SIGHUP to launch the shutdown hooks. I tested this with a very small example that only adds a shutdownhook and has a infinite cycle on main, and ran it with a system tap script in the background:
java -jar app.jar
and in another shell
sudo stap -o process.txt sigkill.stp
However when i tried that with my application i think i found the culprit:
PROCESS: SIGSEGV java.signal_generate sent to java 2280
but don't really know what to do about it considering there is no thread dump or anything and this is strange to reproduce (only my app, only during shutdown).
edit2: now i suspect the reason for the 'abrupt' termination without core dump is the ulimit during shutdown. So i'm trying to solve that in preparation for a bug report. I edited /etc/security/limits.conf to add this and rebooted
* soft core unlimited
root hard core unlimited
* hard rss 10000
(fs.suid_dumpable=2 was set by ubuntu, no apparmor i think)
but during shutdown i edited /etc/init.d/sendsigs again to print ulimit -a and sleep for 30 seconds before killing the processes, and it seems that during reboot the ulimit gets reset again? And moreover, it had a different output like it was using another executable version, for instance instead of saying 'core file size' it had 'core(dump)' or something like that).
edit3: ah, i need to have fs.suid_dumpable=1 instead - gonna try now.
Maybe the init ulimit doesn't matter for shutdown core dump triggering. After all the jvm was executed from the user env so it should be using the user ulimit.
edit4: eh. After much commenting of code i reached the following conclusion that i could have reached from RTFM:
the sigsegvs are harmless.
the non-zero exit code is not.
If the AWT is still up, the signal is always non zero and the shutdown hooks never run. Even a small example still prevents execution of shutdown hooks in linux reboot if a JFrame is up (unlike windows, where they will start). Looking at the source, the application shutdownhooks are run on a slot by themselves, slot 1. I bet slot '0' is the AWT and that is halting the system somehow.
I guess it's time to check the package private signal handling libs to see if i can get SIGHUP before the JVM decides to terminate everything without even giving the opportunity for cleanup code to run.
According to docs
http://docs.oracle.com/javase/7/docs/api/java/lang/Runtime.html#addShutdownHook(java.lang.Thread)
In rare circumstances the virtual machine may abort, that is, stop running without shutting down cleanly....
If the virtual machine aborts then no guarantee can be made about whether or not any shutdown hooks will be run.

Tomcat stopped without any log or any stack

We have trouble with Tomcat 5.5 which stops at night on our production servers (Linux CentOS 4.8) and we have no idea why it stops...
There is no Tomcat's log in catalina.out or any application's log.
We tried different things to find why the server stops:
configure Tomcat to be able to generate a core dump
instrument System.exit() method with javassist to find if the method was called
add a shutdown hook to the JVM (with Runtime.getRuntime().addShutdownHook())
None of them worked, we have no core dump, the Exit method and the shutdown hook are not called.
My conclusions are:
The VM is not terminated properly but crash without any log.
Any idea or log to read to find why Tomcat stops?
1) Make sure you know where stderr is redirected and check if anything got printed there.
2) Check the memory limits on Tomcat and how much free memory does the system have. Review the Linux system logs under /var/log to see if anything suspicious happened during the time. For example, kernel can randomly kill a process (almost) without a trace if the system is running low on memory.
We've ran 5.5 in production for years and never had any unexplained shutdowns, FWIW.
This worked for me.
As suggested here in other answers checked system logs in /var/log/messages but permission denied for me. So, I used dmesg command instead and got this in the logs
"Out of memory: Kill process 14606 (java) score 106 or sacrifice child".
In the output I also noticed Swap Memory free 0 K. Ran top command to confirm the same. So, somehow there was a high memory usage which caused the OS to kill my tomcat process.
After spending hours finally got the reason.
ps -ef | grep tomcat showed that there were several tomcat processes running for the same application. It seems that, earlier tomcat shutdowns might not have been completed successfully and due to some reason the processes were not killed even after the shutdown, which was causing the high memory usage.
So, killed all running tomcat processes using kill. SWAP memory got freed.
Started tomcat again, worked fine. :)
Tomcat 7 has an option inside catalina to prevent the System.exit class call or something similar: http://ci.apache.org/projects/tomcat/tomcat7/docs/security-manager-howto.html .
Maybe there's a similar option for the 5.5 version. Try the documentation.
There are options to redirect the output to the same console that you use to start Tomcat. This information is redirected to logs when you execute on Unix based systems, on Windows, it remains with the console if not redirected.
Most probably there is a stack-overflow exception. This is typical behavior of Tomcat when it happens. For example, you're trying to serialize to JSON or XML beans with cyclic dependencies (but without handling of the cycles).
Everytime I had this issue (several times) it always has been this one. All other stops are usually logged properly (like OutOfMemory etc).
This type of stops leaves no trace anywhere.

Standalone Java App dies after a few days

We have a Java App that connects via RMI to another Java app.
There are multiple instances of this app running at the same time, and after a few days an instance just stops processing... the CPU is in 0 and I have an extra thread listening to an specific port that helps to shutdown the App.
I can connect to the specific port but the app doesn't do anything.
We're using Log4j to keep a log and nothing is written, so there aren't any exceptions thrown.
We also use c3p0 for the DB connections.
Anyone have ideas?
Thanks,
I would suggest starting with a thread dump of the affected application.
You need to see what is going on on a thread by thread basis. It could be that you have a long running thread, or other process which is blocking other work from being done.
Since you are running linux, you can get your thread dump with the following command
kill -3 <pid>
If you need help reading the output, please post it in your original question.
If nothing is shown from the thread dump, other alternatives can be looked at.
Hum... I would suggest using JMetter to stress the Application and take note of anything weird that might be happening (such as Memory Leaks, Deadlocks and such). Also review the code for any Exceptions that might interrupt the program (or System.exit() calls). Finally, if other people have access to the computer, makes sense to check if the process wasn't killed manually somehow.

Categories