How can we take historical thread dumps in weblogic servers - java

A java application process running on weblogic server was reported to have some stuck threads in the past.
Can we get information on those stuck threads from a past time?

If those threads were marked as stuck by WebLogic they were logged in WebLogic server's log file. The stack trace is also logged. Have a look to these log files to see were your threads were stucked.

No, you can't. Gone is gone, end of story.
The only thing you can do is to figure if you can somehow detect such situations while they occur to then collect thread dumps and to log the results.
Worst case you have to give the customers/operators clear instructions how to gather dumps themselves to make them available to you.

Related

ColdFusion JVM: strange memory behaviour

Since last month, we got a problem on our company's server (Win2008ServerStd + IIS7 + CF enterprise 9.0.1 (hotfix2)).
I used jConsole to monitor the Coldfusion JVM (1.6.0_24) activity and here's what I see:
Notice that strange "curve" between 14:10 and 14:15 ! What is that?
Obviously it's not a standard behaviour, when it happens, my applications hang for 30 to 70 seconds!
Do you know what can cause that memory issue? It seems like GC does not run correctly, or hangs itself.
I don't expect a flash-answer, I wonder there can be a lot of root problems causing that but.... where can I start investigating?
Using cfstat, perfmon, fusionreactor, or cf perfomance monitor take a look at running requests and queued duing your problem. What you will likely see is running requests climbing past the setting of the simultaneous requests (in the cf admin). Then the requests will start to queue. Eventually the queue will clear out (if your server is recovering on it's own).
This sort of thing can be caused by a number of things. For example, if your DB server slows down or has an issue, if your network has a problem, or if network ports are resyncing, if your disks have I/O problems etc.
My guess is that you will drive yourself batty trying to figure this out by monitoring your heap. See if you can watch one of the monitors for some specific scripts that might be the culprit.
The other comment (about some indexing agents) is also a possibility. A flurry of indexing can definitely cause behavior. If that's the case, you might take a look at the simultaneous request settings. If it is set at the default you might have enough head room to increase it.
It could have been a spider creating lots and lots of sessions as it crawled the site which would eat up memory for a period of time. Once the spider stopped crawling those sessions would time out and be garbage collected.
I would compare your HTTP server logs w/ the JVM logs. Compare that time frame and see if there are a lot of requests from a search engine spider (Googlebot, msnbot, etc).
Fabio,
Same kind of issue I have couple of month ago where I was getting spike on regular interval and server eating up arround 50% of CPU usage. I wrote full story below URL
http://www.isummation.com/blog/strange-coldfusion-issue-jrun-eating-up-to-50-of-cpu/
which may help you (Sorry for so long).
I found that client variables storing in registry was causing issue and I am able to catch with help of VisualVM where I first find out thread causing issue and looking into trace of exactly find solution.
The only thing that's really odd IMO is the sudden spike to having so many threads. Capture a thread dump on a regular basis (jstack, etc.. are your friends) and then correlate those thread dumps to your monitoring where it shows the spike.
The root problem will become more obvious once you understand what all the extra threads are doing. Perhaps it's more threads handling transactions, but it might be something else entirely.

Standalone Java App dies after a few days

We have a Java App that connects via RMI to another Java app.
There are multiple instances of this app running at the same time, and after a few days an instance just stops processing... the CPU is in 0 and I have an extra thread listening to an specific port that helps to shutdown the App.
I can connect to the specific port but the app doesn't do anything.
We're using Log4j to keep a log and nothing is written, so there aren't any exceptions thrown.
We also use c3p0 for the DB connections.
Anyone have ideas?
Thanks,
I would suggest starting with a thread dump of the affected application.
You need to see what is going on on a thread by thread basis. It could be that you have a long running thread, or other process which is blocking other work from being done.
Since you are running linux, you can get your thread dump with the following command
kill -3 <pid>
If you need help reading the output, please post it in your original question.
If nothing is shown from the thread dump, other alternatives can be looked at.
Hum... I would suggest using JMetter to stress the Application and take note of anything weird that might be happening (such as Memory Leaks, Deadlocks and such). Also review the code for any Exceptions that might interrupt the program (or System.exit() calls). Finally, if other people have access to the computer, makes sense to check if the process wasn't killed manually somehow.

How to find problematic thread in Eclipse remote debugger?

I have a web application running in a jboss application server (But it is not jboss specific so we could also assume it is a tomcat or any other server). Now I have the problem that one thread seems to be in dead-lock situation. It uses 100% CPU all the time. I have started the server with enabled debug port and I can connect Eclipse to it. But the problem is: There are a lot of threads running. How can I find the right thread? I know the process id (from Linux "top" command) but I think this will not help. Do I really have to open each thread separately and check what they are currently doing? Or is there a way to filter the threads for "most active" or something like that in Eclipse?
You can try and generate a thread dump (CTRL+Break as shown in this thread).
Or you could attach a JConsole to the remote session (so leaving Eclipse aside for now), monitor the threads and generate a thread dump.
alt text http://www.jroller.com/dumpster/resource/tdajconsole.png
Seems to be you need to narrow things down to the code that has the bug by identifying which thread is eating the CPU first, then which code is being executed by that thread and at that point you can remote debug.
I would suggest using something like JProfiler, jvisualvm, jconsole or something similar. Using one of these tools will allow you to get some insight into what the thread is doing and should allow you to sort the threads by cpu cycles used so you kind find the offending thread quickly.

SQL Server 2000 blocking prevented by running profiler?

We are working on a large Java program that was converted from a Forte application. During the day we are getting Blocking SPID's in the server. We had a DBA visit yesterday and he set up a profile template to run to catch the locking/blocking action. When we run this profile the blocking problem goes away. Why?
This application is distributed using RMI and has around 70 users. We are using SQL 2000 and windows 2000 servers to keep compatibility with a bunch of old VB helper applications.
We have traced the blocking down to a specific screen and stored procedure but now we can't get the errors to happen with profiler running.
Thanks for any help!
Theo
The good old Heisenberg debugger problem.
Any profiler does two things: it adds code in place to invoke the debugger, and it stores data. The first one can thward optimizers, and the second can change the timing of something, causing a race condition to go away.
This blocking SPID problem seems to show up on Google a lot; the reason appears to be that it occurs when some resource is locked when another one wants it, so the timing error sounds likely.
Microsoft has an article on how to deal with the problem.
Just a collection of random thoughts.. I've seen traces take a server down but never make things better.
What trace template are you using? (These are taken from SQL Server 2005 tools, sorry)
The "Standard (default)" one tracks high levels calls and logon/logout
The "TSQL_SPs" tracks statement calls which would be a lot more intrusive
Is it binary and guaranteed too? Trace on= no blocks, trace off = blocks, or is it unlucky coincidence? When you're all watching the DBA does someone stop clicking in the client and come to watch?
Is something else being switched off as part of the trace. That is, are you using profiler or a scripted trace (lots of sp_trace_set% statements)?. In a scripted trace, there may be something that switches something else off.

Zombie threads eating my brainz (J2EE, Tomcat, Hibernate, Quartz)

It is Hallowe'en after all.
Here's the problem: I'm maintaining some old-ish J2EE code, using Quartz, in which I'm running out of threads. jconsole tells me that there are just short of 60K threads when it goes pear-shaped, of which about 100 (!!) are actually running. Intuition and some googling (see also here) suggest that what's happening is something (I'm betting Quartz) is creating unmanaged threads that never get cleaned up.
Several subquestions:
It there a tool that I can use easily to trace thread creation, so I can be certain the issue is really Quartz?
Most everything I've found about similar problems references Weblogic; is this a false lead for Tomcat?
Does anyone have a known solution?
It's been years since I did J2EE, so I wouldn't be too surprised if this is something that can be solved simply.
Update: It's clearly increasing threads without bound, see this plot from jconsole.
Try to increase the logging level of org.quartz.simpl.SimpleThreadPool to debug to get more information.
If that does not work, try a logging listener. Quartz has a JobListener interface, which is specified in its tutorial. A listener can help you trace job execution. Maybe jobs just don't finish and get deadlocked.
Configure org.quartz.threadPool.threadCount to stop running out of threads.
update:
Also, you might want to take a thread dump and see the thread stats. visual vm has a plugin called TDA, or you can use Thread dump analyzer directly.
Just in case, check the quartz version to see if there is no known bug.
Have you had a look with jvisualvm - it gives some more information.
Also, get stack traces to see what the threads are actually waiting on. You might have an aha-feeling right there.

Categories