When profiling Java application I note interesting fact. When JVM is in GC spiral of death thread dump is looks like:
"1304802943#qtp-393978767-9985" prio=10 tid=0x00007f3ed02dd000 nid=0x74e7 in Object.wait() [0x000000004febb000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:626)
- locked <0x00000007aed40048> (a org.mortbay.thread.QueuedThreadPool$PoolThread)
"26774405#qtp-393978767-9984" prio=10 tid=0x00007f3ee4b37000 nid=0x74e6 in Object.wait() [0x0000000045d1a000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:626)
- locked <0x00000007aed83aa0> (a org.mortbay.thread.QueuedThreadPool$PoolThread)
"764808089#qtp-393978767-9983" prio=10 tid=0x00007f3ee4c50000 nid=0x74e5 in Object.wait() [0x000000004ad6a000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:626)
- locked <0x00000007aed5c448> (a org.mortbay.thread.QueuedThreadPool$PoolThread)
So, there are a lot of threads in TIMED_WAITING state. Theoretically this situation could be easily found in normally functioning application (application simply doesn't have any incoming requests at the moment), but I can't find even single request dispatching thread doing something useful (nominal hit rate is about 100 hps).
Does this behavior have something to di with GC, or it's just coincidence?
Answering just the question's title:
What does thread dump look like when JVM spent time in GC?
The answer is: you have no means to obtain such dump (in a usual way).
JVM processes the request for thread dump only after reaching safepoint which just can't happen while in GC.
But there is a cheat way to obtain active GC's thread dump with help of undocumented JVMTI function AsyncGetCallTrace which is mentioned in this post:
http://jeremymanson.blogspot.com/2010/07/why-many-profilers-have-serious.html
It also hints that Oracle Solaris Studio can be used to take such mixed native/java thread dumps.
Try a jmap -histo:live over time, you can compare output, see which object types are growing.
You need to have the JDK installed for jmap.
http://docs.oracle.com/javase/6/docs/technotes/tools/share/jmap.html
A warning, jmap is intensive, it will pause all threads while it's running, which should only be a few seconds. Processes can core dump because it's intensive, generally it's quick and safe, but I have see it lock up or kill large applications, multi-gig heaps.
My guess is you have a thread pool which is waiting for something to do. If your process is efficient and you have even 100 requests per second you may have trouble catching even one thread doing something. I suggest you look at the CPU load of your process. If its 50%, you have a 50% chance of finding one thread (possibly not a request thread) doing something.
If you want to see what your server spends its time doing, I would try a profiler like VisualVM, or a commercial profiler like YourKit.
Doing a google search for you code, I found a different version http://grepcode.com/file/repo1.maven.org/maven2/org.mortbay.jetty/jetty-util/7.0.0.pre5/org/mortbay/thread/QueuedThreadPool.java however I suspect your threads are TIMED_WAIT in this block int he run() method
// We are idle
// wait for a dispatched job
synchronized (this)
{
if (_job==null)
this.wait(getMaxIdleTimeMs());
job=_job;
_job=null;
}
Related
Recently I often encounter situations where the CPU reaches 100%, So I checked the CPU situation:
It seems that the memory is full, so I dumped heap, and try to analyze it using MAT.
I first noticed that the memory of Finalizer is very abnormal, used more than 2G of memory.
For comparison, it is about 500MB under normal circumstances.
I initially thought it was because the Finalizer thread was blocked, But Java thread dump: BLOCKED thread without "waiting to lock ..." seems to mean that this is not the source of the problem
"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007f83c008c000 nid=0x1c190 waiting for monitor entry [0x00007f83a1bba000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.util.zip.Deflater.deflateBytes(Native Method)
at java.util.zip.Deflater.deflate(Deflater.java:444)
- locked <0x00000006d1b327a0> (a java.util.zip.ZStreamRef)
at java.util.zip.Deflater.deflate(Deflater.java:366)
at java.util.zip.DeflaterOutputStream.deflate(DeflaterOutputStream.java:251)
at java.util.zip.DeflaterOutputStream.write(DeflaterOutputStream.java:211)
at java.util.zip.GZIPOutputStream.write(GZIPOutputStream.java:145)
- locked <0x00000006d1b32748> (a java.util.zip.GZIPOutputStream)
at org.apache.coyote.http11.filters.GzipOutputFilter.doWrite(GzipOutputFilter.java:72)
at org.apache.coyote.http11.Http11OutputBuffer.doWrite(Http11OutputBuffer.java:199)
at org.apache.coyote.Response.doWrite(Response.java:538)
at org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:328)
at org.apache.catalina.connector.OutputBuffer.flushByteBuffer(OutputBuffer.java:748)
at org.apache.catalina.connector.OutputBuffer.realWriteChars(OutputBuffer.java:433)
at org.apache.catalina.connector.OutputBuffer.flushCharBuffer(OutputBuffer.java:753)
at org.apache.catalina.connector.OutputBuffer.doFlush(OutputBuffer.java:284)
at org.apache.catalina.connector.OutputBuffer.flush(OutputBuffer.java:261)
at org.apache.catalina.connector.CoyoteOutputStream.flush(CoyoteOutputStream.java:118)
at javax.imageio.stream..close(FileCacheImageOutputStream.java:238)
at javax.imageio.stream.ImageFileCacheImageOutputStreamInputStreamImpl.finalize(ImageInputStreamImpl.java:874)
at java.lang.System$2.invokeFinalize(System.java:1270)
at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:98)
at java.lang.ref.Finalizer.access$100(Finalizer.java:34)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:210)
Following all the way, I think I found the object that takes up memory:
So it seems that URLJarFile creates a lot of unused buffers, but I don’t know how to continue to find the source of the problem after I get here. Maybe I need to monitor the stack calling here?
Add: I think I found some useful information, most of which are loaded with jstl-1.2.jar
My application is using Gson 2.2 for converting POJOs to JSON. When I was making a load test I stumbled upon a lot of threads blocked in Gson constructor:
"http-apr-28201-exec-28" #370 daemon prio=5 os_prio=0 tid=0x0000000001ee7800 nid=0x62cb waiting for monitor entry [0x00007fe64df9a000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.google.gson.Gson.<init>(Gson.java:200)
at com.google.gson.Gson.<init>(Gson.java:179)
Thread dump does NOT show any threads holding [0x00007fe64df9a000] monitor.
How can I find out who holds it?
Gson code at line 200 looks pretty innocent:
// built-in type adapters that cannot be overridden
factories.add(TypeAdapters.STRING_FACTORY);
factories.add(TypeAdapters.INTEGER_FACTORY);
I'm using JRE 1.8.0_91 on Linux
tl;dr I think you are running into GC-related behavior, where threads are being put in waiting state to allow for garbage collection.
I do not have the whole truth but I hope to provide some pieces of insight.
First thing to realize is that the number in brackets, [0x00007fe64df9a000], is not the address of a monitor. The number in brackets can be seen for all threads in a dump, even threads that are in running state. The number also does not change. Example from my test dump:
main" #1 prio=5 os_prio=0 tid=0x00007fe27c009000 nid=0x27e5c runnable [0x00007fe283bc2000]
java.lang.Thread.State: RUNNABLE
at Foo.main(Foo.java:12)
I am not sure what the number means, but this page hints that it is:
... the pointer to the Java VM internal thread structure. It is generally of no interest unless you are debugging a live Java VM or core file.
Although the format of the trace explained there is a bit different so I am not sure I am correct.
The way a dump looks when the address of the actual monitor is shown:
"qtp48612937-70" #70 prio=5 os_prio=0 tid=0x00007fbb845b4800 nid=0x133c waiting for monitor entry [0x00007fbad69e8000]
java.lang.Thread.State: BLOCKED (on object monitor)
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:233)
- waiting to lock <0x00000005b8d68e90> (a java.lang.Object)
Notice the waiting to lock line in the trace and that the address of the monitor is different from the number in brackets.
The fact that we cannot see the address of the monitor involved indicates that the monitor exists only in native code.
Secondly, the Gson code involved does not contain any synchronization at all. The code just adds an element to an ArrayList (assuming no bytecode manipulation has been done and nothing fishy is being done at low level). I.e., it would not make sense to see the thread waiting for a standard synchronization monitor at this call.
I found some, indications that threads can be shown as waiting for a monitor entry when there is a lot of GC going on.
I wrote a simple test program to try to reproduce it by just adding a lot of elements to an array list:
List<String> l = new ArrayList<>();
while (true) {
for (int i = 0; i < 100_100; i++) {
l.add("" + i);
}
l = new ArrayList<>();
}
Then I took thread dumps of this program. Occasionally I ran into the following trace:
"main" #1 prio=5 os_prio=0 tid=0x00007f35a8009000 nid=0x12448 waiting on condition [0x00007f35ac335000]
java.lang.Thread.State: RUNNABLE
at Foo.main(Foo.java:10) <--- Line of l.add()
While not identical to the OP's trace, it is interesting to have a thread waiting on condition when no synchronization is involved. I experienced it more frequently with a smaller heap, indicating that it might be GC related.
Another possibility could be that code that contains synchronization has been JIT compiled and that prevents you from seeing the actual address of the monitor. However, I think that is less likely since you experience it on ArrayList.add. If that is the case, I know of no way to find out the actual holder of the monitor.
If you don't have GC issues then may be actually there is some thread which has acquired lock on an object and stuck thread is waiting to acquire lock on the same object. The way to figure out is look for
- waiting to lock <some_hex_address> (a <java_class>)
example would be
- waiting to lock <0x00000000f139bb98> (a java.util.concurrent.ConcurrentHashMap)
in the thread dump for entry which says waiting for monitor entry. Once you have found it, you can search for thread that has already acquired lock on the object with address <some_hex_address>, it would look something like this for the example -
- locked <0x00000000f139bb98> (a java.util.concurrent.ConcurrentHashMap)
Now you can see the stacktrace of that thread to figure out which line of code has acquired it.
In Java, thread can have different state:
NEW, RUNNABLE, BLOCKED, WAITING, TIMED_WAITING, TERMINATED
However, when the thread is blocked by IO, its state is "RUNNABLE". How can I tell if it is blocked by IO?
NEW: The thread is created but has not been processed yet.
RUNNABLE:
The thread is occupying the CPU and processing a task. (It may be in WAITING status due to the OS's resource distribution.)
BLOCKED: The thread is waiting for a different thread to release its lock in order to get the monitor lock. JVISULVM shows thta as Monitoring
WAITING: The thread is waiting by using a wait, join or park method.
TIMED_WAITING: The thread is waiting by using a sleep, wait, join or park method. (The difference from WAITING is that the maximum waiting time is specified by the method parameter, and WAITING can be relieved by time as well as external changes.)
TERMINATED: A thread that has exited is in this state.
see also http://architects.dzone.com/articles/how-analyze-java-thread-dumps
Thread Dump
Dumping java thread stack you can find something like that
java.lang.Thread.State: RUNNABLE
at java.io.FileInputStream.readBytes(Native Method)
or
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
and you can understand that java is waiting response.
I suggest this tool Java Thread Dump Analyser or this plug-in TDA
ThreadMXBean
Yiu can obtain more information using the ThreadMXBean
http://docs.oracle.com/javase/7/docs/api/java/lang/management/ThreadMXBean.html
You can check the statckTraces of the thread then find if the last stack is in some specific method associated with i/o blocking (eg: java.net.SocketInputStream.socketRead0)
This is not a clever way but it works.
JProfiler supports the feature you need, details show at: WHAT'S NEW IN JPROFILER 3.1
I have Java EE based application running on tomcat and I am seeing that all of a sudden the application hangs after running for couple of hours.
I collected the thread dump from the application just before it hangs and put it in TDA for analysis:
TDA (Thread Dump Analyzer) gives the following message for the above monitor:
A lot of threads are waiting for this monitor to become available again.
This might indicate a congestion. You also should analyze other locks
blocked by threads waiting for this monitor as there might be much more
threads waiting for it.
And here is the stacktrace of the thread highlighted above:
"MY_THREAD" prio=10 tid=0x00007f97f1918800 nid=0x776a
waiting for monitor entry [0x00007f9819560000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.util.Hashtable.get(Hashtable.java:356)
- locked <0x0000000680038b68> (a java.util.Properties)
at java.util.Properties.getProperty(Properties.java:951)
at java.lang.System.getProperty(System.java:709)
at com.MyClass.myMethod(MyClass.java:344)
I want to know what does the "waiting for monitor entry" state means? And also would appreciate any pointers to help me debug this issue.
One of your threads acquired a monitor object (an exclusive lock on a object). That means the thread is executing synchronized code and for whatever reason stuck there, possibly waiting for other threads. But the other threads cannot continue their execution because they encountered a synchronized block and asked for a lock (monitor object), however they cannot get it until it is released by other thread. So... probably deadlock.
Please look for this string from the whole thread dump
- locked <0x00007f9819560000>
If you can find it, the thread is deadlock with thread "tid=0x00007f97f1918800"
Monitor = synchronized. You have lots of threads trying to get the lock on the same object.
Maybe you should switch from using a Hashtable and use a HashMap
This means that your thread is trying to set a lock (on the Hashtable), but some other thread is already accessing it and has set a lock. So it's waiting for the lock to release. Check what your other threads are doing. Especially thread with tid="0x00007f9819560000"
In a Java threaddump I found the following:
"TP-Processor184" daemon prio=10 tid=0x00007f2a7c056800 nid=0x47e7 waiting for monitor entry [0x00007f2a21278000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.jackrabbit.core.state.SharedItemStateManager.getNonVirtualItemState(SharedItemStateManager.java:1725)
- locked <0x0000000682f99d98> (a org.apache.jackrabbit.core.state.SharedItemStateManager)
at org.apache.jackrabbit.core.state.SharedItemStateManager.getItemState(SharedItemStateManager.java:257)
"TP-Processor137" daemon prio=10 tid=0x00007f2a7c00f800 nid=0x4131 waiting for monitor entry [0x00007f2a1ace7000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.jackrabbit.core.state.SharedItemStateManager.getNonVirtualItemState(SharedItemStateManager.java:1725)
- locked <0x0000000682f99d98> (a org.apache.jackrabbit.core.state.SharedItemStateManager)
at org.apache.jackrabbit.core.state.SharedItemStateManager.getItemState(SharedItemStateManager.java:257)
The point here being that both threads have locked monitor <0x0000000682f99d98> (regardless of them now waiting for two different other monitors).
When looking at Thread Dump Analyzer, with that monitor being selected, it really says "Threads locking monitor: 2" at the bottom, and "2 Thread(s) locking". Please see https://lh4.googleusercontent.com/-fCmlnohVqE0/T1D5lcPerZI/AAAAAAAAD2c/vAHcDiGOoMo/s971/locked_by_two_threads_3.png for the screenshot, I'm not allowed to paste images here.
Does this mean threaddumps aren't atomic with respect to monitor lock information? I can't imagine this really being a locking bug of the JVM (1.6.0_26-b03).
A similar question has already been asked in Can several threads hold a lock on the same monitor in Java?, but the answer to me didn't see the real point of multiple threads locking the same monitor, even though they may be waiting for some other.
Update May 13th 2014:
Newer question Multiple threads hold the same lock? has code to reproduce the behaviour, and #rsxg has filed an according bug report https://bugs.openjdk.java.net/browse/JDK-8036823 along the lines of his answer here.
I don't think that your thread dump is saying that your two threads are "waiting for two different other monitors". I think it is saying that they are both waiting on the same monitor but at two different code points. That may be a stack location or an object instance location or something. This is a great document about analyzing the stack dumps.
Can several threads hold a lock on the same monitor in Java?
No. Your stack dump is showing two threads locked on the same monitor at the same code location but in different stack frames -- or whatever that value is which seems OS dependent.
Edit:
I'm not sure why the thread dump seems to be saying that both threads have a line locked since that seems to only be allowed if they are in a wait() method. I noticed that you are linking to version 1.6.5. Is that really the version you are using? In version 2.3.6 (which may be the latest), the 1725 line actually is a wait.
1722 synchronized (this) {
1723 while (currentlyLoading.contains(id)) {
1724 try {
1725 wait();
1726 } catch (InterruptedException e) {
You could also see this sort of stack trace even if it was an exclusive synchronized lock. For example, the following stack dump under Linux is for two threads locked on the same object from the same code line but in two different instances of the Runnable.run() method. Here's my stupid little test program. Notice that the monitor entry numbers are different, even thought it is the same lock and same code line number.
"Thread-1" prio=10 tid=0x00002aab34055c00 nid=0x4874
waiting for monitor entry [0x0000000041017000..0x0000000041017d90]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00002aab072a1318> (a java.lang.Object)
at com.mprew.be.service.auto.freecause.Foo$OurRunnable.run(Foo.java:38)
- locked <0x00002aab072a1318> (a java.lang.Object)
at java.lang.Thread.run(Thread.java:619)
"Thread-0" prio=10 tid=0x00002aab34054c00 nid=0x4873
waiting for monitor entry [0x0000000040f16000..0x0000000040f16d10]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00002aab072a1318> (a java.lang.Object)
at com.mprew.be.service.auto.freecause.Foo$OurRunnable.run(Foo.java:38)
- locked <0x00002aab072a1318> (a java.lang.Object)
at java.lang.Thread.run(Thread.java:619)
On my Mac, the format is different but again the number after the "monitor entry" is not the same for the same line number.
"Thread-2" prio=5 tid=7f8b9c00d000 nid=0x109622000
waiting for monitor entry [109621000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <7f3192fb0> (a java.lang.Object)
at com.mprew.be.service.auto.freecause.Foo$OurRunnable.run(Foo.java:38)
- locked <7f3192fb0> (a java.lang.Object)
"Thread-1" prio=5 tid=7f8b9f80d800 nid=0x10951f000
waiting for monitor entry [10951e000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <7f3192fb0> (a java.lang.Object)
at com.mprew.be.service.auto.freecause.Foo$OurRunnable.run(Foo.java:38)
- locked <7f3192fb0> (a java.lang.Object)
This Oracle document describe that value as the following:
Address range, which gives an estimate of the valid stack region for the thread
You are probably running into a cosmetic bug in the stack trace routines in the JVM when analyzing heavily contended locks - it may or may not be the same as this bug.
The fact is that neither of your two threads have actually managed to acquire the lock on the SharedItemStateManager, as you can see from the fact that they are reporting waiting for monitor entry. The bug is that further up in the stack trace in both cases they should report waiting to lock instead of locked.
The workaround when analyzing strange stack traces like this is to always check that a thread claiming to have locked an object is not also waiting to acquire a lock on the same object.
(Unfortunately this analysis requires cross-referencing the line numbers in the stack trace with the source, code since there is no relationship between the figures in the waiting for monitor entry header and the locked line in the stack trace. As per this Oracle document, the number 0x00007f2a21278000 in the line TP-Processor184" daemon prio=10 tid=0x00007f2a7c056800 nid=0x47e7 waiting for monitor entry [0x00007f2a21278000] refers to an estimate of the valid stack region for the thread. So it looks like a monitor ID but it isn't - and you can see that the two threads you gave are at different addresses in the stack).
When a thread locks an object but wait()s another thread can lock the same object. You should be able to see a number of threads "holding" the same lock all waiting.
AFAIK, the only other occasion is when multiple threads have locked and waited and are ready to re-acquire the the lock e.g. on a notifyAll(). They are not waiting any more but cannot continue until they have obtained the lock again. (only one thread at a time time can do this)
"http-0.0.0.0-8080-96" daemon prio=10 tid=0x00002abc000a8800 nid=0x3bc4 waiting for monitor entry [0x0000000050823000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:195)
- locked <0x00002aadae12c048> (a java.util.WeakHashMap)
"http-0.0.0.0-8080-289" daemon prio=10 tid=0x00002abc00376800 nid=0x2688 waiting for monitor entry [0x000000005c8e3000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:195)
- locked <0x00002aadae12c048> (a java.util.WeakHashMap
"http-0.0.0.0-8080-295" daemon prio=10 tid=0x00002abc00382800 nid=0x268e runnable [0x000000005cee9000]
java.lang.Thread.State: RUNNABLE
at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:195)
- locked <0x00002aadae12c048> (a java.util.WeakHashMap)
In our thread dump, we have several threads lock same monitor, but only one thread is runnable. It probably because of lock competition, we have 284 other threads waiting for the lock. Multiple threads hold the same lock? said this only exists in the thread dump, for thread dump is not atomic operation.