Multiple Java threads seemingly locking same monitor? - java

In a Java threaddump I found the following:
"TP-Processor184" daemon prio=10 tid=0x00007f2a7c056800 nid=0x47e7 waiting for monitor entry [0x00007f2a21278000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.jackrabbit.core.state.SharedItemStateManager.getNonVirtualItemState(SharedItemStateManager.java:1725)
- locked <0x0000000682f99d98> (a org.apache.jackrabbit.core.state.SharedItemStateManager)
at org.apache.jackrabbit.core.state.SharedItemStateManager.getItemState(SharedItemStateManager.java:257)
"TP-Processor137" daemon prio=10 tid=0x00007f2a7c00f800 nid=0x4131 waiting for monitor entry [0x00007f2a1ace7000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.jackrabbit.core.state.SharedItemStateManager.getNonVirtualItemState(SharedItemStateManager.java:1725)
- locked <0x0000000682f99d98> (a org.apache.jackrabbit.core.state.SharedItemStateManager)
at org.apache.jackrabbit.core.state.SharedItemStateManager.getItemState(SharedItemStateManager.java:257)
The point here being that both threads have locked monitor <0x0000000682f99d98> (regardless of them now waiting for two different other monitors).
When looking at Thread Dump Analyzer, with that monitor being selected, it really says "Threads locking monitor: 2" at the bottom, and "2 Thread(s) locking". Please see https://lh4.googleusercontent.com/-fCmlnohVqE0/T1D5lcPerZI/AAAAAAAAD2c/vAHcDiGOoMo/s971/locked_by_two_threads_3.png for the screenshot, I'm not allowed to paste images here.
Does this mean threaddumps aren't atomic with respect to monitor lock information? I can't imagine this really being a locking bug of the JVM (1.6.0_26-b03).
A similar question has already been asked in Can several threads hold a lock on the same monitor in Java?, but the answer to me didn't see the real point of multiple threads locking the same monitor, even though they may be waiting for some other.
Update May 13th 2014:
Newer question Multiple threads hold the same lock? has code to reproduce the behaviour, and #rsxg has filed an according bug report https://bugs.openjdk.java.net/browse/JDK-8036823 along the lines of his answer here.

I don't think that your thread dump is saying that your two threads are "waiting for two different other monitors". I think it is saying that they are both waiting on the same monitor but at two different code points. That may be a stack location or an object instance location or something. This is a great document about analyzing the stack dumps.
Can several threads hold a lock on the same monitor in Java?
No. Your stack dump is showing two threads locked on the same monitor at the same code location but in different stack frames -- or whatever that value is which seems OS dependent.
Edit:
I'm not sure why the thread dump seems to be saying that both threads have a line locked since that seems to only be allowed if they are in a wait() method. I noticed that you are linking to version 1.6.5. Is that really the version you are using? In version 2.3.6 (which may be the latest), the 1725 line actually is a wait.
1722 synchronized (this) {
1723 while (currentlyLoading.contains(id)) {
1724 try {
1725 wait();
1726 } catch (InterruptedException e) {
You could also see this sort of stack trace even if it was an exclusive synchronized lock. For example, the following stack dump under Linux is for two threads locked on the same object from the same code line but in two different instances of the Runnable.run() method. Here's my stupid little test program. Notice that the monitor entry numbers are different, even thought it is the same lock and same code line number.
"Thread-1" prio=10 tid=0x00002aab34055c00 nid=0x4874
waiting for monitor entry [0x0000000041017000..0x0000000041017d90]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00002aab072a1318> (a java.lang.Object)
at com.mprew.be.service.auto.freecause.Foo$OurRunnable.run(Foo.java:38)
- locked <0x00002aab072a1318> (a java.lang.Object)
at java.lang.Thread.run(Thread.java:619)
"Thread-0" prio=10 tid=0x00002aab34054c00 nid=0x4873
waiting for monitor entry [0x0000000040f16000..0x0000000040f16d10]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00002aab072a1318> (a java.lang.Object)
at com.mprew.be.service.auto.freecause.Foo$OurRunnable.run(Foo.java:38)
- locked <0x00002aab072a1318> (a java.lang.Object)
at java.lang.Thread.run(Thread.java:619)
On my Mac, the format is different but again the number after the "monitor entry" is not the same for the same line number.
"Thread-2" prio=5 tid=7f8b9c00d000 nid=0x109622000
waiting for monitor entry [109621000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <7f3192fb0> (a java.lang.Object)
at com.mprew.be.service.auto.freecause.Foo$OurRunnable.run(Foo.java:38)
- locked <7f3192fb0> (a java.lang.Object)
"Thread-1" prio=5 tid=7f8b9f80d800 nid=0x10951f000
waiting for monitor entry [10951e000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <7f3192fb0> (a java.lang.Object)
at com.mprew.be.service.auto.freecause.Foo$OurRunnable.run(Foo.java:38)
- locked <7f3192fb0> (a java.lang.Object)
This Oracle document describe that value as the following:
Address range, which gives an estimate of the valid stack region for the thread

You are probably running into a cosmetic bug in the stack trace routines in the JVM when analyzing heavily contended locks - it may or may not be the same as this bug.
The fact is that neither of your two threads have actually managed to acquire the lock on the SharedItemStateManager, as you can see from the fact that they are reporting waiting for monitor entry. The bug is that further up in the stack trace in both cases they should report waiting to lock instead of locked.
The workaround when analyzing strange stack traces like this is to always check that a thread claiming to have locked an object is not also waiting to acquire a lock on the same object.
(Unfortunately this analysis requires cross-referencing the line numbers in the stack trace with the source, code since there is no relationship between the figures in the waiting for monitor entry header and the locked line in the stack trace. As per this Oracle document, the number 0x00007f2a21278000 in the line TP-Processor184" daemon prio=10 tid=0x00007f2a7c056800 nid=0x47e7 waiting for monitor entry [0x00007f2a21278000] refers to an estimate of the valid stack region for the thread. So it looks like a monitor ID but it isn't - and you can see that the two threads you gave are at different addresses in the stack).

When a thread locks an object but wait()s another thread can lock the same object. You should be able to see a number of threads "holding" the same lock all waiting.
AFAIK, the only other occasion is when multiple threads have locked and waited and are ready to re-acquire the the lock e.g. on a notifyAll(). They are not waiting any more but cannot continue until they have obtained the lock again. (only one thread at a time time can do this)

"http-0.0.0.0-8080-96" daemon prio=10 tid=0x00002abc000a8800 nid=0x3bc4 waiting for monitor entry [0x0000000050823000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:195)
- locked <0x00002aadae12c048> (a java.util.WeakHashMap)
"http-0.0.0.0-8080-289" daemon prio=10 tid=0x00002abc00376800 nid=0x2688 waiting for monitor entry [0x000000005c8e3000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:195)
- locked <0x00002aadae12c048> (a java.util.WeakHashMap
"http-0.0.0.0-8080-295" daemon prio=10 tid=0x00002abc00382800 nid=0x268e runnable [0x000000005cee9000]
java.lang.Thread.State: RUNNABLE
at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:195)
- locked <0x00002aadae12c048> (a java.util.WeakHashMap)
In our thread dump, we have several threads lock same monitor, but only one thread is runnable. It probably because of lock competition, we have 284 other threads waiting for the lock. Multiple threads hold the same lock? said this only exists in the thread dump, for thread dump is not atomic operation.

Related

Why Throwable::printStackTrace holding lock of PrintStream and cause deadlock of logback

Found a deadlock situation when using e.printStackTrace() and logback in different threads. The thread dumps are given below.
It seems to me, the logback (used in thread AsyncAppender-Worker-Thread-1) trying to acquire the lock of PrintStream, which is already owned by by main thread's java.lang.Throwable$WrappedPrintStream.println. If that's the case, why the printStackTrace keep holding the lock of PrintStream (as it should release it once the printing is done)?
Thread dump For the main thread.
"main#1" prio=5 tid=0x1 nid=NA waiting
java.lang.Thread.State: WAITING
blocks AsyncAppender-Worker-Thread-1#831
at sun.misc.Unsafe.park(Unsafe.java:-1)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:353)
at ch.qos.logback.core.AsyncAppenderBase.put(AsyncAppenderBase.java:139)
at ch.qos.logback.core.AsyncAppenderBase.append(AsyncAppenderBase.java:130)
at ch.qos.logback.core.UnsynchronizedAppenderBase.doAppend(UnsynchronizedAppenderBase.java:88)
at ch.qos.logback.core.spi.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:48)
at ch.qos.logback.classic.Logger.appendLoopOnAppenders(Logger.java:273)
at ch.qos.logback.classic.Logger.callAppenders(Logger.java:260)
at ch.qos.logback.classic.Logger.buildLoggingEventAndAppend(Logger.java:442)
at ch.qos.logback.classic.Logger.filterAndLog_0_Or3Plus(Logger.java:396)
at ch.qos.logback.classic.Logger.error(Logger.java:543)
at com.side.stdlib.logging.StdOutErrLog$2.print(StdOutErrLog.java:43)
at java.io.PrintStream.println(PrintStream.java:823)
- locked <0x1183> (a com.side.stdlib.logging.StdOutErrLog$2)
at java.lang.Throwable$WrappedPrintStream.println(Throwable.java:749)
at java.lang.Throwable.printEnclosedStackTrace(Throwable.java:698)
at java.lang.Throwable.printStackTrace(Throwable.java:668)
at java.lang.Throwable.printStackTrace(Throwable.java:644)
at java.lang.Throwable.printStackTrace(Throwable.java:635)
at com.side.SidekApi.sideAPIExecution(SidekApi.java:175)
Thread dump for the thread AsyncAppender-Worker-Thread-1
"AsyncAppender-Worker-Thread-1#831" daemon prio=5 tid=0xe nid=NA waiting for monitor entry
java.lang.Thread.State: BLOCKED
waiting for main#1 to release lock on <0x1183> (a com.side.stdlib.logging.StdOutErrLog$2)
at java.io.PrintStream.write(PrintStream.java:478)
at java.io.FilterOutputStream.write(FilterOutputStream.java:97)
at ch.qos.logback.core.joran.spi.ConsoleTarget$2.write(ConsoleTarget.java:55)
at ch.qos.logback.core.encoder.LayoutWrappingEncoder.doEncode(LayoutWrappingEncoder.java:135)
at ch.qos.logback.core.OutputStreamAppender.writeOut(OutputStreamAppender.java:194)
at ch.qos.logback.core.OutputStreamAppender.subAppend(OutputStreamAppender.java:219)
at ch.qos.logback.core.OutputStreamAppender.append(OutputStreamAppender.java:103)
at ch.qos.logback.core.UnsynchronizedAppenderBase.doAppend(UnsynchronizedAppenderBase.java:88)
at ch.qos.logback.core.spi.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:48)
at ch.qos.logback.core.AsyncAppenderBase$Worker.run(AsyncAppenderBase.java:226)
It seems the situation is a bit similar with https://bugs.openjdk.java.net/browse/JDK-6719464, but no answer there.
If the logback worker thread can't finish, it must be because its blocking queue is full. The worker is waiting to deposit its log entry, and since the thread is WAITING we know it released the lock on the queue, but it still holds the lock on the printstream. The console writing thread is BLOCKED trying to acquire the lock on the printstream, which it needs in order to write to the console, so they are deadlocked.
A minimal fix that avoids code changes could be swapping out the console appender for one that doesn't need to acquire a lock on the printstream.
In any case needing to take the lock on the printstream probably reduces the benefit from logging asynchronously. The long term fix will involve replacing the printlns with calls to a logger (like slf4j).

How to match theoretical thread states and states showed by jvisualvm

If we will google something like 'java thread state' we will see approximately this diagram:
But if we will open jVisualVm we will see following:
Can you help to meatch these diagrams?
Sleeping state is just Thread.sleep()? Special case of the Running?
What the Park state?(I tried to google but I confused because I knew before only first diagram)
The diagram represents java.lang.Thread.State enum. The Javadoc is quite helpful to get an understanding of the mapping you seek.
The JVisualVM state represent the extra state description you would see in a thread dump, e.g.:
"Finalizer" daemon prio=8 tid=0x022f4000 nid=0xd14 in Object.wait() [0x044cf000]
java.lang.Thread.State: WAITING (on object monitor)
So you could decipher the state on your own, if you get a thread dump and compare the state from JVisualVM and the thread dump by a thread name.
Here is the mapping you want:
Running -> java.lang.Thread.State: RUNNABLE
Sleeping -> java.lang.Thread.State: TIMED_WAITING (sleeping)
Wait -> java.lang.Thread.State: WAITING TIMED_WAITING (on object monitor)
Park -> java.lang.Thread.State: WAITING TIMED_WAITING (parking)
Monitor -> java.lang.Thread.State: BLOCKED (on object monitor)
The Park state is a special case of WAITING or TIMED_WAITING. The difference from Wait is that Wait happens on an object monitor (i.e. Object.wait() within a synchronized block). The Park, on the other hand, removes a thread from scheduling via Unsafe.park without any need of holding a monitor (i.e. it doesn't need a synchronized block).
park is
sun.misc.Unsafe.park()

How to find out which thread holds the monitor?

My application is using Gson 2.2 for converting POJOs to JSON. When I was making a load test I stumbled upon a lot of threads blocked in Gson constructor:
"http-apr-28201-exec-28" #370 daemon prio=5 os_prio=0 tid=0x0000000001ee7800 nid=0x62cb waiting for monitor entry [0x00007fe64df9a000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.google.gson.Gson.<init>(Gson.java:200)
at com.google.gson.Gson.<init>(Gson.java:179)
Thread dump does NOT show any threads holding [0x00007fe64df9a000] monitor.
How can I find out who holds it?
Gson code at line 200 looks pretty innocent:
// built-in type adapters that cannot be overridden
factories.add(TypeAdapters.STRING_FACTORY);
factories.add(TypeAdapters.INTEGER_FACTORY);
I'm using JRE 1.8.0_91 on Linux
tl;dr I think you are running into GC-related behavior, where threads are being put in waiting state to allow for garbage collection.
I do not have the whole truth but I hope to provide some pieces of insight.
First thing to realize is that the number in brackets, [0x00007fe64df9a000], is not the address of a monitor. The number in brackets can be seen for all threads in a dump, even threads that are in running state. The number also does not change. Example from my test dump:
main" #1 prio=5 os_prio=0 tid=0x00007fe27c009000 nid=0x27e5c runnable [0x00007fe283bc2000]
java.lang.Thread.State: RUNNABLE
at Foo.main(Foo.java:12)
I am not sure what the number means, but this page hints that it is:
... the pointer to the Java VM internal thread structure. It is generally of no interest unless you are debugging a live Java VM or core file.
Although the format of the trace explained there is a bit different so I am not sure I am correct.
The way a dump looks when the address of the actual monitor is shown:
"qtp48612937-70" #70 prio=5 os_prio=0 tid=0x00007fbb845b4800 nid=0x133c waiting for monitor entry [0x00007fbad69e8000]
java.lang.Thread.State: BLOCKED (on object monitor)
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:233)
- waiting to lock <0x00000005b8d68e90> (a java.lang.Object)
Notice the waiting to lock line in the trace and that the address of the monitor is different from the number in brackets.
The fact that we cannot see the address of the monitor involved indicates that the monitor exists only in native code.
Secondly, the Gson code involved does not contain any synchronization at all. The code just adds an element to an ArrayList (assuming no bytecode manipulation has been done and nothing fishy is being done at low level). I.e., it would not make sense to see the thread waiting for a standard synchronization monitor at this call.
I found some, indications that threads can be shown as waiting for a monitor entry when there is a lot of GC going on.
I wrote a simple test program to try to reproduce it by just adding a lot of elements to an array list:
List<String> l = new ArrayList<>();
while (true) {
for (int i = 0; i < 100_100; i++) {
l.add("" + i);
}
l = new ArrayList<>();
}
Then I took thread dumps of this program. Occasionally I ran into the following trace:
"main" #1 prio=5 os_prio=0 tid=0x00007f35a8009000 nid=0x12448 waiting on condition [0x00007f35ac335000]
java.lang.Thread.State: RUNNABLE
at Foo.main(Foo.java:10) <--- Line of l.add()
While not identical to the OP's trace, it is interesting to have a thread waiting on condition when no synchronization is involved. I experienced it more frequently with a smaller heap, indicating that it might be GC related.
Another possibility could be that code that contains synchronization has been JIT compiled and that prevents you from seeing the actual address of the monitor. However, I think that is less likely since you experience it on ArrayList.add. If that is the case, I know of no way to find out the actual holder of the monitor.
If you don't have GC issues then may be actually there is some thread which has acquired lock on an object and stuck thread is waiting to acquire lock on the same object. The way to figure out is look for
- waiting to lock <some_hex_address> (a <java_class>)
example would be
- waiting to lock <0x00000000f139bb98> (a java.util.concurrent.ConcurrentHashMap)
in the thread dump for entry which says waiting for monitor entry. Once you have found it, you can search for thread that has already acquired lock on the object with address <some_hex_address>, it would look something like this for the example -
- locked <0x00000000f139bb98> (a java.util.concurrent.ConcurrentHashMap)
Now you can see the stacktrace of that thread to figure out which line of code has acquired it.

what does the lock id mean in java thread dump

I usually observed the lock id (as following) in the thread dump:
"Thread-pool-Bill" - Thread t#42
java.lang.Thread.State: RUNNABLE
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
- locked <79f0aad8> (a java.net.SocksSocketImpl)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
In locked <79f0aad8> (a java.net.SocksSocketImpl), what does 79f0aad8 mean? It doesn't seem to be an object address nor object id cause I couldn't find it from heap dump. So what it is?
its address of internal lock construct(monitor object) for hotspot JVM.
code for your reference
oop o = _locked_monitors->at(i);
instanceKlass* ik = instanceKlass::cast(o->klass());
st->print_cr("\t- locked <" INTPTR_FORMAT "> (a %s)", (address)o, ik->external_name());
You can consider it an opaque identifier. The key is that other pieces of code may be waiting on that particular lock, which will also show up in the thread dump. So you can use it to match up waits and locks between threads. It may also show up in the JVMs thread dump deadlock detection outputs - but I'm not sure.
The entry appears in the stack trace from either a synchronised method, or explicit synchronised block, and is related to the object synchronized on.
Its the hashcode/id of the object on which the Thread owns the lock. Since your thread is RUNNABLE so some other thread may or may not be waiting on it. If another thread is waiting for this lock then your Thread dump would have another entry somewhere else showing a BLOCKED or WAITING Thread which is waiting for lock on the same id i.e 79f0aad8
But since you say you cannot find this String (79f0aad8) anywhere else in your file so either:
you have multiple thread dump files or
the thread Thread t#42 wasn't blocking any other thread when the thread dump was taken.

What does thread dump looks like when JVM spent time in GC

When profiling Java application I note interesting fact. When JVM is in GC spiral of death thread dump is looks like:
"1304802943#qtp-393978767-9985" prio=10 tid=0x00007f3ed02dd000 nid=0x74e7 in Object.wait() [0x000000004febb000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:626)
- locked <0x00000007aed40048> (a org.mortbay.thread.QueuedThreadPool$PoolThread)
"26774405#qtp-393978767-9984" prio=10 tid=0x00007f3ee4b37000 nid=0x74e6 in Object.wait() [0x0000000045d1a000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:626)
- locked <0x00000007aed83aa0> (a org.mortbay.thread.QueuedThreadPool$PoolThread)
"764808089#qtp-393978767-9983" prio=10 tid=0x00007f3ee4c50000 nid=0x74e5 in Object.wait() [0x000000004ad6a000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:626)
- locked <0x00000007aed5c448> (a org.mortbay.thread.QueuedThreadPool$PoolThread)
So, there are a lot of threads in TIMED_WAITING state. Theoretically this situation could be easily found in normally functioning application (application simply doesn't have any incoming requests at the moment), but I can't find even single request dispatching thread doing something useful (nominal hit rate is about 100 hps).
Does this behavior have something to di with GC, or it's just coincidence?
Answering just the question's title:
What does thread dump look like when JVM spent time in GC?
The answer is: you have no means to obtain such dump (in a usual way).
JVM processes the request for thread dump only after reaching safepoint which just can't happen while in GC.
But there is a cheat way to obtain active GC's thread dump with help of undocumented JVMTI function AsyncGetCallTrace which is mentioned in this post:
http://jeremymanson.blogspot.com/2010/07/why-many-profilers-have-serious.html
It also hints that Oracle Solaris Studio can be used to take such mixed native/java thread dumps.
Try a jmap -histo:live over time, you can compare output, see which object types are growing.
You need to have the JDK installed for jmap.
http://docs.oracle.com/javase/6/docs/technotes/tools/share/jmap.html
A warning, jmap is intensive, it will pause all threads while it's running, which should only be a few seconds. Processes can core dump because it's intensive, generally it's quick and safe, but I have see it lock up or kill large applications, multi-gig heaps.
My guess is you have a thread pool which is waiting for something to do. If your process is efficient and you have even 100 requests per second you may have trouble catching even one thread doing something. I suggest you look at the CPU load of your process. If its 50%, you have a 50% chance of finding one thread (possibly not a request thread) doing something.
If you want to see what your server spends its time doing, I would try a profiler like VisualVM, or a commercial profiler like YourKit.
Doing a google search for you code, I found a different version http://grepcode.com/file/repo1.maven.org/maven2/org.mortbay.jetty/jetty-util/7.0.0.pre5/org/mortbay/thread/QueuedThreadPool.java however I suspect your threads are TIMED_WAIT in this block int he run() method
// We are idle
// wait for a dispatched job
synchronized (this)
{
if (_job==null)
this.wait(getMaxIdleTimeMs());
job=_job;
_job=null;
}

Categories