understanding line from java thread dump - java

I have the following thread dump I get using jstack and would like to know what the hex value next to the word runnable shows. I have seen the same value used in other places showing as:
waiting on condition [0x00000000796e9000]
Does this mean the other threads are waiting on this thread?
runnable [0x00000000796e9000]
thread dump
"ajp-bio-8009-exec-2925" daemon prio=10 tid=0x0000000015ca7000 nid=0x53c7 runnable [0x00000000796e9000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)

I have the following thread dump I get using jstack and would like to
know what the hex value next to the word runnable shows. I have seen
the same value used in other places showing as:
waiting on condition [0x00000000796e9000]
Does this mean the other threads are waiting on this thread?
Yes. This indicates one thread holds a lock and another thread is waiting to obtain that lock. This is fairly similar to the synchronized keyword conceptually, but can be quite a bit more powerful (and complicated).
Take a look at the javadoc for condition to get a better understanding of conditions.
This question/answer gives a description of the attributes in a thread dump (for java 6).

Related

Is this a deadlock when using ReentrantReadWriteLock?

I have an application with 1 writer thread and 8 reader threads accessing a shared resource, which is behind a ReentrantReadWriteLock. It froze for about an hour, producing no log output and not responding to requests. This is on Java 8.
Before killing it someone took thread dumps, which look like this:
Writer thread:
"writer-0" #83 prio=5 os_prio=0 tid=0x00007f899c166800 nid=0x2b1f waiting on condition [0x00007f898d3ba000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000002b8dd4ea8> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
Reader:
"reader-1" #249 daemon prio=5 os_prio=0 tid=0x00007f895000c000 nid=0x33d6 waiting on condition [0x00007f898edcf000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000002b8dd4ea8> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
This looks like a deadlock, however there are a couple of things that make me doubt that:
I can't find another thread that could possibly be holding the same lock
Taking a thread dump 4 seconds later yealds the same result, but all threads now report parking to wait for <0x00000002a7daa878>, which is different than 0x00000002b8dd4ea8 in the first dump.
Is this a deadlock? I see that there is some change in the threads' state, but it could only be internal to the lock implementation. What else could be causing this behaviour?
It turned out it was a deadlock. The thread holding the lock was not reported as holding any locks in the thread dump, which made it difficult to diagnose.
The only way to understand that was to inspect a heap dump of the application. For those interested in how, here's the process step-by-step:
A heap dump was taken at roughly the same time as the thread dumps.
I opened it using Java VisualVM, which comes with JDK.
In the "Classes" view I filtered by the class name of the class that contains the lock as a field.
I double-clicked on the class to be taken to the "Instances" view
Thankfully, there were only a few instances of that class, so I was able to find the one that was causing problems.
I inspected the ReentrantReadWriteLock object kept in a field in the class. In particular the sync field of that lock keeps its state - in this case it was ReentrantReadWriteLock$FairSync.
The state property of it was 65536. This represents both the number of shared and exclusive holds of the lock. The shared holds count is stored in the first 16 bits of the state, and is retrieved as state >>> 16. The exclusive holds count is in the last 16 bits, and is retrieved as state & ((1 << 16) - 1). From this we can see that there's 1 shared hold and 0 exclusive holds on the lock.
You can see the threads waiting for the lock in the head field. It is a queue, with thread containing the waiting thread, and next containing the next node in the queue. Going through it I found the writer-0 and 7 of the 8 reader-n threads, confirming what we know from the thread dump.
The firstReader field of the sync object contains the thread that have acquired the read log - from the comment in the code firstReader is the first thread to have acquired the read lock. firstReaderHoldCount is firstReader's hold count.More precisely, firstReader is the unique thread that last changed the shared count from 0 to 1, and has not released theread lock since then; null if there is no such thread.
In this case the thread holding the lock was one of the reader threads. It was blocked on something entirely different, which would have required one of the other reader threads to progress. Ultimately it was caused by a bug, in which a reader thread would not properly release the lock, and keep it forever. That I found by analyzing the code, and adding tracking and logging when the lock was acquired and released.
Is this a deadlock?
I don't think this is evidence of a deadlock. At least, not in the classic sense of the term.
The stack dump shows two threads waiting on the same ReentrantReadWriteLock. One thread is waiting to acquire the read lock. The other is waiting to acquire the write lock.
Now if no threads currently holds any locks, then one of these threads would be able to proceed.
If some other thread currently held the write lock, that would be sufficient to block both of these threads. But that isn't a deadlock. It would only be a deadlock if that third thread was itself waiting on a different lock ... and there was a circularity in the blocking.
So what about the possibility of these two threads blocking each other? I don't think that is possible. The reentrancy rules in the javadocs allow a thread that has the write lock to acquire the read lock without blocking. Likewise it can acquire the write lock it it already holds it.
The other piece of evidence is that things have changed in the thread dump you took a bit later. If there was a genuine deadlock, there would be no change.
If it is not a deadlock between (just) these two threads, what else could it be?
One possibility is that a third thread is holding the write lock (for a long time) and that is gumming things up. Just too much contention on this readwrite lock.
If the (assumed) third thread is using tryLock, it is possible that you have a livelock ... which could explain the "change" evidence. But on the flip-side, that thread should have been parked too ... which you say that you don't see.
Another possibility is that you have too many active threads ... and the OS is struggling to schedule them to cores.
But this is all speculation.

How to find out which thread holds the monitor?

My application is using Gson 2.2 for converting POJOs to JSON. When I was making a load test I stumbled upon a lot of threads blocked in Gson constructor:
"http-apr-28201-exec-28" #370 daemon prio=5 os_prio=0 tid=0x0000000001ee7800 nid=0x62cb waiting for monitor entry [0x00007fe64df9a000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.google.gson.Gson.<init>(Gson.java:200)
at com.google.gson.Gson.<init>(Gson.java:179)
Thread dump does NOT show any threads holding [0x00007fe64df9a000] monitor.
How can I find out who holds it?
Gson code at line 200 looks pretty innocent:
// built-in type adapters that cannot be overridden
factories.add(TypeAdapters.STRING_FACTORY);
factories.add(TypeAdapters.INTEGER_FACTORY);
I'm using JRE 1.8.0_91 on Linux
tl;dr I think you are running into GC-related behavior, where threads are being put in waiting state to allow for garbage collection.
I do not have the whole truth but I hope to provide some pieces of insight.
First thing to realize is that the number in brackets, [0x00007fe64df9a000], is not the address of a monitor. The number in brackets can be seen for all threads in a dump, even threads that are in running state. The number also does not change. Example from my test dump:
main" #1 prio=5 os_prio=0 tid=0x00007fe27c009000 nid=0x27e5c runnable [0x00007fe283bc2000]
java.lang.Thread.State: RUNNABLE
at Foo.main(Foo.java:12)
I am not sure what the number means, but this page hints that it is:
... the pointer to the Java VM internal thread structure. It is generally of no interest unless you are debugging a live Java VM or core file.
Although the format of the trace explained there is a bit different so I am not sure I am correct.
The way a dump looks when the address of the actual monitor is shown:
"qtp48612937-70" #70 prio=5 os_prio=0 tid=0x00007fbb845b4800 nid=0x133c waiting for monitor entry [0x00007fbad69e8000]
java.lang.Thread.State: BLOCKED (on object monitor)
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:233)
- waiting to lock <0x00000005b8d68e90> (a java.lang.Object)
Notice the waiting to lock line in the trace and that the address of the monitor is different from the number in brackets.
The fact that we cannot see the address of the monitor involved indicates that the monitor exists only in native code.
Secondly, the Gson code involved does not contain any synchronization at all. The code just adds an element to an ArrayList (assuming no bytecode manipulation has been done and nothing fishy is being done at low level). I.e., it would not make sense to see the thread waiting for a standard synchronization monitor at this call.
I found some, indications that threads can be shown as waiting for a monitor entry when there is a lot of GC going on.
I wrote a simple test program to try to reproduce it by just adding a lot of elements to an array list:
List<String> l = new ArrayList<>();
while (true) {
for (int i = 0; i < 100_100; i++) {
l.add("" + i);
}
l = new ArrayList<>();
}
Then I took thread dumps of this program. Occasionally I ran into the following trace:
"main" #1 prio=5 os_prio=0 tid=0x00007f35a8009000 nid=0x12448 waiting on condition [0x00007f35ac335000]
java.lang.Thread.State: RUNNABLE
at Foo.main(Foo.java:10) <--- Line of l.add()
While not identical to the OP's trace, it is interesting to have a thread waiting on condition when no synchronization is involved. I experienced it more frequently with a smaller heap, indicating that it might be GC related.
Another possibility could be that code that contains synchronization has been JIT compiled and that prevents you from seeing the actual address of the monitor. However, I think that is less likely since you experience it on ArrayList.add. If that is the case, I know of no way to find out the actual holder of the monitor.
If you don't have GC issues then may be actually there is some thread which has acquired lock on an object and stuck thread is waiting to acquire lock on the same object. The way to figure out is look for
- waiting to lock <some_hex_address> (a <java_class>)
example would be
- waiting to lock <0x00000000f139bb98> (a java.util.concurrent.ConcurrentHashMap)
in the thread dump for entry which says waiting for monitor entry. Once you have found it, you can search for thread that has already acquired lock on the object with address <some_hex_address>, it would look something like this for the example -
- locked <0x00000000f139bb98> (a java.util.concurrent.ConcurrentHashMap)
Now you can see the stacktrace of that thread to figure out which line of code has acquired it.

what does the lock id mean in java thread dump

I usually observed the lock id (as following) in the thread dump:
"Thread-pool-Bill" - Thread t#42
java.lang.Thread.State: RUNNABLE
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
- locked <79f0aad8> (a java.net.SocksSocketImpl)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
In locked <79f0aad8> (a java.net.SocksSocketImpl), what does 79f0aad8 mean? It doesn't seem to be an object address nor object id cause I couldn't find it from heap dump. So what it is?
its address of internal lock construct(monitor object) for hotspot JVM.
code for your reference
oop o = _locked_monitors->at(i);
instanceKlass* ik = instanceKlass::cast(o->klass());
st->print_cr("\t- locked <" INTPTR_FORMAT "> (a %s)", (address)o, ik->external_name());
You can consider it an opaque identifier. The key is that other pieces of code may be waiting on that particular lock, which will also show up in the thread dump. So you can use it to match up waits and locks between threads. It may also show up in the JVMs thread dump deadlock detection outputs - but I'm not sure.
The entry appears in the stack trace from either a synchronised method, or explicit synchronised block, and is related to the object synchronized on.
Its the hashcode/id of the object on which the Thread owns the lock. Since your thread is RUNNABLE so some other thread may or may not be waiting on it. If another thread is waiting for this lock then your Thread dump would have another entry somewhere else showing a BLOCKED or WAITING Thread which is waiting for lock on the same id i.e 79f0aad8
But since you say you cannot find this String (79f0aad8) anywhere else in your file so either:
you have multiple thread dump files or
the thread Thread t#42 wasn't blocking any other thread when the thread dump was taken.

How to detect thread being blocked by IO?

In Java, thread can have different state:
NEW, RUNNABLE, BLOCKED, WAITING, TIMED_WAITING, TERMINATED
However, when the thread is blocked by IO, its state is "RUNNABLE". How can I tell if it is blocked by IO?
NEW: The thread is created but has not been processed yet.
RUNNABLE:
The thread is occupying the CPU and processing a task. (It may be in WAITING status due to the OS's resource distribution.)
BLOCKED: The thread is waiting for a different thread to release its lock in order to get the monitor lock. JVISULVM shows thta as Monitoring
WAITING: The thread is waiting by using a wait, join or park method.
TIMED_WAITING: The thread is waiting by using a sleep, wait, join or park method. (The difference from WAITING is that the maximum waiting time is specified by the method parameter, and WAITING can be relieved by time as well as external changes.)
TERMINATED: A thread that has exited is in this state.
see also http://architects.dzone.com/articles/how-analyze-java-thread-dumps
Thread Dump
Dumping java thread stack you can find something like that
java.lang.Thread.State: RUNNABLE
at java.io.FileInputStream.readBytes(Native Method)
or
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
and you can understand that java is waiting response.
I suggest this tool Java Thread Dump Analyser or this plug-in TDA
ThreadMXBean
Yiu can obtain more information using the ThreadMXBean
http://docs.oracle.com/javase/7/docs/api/java/lang/management/ThreadMXBean.html
You can check the statckTraces of the thread then find if the last stack is in some specific method associated with i/o blocking (eg: java.net.SocketInputStream.socketRead0)
This is not a clever way but it works.
JProfiler supports the feature you need, details show at: WHAT'S NEW IN JPROFILER 3.1

Analyzing thread dump of a java process

I have Java EE based application running on tomcat and I am seeing that all of a sudden the application hangs after running for couple of hours.
I collected the thread dump from the application just before it hangs and put it in TDA for analysis:
TDA (Thread Dump Analyzer) gives the following message for the above monitor:
A lot of threads are waiting for this monitor to become available again.
This might indicate a congestion. You also should analyze other locks
blocked by threads waiting for this monitor as there might be much more
threads waiting for it.
And here is the stacktrace of the thread highlighted above:
"MY_THREAD" prio=10 tid=0x00007f97f1918800 nid=0x776a
waiting for monitor entry [0x00007f9819560000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.util.Hashtable.get(Hashtable.java:356)
- locked <0x0000000680038b68> (a java.util.Properties)
at java.util.Properties.getProperty(Properties.java:951)
at java.lang.System.getProperty(System.java:709)
at com.MyClass.myMethod(MyClass.java:344)
I want to know what does the "waiting for monitor entry" state means? And also would appreciate any pointers to help me debug this issue.
One of your threads acquired a monitor object (an exclusive lock on a object). That means the thread is executing synchronized code and for whatever reason stuck there, possibly waiting for other threads. But the other threads cannot continue their execution because they encountered a synchronized block and asked for a lock (monitor object), however they cannot get it until it is released by other thread. So... probably deadlock.
Please look for this string from the whole thread dump
- locked <0x00007f9819560000>
If you can find it, the thread is deadlock with thread "tid=0x00007f97f1918800"
Monitor = synchronized. You have lots of threads trying to get the lock on the same object.
Maybe you should switch from using a Hashtable and use a HashMap
This means that your thread is trying to set a lock (on the Hashtable), but some other thread is already accessing it and has set a lock. So it's waiting for the lock to release. Check what your other threads are doing. Especially thread with tid="0x00007f9819560000"

Categories