RUNNABLE Thread.State but in Object.wait() - java

I have extracted a JStack of my container process and got the threads running there with the following distribution grouped by Thread.state:
count thread state
67 RUNNABLE
1 TIMED_WAITING (on object monitor)
8 TIMED_WAITING (parking)
4 TIMED_WAITING (sleeping)
3 WAITING (on object monitor)
17 WAITING (parking)
For the runnable threads I have the following description:
"http-bio-8080-exec-55" daemon prio=10 tid=0x000000002cbab300 nid=0x642b in Object.wait() [0x00002ab37ad11000]
java.lang.Thread.State: RUNNABLE
at com.mysema.query.jpa.impl.JPAQuery.<init>(JPAQuery.java:44)
at net.mbppcb.cube.repository.TransactionDaoImpl.findByBusinessId(TransactionDaoImpl.java:73)
at sun.reflect.GeneratedMethodAccessor76.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:317)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:183)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:155)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
...
The number of threads in RUNNABLE state as shown above raises with the time and it seems to be hanging. If they suppose to be blocked, shouldn't they be on state BLOCKED? Or should they be on WAITING state? Is strange to have RUNNABLE threads but in Object.wait() isn't it?
Update 1
I can see in the documentation:
A thread in the runnable state is executing in the Java virtual
machine but it may be waiting for other resources from the operating
system such as processor.
How can I figure out what is the thread waiting for?

This seems like a class initialization deadlock.
JPAQuery constructor is waiting for the initialization of a dependent class, probably JPAProvider:
public JPAQuery(EntityManager em) {
super(em, JPAProvider.getTemplates(em), new DefaultQueryMetadata());
}
Such deadlocks may be caused by a typical bug when a subclass is referenced from a static initializer. If you share the details of other thread stacks, we'll likely find out which thread holds the class lock.
Why is the thread in RUNNABLE state then?
Well, it's a confusion inside HotSpot JVM. The class initialization procedure is implemented in VM runtime, not in Java land, and the class lock is grabbed natively. Seems to be a reason why a thread state has not been changed, but I guess this behavior should be fixed in JVM as well.

The Oracle Thread.State documentation specifies that a thread in the blocked state is waiting for a monitor lock to enter a synchronized block/method or reenter a synchronized block/method after calling.
It looks like none of the threads in blocking mode.

If all the Runnable threads are apparently blocked in database operations, I would suggest using database monitoring/diagnostic tools to explore the reason. After that, possibly investigating your database code to find problems such as uncommitted transactions, mishandled exceptions leading to unclosed resources.
Java thread dumps have probably given you all the information they can at this point - a pointer for where to start looking next.

"The number of threads in RUNNABLE state as shown above raises with
the time and it seems to be hanging."
The number of RUNNABLES would rise depending upon the number of threads executing the method "com.mysema.query.jpa.impl.JPAQuery." at the time this thread dump was taken. This method is actually the constructor of JPAQuery class - denoted by "init". You would probably need to investigate the code in the constructor and the subsequent calls to the JPA implementation.

The Object.wait() merely calls Object.wait(0) (zero for no timeout).
The implementation of Object.wait(long) is:
public final native void wait(long timeout) throws InterruptedException;
All threads that are in a native frame are RUNNABLE, as the JVM does not know (does not "manage", hence it is a "native" frame) the state of the call.

Related

Is this a deadlock when using ReentrantReadWriteLock?

I have an application with 1 writer thread and 8 reader threads accessing a shared resource, which is behind a ReentrantReadWriteLock. It froze for about an hour, producing no log output and not responding to requests. This is on Java 8.
Before killing it someone took thread dumps, which look like this:
Writer thread:
"writer-0" #83 prio=5 os_prio=0 tid=0x00007f899c166800 nid=0x2b1f waiting on condition [0x00007f898d3ba000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000002b8dd4ea8> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
Reader:
"reader-1" #249 daemon prio=5 os_prio=0 tid=0x00007f895000c000 nid=0x33d6 waiting on condition [0x00007f898edcf000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000002b8dd4ea8> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
This looks like a deadlock, however there are a couple of things that make me doubt that:
I can't find another thread that could possibly be holding the same lock
Taking a thread dump 4 seconds later yealds the same result, but all threads now report parking to wait for <0x00000002a7daa878>, which is different than 0x00000002b8dd4ea8 in the first dump.
Is this a deadlock? I see that there is some change in the threads' state, but it could only be internal to the lock implementation. What else could be causing this behaviour?
It turned out it was a deadlock. The thread holding the lock was not reported as holding any locks in the thread dump, which made it difficult to diagnose.
The only way to understand that was to inspect a heap dump of the application. For those interested in how, here's the process step-by-step:
A heap dump was taken at roughly the same time as the thread dumps.
I opened it using Java VisualVM, which comes with JDK.
In the "Classes" view I filtered by the class name of the class that contains the lock as a field.
I double-clicked on the class to be taken to the "Instances" view
Thankfully, there were only a few instances of that class, so I was able to find the one that was causing problems.
I inspected the ReentrantReadWriteLock object kept in a field in the class. In particular the sync field of that lock keeps its state - in this case it was ReentrantReadWriteLock$FairSync.
The state property of it was 65536. This represents both the number of shared and exclusive holds of the lock. The shared holds count is stored in the first 16 bits of the state, and is retrieved as state >>> 16. The exclusive holds count is in the last 16 bits, and is retrieved as state & ((1 << 16) - 1). From this we can see that there's 1 shared hold and 0 exclusive holds on the lock.
You can see the threads waiting for the lock in the head field. It is a queue, with thread containing the waiting thread, and next containing the next node in the queue. Going through it I found the writer-0 and 7 of the 8 reader-n threads, confirming what we know from the thread dump.
The firstReader field of the sync object contains the thread that have acquired the read log - from the comment in the code firstReader is the first thread to have acquired the read lock. firstReaderHoldCount is firstReader's hold count.More precisely, firstReader is the unique thread that last changed the shared count from 0 to 1, and has not released theread lock since then; null if there is no such thread.
In this case the thread holding the lock was one of the reader threads. It was blocked on something entirely different, which would have required one of the other reader threads to progress. Ultimately it was caused by a bug, in which a reader thread would not properly release the lock, and keep it forever. That I found by analyzing the code, and adding tracking and logging when the lock was acquired and released.
Is this a deadlock?
I don't think this is evidence of a deadlock. At least, not in the classic sense of the term.
The stack dump shows two threads waiting on the same ReentrantReadWriteLock. One thread is waiting to acquire the read lock. The other is waiting to acquire the write lock.
Now if no threads currently holds any locks, then one of these threads would be able to proceed.
If some other thread currently held the write lock, that would be sufficient to block both of these threads. But that isn't a deadlock. It would only be a deadlock if that third thread was itself waiting on a different lock ... and there was a circularity in the blocking.
So what about the possibility of these two threads blocking each other? I don't think that is possible. The reentrancy rules in the javadocs allow a thread that has the write lock to acquire the read lock without blocking. Likewise it can acquire the write lock it it already holds it.
The other piece of evidence is that things have changed in the thread dump you took a bit later. If there was a genuine deadlock, there would be no change.
If it is not a deadlock between (just) these two threads, what else could it be?
One possibility is that a third thread is holding the write lock (for a long time) and that is gumming things up. Just too much contention on this readwrite lock.
If the (assumed) third thread is using tryLock, it is possible that you have a livelock ... which could explain the "change" evidence. But on the flip-side, that thread should have been parked too ... which you say that you don't see.
Another possibility is that you have too many active threads ... and the OS is struggling to schedule them to cores.
But this is all speculation.

How to find out which thread holds the monitor?

My application is using Gson 2.2 for converting POJOs to JSON. When I was making a load test I stumbled upon a lot of threads blocked in Gson constructor:
"http-apr-28201-exec-28" #370 daemon prio=5 os_prio=0 tid=0x0000000001ee7800 nid=0x62cb waiting for monitor entry [0x00007fe64df9a000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.google.gson.Gson.<init>(Gson.java:200)
at com.google.gson.Gson.<init>(Gson.java:179)
Thread dump does NOT show any threads holding [0x00007fe64df9a000] monitor.
How can I find out who holds it?
Gson code at line 200 looks pretty innocent:
// built-in type adapters that cannot be overridden
factories.add(TypeAdapters.STRING_FACTORY);
factories.add(TypeAdapters.INTEGER_FACTORY);
I'm using JRE 1.8.0_91 on Linux
tl;dr I think you are running into GC-related behavior, where threads are being put in waiting state to allow for garbage collection.
I do not have the whole truth but I hope to provide some pieces of insight.
First thing to realize is that the number in brackets, [0x00007fe64df9a000], is not the address of a monitor. The number in brackets can be seen for all threads in a dump, even threads that are in running state. The number also does not change. Example from my test dump:
main" #1 prio=5 os_prio=0 tid=0x00007fe27c009000 nid=0x27e5c runnable [0x00007fe283bc2000]
java.lang.Thread.State: RUNNABLE
at Foo.main(Foo.java:12)
I am not sure what the number means, but this page hints that it is:
... the pointer to the Java VM internal thread structure. It is generally of no interest unless you are debugging a live Java VM or core file.
Although the format of the trace explained there is a bit different so I am not sure I am correct.
The way a dump looks when the address of the actual monitor is shown:
"qtp48612937-70" #70 prio=5 os_prio=0 tid=0x00007fbb845b4800 nid=0x133c waiting for monitor entry [0x00007fbad69e8000]
java.lang.Thread.State: BLOCKED (on object monitor)
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:233)
- waiting to lock <0x00000005b8d68e90> (a java.lang.Object)
Notice the waiting to lock line in the trace and that the address of the monitor is different from the number in brackets.
The fact that we cannot see the address of the monitor involved indicates that the monitor exists only in native code.
Secondly, the Gson code involved does not contain any synchronization at all. The code just adds an element to an ArrayList (assuming no bytecode manipulation has been done and nothing fishy is being done at low level). I.e., it would not make sense to see the thread waiting for a standard synchronization monitor at this call.
I found some, indications that threads can be shown as waiting for a monitor entry when there is a lot of GC going on.
I wrote a simple test program to try to reproduce it by just adding a lot of elements to an array list:
List<String> l = new ArrayList<>();
while (true) {
for (int i = 0; i < 100_100; i++) {
l.add("" + i);
}
l = new ArrayList<>();
}
Then I took thread dumps of this program. Occasionally I ran into the following trace:
"main" #1 prio=5 os_prio=0 tid=0x00007f35a8009000 nid=0x12448 waiting on condition [0x00007f35ac335000]
java.lang.Thread.State: RUNNABLE
at Foo.main(Foo.java:10) <--- Line of l.add()
While not identical to the OP's trace, it is interesting to have a thread waiting on condition when no synchronization is involved. I experienced it more frequently with a smaller heap, indicating that it might be GC related.
Another possibility could be that code that contains synchronization has been JIT compiled and that prevents you from seeing the actual address of the monitor. However, I think that is less likely since you experience it on ArrayList.add. If that is the case, I know of no way to find out the actual holder of the monitor.
If you don't have GC issues then may be actually there is some thread which has acquired lock on an object and stuck thread is waiting to acquire lock on the same object. The way to figure out is look for
- waiting to lock <some_hex_address> (a <java_class>)
example would be
- waiting to lock <0x00000000f139bb98> (a java.util.concurrent.ConcurrentHashMap)
in the thread dump for entry which says waiting for monitor entry. Once you have found it, you can search for thread that has already acquired lock on the object with address <some_hex_address>, it would look something like this for the example -
- locked <0x00000000f139bb98> (a java.util.concurrent.ConcurrentHashMap)
Now you can see the stacktrace of that thread to figure out which line of code has acquired it.

How to detect thread being blocked by IO?

In Java, thread can have different state:
NEW, RUNNABLE, BLOCKED, WAITING, TIMED_WAITING, TERMINATED
However, when the thread is blocked by IO, its state is "RUNNABLE". How can I tell if it is blocked by IO?
NEW: The thread is created but has not been processed yet.
RUNNABLE:
The thread is occupying the CPU and processing a task. (It may be in WAITING status due to the OS's resource distribution.)
BLOCKED: The thread is waiting for a different thread to release its lock in order to get the monitor lock. JVISULVM shows thta as Monitoring
WAITING: The thread is waiting by using a wait, join or park method.
TIMED_WAITING: The thread is waiting by using a sleep, wait, join or park method. (The difference from WAITING is that the maximum waiting time is specified by the method parameter, and WAITING can be relieved by time as well as external changes.)
TERMINATED: A thread that has exited is in this state.
see also http://architects.dzone.com/articles/how-analyze-java-thread-dumps
Thread Dump
Dumping java thread stack you can find something like that
java.lang.Thread.State: RUNNABLE
at java.io.FileInputStream.readBytes(Native Method)
or
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
and you can understand that java is waiting response.
I suggest this tool Java Thread Dump Analyser or this plug-in TDA
ThreadMXBean
Yiu can obtain more information using the ThreadMXBean
http://docs.oracle.com/javase/7/docs/api/java/lang/management/ThreadMXBean.html
You can check the statckTraces of the thread then find if the last stack is in some specific method associated with i/o blocking (eg: java.net.SocketInputStream.socketRead0)
This is not a clever way but it works.
JProfiler supports the feature you need, details show at: WHAT'S NEW IN JPROFILER 3.1

Analyzing thread dump of a java process

I have Java EE based application running on tomcat and I am seeing that all of a sudden the application hangs after running for couple of hours.
I collected the thread dump from the application just before it hangs and put it in TDA for analysis:
TDA (Thread Dump Analyzer) gives the following message for the above monitor:
A lot of threads are waiting for this monitor to become available again.
This might indicate a congestion. You also should analyze other locks
blocked by threads waiting for this monitor as there might be much more
threads waiting for it.
And here is the stacktrace of the thread highlighted above:
"MY_THREAD" prio=10 tid=0x00007f97f1918800 nid=0x776a
waiting for monitor entry [0x00007f9819560000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.util.Hashtable.get(Hashtable.java:356)
- locked <0x0000000680038b68> (a java.util.Properties)
at java.util.Properties.getProperty(Properties.java:951)
at java.lang.System.getProperty(System.java:709)
at com.MyClass.myMethod(MyClass.java:344)
I want to know what does the "waiting for monitor entry" state means? And also would appreciate any pointers to help me debug this issue.
One of your threads acquired a monitor object (an exclusive lock on a object). That means the thread is executing synchronized code and for whatever reason stuck there, possibly waiting for other threads. But the other threads cannot continue their execution because they encountered a synchronized block and asked for a lock (monitor object), however they cannot get it until it is released by other thread. So... probably deadlock.
Please look for this string from the whole thread dump
- locked <0x00007f9819560000>
If you can find it, the thread is deadlock with thread "tid=0x00007f97f1918800"
Monitor = synchronized. You have lots of threads trying to get the lock on the same object.
Maybe you should switch from using a Hashtable and use a HashMap
This means that your thread is trying to set a lock (on the Hashtable), but some other thread is already accessing it and has set a lock. So it's waiting for the lock to release. Check what your other threads are doing. Especially thread with tid="0x00007f9819560000"

What does thread dump looks like when JVM spent time in GC

When profiling Java application I note interesting fact. When JVM is in GC spiral of death thread dump is looks like:
"1304802943#qtp-393978767-9985" prio=10 tid=0x00007f3ed02dd000 nid=0x74e7 in Object.wait() [0x000000004febb000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:626)
- locked <0x00000007aed40048> (a org.mortbay.thread.QueuedThreadPool$PoolThread)
"26774405#qtp-393978767-9984" prio=10 tid=0x00007f3ee4b37000 nid=0x74e6 in Object.wait() [0x0000000045d1a000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:626)
- locked <0x00000007aed83aa0> (a org.mortbay.thread.QueuedThreadPool$PoolThread)
"764808089#qtp-393978767-9983" prio=10 tid=0x00007f3ee4c50000 nid=0x74e5 in Object.wait() [0x000000004ad6a000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:626)
- locked <0x00000007aed5c448> (a org.mortbay.thread.QueuedThreadPool$PoolThread)
So, there are a lot of threads in TIMED_WAITING state. Theoretically this situation could be easily found in normally functioning application (application simply doesn't have any incoming requests at the moment), but I can't find even single request dispatching thread doing something useful (nominal hit rate is about 100 hps).
Does this behavior have something to di with GC, or it's just coincidence?
Answering just the question's title:
What does thread dump look like when JVM spent time in GC?
The answer is: you have no means to obtain such dump (in a usual way).
JVM processes the request for thread dump only after reaching safepoint which just can't happen while in GC.
But there is a cheat way to obtain active GC's thread dump with help of undocumented JVMTI function AsyncGetCallTrace which is mentioned in this post:
http://jeremymanson.blogspot.com/2010/07/why-many-profilers-have-serious.html
It also hints that Oracle Solaris Studio can be used to take such mixed native/java thread dumps.
Try a jmap -histo:live over time, you can compare output, see which object types are growing.
You need to have the JDK installed for jmap.
http://docs.oracle.com/javase/6/docs/technotes/tools/share/jmap.html
A warning, jmap is intensive, it will pause all threads while it's running, which should only be a few seconds. Processes can core dump because it's intensive, generally it's quick and safe, but I have see it lock up or kill large applications, multi-gig heaps.
My guess is you have a thread pool which is waiting for something to do. If your process is efficient and you have even 100 requests per second you may have trouble catching even one thread doing something. I suggest you look at the CPU load of your process. If its 50%, you have a 50% chance of finding one thread (possibly not a request thread) doing something.
If you want to see what your server spends its time doing, I would try a profiler like VisualVM, or a commercial profiler like YourKit.
Doing a google search for you code, I found a different version http://grepcode.com/file/repo1.maven.org/maven2/org.mortbay.jetty/jetty-util/7.0.0.pre5/org/mortbay/thread/QueuedThreadPool.java however I suspect your threads are TIMED_WAIT in this block int he run() method
// We are idle
// wait for a dispatched job
synchronized (this)
{
if (_job==null)
this.wait(getMaxIdleTimeMs());
job=_job;
_job=null;
}

Categories