My tomcat server is responding slowly frequently and dying out after some time. This is happening quite often once a week. I took a thread dump and it shows about 50% of the threads are in locked state. Class which is locking the thread is org.apache.tomcat.util.net.AprEndpoint
I think over the time number of thread locking is rising and finally taking up all threads in locked state.
Here is one statement from the thread dump which shows locked state
"http-8080-4" daemon prio=6 tid=0x000000006acca800 nid=0x9d8 in Object.wait() [0x000000007083f000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x0000000028ead838> (a org.apache.tomcat.util.net.AprEndpoint$Worker)
at java.lang.Object.wait(Object.java:485)
at org.apache.tomcat.util.net.AprEndpoint$Worker.await(AprEndpoint.java:1511)
- locked <0x0000000028ead838> (a org.apache.tomcat.util.net.AprEndpoint$Worker)
at org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1536)
at java.lang.Thread.run(Thread.java:619)
Locked ownable synchronizers:
- None
Related
I have take thread dumps of my application deployed in production, which uses logback. I am NOT expert in analysing thread dumps, however, I am required to do so. I am learning it, and have read some online articles.
The below is the real thread dumps:
"logback-8" #136 daemon prio=5 os_prio=0 tid=0x00007f3588001000 nid=0x13a waiting on condition [0x00007f35f8e7b000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000068d740338> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1088)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
"logback-7" #135 daemon prio=5 os_prio=0 tid=0x00007f356c030800 nid=0x139 waiting on condition [0x00007f35f8d7a000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000068d740338> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1088)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
"logback-6" #134 daemon prio=5 os_prio=0 tid=0x00007f357c018000 nid=0x138 waiting on condition [0x00007f3568ef0000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000068d740338> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1088)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
"logback-5" #133 daemon prio=5 os_prio=0 tid=0x00007f36b8003800 nid=0x137 waiting on condition [0x00007f35f9380000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000068d740338> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1088)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
As I can see, most of these threads are in WAITING state.
But how do i know what exactly are they waiting for? Are these all threads trying to "wait" on "something"?
I did read some online materials , but when seeing a real thread dumps, it is not making any sense to me as how to take steps in understanding, as to what exactly these threads are trying to do, and at what thing are they WAITING, BLOCKED etc.
Can anyone please help me understand this?
As far as I have analyse thread dumps, When you are using such frameworks like Spring or other library like you have used here Logback. They create thread pool based on your configuration. For example, you can find such configuration in applicationcontext.xml or any of your java based configuration class. So, what it does is, it will create that much of threads at application startup or first initialisation call. If any task comes, then framework will pick one thread from pool and assign it. After completion of the task thread will come back in the pool and at that time the thread status would be like WAITING (parking) . This does not mean that it is block by any thread. It will not cause any performance headache if your server is capable enough to handle such pool. They are just sitting ideal because they do not have any task to do.
You can find more info on this and if you want to see visual way of analysis of your thread dump then you can use this. I found this site very helpful. Just upload your dump file.It will tell every thing that needed to be known by that thread dump.
Based on the above stack trace, these are the following inferences
There is a queue Q.
N number of jobs(Runnables) are added to the queue.
These jobs will be taken from the queue and executed by the threads
in the ThreadPoolExecutors(It can be cached/Dynamic).
Lets assume, we have a thread pool Executor with only two threads(Fixed Size) and you have submitted only one job to the queue.(Job1)
Thread1 takes the job1 from the queue and execute. Once the execution is over, It goes to the WAITING state because there are no elements in the queue. It has no work to do.
Presence of waiting threads on Stack trace doesn't imply they are harmful.
First thing you may look to do is to search for Runnable thread and the synchronizers that it has locked.
Runnable thread if hangs on for a long time there is a great chance for it to be performing IO or DB operations. Take subsequent thread dumps with Interval of 30 seconds or so and do a comparative study.
I have jstack where many thread are in state WAITING with description "parking to wait for " like:
java.lang.Thread.State: WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base#11.0.3/Native Method)
- parking to wait for <0x0000000307db96c8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(java.base#11.0.3/LockSupport.java:194)
What is this big hex number? Is it time? Is it identifier?
EDIT
I have dumped state of my Java application thereads with threads that work very long (few days) in the morning and in the afternoon. I see that "waiting on condition" is with the same big hex number, but other big hex number in "parking to wait for" is different:
In the morning:
"qtp792232038-1037-..." #1037 prio=5 os_prio=0 cpu=787.64ms elapsed=528768.56s tid=0x00007f164004a800 nid=0x1346
waiting on condition [0x00007f181fffd000]
java.lang.Thread.State: WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base#11.0.3/Native Method)
- parking to wait for <0x000000030a69c410> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(java.base#11.0.3/LockSupport.java:194)
...
After few hours:
"qtp792232038-1037-..." #1037 prio=5 os_prio=0 cpu=787.64ms elapsed=546900.36s tid=0x00007f164004a800 nid=0x1346
waiting on condition [0x00007f181fffd000]
java.lang.Thread.State: WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base#11.0.3/Native Method)
- parking to wait for <0x0000000307db96c8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(java.base#11.0.3/LockSupport.java:194)
...
Yes, it is an internal identifier for the lock object.
You can use this to see which thread is waiting for which other thread.
Search the thread dump for this id, there should be another thread mentioned with a stack frame that is holding the lock with the same id.
I have a threaddump of an app which showed 3 threads like below.
===============
"http-443-11" daemon prio=10 tid=0x00000000473bc800 nid=0x3590 waiting on condition [0x0000000061818000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000007612a3880> (a java.util.concurrent.Semaphore$NonfairSync)
"http-443-4" daemon prio=10 tid=0x00000000451f6000 nid=0x243a waiting on condition [0x0000000055354000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000007612a3880> (a java.util.concurrent.Semaphore$NonfairSync)
"http-443-7" daemon prio=10 tid=0x000000004602e000 nid=0x2974 waiting on condition [0x000000005e6e7000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000007612a3880> (a java.util.concurrent.Semaphore$NonfairSync)
===============
What is the significance of the "waiting on condition []" ? What does the number in the [] signifiy ?
In the thread stack we can see that threads are daemon threads and are waiting for task to come. Since these threads are created on JVM startup , they dont get killed unless JVM exits or any non daemon thread i not running, thus they wait for tasks to come. Say Garbage collection thread is a daemoon thread which might not be running all the time, it could be in waiting state.
If I'm using more than one GenericKeyedObjectPool in an application with enabled asynchronous idle object eviction how many "idle object eviction" thread will run in the background?
Do multiple GenericKeyedObjectPools create only one eviction thread or do they create separate threads for every pool?
The current implementation (v1.6) uses a static timer, so in practice multiple pools use only one eviction thread. (Assuming that they are loaded into the same classloader.) You can check it with jstack, there is only one timer thread:
"Timer-0" daemon prio=10 tid=0x7bce5000 nid=0x1ca5 in Object.wait() [0x7b23d000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0xa26c0fe8> (a java.util.TaskQueue)
at java.util.TimerThread.mainLoop(Timer.java:509)
- locked <0xa26c0fe8> (a java.util.TaskQueue)
at java.util.TimerThread.run(Timer.java:462)
Locked ownable synchronizers:
- None
In a Java threaddump I found the following:
"TP-Processor184" daemon prio=10 tid=0x00007f2a7c056800 nid=0x47e7 waiting for monitor entry [0x00007f2a21278000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.jackrabbit.core.state.SharedItemStateManager.getNonVirtualItemState(SharedItemStateManager.java:1725)
- locked <0x0000000682f99d98> (a org.apache.jackrabbit.core.state.SharedItemStateManager)
at org.apache.jackrabbit.core.state.SharedItemStateManager.getItemState(SharedItemStateManager.java:257)
"TP-Processor137" daemon prio=10 tid=0x00007f2a7c00f800 nid=0x4131 waiting for monitor entry [0x00007f2a1ace7000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.jackrabbit.core.state.SharedItemStateManager.getNonVirtualItemState(SharedItemStateManager.java:1725)
- locked <0x0000000682f99d98> (a org.apache.jackrabbit.core.state.SharedItemStateManager)
at org.apache.jackrabbit.core.state.SharedItemStateManager.getItemState(SharedItemStateManager.java:257)
The point here being that both threads have locked monitor <0x0000000682f99d98> (regardless of them now waiting for two different other monitors).
When looking at Thread Dump Analyzer, with that monitor being selected, it really says "Threads locking monitor: 2" at the bottom, and "2 Thread(s) locking". Please see https://lh4.googleusercontent.com/-fCmlnohVqE0/T1D5lcPerZI/AAAAAAAAD2c/vAHcDiGOoMo/s971/locked_by_two_threads_3.png for the screenshot, I'm not allowed to paste images here.
Does this mean threaddumps aren't atomic with respect to monitor lock information? I can't imagine this really being a locking bug of the JVM (1.6.0_26-b03).
A similar question has already been asked in Can several threads hold a lock on the same monitor in Java?, but the answer to me didn't see the real point of multiple threads locking the same monitor, even though they may be waiting for some other.
Update May 13th 2014:
Newer question Multiple threads hold the same lock? has code to reproduce the behaviour, and #rsxg has filed an according bug report https://bugs.openjdk.java.net/browse/JDK-8036823 along the lines of his answer here.
I don't think that your thread dump is saying that your two threads are "waiting for two different other monitors". I think it is saying that they are both waiting on the same monitor but at two different code points. That may be a stack location or an object instance location or something. This is a great document about analyzing the stack dumps.
Can several threads hold a lock on the same monitor in Java?
No. Your stack dump is showing two threads locked on the same monitor at the same code location but in different stack frames -- or whatever that value is which seems OS dependent.
Edit:
I'm not sure why the thread dump seems to be saying that both threads have a line locked since that seems to only be allowed if they are in a wait() method. I noticed that you are linking to version 1.6.5. Is that really the version you are using? In version 2.3.6 (which may be the latest), the 1725 line actually is a wait.
1722 synchronized (this) {
1723 while (currentlyLoading.contains(id)) {
1724 try {
1725 wait();
1726 } catch (InterruptedException e) {
You could also see this sort of stack trace even if it was an exclusive synchronized lock. For example, the following stack dump under Linux is for two threads locked on the same object from the same code line but in two different instances of the Runnable.run() method. Here's my stupid little test program. Notice that the monitor entry numbers are different, even thought it is the same lock and same code line number.
"Thread-1" prio=10 tid=0x00002aab34055c00 nid=0x4874
waiting for monitor entry [0x0000000041017000..0x0000000041017d90]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00002aab072a1318> (a java.lang.Object)
at com.mprew.be.service.auto.freecause.Foo$OurRunnable.run(Foo.java:38)
- locked <0x00002aab072a1318> (a java.lang.Object)
at java.lang.Thread.run(Thread.java:619)
"Thread-0" prio=10 tid=0x00002aab34054c00 nid=0x4873
waiting for monitor entry [0x0000000040f16000..0x0000000040f16d10]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00002aab072a1318> (a java.lang.Object)
at com.mprew.be.service.auto.freecause.Foo$OurRunnable.run(Foo.java:38)
- locked <0x00002aab072a1318> (a java.lang.Object)
at java.lang.Thread.run(Thread.java:619)
On my Mac, the format is different but again the number after the "monitor entry" is not the same for the same line number.
"Thread-2" prio=5 tid=7f8b9c00d000 nid=0x109622000
waiting for monitor entry [109621000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <7f3192fb0> (a java.lang.Object)
at com.mprew.be.service.auto.freecause.Foo$OurRunnable.run(Foo.java:38)
- locked <7f3192fb0> (a java.lang.Object)
"Thread-1" prio=5 tid=7f8b9f80d800 nid=0x10951f000
waiting for monitor entry [10951e000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <7f3192fb0> (a java.lang.Object)
at com.mprew.be.service.auto.freecause.Foo$OurRunnable.run(Foo.java:38)
- locked <7f3192fb0> (a java.lang.Object)
This Oracle document describe that value as the following:
Address range, which gives an estimate of the valid stack region for the thread
You are probably running into a cosmetic bug in the stack trace routines in the JVM when analyzing heavily contended locks - it may or may not be the same as this bug.
The fact is that neither of your two threads have actually managed to acquire the lock on the SharedItemStateManager, as you can see from the fact that they are reporting waiting for monitor entry. The bug is that further up in the stack trace in both cases they should report waiting to lock instead of locked.
The workaround when analyzing strange stack traces like this is to always check that a thread claiming to have locked an object is not also waiting to acquire a lock on the same object.
(Unfortunately this analysis requires cross-referencing the line numbers in the stack trace with the source, code since there is no relationship between the figures in the waiting for monitor entry header and the locked line in the stack trace. As per this Oracle document, the number 0x00007f2a21278000 in the line TP-Processor184" daemon prio=10 tid=0x00007f2a7c056800 nid=0x47e7 waiting for monitor entry [0x00007f2a21278000] refers to an estimate of the valid stack region for the thread. So it looks like a monitor ID but it isn't - and you can see that the two threads you gave are at different addresses in the stack).
When a thread locks an object but wait()s another thread can lock the same object. You should be able to see a number of threads "holding" the same lock all waiting.
AFAIK, the only other occasion is when multiple threads have locked and waited and are ready to re-acquire the the lock e.g. on a notifyAll(). They are not waiting any more but cannot continue until they have obtained the lock again. (only one thread at a time time can do this)
"http-0.0.0.0-8080-96" daemon prio=10 tid=0x00002abc000a8800 nid=0x3bc4 waiting for monitor entry [0x0000000050823000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:195)
- locked <0x00002aadae12c048> (a java.util.WeakHashMap)
"http-0.0.0.0-8080-289" daemon prio=10 tid=0x00002abc00376800 nid=0x2688 waiting for monitor entry [0x000000005c8e3000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:195)
- locked <0x00002aadae12c048> (a java.util.WeakHashMap
"http-0.0.0.0-8080-295" daemon prio=10 tid=0x00002abc00382800 nid=0x268e runnable [0x000000005cee9000]
java.lang.Thread.State: RUNNABLE
at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:195)
- locked <0x00002aadae12c048> (a java.util.WeakHashMap)
In our thread dump, we have several threads lock same monitor, but only one thread is runnable. It probably because of lock competition, we have 284 other threads waiting for the lock. Multiple threads hold the same lock? said this only exists in the thread dump, for thread dump is not atomic operation.