I have a main thread which dispatches jobs to a thread pool. I'm using Java's Executor framework.
From the profiler (VirtualVM) I can see each thread's activity: I can see that the main thread is waiting a lot (because the executor's queue has a upper limit) which means the executor's queue is full most of the time. However the executor's threads are not as busy as I would have thought. Most of them have a waiting time of 75%. In virtualVM it says it waits on Monitor.
Can anyone explain why is this happenning? why would the executor threads wait while there is still plenty of work available to do? And how to improve the performance of the executor? thus to improve the performance overall? More detail on the executor's wait on monitor would be great.
The job runs in the workers is just some computation, which don't depends on anything else and don't communicate to any other thread (no synchronisation), except in the end, it put data in the database, using it is own connection.
Parallel execution will yield significantly better results that a synchronous execution if:
the work to be done is independent from each other (no or few and very short critical sections)
each single executed work takes enough time to make up for thread start / executor's internal synchronization
the work does not use the same resource - for example reading multiple files from the same disk will probably be slower than reading them sequentially.
you actually have enough system resources (processor cores, memory, network speed) to use at once
Threading does not mean that all the threads will work in parallel all the time. Threads will surely go to waiting state due to various reasons, mostly depend on how the scheduler assigns the CPU to each of them. Is there some synchronized code in your thread class? If yes then if one thread is executing a synchronized method then all the other threads have to wait. If there is too much of synchronized code then threads waiting time will increase.
After doing a thread dump, it turns out that it is the database layer that has the synchronisation. Hibernate's Sequence Generator is synchronised.
"pool-2-thread-1" - Thread t#13
java.lang.Thread.State: BLOCKED
at org.hibernate.id.SequenceHiLoGenerator.generate(SequenceHiLoGenerator.java:73)
- waiting to lock <61fcb35> (a org.hibernate.id.SequenceHiLoGenerator) owned by "pool-2-thread-5" t#23
at org.hibernate.internal.StatelessSessionImpl.insert(StatelessSessionImpl.java:117)
at org.hibernate.internal.StatelessSessionImpl.insert(StatelessSessionImpl.java:110)
at ac.uk.ebi.kraken.unisave.storage.impl.HibernateStorageEngine.saveEntryIndex(HibernateStorageEngine.java:269)
at ac.uk.ebi.kraken.unisave.storage.impl.EntryStoreImpl.storeEntryIndex(EntryStoreImpl.java:302)
at ac.uk.ebi.kraken.unisave.impl.MTEntryIndexLoader$EntryIndexLoader.run(MTEntryIndexLoader.java:129)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)
Locked ownable synchronizers:
- locked <3d360c93> (a java.util.concurrent.ThreadPoolExecutor$Worker)
Thread is scheduled by the scheduler to assign cpu cycles to run them, that means if the machine has 4 cpus, at a time only 4 threads could be run in parallel, so other threads have to wait for the scheduler to assign cpu to them.
Related
I have an application with 1 writer thread and 8 reader threads accessing a shared resource, which is behind a ReentrantReadWriteLock. It froze for about an hour, producing no log output and not responding to requests. This is on Java 8.
Before killing it someone took thread dumps, which look like this:
Writer thread:
"writer-0" #83 prio=5 os_prio=0 tid=0x00007f899c166800 nid=0x2b1f waiting on condition [0x00007f898d3ba000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000002b8dd4ea8> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
Reader:
"reader-1" #249 daemon prio=5 os_prio=0 tid=0x00007f895000c000 nid=0x33d6 waiting on condition [0x00007f898edcf000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000002b8dd4ea8> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
This looks like a deadlock, however there are a couple of things that make me doubt that:
I can't find another thread that could possibly be holding the same lock
Taking a thread dump 4 seconds later yealds the same result, but all threads now report parking to wait for <0x00000002a7daa878>, which is different than 0x00000002b8dd4ea8 in the first dump.
Is this a deadlock? I see that there is some change in the threads' state, but it could only be internal to the lock implementation. What else could be causing this behaviour?
It turned out it was a deadlock. The thread holding the lock was not reported as holding any locks in the thread dump, which made it difficult to diagnose.
The only way to understand that was to inspect a heap dump of the application. For those interested in how, here's the process step-by-step:
A heap dump was taken at roughly the same time as the thread dumps.
I opened it using Java VisualVM, which comes with JDK.
In the "Classes" view I filtered by the class name of the class that contains the lock as a field.
I double-clicked on the class to be taken to the "Instances" view
Thankfully, there were only a few instances of that class, so I was able to find the one that was causing problems.
I inspected the ReentrantReadWriteLock object kept in a field in the class. In particular the sync field of that lock keeps its state - in this case it was ReentrantReadWriteLock$FairSync.
The state property of it was 65536. This represents both the number of shared and exclusive holds of the lock. The shared holds count is stored in the first 16 bits of the state, and is retrieved as state >>> 16. The exclusive holds count is in the last 16 bits, and is retrieved as state & ((1 << 16) - 1). From this we can see that there's 1 shared hold and 0 exclusive holds on the lock.
You can see the threads waiting for the lock in the head field. It is a queue, with thread containing the waiting thread, and next containing the next node in the queue. Going through it I found the writer-0 and 7 of the 8 reader-n threads, confirming what we know from the thread dump.
The firstReader field of the sync object contains the thread that have acquired the read log - from the comment in the code firstReader is the first thread to have acquired the read lock. firstReaderHoldCount is firstReader's hold count.More precisely, firstReader is the unique thread that last changed the shared count from 0 to 1, and has not released theread lock since then; null if there is no such thread.
In this case the thread holding the lock was one of the reader threads. It was blocked on something entirely different, which would have required one of the other reader threads to progress. Ultimately it was caused by a bug, in which a reader thread would not properly release the lock, and keep it forever. That I found by analyzing the code, and adding tracking and logging when the lock was acquired and released.
Is this a deadlock?
I don't think this is evidence of a deadlock. At least, not in the classic sense of the term.
The stack dump shows two threads waiting on the same ReentrantReadWriteLock. One thread is waiting to acquire the read lock. The other is waiting to acquire the write lock.
Now if no threads currently holds any locks, then one of these threads would be able to proceed.
If some other thread currently held the write lock, that would be sufficient to block both of these threads. But that isn't a deadlock. It would only be a deadlock if that third thread was itself waiting on a different lock ... and there was a circularity in the blocking.
So what about the possibility of these two threads blocking each other? I don't think that is possible. The reentrancy rules in the javadocs allow a thread that has the write lock to acquire the read lock without blocking. Likewise it can acquire the write lock it it already holds it.
The other piece of evidence is that things have changed in the thread dump you took a bit later. If there was a genuine deadlock, there would be no change.
If it is not a deadlock between (just) these two threads, what else could it be?
One possibility is that a third thread is holding the write lock (for a long time) and that is gumming things up. Just too much contention on this readwrite lock.
If the (assumed) third thread is using tryLock, it is possible that you have a livelock ... which could explain the "change" evidence. But on the flip-side, that thread should have been parked too ... which you say that you don't see.
Another possibility is that you have too many active threads ... and the OS is struggling to schedule them to cores.
But this is all speculation.
I am trying to analyze a thread dump which seems to indicate that there are numerous threads that are waiting on java.util.concurrent.Semaphore permits, i.e., the threads are waiting on Semaphore.acquire().
This I was able to imply because the threads are in WAITING (parking) state, and from what I've understood, Semaphore's do not use LOCK monitors, but use LockSupport.park() instead, waiting on another thread to unpark it.
Now, is there a way to imply from a thread dump on what all threads currently hold the Semaphore permits?
Similar to finding threads in BLOCKED state, and check which is the thread that holds the LOCK which is causing the thread to BLOCK?
Semaphores do not have a concept of ownership or know anything about threads. This makes them particularly lightweight (and useful in asynchronous programming where your logical thread of execution and the hardware thread on which it is executed won't necessarily have a 1:1 mapping).
You can also see this from the fact that a thread can release a semaphore without ever having acquired it.
You will have to look at the stacktraces to see where on what semaphores the threads are waiting and work backwards from there.
There are tools that help you to analyse dumps.Yourkit is one such tool that can be used to analyse blocked threads.
Reference:
https://www.yourkit.com/docs/java/help/monitor_profiling.jsp
I know the behavior of sleep method in Java.
When I call sleep method currently executing thread will stop it's execution and goes in sleep state. While it is sleeping it acquires the lock.
For example if I call sleep method as follows
Thread.sleep(50)
My Q is what happens after 50ms.
It will wake up and directly start executing or
it will go in runnable state and wait for CPU to give it a chance to execute?
In other words it will go to Runnable state and fight for CPU with other thread.
Please let me know the answer.
It will go into runnable state. There's never a guarantee a thread will be executing at a particular moment. But you can set the thread's priority to give it a better chance at getting CPU time.
Actually it depends on what operation system do you use and different operating systems has different process scheduling algorithms.
Most desktop operating systems are not real-time operating system. There is no guarantee about the precision of the sleep. When you call sleep, the thread is suspended and is not runnable until the requested duration elapses. When it's runnable again, it's up to the scheduler to run the thread again when some execution time is available.
For example, most Linux distros use CFS as default scheduling algorithm CFS uses a concept called "sleeper fairness", which considers sleeping or waiting tasks equivalent to those on the runqueue. So in your case, thread after sleeping will get a comparable share of CPU time.
It's up to the operating system scheduler. Typically, if the sleep is "sufficiently small" and the thread has enough of its timeslice left, the thread will hold onto the core and resume immediately when the sleep is finished. If the sleep is "too long" (typically around 10ms or more), then the core will be available to do other work and the thread will just be made ready-to-run when the sleep finishes. Depending on relative priorities, a new ready-to-run thread may pre-empt currently-running threads.
It will go in runnable state and wait for CPU to give it a chance to execute
I have Java EE based application running on tomcat and I am seeing that all of a sudden the application hangs after running for couple of hours.
I collected the thread dump from the application just before it hangs and put it in TDA for analysis:
TDA (Thread Dump Analyzer) gives the following message for the above monitor:
A lot of threads are waiting for this monitor to become available again.
This might indicate a congestion. You also should analyze other locks
blocked by threads waiting for this monitor as there might be much more
threads waiting for it.
And here is the stacktrace of the thread highlighted above:
"MY_THREAD" prio=10 tid=0x00007f97f1918800 nid=0x776a
waiting for monitor entry [0x00007f9819560000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.util.Hashtable.get(Hashtable.java:356)
- locked <0x0000000680038b68> (a java.util.Properties)
at java.util.Properties.getProperty(Properties.java:951)
at java.lang.System.getProperty(System.java:709)
at com.MyClass.myMethod(MyClass.java:344)
I want to know what does the "waiting for monitor entry" state means? And also would appreciate any pointers to help me debug this issue.
One of your threads acquired a monitor object (an exclusive lock on a object). That means the thread is executing synchronized code and for whatever reason stuck there, possibly waiting for other threads. But the other threads cannot continue their execution because they encountered a synchronized block and asked for a lock (monitor object), however they cannot get it until it is released by other thread. So... probably deadlock.
Please look for this string from the whole thread dump
- locked <0x00007f9819560000>
If you can find it, the thread is deadlock with thread "tid=0x00007f97f1918800"
Monitor = synchronized. You have lots of threads trying to get the lock on the same object.
Maybe you should switch from using a Hashtable and use a HashMap
This means that your thread is trying to set a lock (on the Hashtable), but some other thread is already accessing it and has set a lock. So it's waiting for the lock to release. Check what your other threads are doing. Especially thread with tid="0x00007f9819560000"
I have a Java application that is structured as:
One thread watching a java.nio.Selector for IO.
A java.util.concurrent.ScheduledThreadPoolExecutor thread pool handling either work to be done immediately — dispatching IO read by the IO thread — or work to be done after a delay, usually errors.
The ScheduledThreadPoolExecutor has an upper bound on the number of threads to create; currently 5000 in the app, but I haven't tuned that number at all.
After running the app for a while, I get thousands and thousands of threads that have this stack trace:
"pool-1-thread-5262" prio=10 tid=0x00007f636c2df800 nid=0x2516 waiting on condition [0x00007f60246a5000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000581c49520> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2025)
at java.util.concurrent.DelayQueue.poll(DelayQueue.java:209)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.poll(ScheduledThreadPoolExecutor.java:611)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.poll(ScheduledThreadPoolExecutor.java:602)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:945)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
at java.lang.Thread.run(Thread.java:662)
I assume that the above is being caused by my calls to schedule(java.lang.Runnable, long, java.util.concurrent.TimeUnit), which certainly happens often in the app. Is this the expected behavior?
Having all of these threads hanging around doesn't seem to impact the application at all — if a worker thread is needed, it does not appear like these TIMED_WAITING threads prevent tasks from running when submitted through the submit method, but I'm not totally sure of that. Does having thousands of threads hanging around in this parked state impact the app or system performance?
Tasks that are submitted via the schedule method are very simple: they basically just re-schedule the Channel back with the Selector. So, these tasks are not very long-lived, they just need to execute at some point in the future. Normal worker threads will do traditional blocking-IO to perform their work, and are generally more long-lived.
A related question: is it better to do delayed tasks in an explicit, single thread instead of using the schedule method? That is, have a loop like this:
DelayedQueue<SomeTaskClass> tasks = ...;
while (true) {
task<SomeTaskClass> = tasks.take();
threadpool.submit(task);
}
Does DelayQueue use any worker threads to implement its functionality? I was going to just experiment with it today, but advice would be well appreciated.
After running the app for a while, I get thousands and thousands of threads that have this stack trace.
Unless you actually plan on having 5000 threads all operating at once, that is a too high number. If they are blocked on IO then that should be fine. Unless you are starting with a minimum number of threads that is too large, then their existence in your thread dump means that at some point they were all needed to process the tasks submitted to the executor. So at some point you had 5000 tasks being run at once -- blocking or whatever. If you show the actual executor constructor call I can be more specific.
If you have the time, playing with that upper bound might be good to see if it does affect application behavior.
Does having thousands of threads hanging around in this parked state impact the app or system performance?
They will take up more memory which may affect JVM performance but otherwise it should not impact the application unless too many are running at once. They may just be wasting some system resources which is the only reason why I'd play with the 5000 and other executor constructor args.
is it better to do delayed tasks in an explicit, single thread instead of using the schedule method?
I'd say no. Just about anytime you can replace by-hand thread code with a use of the ExecutorService classes it is a good thing. I think the idea of doing a task and then delaying for a while is a great use of the ScheduledThreadPoolExecutor.
Does DelayQueue use any worker threads to implement its functionality?
No. It is just a BlockingQueue implementation that helps with delaying of tasks. I've never used the class actually, although I would have if I'd known about it. The ScheduledThreadPoolExecutor uses this class to do its job so using DelayQueue yourself is again a waste. Just stick with STPE.