Nashorn – Concurrent invocations get stuck in WeekHashMap - java

I aware that JavaScript by design won’t support multithread, but we use JavaScript code like a service, where we compile the JavaScript code using Nashorn and invoke one of the method from the compiled script instance, concurrently, with different inputs to get the desired output. Our JavaScript code is thread-safe and they never access or manipulate any global data, no closures manipulation.
Occasionally, one of the thread get stuck in WeekHashMap and blocks all other concurrent threads forever. As of now, we don’t have any workaround or solution, since the WeekHashMap::getEntry() method is get stuck in tight-loop and there is no way to interrupt and safely kill the thread. This forces us to bounce the box now and then, this also downgrades Nashorn's adoptions in Tier-1 high revenue systems.
Thread Dump:
"Thread-3" #7647 prio=5 os_prio=0 tid=0x00007f023c2d0800 nid=0x9384 runnable [0x00007f03feee9000]
java.lang.Thread.State: RUNNABLE
at java.util.WeakHashMap.getEntry(WeakHashMap.java:431)
at java.util.WeakHashMap.containsKey(WeakHashMap.java:417)
at jdk.nashorn.internal.runtime.PropertyListeners$WeakPropertyMapSet.contains(PropertyListeners.java:217)
at jdk.nashorn.internal.runtime.PropertyListeners.containsListener(PropertyListeners.java:115)
- locked <0x000000063c9ecd68> (a jdk.nashorn.internal.runtime.PropertyListeners)
at jdk.nashorn.internal.runtime.PropertyListeners.addListener(PropertyListeners.java:95)
at jdk.nashorn.internal.runtime.PropertyMap.addListener(PropertyMap.java:247)
at jdk.nashorn.internal.runtime.ScriptObject.getProtoSwitchPoint(ScriptObject.java:2112)
at jdk.nashorn.internal.runtime.ScriptObject.createEmptyGetter(ScriptObject.java:2409)
at jdk.nashorn.internal.runtime.ScriptObject.noSuchProperty(ScriptObject.java:2353)
at jdk.nashorn.internal.runtime.ScriptObject.findGetMethod(ScriptObject.java:1960)
at jdk.nashorn.internal.runtime.ScriptObject.lookup(ScriptObject.java:1828)
at jdk.nashorn.internal.runtime.linker.NashornLinker.getGuardedInvocation(NashornLinker.java:104)
at jdk.nashorn.internal.runtime.linker.NashornLinker.getGuardedInvocation(NashornLinker.java:98)
at jdk.internal.dynalink.support.CompositeTypeBasedGuardingDynamicLinker.getGuardedInvocation(CompositeTypeBasedGuardingDynamicLinker.java:176)
at jdk.internal.dynalink.support.CompositeGuardingDynamicLinker.getGuardedInvocation(CompositeGuardingDynamicLinker.java:124)
at jdk.internal.dynalink.support.LinkerServicesImpl.getGuardedInvocation(LinkerServicesImpl.java:154)
at jdk.internal.dynalink.DynamicLinker.relink(DynamicLinker.java:253)
at java.lang.invoke.LambdaForm$DMH/1376533963.invokeSpecial_LLIL_L(LambdaForm$DMH)
at java.lang.invoke.LambdaForm$BMH/1644775282.reinvoke(LambdaForm$BMH)
at java.lang.invoke.LambdaForm$MH/1967400458.exactInvoker(LambdaForm$MH)
at java.lang.invoke.LambdaForm$reinvoker/1083020379.dontInline(LambdaForm$reinvoker)
//Trimmed Purposely
"Thread-2" #7646 prio=5 os_prio=0 tid=0x00007f023c2d6800 nid=0x9383 waiting for monitor entry [0x00007f03fefea000]
java.lang.Thread.State: BLOCKED (on object monitor)
at jdk.nashorn.internal.runtime.PropertyListeners.containsListener(PropertyListeners.java:111)
- waiting to lock <0x000000063c9ecd68> (a jdk.nashorn.internal.runtime.PropertyListeners)
at jdk.nashorn.internal.runtime.PropertyListeners.addListener(PropertyListeners.java:95)
at jdk.nashorn.internal.runtime.PropertyMap.addListener(PropertyMap.java:247)
at jdk.nashorn.internal.runtime.ScriptObject.getProtoSwitchPoint(ScriptObject.java:2112)
at jdk.nashorn.internal.runtime.ScriptObject.createEmptyGetter(ScriptObject.java:2409)
at jdk.nashorn.internal.runtime.ScriptObject.noSuchProperty(ScriptObject.java:2353)
at jdk.nashorn.internal.runtime.ScriptObject.findGetMethod(ScriptObject.java:1960)
at jdk.nashorn.internal.runtime.ScriptObject.lookup(ScriptObject.java:1828)
at jdk.nashorn.internal.runtime.linker.NashornLinker.getGuardedInvocation(NashornLinker.java:104)
at jdk.nashorn.internal.runtime.linker.NashornLinker.getGuardedInvocation(NashornLinker.java:98)
at jdk.internal.dynalink.support.CompositeTypeBasedGuardingDynamicLinker.getGuardedInvocation(CompositeTypeBasedGuardingDynamicLinker.java:176)
at jdk.internal.dynalink.support.CompositeGuardingDynamicLinker.getGuardedInvocation(CompositeGuardingDynamicLinker.java:124)
at jdk.internal.dynalink.support.LinkerServicesImpl.getGuardedInvocation(LinkerServicesImpl.java:154)
at jdk.internal.dynalink.DynamicLinker.relink(DynamicLinker.java:253)
at java.lang.invoke.LambdaForm$DMH/1376533963.invokeSpecial_LLIL_L(LambdaForm$DMH)
at java.lang.invoke.LambdaForm$BMH/1644775282.reinvoke(LambdaForm$BMH)
at java.lang.invoke.LambdaForm$MH/1967400458.exactInvoker(LambdaForm$MH)
at java.lang.invoke.LambdaForm$reinvoker/1083020379.dontInline(LambdaForm$reinvoker)
at java.lang.invoke.LambdaForm$MH/363682507.guard(LambdaForm$MH)
at java.lang.invoke.LambdaForm$reinvoker/1083020379.dontInline(LambdaForm$reinvoker)
//Trimmed Purposely
Almost similar issue reported in the following bug, but I am not able to +1 or add more details in this bug. As stated in this bug, it is really hard to reproduce this bug from developer system.
https://bugs.openjdk.java.net/browse/JDK-8146274
Questions:
If there any better workaround to address this problem?
What if JDK team replace the WeakHashMap with ConcurrentHashMap? WeakHashMap for sure does not support thread safety.

Related

How to find out which thread holds the monitor?

My application is using Gson 2.2 for converting POJOs to JSON. When I was making a load test I stumbled upon a lot of threads blocked in Gson constructor:
"http-apr-28201-exec-28" #370 daemon prio=5 os_prio=0 tid=0x0000000001ee7800 nid=0x62cb waiting for monitor entry [0x00007fe64df9a000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.google.gson.Gson.<init>(Gson.java:200)
at com.google.gson.Gson.<init>(Gson.java:179)
Thread dump does NOT show any threads holding [0x00007fe64df9a000] monitor.
How can I find out who holds it?
Gson code at line 200 looks pretty innocent:
// built-in type adapters that cannot be overridden
factories.add(TypeAdapters.STRING_FACTORY);
factories.add(TypeAdapters.INTEGER_FACTORY);
I'm using JRE 1.8.0_91 on Linux
tl;dr I think you are running into GC-related behavior, where threads are being put in waiting state to allow for garbage collection.
I do not have the whole truth but I hope to provide some pieces of insight.
First thing to realize is that the number in brackets, [0x00007fe64df9a000], is not the address of a monitor. The number in brackets can be seen for all threads in a dump, even threads that are in running state. The number also does not change. Example from my test dump:
main" #1 prio=5 os_prio=0 tid=0x00007fe27c009000 nid=0x27e5c runnable [0x00007fe283bc2000]
java.lang.Thread.State: RUNNABLE
at Foo.main(Foo.java:12)
I am not sure what the number means, but this page hints that it is:
... the pointer to the Java VM internal thread structure. It is generally of no interest unless you are debugging a live Java VM or core file.
Although the format of the trace explained there is a bit different so I am not sure I am correct.
The way a dump looks when the address of the actual monitor is shown:
"qtp48612937-70" #70 prio=5 os_prio=0 tid=0x00007fbb845b4800 nid=0x133c waiting for monitor entry [0x00007fbad69e8000]
java.lang.Thread.State: BLOCKED (on object monitor)
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:233)
- waiting to lock <0x00000005b8d68e90> (a java.lang.Object)
Notice the waiting to lock line in the trace and that the address of the monitor is different from the number in brackets.
The fact that we cannot see the address of the monitor involved indicates that the monitor exists only in native code.
Secondly, the Gson code involved does not contain any synchronization at all. The code just adds an element to an ArrayList (assuming no bytecode manipulation has been done and nothing fishy is being done at low level). I.e., it would not make sense to see the thread waiting for a standard synchronization monitor at this call.
I found some, indications that threads can be shown as waiting for a monitor entry when there is a lot of GC going on.
I wrote a simple test program to try to reproduce it by just adding a lot of elements to an array list:
List<String> l = new ArrayList<>();
while (true) {
for (int i = 0; i < 100_100; i++) {
l.add("" + i);
}
l = new ArrayList<>();
}
Then I took thread dumps of this program. Occasionally I ran into the following trace:
"main" #1 prio=5 os_prio=0 tid=0x00007f35a8009000 nid=0x12448 waiting on condition [0x00007f35ac335000]
java.lang.Thread.State: RUNNABLE
at Foo.main(Foo.java:10) <--- Line of l.add()
While not identical to the OP's trace, it is interesting to have a thread waiting on condition when no synchronization is involved. I experienced it more frequently with a smaller heap, indicating that it might be GC related.
Another possibility could be that code that contains synchronization has been JIT compiled and that prevents you from seeing the actual address of the monitor. However, I think that is less likely since you experience it on ArrayList.add. If that is the case, I know of no way to find out the actual holder of the monitor.
If you don't have GC issues then may be actually there is some thread which has acquired lock on an object and stuck thread is waiting to acquire lock on the same object. The way to figure out is look for
- waiting to lock <some_hex_address> (a <java_class>)
example would be
- waiting to lock <0x00000000f139bb98> (a java.util.concurrent.ConcurrentHashMap)
in the thread dump for entry which says waiting for monitor entry. Once you have found it, you can search for thread that has already acquired lock on the object with address <some_hex_address>, it would look something like this for the example -
- locked <0x00000000f139bb98> (a java.util.concurrent.ConcurrentHashMap)
Now you can see the stacktrace of that thread to figure out which line of code has acquired it.

How to find which threads currently hold java.util.concurrent.Semaphore permits?

I am trying to analyze a thread dump which seems to indicate that there are numerous threads that are waiting on java.util.concurrent.Semaphore permits, i.e., the threads are waiting on Semaphore.acquire().
This I was able to imply because the threads are in WAITING (parking) state, and from what I've understood, Semaphore's do not use LOCK monitors, but use LockSupport.park() instead, waiting on another thread to unpark it.
Now, is there a way to imply from a thread dump on what all threads currently hold the Semaphore permits?
Similar to finding threads in BLOCKED state, and check which is the thread that holds the LOCK which is causing the thread to BLOCK?
Semaphores do not have a concept of ownership or know anything about threads. This makes them particularly lightweight (and useful in asynchronous programming where your logical thread of execution and the hardware thread on which it is executed won't necessarily have a 1:1 mapping).
You can also see this from the fact that a thread can release a semaphore without ever having acquired it.
You will have to look at the stacktraces to see where on what semaphores the threads are waiting and work backwards from there.
There are tools that help you to analyse dumps.Yourkit is one such tool that can be used to analyse blocked threads.
Reference:
https://www.yourkit.com/docs/java/help/monitor_profiling.jsp

understanding line from java thread dump

I have the following thread dump I get using jstack and would like to know what the hex value next to the word runnable shows. I have seen the same value used in other places showing as:
waiting on condition [0x00000000796e9000]
Does this mean the other threads are waiting on this thread?
runnable [0x00000000796e9000]
thread dump
"ajp-bio-8009-exec-2925" daemon prio=10 tid=0x0000000015ca7000 nid=0x53c7 runnable [0x00000000796e9000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)
I have the following thread dump I get using jstack and would like to
know what the hex value next to the word runnable shows. I have seen
the same value used in other places showing as:
waiting on condition [0x00000000796e9000]
Does this mean the other threads are waiting on this thread?
Yes. This indicates one thread holds a lock and another thread is waiting to obtain that lock. This is fairly similar to the synchronized keyword conceptually, but can be quite a bit more powerful (and complicated).
Take a look at the javadoc for condition to get a better understanding of conditions.
This question/answer gives a description of the attributes in a thread dump (for java 6).

Analyzing thread dump of a java process

I have Java EE based application running on tomcat and I am seeing that all of a sudden the application hangs after running for couple of hours.
I collected the thread dump from the application just before it hangs and put it in TDA for analysis:
TDA (Thread Dump Analyzer) gives the following message for the above monitor:
A lot of threads are waiting for this monitor to become available again.
This might indicate a congestion. You also should analyze other locks
blocked by threads waiting for this monitor as there might be much more
threads waiting for it.
And here is the stacktrace of the thread highlighted above:
"MY_THREAD" prio=10 tid=0x00007f97f1918800 nid=0x776a
waiting for monitor entry [0x00007f9819560000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.util.Hashtable.get(Hashtable.java:356)
- locked <0x0000000680038b68> (a java.util.Properties)
at java.util.Properties.getProperty(Properties.java:951)
at java.lang.System.getProperty(System.java:709)
at com.MyClass.myMethod(MyClass.java:344)
I want to know what does the "waiting for monitor entry" state means? And also would appreciate any pointers to help me debug this issue.
One of your threads acquired a monitor object (an exclusive lock on a object). That means the thread is executing synchronized code and for whatever reason stuck there, possibly waiting for other threads. But the other threads cannot continue their execution because they encountered a synchronized block and asked for a lock (monitor object), however they cannot get it until it is released by other thread. So... probably deadlock.
Please look for this string from the whole thread dump
- locked <0x00007f9819560000>
If you can find it, the thread is deadlock with thread "tid=0x00007f97f1918800"
Monitor = synchronized. You have lots of threads trying to get the lock on the same object.
Maybe you should switch from using a Hashtable and use a HashMap
This means that your thread is trying to set a lock (on the Hashtable), but some other thread is already accessing it and has set a lock. So it's waiting for the lock to release. Check what your other threads are doing. Especially thread with tid="0x00007f9819560000"

Understanding the Reference Handler thread

I am continuing my path to deep understanding of Java Thread. Unfortunately my Java Certification didn't cover that part, so the only way of learning is to post a series of dumb questions. With so many years of Java Development, I am sometimes wondering how much I still have to learn :-)
In particular my attention is now with the reference handler thread.
"Reference Handler" daemon prio=10 tid=0x02da3400 nid=0xb98 in Object.wait() [0x0302f000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x1aac0320> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:485)
at java.lang.ref.Reference$ReferenceHandler.run(Unknown Source)
- locked <0x1aac0320> (a java.lang.ref.Reference$Lock)
Now some questions are following, for some of them I know the answer, but I am not posting it, because I would like to hear someone else opinions:
What is the Reference Handler thread supposed to do ?
A thread dump should be considered bottom up, why does the stack trace start with locked, shouldn't the lock statement appears at least after the thread has run ?
What does "Native Method" means ?
Why "Unknown Source", in which case the thread dump cannot recall the source code ?
Lastly the waiting on and locked has the same , why ?
as usual, I kindly ask to answer all the questions, so that I can mark answered.
I suspect it handles running finalizers for the JVM. It's an implementation detail and as such not specified in the JVM spec.
This only means that the java.lang.ref.Reference$Lock was locked in the method mentioned in the line preceding it (i.e in ReferenceHandler.run().
"Native Method" simply means that the method is implemented in native (i.e. non-Java) code (think JNI).
Unknown Source only means that the .class file doesn't contain any source code location information (at least for this specific point). This can happen either when the method is a synthetic one (doesn't look like it here), or the class was compiled without debug information.
When a thread waits on some object, then it must have locked that object at some point down the call trace, so you can't really have a waiting on without a corresponding locked.
1) The Finalizer Thread calls finalizer methods.
The Reference Thread has a similar purpose.
http://www.java2s.com/Open-Source/Java-Document/6.0-JDK-Core/lang/java/lang/ref/Reference.java.htm
The OpenJDK source states its is a
High-priority thread to enqueue pending References
The GC creates a simple linked list of references which need to be processed and this thread quickly adds them to a proper queue. The reason this is done in two phases is that the GC does nothing but find the References, this thread calls code which handles those references e.g. Call Cleaners, and notifies ReferenceQueue listeners.
2) A lock is acquired for a synchronized method before it is entered.
3-5) covered by Joachim ;)
Wow, too deep for me. I can only answer one or two of your questions.
"Native Method" simply means the implementation of that method is in some native (i.e. C or C++) library. Once the call stack has "gone native", the JVM can no longer monitor it. No way for it to provide additional stack information.
"Unknown Source" likely means the code was compiled with optimization turned on and debugging info turned off (-g flag?). This eliminates the file/line information from the .class file.

Categories