Flink taskManager pod had too many thread named "pool-{xxx}-thread-1" - java

My flink job, taskManager vm has too many threadPool, every pool contains one thread, thread names:"pool-{poolNumber}-thread-1".
thread dump:
// about 2200 threads like this
"pool-236-thread-1" #153 prio=5 os_prio=0 tid=0x00007f9e1c12d000 nid=0x245 waiting on condition [0x00007f9e40c66000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000007a8082938> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
my debug:
I debug the job in my IDE, start the job.
there is one waiting thread names like 'pool-{poolNumber}-thread-1', which is "pool-5-thread-1".
I put a break on Thread.class's constructure.
the thread to create "pool-5-thread-1"'s stack is:
questions:
I don't know what to do next😢

Related

Request going pending due to threads in WAITING and TIMED_WAITING state

Java spring boot application request goes in pending state as threads held in WAITING and TIMED_WAITING state.
Jstack logs:
"qtp886341817-1399" #1399 prio=5 os_prio=0 tid=0x00007f02142ae800 nid=0x22f904 waiting on condition [0x00007f01c3fa8000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000684588e00> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
at org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:656)
at org.eclipse.jetty.util.thread.QueuedThreadPool.access$800(QueuedThreadPool.java:49)
at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:720)
at java.lang.Thread.run(Thread.java:748)
"threadPoolTaskExecutor-1" #114 prio=5 os_prio=0 tid=0x00007f02140b4800 nid=0x229d78 waiting on condition [0x00007f01c55b2000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000684588e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"qtp886341817-717" #717 prio=5 os_prio=0 tid=0x00007f021c102000 nid=0x22c546 in Object.wait() [0x00007f01ee774000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:502)
at org.eclipse.paho.client.mqttv3.internal.Token.waitUntilSent(Token.java:248)
- locked <0x0000000689516c80>
(a java.lang.Object) at org.eclipse.paho.client.mqttv3.MqttTopic.publish(MqttTopic.java:117)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126) at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)
at java.lang.Thread.run(Thread.java:748)
Details:
In this situation, the application is unable to serve API, the number of current threads went up to 250+, many threads go on deadlock state.
This spring application is hosted on AWS's t2.medium instance, Xms=1g, Xmx=2g, UseG1GC and we are using the jetty server.
This application generally servers a long thread wait APIs, it takes at least 12 to 60 seconds to respond to some of the APIs.
Questions:
Is there any way to find out how much threads can a spring application/JVM/jetty server can handle.
How can we tune this application to avoid such a situation (when application non-responsive)
How to restrict API's before this hung up situation.
Look here:
at org.eclipse.paho.client.mqttv3.internal.Token.waitUntilSent(Token.java:248)
- locked <0x0000000689516c80>
there is a lock, happened on timed out attempt to send message. Try to add here an asynchronous call.

Wildfly 10.1.0 Final Remoting endpoint task threads keep growing

We are using Wildfly 10.1.0 Final.
We encountered an OutOfMemoryError caused by threads kept growing.
After examining the thread dump.
We found that there are thousands of Remoting "endpoint" task-N threads.
What are Remoting "endpoint" task-N threads for?
Are they created by jobss-remoting?
After restarting the server, we found that in the begining, there were only 16 threads of them:
Remoting "endpoint" task-1 ~ Remoting "endpoint" task-16.
After the server run for serveral days or months, there may be hundred or thousands of Remoting threads:
A snippet of thread dump is listed below.
In this thread dump, there are several "Remoting "endpoint" task-11" with different number.
So are other tasks such as task-1 to task-16.
All these threads were doing nothing but waiting.
"Remoting "endpoint" task-11" #55415 daemon prio=5 os_prio=0 tid=0x00007f2b8c0a8000 nid=0x276e waiting on condition [0x00007f280a36c000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006eabee2b8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"Remoting "endpoint" task-11" #55417 daemon prio=5 os_prio=0 tid=0x00007f2ba003f800 nid=0x276d waiting on condition [0x00007f2794818000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006eabecf40> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"Remoting "endpoint" task-11" #55414 daemon prio=5 os_prio=0 tid=0x00007f2b98023800 nid=0x276b waiting on condition [0x00007f2792bfc000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006eabeda50> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"Remoting "endpoint" task-10" #55411 daemon prio=5 os_prio=0 tid=0x00007f2ba003e000 nid=0x276a waiting on condition [0x00007f27926f7000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006eabecf40> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"Remoting "endpoint" task-10" #55413 daemon prio=5 os_prio=0 tid=0x00007f2b8c0a7000 nid=0x2769 waiting on condition [0x00007f27927f8000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006eabee2b8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"Remoting "endpoint" task-10" #55412 daemon prio=5 os_prio=0 tid=0x00007f2b98022800 nid=0x2768 waiting on condition [0x00007f27c4815000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006eabeda50> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"Remoting "endpoint" task-9" #55372 daemon prio=5 os_prio=0 tid=0x00007f2c7408f000 nid=0x41df waiting on condition [0x00007f27907d8000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006eabece88> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"Remoting "endpoint" task-8" #55369 daemon prio=5 os_prio=0 tid=0x00007f2c7408d000 nid=0x41dd waiting on condition [0x00007f27909da000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006eabece88> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"Remoting "endpoint" task-7" #55368 daemon prio=5 os_prio=0 tid=0x00007f2c7408b000 nid=0x41dc waiting on condition [0x00007f2790adb000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006eabece88> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"Remoting "endpoint" task-6" #55367 daemon prio=5 os_prio=0 tid=0x00007f2c74089000 nid=0x41db waiting on condition [0x00007f2790bdc000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006eabece88> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"Remoting "endpoint" task-5" #55366 daemon prio=5 os_prio=0 tid=0x00007f2c74087000 nid=0x41da waiting on condition [0x00007f2790cdd000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006eabece88> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"Remoting "endpoint" task-4" #55365 daemon prio=5 os_prio=0 tid=0x00007f2c74085000 nid=0x41d9 waiting on condition [0x00007f2790dde000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006eabece88> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"Remoting "endpoint" task-9" #55364 daemon prio=5 os_prio=0 tid=0x00007f2bd813c000 nid=0x41d8 waiting on condition [0x00007f2790edf000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006eabed500> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"Remoting "endpoint" task-9" #55363 daemon prio=5 os_prio=0 tid=0x00007f2bf4044000 nid=0x41d7 waiting on condition [0x00007f2790fe0000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006eabee3c0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
....
20180903
I found that "Remoting "endpoint" task" threads are created by "xnio".
And I found there is an issue of xnio that is very similar to our scenario:
https://issues.jboss.org/browse/XNIO-285
It says this issue has been fixed in "xnio 3.6.0.Beta1".
Unfortunately, Wildfly 10.1.0 is using xnio 3.4.0.
When I tried to upgrade to xnio 3.6.5, I got an java.lang.NoClassDefFoundError of org/wildfly/common/context/Contextual. After upgrading wildfly-common-1.4.0.Final.jar which contains the class "org/wildfly/common/context/Contextual", NoClassDefFoundError was still there.
Is there any other way to prevent Remoting "endpoint" task threads from growing?
You might be using scoped EJB context for Remote Method Execution.
Every scoped EJB context will create new thread and simply calling context.close() method won't close the context so you are getting OutOfMemoryError
How to close scoped EJB client contexts?
The answer is the same, use the close() method on the EJB client context. But the real question is how do you get the relevant scoped EJB client context which is associated with a JNDI context. Before we get to that, it's important to understand how the ejb: JNDI namespace that's used for EJB lookups and how the JNDI context (typically the InitialContext that you see in the client code) are related. The JNDI API provided by Java language allows "URL context factory" to be registered in the JNDI framework (see this for details http://docs.oracle.com/javase/jndi/tutorial/provider/url/factory.html). Like that documentation states, the URL context factory can be used to resolve URL strings during JNDI lookup. That's what the ejb: prefix is when you do a remote EJB lookup. The ejb: URL string is backed by a URL context factory.
Internally, when a lookup happens for a ejb: URL string, a relevant javax.naming.Context is created for that ejb: lookup. Let's see some code for better understanding:
// JNDI context "A"
Context jndiCtx = new InitialContext(props);
// Now let's lookup a EJB
MyBean bean = jndiCtx.lookup("ejb:app/module/distinct/bean!interface");
So we first create a JNDI context and then use it to lookup an EJB. The bean lookup using the ejb: JNDI name, although, is just one statement, involves a few more things under the hood. What's actually happening when you lookup that string is that a separate javax.naming.Context gets created for the ejb: URL string. This new javax.naming.Context is then used to lookup the rest of the string in that JNDI name.
Let's break up that one line into multiple statements to understand better:
// Remember, the ejb: is backed by a URL context factory which returns a Context for the ejb: URL (that's why it's called a context factory)
final Context ejbNamingContext = (Context) jndiCtx.lookup("ejb:");
// Use the returned EJB naming context to lookup the rest of the JNDI string for EJB
final MyBean bean = ejbNamingContext.lookup("app/module/distinct/bean!interface");
As you see above, we split up that single statement into a couple of statements for explaining the details better. So as you can see when the ejb: URL string is parsed in a JNDI name, it gets hold of a javax.naming.Context instance. This instance is different from the one which was used to do the lookup (jndiCtx in this example). This is an important detail to understand (for reasons explained later). Now this returned instance is used to lookup the rest of the JNDI string ("app/module/distinct/bean!interface"), which then returns the EJB proxy. Irrespective of whether the lookup is done in a single statement or multiple parts, the code works the same. i.e. an instance of javax.naming.Context gets created for the ejb: URL string.
So why am I explaining all this when the section is titled "How to close scoped EJB client contexts"? The reason is because client applications dealing with scoped EJB client contexts which are associated with a JNDI context would expect the following code to close the associated EJB client context, but will be surprised that it won't:
final Properties props = new Properties();
// mark it for scoped EJB client context
props.put("org.jboss.ejb.client.scoped.context","true");
// add other properties
props.put(....);
...
Context jndiCtx = new InitialContext(props);
try {
final MyBean bean = jndiCtx.lookup("ejb:app/module/distinct/bean!interface");
bean.doSomething();
} finally {
jndiCtx.close();
}
Applications expect that the call to jndiCtx.close() will effectively close the EJB client context associated with the JNDI context. That doesn't happen because as explained previously, the javax.naming.Context backing the ejb: URL string is a different instance than the one the code is closing. The JNDI implementation in Java, only just closes the context on which the close was called. As a result, the other javax.naming.Context that backs the ejb: URL string is still not closed, which effectively means that the scoped EJB client context is not closed too which then ultimately means that the connection to the server(s) in the EJB client context are not closed too.
So now let's see how this can be done properly. We know that the ejb: URL string lookup returns us a javax.naming.Context. All we have to do is keep a reference to this instance and close it when we are done with the EJB invocations. So here's how it's going to look:
final Properties props = new Properties();
// mark it for scoped EJB client context
props.put("org.jboss.ejb.client.scoped.context","true");
// add other properties
props.put(....);
...
Context jndiCtx = new InitialContext(props);
Context ejbRootNamingContext = (Context) jndiCtx.lookup("ejb:");
try {
final MyBean bean = ejbRootNamingContext.lookup("app/module/distinct/bean!interface"); // the rest of the EJB jndi string
bean.doSomething();
} finally {
try {
// close the EJB naming JNDI context
ejbRootNamingContext.close();
} catch (Throwable t) {
// log and ignore
}
try {
// also close our other JNDI context since we are done with it too
jndiCtx.close();
} catch (Throwable t) {
// log and ignore
}
}
As you see, we changed the code to first do a lookup on just the "ejb:" string to get hold of the EJB naming context and then used that ejbRootNamingContext instance to lookup the rest of the EJB JNDI name to get hold of the EJB proxy. Then when it was time to close the context, we closed the ejbRootNamingContext (as well as the other JNDI context). Closing the ejbRootNamingContext ensures that the scoped EJB client context associated with that JNDI context is closed too. Effectively, this closes the connection(s) to the server(s) within that EJB client context.
For More Details you can refer Scoped EJB client contexts
I found that the Remoting "endpoint" task threads are created by javax.management.remote.JMXConnector. We opened some javax.management.remote.JMXConnector to access MBeans in the other servers. But didn't close them. After closing those JMXConnector instances, the threads are gone.
javax.management.remote.JMXConnector is using xnio to communicate with MBeans. It will create a XnioWorker when it is opened, and XnioWorker will create Remoting "endpoint" task threads. So the problem is not caused by EJB.

is it possible for `ConcurrentHashMap` to hang?

First I'll summarize what I've found so far.
This answer suggests that changing the concurrencyLevel parameter of ConcurrentHashMap's constructor might help. I've tried that and my code still hanged.
Answers here suggest that it could be a runtime bug.
What I'm trying to do:
I have 10 worker threads running along with a main thread. The worker threads will have to process many arrays to find the index of the max element in the array (if there are multiple max values, the first occurrence will be used). Among these "many arrays," some of them can be duplicate, so I'm trying to avoid those full array scans to speed up the program.
The controller class contains a ConcurrentHashMap that maps the hash values of arrays to the corresponding max-element indices.
The worker threads will ask the controller class for the mapped index first before trying to calculate the index by doing full array scans. In the latter case, the newly calculated index will be put into the map.
The main thread does not access the hash map.
What happened:
My code will hang after 70,000 ~ 130,000 calls to getMaxIndex(). This count is obtained by putting a log string into getMaxIndex() so it might not be exactly accurate.
My CPU usage will gradually go up for ~6 seconds, and then it will go down to ~10% after peaked at ~100%. I have plenty of unused memory left. (Does this look like deadlock?)
If the code does not use map it works just fine (see getMaxIndex() version 2 below).
I've tried to add synchronized to getMaxIndex()'s signature and use the regular HashMap instead, that also did not work.
I've tried to use different initialCapacity values too (e.g. 50,000 & 100,000). Did not work.
Here's my code:
// in the controller class
int getMaxIndex(#NotNull double[] arr) {
int hash = Arrays.hashCode(arr);
if(maxIndices.containsKey(hash)) {
return maxIndices.get(hash);
} else {
int maxIndex =
IntStream.range(0, arr.length)
.reduce((a, b) -> arr[a] < arr[b] ? b : a)
.orElse(-1); // -1 to let program crash
maxIndices.put(hash, maxIndex);
return maxIndex;
}
}
The worker thread will call getMaxIndex() like this: return remaining[controller.getMaxIndex(arr)];, remaining is just another int array.
getMaxIndex() v2:
int getMaxIndex(#NotNull double[] arr) {
return IntStream.range(0, arr.length)
.reduce((a, b) -> arr[a] < arr[b] ? b : a)
.orElse(-1); // -1 to let program crash
}
JVM info in case it matters:
java version "1.8.0_151"
Java(TM) SE Runtime Environment (build 1.8.0_151-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.151-b12, mixed mode)
EDIT: stack dump; I used Phaser to synchronize the worker threads, so some of them appear to be waiting on the phaser, but pool-1-thread-2, pool-1-thread-10, pool-1-thread-11, and pool-1-thread-12 do not appear to be waiting on the phaser.
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.151-b12 mixed mode):
"Attach Listener" #23 daemon prio=9 os_prio=0 tid=0x00007f0c54001000 nid=0x4da2 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"pool-1-thread-13" #22 prio=5 os_prio=0 tid=0x00007f0c8c2cb800 nid=0x4d5e waiting on condition [0x00007f0c4eddd000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000076e792f40> (a java.util.concurrent.Phaser$QNode)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.Phaser$QNode.block(Phaser.java:1140)
at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
at java.util.concurrent.Phaser.internalAwaitAdvance(Phaser.java:1067)
at java.util.concurrent.Phaser.arriveAndAwaitAdvance(Phaser.java:690)
at Ant.call(Ant.java:77)
at Ant.call(Ant.java:10)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"pool-1-thread-12" #21 prio=5 os_prio=0 tid=0x00007f0c8c2ca000 nid=0x4d5d waiting on condition [0x00007f0c4eede000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000775518738> (a java.util.concurrent.SynchronousQueue$TransferStack)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"pool-1-thread-11" #20 prio=5 os_prio=0 tid=0x00007f0c8c2c8000 nid=0x4d5c waiting on condition [0x00007f0c4efdf000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000775518738> (a java.util.concurrent.SynchronousQueue$TransferStack)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"pool-1-thread-10" #19 prio=5 os_prio=0 tid=0x00007f0c8c2c6000 nid=0x4d5b waiting on condition [0x00007f0c4f0e0000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000775518738> (a java.util.concurrent.SynchronousQueue$TransferStack)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"pool-1-thread-9" #18 prio=5 os_prio=0 tid=0x00007f0c8c2c4800 nid=0x4d5a waiting on condition [0x00007f0c4f1e1000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000076e7c74f8> (a java.util.concurrent.Phaser$QNode)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.Phaser$QNode.block(Phaser.java:1140)
at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
at java.util.concurrent.Phaser.internalAwaitAdvance(Phaser.java:1067)
at java.util.concurrent.Phaser.arriveAndAwaitAdvance(Phaser.java:690)
at Ant.call(Ant.java:77)
at Ant.call(Ant.java:10)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"pool-1-thread-8" #17 prio=5 os_prio=0 tid=0x00007f0c8c2c2800 nid=0x4d59 waiting on condition [0x00007f0c4f2e2000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000076e64fb78> (a java.util.concurrent.Phaser$QNode)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.Phaser$QNode.block(Phaser.java:1140)
at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
at java.util.concurrent.Phaser.internalAwaitAdvance(Phaser.java:1067)
at java.util.concurrent.Phaser.arriveAndAwaitAdvance(Phaser.java:690)
at Ant.call(Ant.java:77)
at Ant.call(Ant.java:10)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"pool-1-thread-7" #16 prio=5 os_prio=0 tid=0x00007f0c8c2c1000 nid=0x4d58 waiting on condition [0x00007f0c4f3e3000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000076e8b44c8> (a java.util.concurrent.Phaser$QNode)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.Phaser$QNode.block(Phaser.java:1140)
at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
at java.util.concurrent.Phaser.internalAwaitAdvance(Phaser.java:1067)
at java.util.concurrent.Phaser.arriveAndAwaitAdvance(Phaser.java:690)
at Ant.call(Ant.java:77)
at Ant.call(Ant.java:10)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"pool-1-thread-6" #15 prio=5 os_prio=0 tid=0x00007f0c8c2bf800 nid=0x4d57 waiting on condition [0x00007f0c4f4e4000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000076e5b4500> (a java.util.concurrent.Phaser$QNode)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.Phaser$QNode.block(Phaser.java:1140)
at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
at java.util.concurrent.Phaser.internalAwaitAdvance(Phaser.java:1067)
at java.util.concurrent.Phaser.arriveAndAwaitAdvance(Phaser.java:690)
at Ant.call(Ant.java:77)
at Ant.call(Ant.java:10)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"pool-1-thread-5" #14 prio=5 os_prio=0 tid=0x00007f0c8c2bd800 nid=0x4d56 waiting on condition [0x00007f0c4f5e5000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000076e836958> (a java.util.concurrent.Phaser$QNode)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.Phaser$QNode.block(Phaser.java:1140)
at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
at java.util.concurrent.Phaser.internalAwaitAdvance(Phaser.java:1067)
at java.util.concurrent.Phaser.arriveAndAwaitAdvance(Phaser.java:690)
at Ant.call(Ant.java:77)
at Ant.call(Ant.java:10)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"pool-1-thread-4" #13 prio=5 os_prio=0 tid=0x00007f0c8c2bc000 nid=0x4d55 waiting on condition [0x00007f0c4f6e6000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000076e4f4cf0> (a java.util.concurrent.Phaser$QNode)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.Phaser$QNode.block(Phaser.java:1140)
at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
at java.util.concurrent.Phaser.internalAwaitAdvance(Phaser.java:1067)
at java.util.concurrent.Phaser.arriveAndAwaitAdvance(Phaser.java:690)
at Ant.call(Ant.java:77)
at Ant.call(Ant.java:10)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"pool-1-thread-3" #12 prio=5 os_prio=0 tid=0x00007f0c8c2ba000 nid=0x4d54 waiting on condition [0x00007f0c4f7e7000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000076e40abb8> (a java.util.concurrent.Phaser$QNode)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.Phaser$QNode.block(Phaser.java:1140)
at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
at java.util.concurrent.Phaser.internalAwaitAdvance(Phaser.java:1067)
at java.util.concurrent.Phaser.arriveAndAwaitAdvance(Phaser.java:690)
at Ant.call(Ant.java:77)
at Ant.call(Ant.java:10)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"pool-1-thread-2" #11 prio=5 os_prio=0 tid=0x00007f0c8c2b8800 nid=0x4d53 waiting on condition [0x00007f0c4f8e8000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000775518738> (a java.util.concurrent.SynchronousQueue$TransferStack)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"pool-1-thread-1" #10 prio=5 os_prio=0 tid=0x00007f0c8c2b5800 nid=0x4d52 waiting on condition [0x00007f0c4f9e9000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000076e486ab0> (a java.util.concurrent.Phaser$QNode)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.Phaser$QNode.block(Phaser.java:1140)
at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
at java.util.concurrent.Phaser.internalAwaitAdvance(Phaser.java:1067)
at java.util.concurrent.Phaser.arriveAndAwaitAdvance(Phaser.java:690)
at Ant.call(Ant.java:77)
at Ant.call(Ant.java:10)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"Service Thread" #9 daemon prio=9 os_prio=0 tid=0x00007f0c8c200800 nid=0x4d50 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread2" #8 daemon prio=9 os_prio=0 tid=0x00007f0c8c1fd800 nid=0x4d4f waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread1" #7 daemon prio=9 os_prio=0 tid=0x00007f0c8c1f8800 nid=0x4d4e waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" #6 daemon prio=9 os_prio=0 tid=0x00007f0c8c1f7800 nid=0x4d4d waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Monitor Ctrl-Break" #5 daemon prio=5 os_prio=0 tid=0x00007f0c8c1fb000 nid=0x4d4c runnable [0x00007f0c781b4000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
- locked <0x000000077550ecb0> (a java.io.InputStreamReader)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
at java.io.BufferedReader.fill(BufferedReader.java:161)
at java.io.BufferedReader.readLine(BufferedReader.java:324)
- locked <0x000000077550ecb0> (a java.io.InputStreamReader)
at java.io.BufferedReader.readLine(BufferedReader.java:389)
at com.intellij.rt.execution.application.AppMainV2$1.run(AppMainV2.java:64)
"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007f0c8c181000 nid=0x4d49 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007f0c8c14d800 nid=0x4d42 in Object.wait() [0x00007f0c78564000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x0000000775500d08> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
- locked <0x0000000775500d08> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007f0c8c149000 nid=0x4d41 in Object.wait() [0x00007f0c78665000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x0000000775500d48> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
- locked <0x0000000775500d48> (a java.lang.ref.Reference$Lock)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
"main" #1 prio=5 os_prio=0 tid=0x00007f0c8c00c800 nid=0x4d35 waiting on condition [0x00007f0c91f77000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000076dd5e268> (a java.util.concurrent.FutureTask)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
at java.util.concurrent.FutureTask.get(FutureTask.java:191)
at java.util.concurrent.AbstractExecutorService.invokeAll(AbstractExecutorService.java:244)
at ConcurrentACS.loop(ConcurrentACS.java:138)
at ConcurrentACS.compute(ConcurrentACS.java:165)
at ConcurrentACS.main(ConcurrentACS.java:192)
"VM Thread" os_prio=0 tid=0x00007f0c8c141800 nid=0x4d3f runnable
"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007f0c8c022000 nid=0x4d37 runnable
"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007f0c8c024000 nid=0x4d38 runnable
"GC task thread#2 (ParallelGC)" os_prio=0 tid=0x00007f0c8c025800 nid=0x4d39 runnable
"GC task thread#3 (ParallelGC)" os_prio=0 tid=0x00007f0c8c027800 nid=0x4d3a runnable
"VM Periodic Task Thread" os_prio=0 tid=0x00007f0c8c205800 nid=0x4d51 waiting on condition
JNI global references: 272
is it possible for ConcurrentHashMap to hang?
The short answer is no if by "hang" you mean some sort of program loop or deadlock. If you are implying that you have discovered a race condition (bug) in that code that would cause it to hang during normal JVM and system execution then I seriously doubt it.
I suspect that there is something else going on and just because you are using a CHM in the version that is hanging shouldn't imply that the class has a bug. I would use stack dumps or a profiler to show that the code is locked on a CHM line before I'd cast any blame that way.
Is it possible to be calling CHM at some large number of times per second so that the performance of your program suffers because of it? Sure. But it wouldn't hang in that it is stuck or deadlocked.
My CPU usage will gradually go up for ~6 seconds, and then it will go down to ~10% after peaked at ~100%. I have plenty of unused memory left. (Does this look like deadlock?)
Your now posted stack trace shows that no threads are locked in CHM code so it doesn't look to be the problem. The performance curve you are talking about seems to be happening because of the fork/join thread-pool that you are using initially starts X threads but then some of them finish their tasks and exit. This is to be expected. It has nothing to do with the CHM.
if(maxIndices.containsKey(hash)) {
return maxIndices.get(hash);
Just a quick comment. This code makes 2 calls to the CHM instead of something like:
Integer maxIndex = maxIndices.get(hash);
if (maxIndex != null) {
return maxIndex;
}
...
But that's just inefficient and wouldn't cause a bug. Also, it is important to recognize that race conditions in your code means that multiple threads might get a null for the index and calculate the index value. But also this is not a bug which would cause a "hang".
The first version isn't thread safe cause your check-than-act sequence isn't atomic. Try to use this implementation:
private final Map<Integer, Integer> maxIndices = new ConcurrentHashMap<>();
int getMaxIndex(final double[] arr) {
// make sure the content of the arr can't be modified concurrently
// otherwise create a copy of the array in this method
int hash = Arrays.hashCode(arr);
return maxIndices.computeIfAbsent(hash,
key -> IntStream.range(0, arr.length).reduce((a, b) -> arr[a] < arr[b] ? b : a).orElse(-1));
}

Logback hangs forever

I am writing a Java application to route a high number of concurrent messages. The application uses the Logback framework for logging and I am seeing a surprising behavior where the application hangs. In a stack trace, I can see that application threads are stuck in logging calls:
"New I/O client worker #1-1" #125 prio=5 os_prio=0 tid=0x00007f0524017000 nid=0x29f3 waiting on condition [0x00007f052ecea000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00007f089c4a7e88> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at java.util.concurrent.ArrayBlockingQueue.remainingCapacity(ArrayBlockingQueue.java:468)
at ch.qos.logback.core.AsyncAppenderBase.isQueueBelowDiscardingThreshold(AsyncAppenderBase.java:152)
at ch.qos.logback.core.AsyncAppenderBase.append(AsyncAppenderBase.java:144)
at ch.qos.logback.core.UnsynchronizedAppenderBase.doAppend(UnsynchronizedAppenderBase.java:84)
at ch.qos.logback.core.spi.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:51)
at ch.qos.logback.classic.Logger.appendLoopOnAppenders(Logger.java:270)
at ch.qos.logback.classic.Logger.callAppenders(Logger.java:257)
at ch.qos.logback.classic.Logger.buildLoggingEventAndAppend(Logger.java:421)
at ch.qos.logback.classic.Logger.filterAndLog_0_Or3Plus(Logger.java:383)
at ch.qos.logback.classic.Logger.info(Logger.java:579)
at com.application.ClientListener$6.operationComplete(***.java:514)
- locked <0x00007f089c372b60> (a com.application.ClientListener)
at org.jboss.netty.channel.DefaultChannelFuture.notifyListener(DefaultChannelFuture.java:381)
at org.jboss.netty.channel.DefaultChannelFuture.notifyListeners(DefaultChannelFuture.java:372)
at org.jboss.netty.channel.DefaultChannelFuture.setSuccess(DefaultChannelFuture.java:316)
at org.jboss.netty.channel.socket.nio.NioWorker$RegisterTask.run(NioWorker.java:776)
at org.jboss.netty.channel.socket.nio.NioWorker.processRegisterTaskQueue(NioWorker.java:257)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:199)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
at org.jboss.netty.util.internal.IoWorkerRunnable.run(IoWorkerRunnable.java:46)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Locked ownable synchronizers:
- <0x00007f08a80fc118> (a java.util.concurrent.ThreadPoolExecutor$Worker)
It seems that the logging call is blocked trying to acquire an lock <0x00007f089c4a7e88>
inside a java.util.concurrent.ArrayBlockingQueue instance used in AsyncAppenderBase.
In the stack trace, I can see that the lock <0x00007f089c4a7e88> is held by another thread in a thread pool that is idle:
"dispatcher-3" #90 prio=5 os_prio=0 tid=0x00007f04d0004800 nid=0x29d2 waiting on condition [0x00007f0534ed3000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00007f089cbbaae8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1081)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Locked ownable synchronizers:
- <0x00007f089c4a7e88> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
It looks like the internal lock of the ArrayBlockingQueue was held by that thread and subsequently
not released.
What is going on here? A race condition in java.util.concurrent.ArrayBlockingQueue? A bug in Logback?
I am using Java 8u40 and Logback 1.2.1.
You need to set asyncAppender neverBlock option to true for this to work.

jetty-9 all the threads in TIMED_WAITING(parking) state

I have an application which which is deployed on jetty server.
Jetty standalone server has default configuration.Application contains spring based websocket.
Clients connect to the server using web-socket connection.
Once connection is established client sends heartbeat messages for every 5 seconds and sends few requests like requesting specific Data.
When client opens connection without sending any heart beat messages or specific requests am able to open any number of connections(tried upto 10000)
On the other hand,when all the threads are opened and started sending heart beat messages and specific requests parallely, server is hanged and is no longer able to accept connections greater than 2000.
Jetty Server version: jetty-distribution-9.3.8.v20160314
But after 2000 connections only application is getting hanged
Using jstack took the thread dump and the following is the result:
"qtp1143839598-216" #216 prio=5 os_prio=0 tid=0x00007fcd88017000 nid=0x7365 waiting on condition [0x00007fcd344c3000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000850d7fb0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
at org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:546)
at org.eclipse.jetty.util.thread.QueuedThreadPool.access$800(QueuedThreadPool.java:47)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:609)
at java.lang.Thread.run(Thread.java:745)
"qtp1143839598-215" #215 prio=5 os_prio=0 tid=0x00007fcd8c01c000 nid=0x7364 waiting on condition [0x00007fcd345c4000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000850d7fb0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
at org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:546)
at org.eclipse.jetty.util.thread.QueuedThreadPool.access$800(QueuedThreadPool.java:47)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:609)
at java.lang.Thread.run(Thread.java:745)
"qtp1143839598-214" #214 prio=5 os_prio=0 tid=0x00007fcd805ea800 nid=0x735f waiting on condition [0x00007fcd346c5000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000850d7fb0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
at org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:546)
at org.eclipse.jetty.util.thread.QueuedThreadPool.access$800(QueuedThreadPool.java:47)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:609)
at java.lang.Thread.run(Thread.java:745)
"qtp1143839598-213" #213 prio=5 os_prio=0 tid=0x00007fcd84135000 nid=0x7358 waiting on condition [0x00007fcd347c6000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000850d7fb0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
at org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:546)
at org.eclipse.jetty.util.thread.QueuedThreadPool.access$800(QueuedThreadPool.java:47)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:609)
at java.lang.Thread.run(Thread.java:745)
"qtp1143839598-212" #212 prio=5 os_prio=0 tid=0x00007fcd78032000 nid=0x7354 waiting on condition [0x00007fcd348c7000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000850d7fb0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
at org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:546)
at org.eclipse.jetty.util.thread.QueuedThreadPool.access$800(QueuedThreadPool.java:47)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:609)
at java.lang.Thread.run(Thread.java:745)
There are around 196 threads with similar to the above logs
We are expecting
500K live connections at any point of time 2. 2000 new connections per minute , 4000 re-connects (error and retry on client side)
Is there any way to configure jetty to handle the above requirement..and the above logs in jstack shows deadlock? or problem in application threads? Or is there any configuration to be made in spring websockets to achieve the above requirement. Sorry if language/framing of the question is not clear..

Categories