SelectorImpl is BLOCKED - java

I use a lot of client sends a request to the server about 1000 requests per second a client, the server's CPU soon rose to 600% (8 cores), and always maintain this state. When I use jstack printing process content, I found SelectorImpl is BLOCKED state. Records are as follows:
nioEventLoopGroup-4-1 prio=10 tid=0x00007fef28001800 nid=0x1dbf waiting for monitor entry [0x00007fef9eec7000]
java.lang.Thread.State: BLOCKED (on object monitor)
at sun.nio.ch.EPollSelectorImpl.doSelect(Unknown Source)
- waiting to lock <0x00000000c01f1af8> (a java.lang.Object)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(Unknown Source)
- locked <0x00000000c01d9420> (a io.netty.channel.nio.SelectedSelectionKeySet)
- locked <0x00000000c01f1948> (a java.util.Collections$UnmodifiableSet)
- locked <0x00000000c01d92c0> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(Unknown Source)
at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:635)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:319)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
at java.lang.Thread.run(Unknown Source)
High CPU has something to do with this? Another problem is that when I connect a lot of clients, find some client will connect, an error is as follows:
"nioEventLoopGroup-4-1" prio=10 tid=0x00007fef28001800 nid=0x1dbf waiting for monitor entry [0x00007fef9eec7000]
java.lang.Thread.State: BLOCKED (on object monitor)
at sun.nio.ch.EPollSelectorImpl.doSelect(Unknown Source)
- waiting to lock <0x00000000c01f1af8> (a java.lang.Object)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(Unknown Source)
- locked <0x00000000c01d9420> (a io.netty.channel.nio.SelectedSelectionKeySet)
- locked <0x00000000c01f1948> (a java.util.Collections$UnmodifiableSet)
- locked <0x00000000c01d92c0> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(Unknown Source)
at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:635)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:319)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101)
at java.lang.Thread.run(Unknown Source)
Generate client is accomplished by using a thread pool, and has set up a connection timeout, but why frequent connection timeout? Is to serve the cause of the suit?
public void run() {
System.out.println(tnum + " connecting...");
try {
Bootstrap bootstrap = new Bootstrap();
bootstrap.group(group)
.channel(NioSocketChannel.class)
.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, 30000)
.handler(loadClientInitializer);
// Start the connection attempt.
ChannelFuture future = bootstrap.connect(host, port);
future.channel().attr(AttrNum).set(tnum);
future.sync();
if (future.isSuccess()) {
System.out.println(tnum + " login success.");
goSend(tnum, future.channel());
} else {
System.out.println(tnum + " login failed.");
}
} catch (Exception e) {
XLog.error(e);
} finally {
// group.shutdownGracefully();
}
}

High CPU has something to do with this?
It might be. I'd diagnose this problem following way (on a Linux box):
Find threads which are eating CPU
Using pidstat I'd find which threads are eating CPU and in what mode (user/kernel) time is spent.
$ pidstat -p [java-process-pid] -tu 1 | awk '$9 > 50'
This command shows threads eating at least 50% of CPU time. You can inspect what those threads are doing using jstack, VisualVM or Java Flight Recorder.
If CPU-hungry threads and BLOCKED threads are the same, CPU usage seems to do something with contention.
Find reason for connection timeout
Basically you will get connection timeout if two OS'es can't finish TCP-handshake in a given time. Several reasons for this:
network link saturation. Can be diagnosed using sar -n DEV 1 and comparing rxkB/s and txkB/s columns to your link maximum throughput.
server (Netty) doesn't respond with accept() call in given timeout. This thread can be BLOCKED or starving for CPU time. You can find which threads are calling accept() (therefore finishing TCP-handshake) using strace -f -e trace=accept -p [java-pid]. And after that check for possible reasons using pidstat/jstack.
Also you can find number of received requests for connection open (but not confirmed) with netstat -an | grep -c SYN_RECV

If you can elaborate more on what your Netty is doing it could be helpful. Regardless - please make sure you are closing the channels. Notice from the Channel javadoc:
It is important to call close() or close(ChannelPromise) to release all resources once you are done with the Channel. This ensures all resources are released in a proper way, i.e. filehandles
If you are closing the channels, then the problem may be with the logic it self - running into infinite loops or similar - which may be able to explain the high CPU.

Related

GRPC server in JAVA to support huge load of requests

I have a grpc (1.13.x) server on java that isn't performing any computation or I/O intensive task. The intention is to check the number of requests this server can support per second on 80 core machine.
Server:
ExecutorService executor = new ThreadPoolExecutor(160, Integer.MAX_VALUE,
60L, TimeUnit.SECONDS,
new SynchronousQueue<Runnable>(),
new ThreadFactoryBuilder()
.setDaemon(true)
.setNameFormat("Glowroot-IT-Harness-GRPC-Executor-%d")
.build());
Server server = NettyServerBuilder.forPort(50051)
.addService(new MyService())
.executor(executor)
.build()
.start();
Service:
#Override
public void verify(Request request, StreamObserver<Result> responseObserver) {
Result result = Result.newBuilder()
.setMessage("hello")
.build();
responseObserver.onNext(result);
responseObserver.onCompleted();
}
I am using ghz client to perform a load test. Server is able to handle 40k requests per second but RPS count is not able to exceed more than 40k even on increase in number of concurrent clients with incoming requests rate 100k. GRPC server is able to handle just 40K requests per second and it queues all other requests. CPU is underutilized (7%). About 90% of grpc threads (with prefix grpc-default-executor) were in waiting state, despite no I/O operation. More than 25k threads are in waiting state.
Stacktrace of threads in waiting:
grpc-default-executor-4605
PRIORITY :5
THREAD ID :0X00007F15A4440D80
NATIVE ID :
stackTrace:
java.lang.Thread.State: TIMED_WAITING (parking)
at jdk.internal.misc.Unsafe.park(java.base#15.0.1/Native Method)
- parking to wait for <0x00007f1df161ae20> (a java.util.concurrent.SynchronousQueue$TransferStack)
at java.util.concurrent.locks.LockSupport.parkNanos(java.base#15.0.1/LockSupport.java:252)
at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(java.base#15.0.1/SynchronousQueue.java:462)
at java.util.concurrent.SynchronousQueue$TransferStack.transfer(java.base#15.0.1/SynchronousQueue.java:361)
at java.util.concurrent.SynchronousQueue.poll(java.base#15.0.1/SynchronousQueue.java:937)
at java.util.concurrent.ThreadPoolExecutor.getTask(java.base#15.0.1/ThreadPoolExecutor.java:1055)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base#15.0.1/ThreadPoolExecutor.java:1116)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base#15.0.1/ThreadPoolExecutor.java:630)
at java.lang.Thread.run(java.base#15.0.1/Thread.java:832)
Locked ownable synchronizers:
- None
How can I configure the server to support 100K+ requests?
Nothing in the gRPC stack seems to cause this limit. What's the average response time on the server side? It looks like you are limited by the ephemeral ports or TCP connection limit and you may want to tweak your kernel as described at here https://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-1 or here https://blog.box.com/ephemeral-port-exhaustion-and-web-services-at-scale

How to configure 'MQCNO_HANDLE_SHARE_NO_BLOCK' in IBM MQ Connection Factory?

I'm having a race condition between two threads simultaneously trying to close a JMS Session, which was created for the IBM MQ broker. It appears theres an option to prevent different threads from using a same connection handler simultaneously. The config is called 'MQCNO_HANDLE_SHARE_NO_BLOCK' (check https://www.ibm.com/support/knowledgecenter/en/SSFKSJ_7.5.0/com.ibm.mq.ref.dev.doc/q095480_.htm).
I'm looking for some way to configure this property using the MQConnectionFactory, in the IBM MQ Java client (v.9.1.1.0).
I've already tried using the connectionFactory.setMQConnectionOptions, and or-ing the 'com.ibm.mq.constants.CMQC#MQCNO_HANDLE_SHARE_NO_BLOCK' constant to the actual value, but the client fails to start, telling me it's that the connection options are not valid.
connectionFactory.setMQConnectionOptions(connectionFactory.getMQConnectionOptions());
connectionFactory.setMQConnectionOptions(connectionFactory.getMQConnectionOptions() | MQCNO_HANDLE_SHARE_NO_BLOCK);
I've found in some adaptation of the IBM MQ client to GoLang that the flag is setted in here:
if gocno == nil {
// Because Go programs are always threaded, and we cannot
// tell on which thread we might get dispatched, allow handles always to
// be shareable.
gocno = NewMQCNO()
gocno.Options = MQCNO_HANDLE_SHARE_NO_BLOCK
} else {
if (gocno.Options & (MQCNO_HANDLE_SHARE_NO_BLOCK |
MQCNO_HANDLE_SHARE_BLOCK)) == 0 {
gocno.Options |= MQCNO_HANDLE_SHARE_NO_BLOCK
}
}
copyCNOtoC(&mqcno, gocno)
C.MQCONNX((*C.MQCHAR)(mqQMgrName), &mqcno, &qMgr.hConn, &mqcc, &mqrc)
Does anyone dealt with this problem, or used this flag?
EDIT - Added thread dumps from two locked threads.
I have the following threads locked:
1) JMSCCThreadPoolWoker, aka, IBM worker handling the exception raised by an IBM TCP receiver thread:
"JMSCCThreadPoolWorker-493": inconsistent?, holding [0x00000006d605a0b8, 0x00000006d5f6b9e8, 0x00000005c631e140]
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:502)
at com.ibm.mq.jmqi.remote.util.ReentrantMutex.acquire(ReentrantMutex.java:167)
at com.ibm.mq.jmqi.remote.util.ReentrantMutex.acquire(ReentrantMutex.java:73)
at com.ibm.mq.jmqi.remote.api.RemoteHconn.requestDispatchLock(RemoteHconn.java:1219)
at com.ibm.mq.jmqi.remote.api.RemoteFAP.MQCTL(RemoteFAP.java:2576)
at com.ibm.mq.jmqi.monitoring.JmqiInterceptAdapter.MQCTL(JmqiInterceptAdapter.java:333)
at com.ibm.msg.client.wmq.internal.WMQConsumerOwnerShadow.controlAsyncService(WMQConsumerOwnerShadow.java:169)
at com.ibm.msg.client.wmq.internal.WMQConsumerOwnerShadow.stop(WMQConsumerOwnerShadow.java:471)
at com.ibm.msg.client.wmq.internal.WMQSession.stop(WMQSession.java:1894)
at com.ibm.msg.client.jms.internal.JmsSessionImpl.stop(JmsSessionImpl.java:2515)
at com.ibm.msg.client.jms.internal.JmsSessionImpl.stop(JmsSessionImpl.java:2498)
at com.ibm.msg.client.jms.internal.JmsConnectionImpl.stop(JmsConnectionImpl.java:1263)
at com.ibm.mq.jms.MQConnection.stop(MQConnection.java:473)
at org.springframework.jms.connection.SingleConnectionFactory.closeConnection(SingleConnectionFactory.java:491)
at org.springframework.jms.connection.SingleConnectionFactory.resetConnection(SingleConnectionFactory.java:383)
at org.springframework.jms.connection.CachingConnectionFactory.resetConnection(CachingConnectionFactory.java:199)
at org.springframework.jms.connection.SingleConnectionFactory.onException(SingleConnectionFactory.java:361)
at org.springframework.jms.connection.SingleConnectionFactory$AggregatedExceptionListener.onException(SingleConnectionFactory.java:715)
at com.ibm.msg.client.jms.internal.JmsProviderExceptionListener.run(JmsProviderExceptionListener.java:413)
at com.ibm.msg.client.commonservices.workqueue.WorkQueueItem.runTask(WorkQueueItem.java:319)
at com.ibm.msg.client.commonservices.workqueue.SimpleWorkQueueItem.runItem(SimpleWorkQueueItem.java:99)
at com.ibm.msg.client.commonservices.workqueue.WorkQueueItem.run(WorkQueueItem.java:343)
at com.ibm.msg.client.commonservices.workqueue.WorkQueueManager.runWorkQueueItem(WorkQueueManager.java:312)
at com.ibm.msg.client.commonservices.j2se.workqueue.WorkQueueManagerImplementation$ThreadPoolWorker.run(WorkQueueManagerImplementation.java:1227)
2) A message handling threads, which happens to be processing a message, and in a Catch clause, attemps to close the sessions/producer/consumer (there's a JMS REPLY_TO handling situtation).
"[MuleRuntime].cpuLight.07: CPU_LITE #76c382e9": awaiting notification on [0x00000006d5f6b9c8], holding [0x0000000718d73900]
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at com.ibm.msg.client.jms.internal.JmsSessionImpl.close(JmsSessionImpl.java:383)
at com.ibm.msg.client.jms.internal.JmsSessionImpl.close(JmsSessionImpl.java:349)
at com.ibm.mq.jms.MQSession.close(MQSession.java:275)
at org.springframework.jms.connection.CachingConnectionFactory$CachedSessionInvocationHandler.physicalClose(CachingConnectionFactory.java:481)
at org.springframework.jms.connection.CachingConnectionFactory$CachedSessionInvocationHandler.invoke(CachingConnectionFactory.java:311)
at com.sun.proxy.$Proxy197.close(Unknown Source)
at org.mule.jms.commons.internal.connection.session.DefaultJmsSession.close(DefaultJmsSession.java:65)
at org.mule.jms.commons.internal.common.JmsCommons.closeQuietly(JmsCommons.java:165)
at org.mule.jms.commons.internal.source.JmsListener.doReply(JmsListener.java:326)
at MORE STUFF BELOW

tomcat: Waiting on condition thread

I'm investigating the strange issue with tomcat shutdown process: after runnig shutdown.sh the java process still appears(I check it by using ps -ef|grep tomcat)
The situation is a bit complicated, because I have very limitted access to the server(no debug, for example)
I took thread dump(by using kill -3 <PID>) and heap dump by using remote jConsole and Hotspot features.
After looking into thread dump I found this:
"pool-2-thread-1" #74 prio=5 os_prio=0 tid=0x00007f3354359800 nid=0x7b46 waiting on condition [0x00007f333e55d000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000c378d330> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1081)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
So, my understanding of the problem is follows: There is a resource(DB connection or something else) which is used in CachedThreadpool, and this resource is now locked,
and prevent to thread pool-2-thread-1 to stop. Assuming that this thread isn't deamon - JVM cannot gracefully stop.
Is there a way to find out which resource is locked, from where is it locked and how to avoid that? Another question is - how to prevent this situation?
Another this is: what the adress 0x00007f333e55d000 is for?
Thanks!
After a wail of struggling I found out, that the freezing thread always have the same name pool-2-thread-1. I grep all the sources of the project to find any places where any scheduled thread pool is started: Executors#newScheduledThreadPool. After a huge amount of time I logged all that places, with thread name in thread pool.
One more restart server and gotcha!
I found out one thread pool, that is started with only one thread and used following code:
private void shutdownThreadpool(int tries) {
try {
boolean terminated;
LOGGER.debug("Trying to shutdown thread pool in {} tries", tries);
pool.shutdown();
do {
terminated = pool.awaitTermination(WAIT_ON_SHUTDOWN,TimeUnit.MILLISECONDS);
if (--tries == 0
&& !terminated) {
LOGGER.debug("After 10 attempts couldn't shutdown thread pool, force shutdown");
pool.shutdownNow();
terminated = pool.awaitTermination(WAIT_ON_SHUTDOWN, TimeUnit.MILLISECONDS);
if (!terminated) {
LOGGER.debug("Cannot stop thread pool even with force");
LOGGER.trace("Some of the workers doesn't react to Interruption event properly");
terminated = true;
}
} else {
LOGGER.info("After {} attempts doesn't stop", tries);
}
} while (!terminated);
LOGGER.debug("Successfully stop thread pool");
} catch (final InterruptedException ie) {
LOGGER.warn("Thread pool shutdown interrupted");
}
}
After that the issue was solved.

Reading thread dump to debug an exhausted database connection pool

I have a web app being served by jetty + mysql. I'm running into an issue where my database connection pool gets exhausted, and all threads start blocking waiting for a connection. I've tried two database connection pool libraries: (1) bonecp (2) hikari. Both exhibit the same behavior with my app.
I've done several thread dumps when I see this state, and all the blocked threads are in this state (not picking on bonecp, I'm sure it's something on my end now):
"qtp1218743501-131" prio=10 tid=0x00007fb858295800 nid=0x669b waiting on condition [0x00007fb8cd5d3000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000763f42d20> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
at com.jolbox.bonecp.DefaultConnectionStrategy.getConnectionInternal(DefaultConnectionStrategy.java:82)
at com.jolbox.bonecp.AbstractConnectionStrategy.getConnection(AbstractConnectionStrategy.java:90)
at com.jolbox.bonecp.BoneCP.getConnection(BoneCP.java:553)
at com.me.Foo.start(Foo.java:30)
...
I'm not sure where to go from here. I was thinking that I would see some stack traces in the thread dump where my code was stuck doing some lengthly operation, not waiting for a connection. For example, if my code looks like this:
public class Foo {
public void start() {
Connection conn = threadPool.getConnection();
work(conn);
conn.close();
}
public void work(Connection conn) {
.. something lengthy like scan every row in the database etc ..
}
}
I would expect one of the threads above to have a stack trace that shows it working away in the work() method:
...
at com.me.mycode.Foo.work()
at com.me.mycode.Foo.start()
but instead they're just all waiting for a connection:
...
at com.jolbox.bonecp.BoneCP.getConnection() // ?
at com.me.mycode.Foo.work()
at com.me.mycode.Foo.start()
Any thoughts on how to continue debugging would be great.
Some other background: the app operates normally for about 45 minutes, mem and thread dumps show nothing out of the ordinary. Then the condition is triggered and the thread count spikes up. I started thinking it might be some combination of sql statements the app is trying to perform which turn into some sort of lock on the mysql side, but again I would expect some of the threads in the stack traces above to show me that they're in that part of the code.
The thread dumps were taken using visualvm.
Thanks
Take advantage of the configuration options for the connection pool (see BoneCPConfig / HikariCPConfig). First of all, set a connection time-out (HikariCP connectionTimeout) and a leak detection time-out (HikariCP leakDetectionThreshold, I could not find the counterpart in BoneCP). There might be more configuration options that dump stack-traces when something is not quite right.
My guess is that your application does not always return a connection to the pool and after 45 minutes has no connection in the pool anymore (and thus blocks forever trying to get a connection from the pool). Treat a connection like opening/closing a file, i.e. always use try/finally:
public void start() {
Connection conn = null;
try {
work(conn = dbPool.getConnection());
} finally {
if (conn != null) {
conn.close();
}
}
}
Finally, both connection pools have options to allow JMX monitoring. You can use this to monitor for strange behavior in the pool.
I question the whole design.
If you have a waiting block in a multithreaded netIO, you need a better implementation of the connection.
I suggest you take a look at non blocking IO (Java.nio, channels package), or granulate your locks.

java app, thread hangs after killing MySQL connection

I have some worker threads running, with MySQL and mysql-connector-java-5.1.20.
When I kill some SQL statement ( using kill "connection id" from mysql client), the java thread hangs, which should throw some exception.
jstack prints:
"quartzBase$child#45e3dd3c_Worker-3" prio=10 tid=0x00007f960004c800 nid=0x713d runnable [0x00007f943b3a0000]
java.lang.Thread.State: RUNNABLE
at java.net.PlainSocketImpl.socketAvailable(Native Method)
at java.net.PlainSocketImpl.available(PlainSocketImpl.java:472)
- locked <0x00007f9e11fe13a8> (a java.net.SocksSocketImpl)
at java.net.SocketInputStream.available(SocketInputStream.java:217)
at com.mysql.jdbc.util.ReadAheadInputStream.available(ReadAheadInputStream.java:232)
at com.mysql.jdbc.MysqlIO.clearInputStream(MysqlIO.java:981)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2426)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2651)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2677)
- locked <0x00007f9e17de2b50> (a com.mysql.jdbc.JDBC4Connection)
at com.mysql.jdbc.ConnectionImpl.rollbackNoChecks(ConnectionImpl.java:4863)
at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4749)
- locked <0x00007f9e17de2b50> (a com.mysql.jdbc.JDBC4Connection)
at org.apache.commons.dbcp.DelegatingConnection.rollback(DelegatingConnection.java:368)
at org.apache.commons.dbcp.PoolingDataSource$PoolGuardConnectionWrapper.rollback(PoolingDataSource.java:323)
at org.hibernate.transaction.JDBCTransaction.rollbackAndResetAutoCommit(JDBCTransaction.java:217)
at org.hibernate.transaction.JDBCTransaction.rollback(JDBCTransaction.java:196)
at org.springframework.orm.hibernate3.HibernateTransactionManager.doRollback(HibernateTransactionManager.java:676)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.processRollback(AbstractPlatformTransactionManager.java:845)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.rollback(AbstractPlatformTransactionManager.java:822)
at org.springframework.transaction.interceptor.TransactionAspectSupport.completeTransactionAfterThrowing(TransactionAspectSupport.java:430)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:112)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
at $Proxy1021.process(Unknown Source)
Using jvmtop, I saw this:
JvmTop 0.8.0 alpha - 22:48:37, amd64, 24 cpus, Linux 2.6.32-35, load avg 11.53
http://code.google.com/p/jvmtop
Profiling PID 27403: com.caucho.server.resin.Resin --root-dir
36.41% ( 0.22s) com.mysql.jdbc.util.ReadAheadInputStream.available()
33.42% ( 0.20s) ....opensymphony.xwork2.conversion.impl.DefaultTypeConve()
30.17% ( 0.18s) com.mysql.jdbc.util.ReadAheadInputStream.fill()
0.00% ( 0.00s) com.rabbitmq.client.impl.Frame.readFrom()
The worker threads will never accept new tasks.
any idea?
According to MySQL documentation "kill connection thread_id" should terminate the connection associated with the given thread_id. But it looks like that is not happening (in which case the Java thread will wait for an answer forever). Maybe you can verify that the connection is actually closed using some network tool (e.g. netstat) .
I've ran into hanging MySQL connections before and had to resort to using the socketTimeout JDBC connection parameter (but be careful: the socketTimeout needs to be larger than the time it takes to complete the longest running query). You could also try to use the QueryTimeout for a prepared statement.

Categories