In normal operation my application exits fine when sent a 'kill -s SIGTERM '.
However, under load sometimes the process does not exit.
I'm just wondering if it possible that http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6392332 is the reason for this, or if it could be something else?
Here are some parts of the stack trace of the process in question, showing the shutdown methods, any help is much appreciated.
Note, this is a Java process running on 64-bit RHEL 6.3.
2013-05-22 08:01:33
Full thread dump Java HotSpot(TM) 64-Bit Server VM (23.21-b01 mixed mode):
...
"Thread-15" prio=10 tid=0x000000001994d000 nid=0x4d5a waiting on condition [0x00007f4da08a3000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000079fd9fce8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
at java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1468)
at org.jboss.netty.util.internal.ExecutorUtil.terminate(ExecutorUtil.java:109)
at org.jboss.netty.util.internal.ExecutorUtil.terminate(ExecutorUtil.java:49)
at org.jboss.netty.channel.socket.nio.AbstractNioWorkerPool.releaseExternalResources(AbstractNioWorkerPool.java:77)
at org.jboss.netty.channel.socket.nio.NioServerSocketChannelFactory.releaseExternalResources(NioServerSocketChannelFactory.java:164)
at com.test.services.radius.server.RadiusServerImpl.stop(RadiusServerImpl.java:87)
at com.test.services.radius.ServiceProvider.unload(ServiceProvider.java:61)
at com.test.spf.ServiceProviderCacheImpl.clearCurrentCache(ServiceProviderCacheImpl.java:150)
- locked <0x00000006b8968038> (a com.test.spf.ServiceProviderCacheImpl)
at com.test.spf.ServiceProviderCacheImpl.unload(ServiceProviderCacheImpl.java:170)
at com.test.spf.SPAImpl.stop(SPAImpl.java:178)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.springframework.beans.factory.support.DisposableBeanAdapter.invokeCustomDestroyMethod(DisposableBeanAdapter.java:273)
at org.springframework.beans.factory.support.DisposableBeanAdapter.destroy(DisposableBeanAdapter.java:199)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroyBean(DefaultSingletonBeanRegistry.java:487)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingleton(DefaultSingletonBeanRegistry.java:463)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroyBean(DefaultSingletonBeanRegistry.java:480)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingleton(DefaultSingletonBeanRegistry.java:463)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroyBean(DefaultSingletonBeanRegistry.java:480)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingleton(DefaultSingletonBeanRegistry.java:463)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroyBean(DefaultSingletonBeanRegistry.java:480)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingleton(DefaultSingletonBeanRegistry.java:463)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingletons(DefaultSingletonBeanRegistry.java:431)
- locked <0x00000006da966290> (a java.util.LinkedHashMap)
at org.springframework.context.support.AbstractApplicationContext.destroyBeans(AbstractApplicationContext.java:1048)
at org.springframework.context.support.AbstractApplicationContext.doClose(AbstractApplicationContext.java:1022)
at org.springframework.context.support.AbstractApplicationContext.close(AbstractApplicationContext.java:970)
- locked <0x000000073ad1a920> (a java.lang.Object)
at org.springframework.web.context.ContextLoader.closeWebApplicationContext(ContextLoader.java:384)
at org.springframework.web.context.ContextLoaderListener.contextDestroyed(ContextLoaderListener.java:78)
at org.apache.catalina.core.StandardContext.listenerStop(StandardContext.java:4245)
at org.apache.catalina.core.StandardContext.stop(StandardContext.java:4886)
- locked <0x00000006b8968118> (a org.apache.catalina.core.StandardContext)
at org.apache.catalina.core.ContainerBase.removeChild(ContainerBase.java:936)
at org.apache.catalina.startup.HostConfig.undeployApps(HostConfig.java:1359)
at org.apache.catalina.startup.HostConfig.stop(HostConfig.java:1330)
at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:326)
at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:142)
at org.apache.catalina.core.ContainerBase.stop(ContainerBase.java:1098)
- locked <0x00000006b89682f8> (a org.apache.catalina.core.StandardHost)
at org.apache.catalina.core.ContainerBase.stop(ContainerBase.java:1110)
- locked <0x000000068e63ff10> (a org.apache.catalina.core.StandardEngine)
at org.apache.catalina.core.StandardEngine.stop(StandardEngine.java:468)
at org.apache.catalina.core.StandardService.stop(StandardService.java:604)
- locked <0x000000068e63ff10> (a org.apache.catalina.core.StandardEngine)
at org.apache.catalina.core.StandardServer.stop(StandardServer.java:788)
at org.apache.catalina.startup.Catalina.stop(Catalina.java:662)
at org.apache.catalina.startup.Catalina$CatalinaShutdownHook.run(Catalina.java:706)
"SIGTERM handler" daemon prio=10 tid=0x0000000025453000 nid=0x4d58 in Object.wait() [0x00007f4da0b2f000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x000000072c10db20> (a org.apache.catalina.startup.Catalina$CatalinaShutdownHook)
at java.lang.Thread.join(Thread.java:1258)
- locked <0x000000072c10db20> (a org.apache.catalina.startup.Catalina$CatalinaShutdownHook)
at java.lang.Thread.join(Thread.java:1332)
at java.lang.ApplicationShutdownHooks.runHooks(ApplicationShutdownHooks.java:106)
at java.lang.ApplicationShutdownHooks$1.run(ApplicationShutdownHooks.java:46)
at java.lang.Shutdown.runHooks(Shutdown.java:123)
at java.lang.Shutdown.sequence(Shutdown.java:167)
at java.lang.Shutdown.exit(Shutdown.java:212)
- locked <0x0000000707855738> (a java.lang.Class for java.lang.Shutdown)
at java.lang.Terminator$1.handle(Terminator.java:52)
at sun.misc.Signal$1.run(Signal.java:212)
at java.lang.Thread.run(Thread.java:722)
No Unix process will finish on SIGTERM till all threads are finished properly, thus returned from run method. High load could have its roots in deadlock - deadlocked threads usually never ends. Or endless loop in some thread.
Btw proper way of shutting down the Tomcat is to use bundled shutdown script. It could fail in some extraordinary situations, but then you can just SIGKILL it.
SIGTERM or SIGKILL, that's often the business question. Proper shutdown can easily take more than 15min, when complex application, swapped... a.s.o. So can you stand 15min outage or rather you'll kill it and start in next 2min?
Related
Java spring boot application request goes in pending state as threads held in WAITING and TIMED_WAITING state.
Jstack logs:
"qtp886341817-1399" #1399 prio=5 os_prio=0 tid=0x00007f02142ae800 nid=0x22f904 waiting on condition [0x00007f01c3fa8000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000684588e00> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
at org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:656)
at org.eclipse.jetty.util.thread.QueuedThreadPool.access$800(QueuedThreadPool.java:49)
at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:720)
at java.lang.Thread.run(Thread.java:748)
"threadPoolTaskExecutor-1" #114 prio=5 os_prio=0 tid=0x00007f02140b4800 nid=0x229d78 waiting on condition [0x00007f01c55b2000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000684588e58> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"qtp886341817-717" #717 prio=5 os_prio=0 tid=0x00007f021c102000 nid=0x22c546 in Object.wait() [0x00007f01ee774000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:502)
at org.eclipse.paho.client.mqttv3.internal.Token.waitUntilSent(Token.java:248)
- locked <0x0000000689516c80>
(a java.lang.Object) at org.eclipse.paho.client.mqttv3.MqttTopic.publish(MqttTopic.java:117)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305) at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168) at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126) at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)
at java.lang.Thread.run(Thread.java:748)
Details:
In this situation, the application is unable to serve API, the number of current threads went up to 250+, many threads go on deadlock state.
This spring application is hosted on AWS's t2.medium instance, Xms=1g, Xmx=2g, UseG1GC and we are using the jetty server.
This application generally servers a long thread wait APIs, it takes at least 12 to 60 seconds to respond to some of the APIs.
Questions:
Is there any way to find out how much threads can a spring application/JVM/jetty server can handle.
How can we tune this application to avoid such a situation (when application non-responsive)
How to restrict API's before this hung up situation.
Look here:
at org.eclipse.paho.client.mqttv3.internal.Token.waitUntilSent(Token.java:248)
- locked <0x0000000689516c80>
there is a lock, happened on timed out attempt to send message. Try to add here an asynchronous call.
As shown below in the thread dump, thread with id qtp336276309-556300 obtained org.apache.log4j.spi.RootLogger's
lock & did not release it.
This causes other threads to be BLOCKED.
This is sometimes accompanied by CPU utilization on the host machine consistently increase over a week from 5% to 30%.
Would like to know whether the threads in Blocked state could cause CPU spike? If yes, then how can I approach to fix the problem?
If no, then what else should be reviewed in case of resolving spiking CPU utilization?
Thread dump is as below:
"qtp336276309-561036" #561036 prio=5 os_prio=0 tid=0x00007efe80576800 nid=0x20a7 waiting for monitor entry [0x00007efe19fbd000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.log4j.Category.callAppenders(Category.java:204)
- waiting to lock <0x00000000e02333c0> (a org.apache.log4j.spi.RootLogger)
at org.apache.log4j.Category.forcedLog(Category.java:391)
at org.apache.log4j.Category.log(Category.java:856)
at org.slf4j.impl.Log4jLoggerAdapter.log(Log4jLoggerAdapter.java:601)
at org.slf4j.bridge.SLF4JBridgeHandler.callLocationAwareLogger(SLF4JBridgeHandler.java:224)
at org.slf4j.bridge.SLF4JBridgeHandler.publish(SLF4JBridgeHandler.java:301)
at java.util.logging.Logger.log(Logger.java:738)
at java.util.logging.Logger.doLog(Logger.java:765)
at java.util.logging.Logger.log(Logger.java:788)
at java.util.logging.Logger.info(Logger.java:1490)
at org.apache.mesos.chronos.scheduler.api.TaskManagementResource.updateStatus(TaskManagementResource.scala:43)
"qtp336276309-561035" #561035 prio=5 os_prio=0 tid=0x00007efe80193000 nid=0x20a6 waiting for monitor entry [0x00007efe248f4000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.log4j.Category.callAppenders(Category.java:204)
- waiting to lock <0x00000000e02333c0> (a org.apache.log4j.spi.RootLogger)
at org.apache.log4j.Category.forcedLog(Category.java:391)
at org.apache.log4j.Category.info(Category.java:666)
at mesosphere.chaos.http.ChaosRequestLog.write(ChaosRequestLog.scala:15)
at org.eclipse.jetty.server.NCSARequestLog.log(NCSARequestLog.java:591)
at org.eclipse.jetty.server.handler.RequestLogHandler.handle(RequestLogHandler.java:92)
at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:370)
at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971)
at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
at org.eclipse.jetty.io.nio.SslConnection.handle(SslConnection.java:196)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:696)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:53)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:748)
Locked ownable synchronizers:
- None
"Thread-328668" #561032 prio=5 os_prio=0 tid=0x00007efe50052000 nid=0x62 waiting for monitor entry [0x00007efe6b0cb000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.log4j.Category.callAppenders(Category.java:204)
- waiting to lock <0x00000000e02333c0> (a org.apache.log4j.spi.RootLogger)
at org.apache.log4j.Category.forcedLog(Category.java:391)
at org.apache.log4j.Category.log(Category.java:856)
at org.slf4j.impl.Log4jLoggerAdapter.log(Log4jLoggerAdapter.java:601)
at org.slf4j.bridge.SLF4JBridgeHandler.callLocationAwareLogger(SLF4JBridgeHandler.java:224)
at org.slf4j.bridge.SLF4JBridgeHandler.publish(SLF4JBridgeHandler.java:301)
at java.util.logging.Logger.log(Logger.java:738)
at java.util.logging.Logger.doLog(Logger.java:765)
at java.util.logging.Logger.log(Logger.java:788)
at java.util.logging.Logger.info(Logger.java:1490)
at org.apache.mesos.chronos.scheduler.mesos.MesosJobFramework.statusUpdate(MesosJobFramework.scala:224)
at sun.reflect.GeneratedMethodAccessor89.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.google.inject.internal.DelegatingInvocationHandler.invoke(DelegatingInvocationHandler.java:37)
at com.sun.proxy.$Proxy30.statusUpdate(Unknown Source)
Locked ownable synchronizers:
- None
"qtp336276309-556300" #556300 prio=5 os_prio=0 tid=0x00007efe81654800 nid=0x2047 runnable [0x00007efe1a1c0000]
java.lang.Thread.State: RUNNABLE
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:326)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
- locked <0x00000000e0234470> (a java.io.BufferedOutputStream)
at java.io.PrintStream.write(PrintStream.java:480)
- locked <0x00000000e0234450> (a java.io.PrintStream)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)
at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
- locked <0x00000000e0234438> (a java.io.OutputStreamWriter)
at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
at org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:59)
at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:324)
at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
- locked <0x00000000e0233cc0> (a org.apache.log4j.ConsoleAppender)
at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
at org.apache.log4j.Category.callAppenders(Category.java:206)
- locked <0x00000000e02333c0> (a org.apache.log4j.spi.RootLogger)
at org.apache.log4j.Category.forcedLog(Category.java:391)
at org.apache.log4j.Category.info(Category.java:666)
at mesosphere.chaos.http.ChaosRequestLog.write(ChaosRequestLog.scala:15)
at org.eclipse.jetty.server.NCSARequestLog.log(NCSARequestLog.java:591)
at org.eclipse.jetty.server.handler.RequestLogHandler.handle(RequestLogHandler.java:92)
at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:370)
at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494)
at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971)
at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
at org.eclipse.jetty.io.nio.SslConnection.handle(SslConnection.java:196)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:696)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:53)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:748)
Locked ownable synchronizers:
- None
As shown below in the thread dump, thread with id qtp336276309-556300 obtained org.apache.log4j.spi.RootLogger's lock & did not release it.
This causes other threads to be BLOCKED.
This is normal, only one thread can be logging to a file at any one time. If you have multiple threads trying to write at once, they have to wait.
This is sometimes accompanied by CPU utilization on the host machine consistently increase over a week from 5% to 30%.
While logging is a very common cause of slowness, it usually doesn't get worse over time.
What often causes a problem is logging too much, and you might be logging more over time which could result in an increase, however I would see if something else is causing more activity. i.e. logging might just be the symptom.
Would like to know whether the threads in Blocked state could cause CPU spike?
BLOCKED threads don't use much CPU. While this can result in higher latencies, you might not see any increase in CPU usage (in fact it might go down)
If no, then what else should be reviewed in case of resolving spiking CPU utilization?
I would look at what other threads are doing at this time. I would try reducing the logging to reduce the amount of "noise" so you can see what it is doing most of the time.
I got this type thread dump on tomcat, All thread are in wating state.
So application is slow down.
Please suggest me the solution for that.
I am using Tomcat 7 and java 7
"ImageLoadWorker(653)" prio=5 tid=0x2089 nid=0x829 in Object.wait() - stats: cpu=0 blk=-1 wait=-1
java.lang.Thread.State: WAITING
at java.lang.Object.wait(Native Method)
- waiting on org.xhtmlrenderer.swing.ImageLoadQueue#4651e7d2
at java.lang.Object.wait(Object.java:503)
at org.xhtmlrenderer.swing.ImageLoadQueue.getTask(ImageLoadQueue.java:83)
at org.xhtmlrenderer.swing.ImageLoadWorker.run(ImageLoadWorker.java:53)
Locked synchronizers: count = 0
There is thread leakage in this class of flyingsaucer jar. I got answer for this problem on below URL.
https://technotailor.wordpress.com/2017/04/17/thread-leak-with-imageloadworker-in-flying-saucer-jar/
Could somebody help me and explain what exactly is going on, looking at the following thread dump. It is a web application running on Tomcat 7 and we experience that some requests do not get answered:
"ajp-bio-8012-exec-161" daemon prio=10 tid=0x00007fe170603000 nid=0x344f runnable [0x00007fe174fae000]
java.lang.Thread.State: RUNNABLE
at java.security.AccessController.doPrivileged(Native Method)
at java.io.FilePermission.init(FilePermission.java:209)
at java.io.FilePermission.<init>(FilePermission.java:285)
at java.lang.SecurityManager.checkRead(SecurityManager.java:888)
at java.io.File.exists(File.java:808)
at sun.misc.URLClassPath$FileLoader.getResource(URLClassPath.java:1080)
at sun.misc.URLClassPath$FileLoader.findResource(URLClassPath.java:1047)
at sun.misc.URLClassPath.findResource(URLClassPath.java:176)
at java.net.URLClassLoader$2.run(URLClassLoader.java:551)
at java.net.URLClassLoader$2.run(URLClassLoader.java:549)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findResource(URLClassLoader.java:548)
at java.lang.ClassLoader.getResource(ClassLoader.java:1147)
at java.lang.ClassLoader.getResource(ClassLoader.java:1142)
at org.apache.catalina.loader.WebappClassLoader.getResource(WebappClassLoader.java:1445)
at java.lang.Class.getResource(Class.java:2142)
at javassist.ClassClassPath.find(ClassClassPath.java:84)
at javassist.ClassPoolTail.find(ClassPoolTail.java:317)
at javassist.ClassPool.find(ClassPool.java:495)
at javassist.ClassPool.createCtClass(ClassPool.java:479)
at javassist.ClassPool.get0(ClassPool.java:445)
- locked <0x00000000db6cdb48> (a javassist.ClassPool)
at javassist.ClassPool.get(ClassPool.java:414)
at javassist.compiler.MemberResolver.lookupClass0(MemberResolver.java:425)
at javassist.compiler.MemberResolver.lookupClass(MemberResolver.java:389)
at javassist.compiler.MemberResolver.lookupClassByJvmName(MemberResolver.java:310)
at javassist.compiler.MemberResolver.lookupClass(MemberResolver.java:327)
at javassist.compiler.MemberResolver.lookupClass(MemberResolver.java:314)
at javassist.compiler.Javac.compileField(Javac.java:122)
at javassist.compiler.Javac.compile(Javac.java:91)
at javassist.CtField.make(CtField.java:163)
at com.mycompany.validation.util.BeanGenerator.addProperty(BeanGenerator.java:157)
...
at java.lang.Thread.run(Thread.java:745)
"ajp-bio-8012-exec-12" daemon prio=10 tid=0x0000000000c49800 nid=0x3d99 waiting for monitor entry [0x00007fe1756b7000]
java.lang.Thread.State: BLOCKED (on object monitor)
at javassist.ClassPool.get0(ClassPool.java:432)
- waiting to lock <0x00000000db6cdb48> (a javassist.ClassPool)
at javassist.ClassPool.get(ClassPool.java:414)
at javassist.compiler.MemberResolver.lookupClass0(MemberResolver.java:425)
at javassist.compiler.MemberResolver.lookupClass(MemberResolver.java:389)
at javassist.compiler.MemberResolver.lookupClassByJvmName(MemberResolver.java:310)
at javassist.compiler.MemberResolver.lookupClass(MemberResolver.java:327)
at javassist.compiler.MemberResolver.lookupClass(MemberResolver.java:314)
at javassist.compiler.Javac.compileField(Javac.java:122)
at javassist.compiler.Javac.compile(Javac.java:91)
at javassist.CtField.make(CtField.java:163)
at com.mycompany.validation.util.BeanGenerator.addProperty(BeanGenerator.java:157)
...
at java.lang.Thread.run(Thread.java:745)
"ajp-bio-8012-exec-5" daemon prio=10 tid=0x0000000001603800 nid=0x7c77 waiting for monitor entry [0x00007fe174aaa000]
java.lang.Thread.State: BLOCKED (on object monitor)
at javassist.ClassPool.makeClass(ClassPool.java:621)
- waiting to lock <0x00000000db6cdb48> (a javassist.ClassPool)
at javassist.ClassPool.makeClass(ClassPool.java:606)
at com.mycompany.validation.util.BeanGenerator.init(BeanGenerator.java:334)
...
at java.lang.Thread.run(Thread.java:745)
I am not a specialist in thread dumps and just would like to know how to get rid of the problem.
Has the first thread locked an object (ClassPool) that two other request threads are waiting for to unlock? Why is it locked? Do I have to synchronize and how?
Thanks for any hint!
You have a single ClassPool in your application which has the monitor id 0x00000000db6cdb48.
If you see the method signature of ClassPool.get0:
protected synchronized CtClass get0(String classname, boolean useCache)
So it is synchronized. I suppose you simply used the ClassPool.getDefault(). You have to know that this method returns a single instance, so all of your calls will be sychronized.
Create a new pool for each thread (with new ClassPool(ClassPool.getDefault()) for example) and it will be fine then.
Apart from that you might check your secrurity manager, why it takes so long.
I am launching a batch of traversal on an embedded Neo4J DB 2.0.3.
23M nodes
87M relations
16GB Heap
Tried different cache settings
Tried different thread settings ( From 20 to 5 threads )
The Job is running fine for some time (ex: 1h) and then the throughput slows down dramatically because the threads are spending most of their time waiting for locks.
It looks like the memory buffers (MappedPersistenceWindow) can't be shared by multiple threads, which sounds a bit weird.
Some samples from the thread dump :
"taskExecutor-6" - Thread t#17
java.lang.Thread.State: BLOCKED
at java.lang.Object.wait(Native Method)
- waiting on <3a67f9f7> (a org.neo4j.kernel.impl.nioneo.store.MappedPersistenceWindow)
at java.lang.Object.wait(Object.java:503)
at org.neo4j.kernel.impl.nioneo.store.LockableWindow.lock(LockableWindow.java:96)
at org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.acquire(PersistenceWindowPool.java:197)
at org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.acquireWindow(CommonAbstractStore.java:430)
at org.neo4j.kernel.impl.nioneo.store.RelationshipStore.getRecord(RelationshipStore.java:84)
at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransaction$3.load(NeoStoreTransaction.java:208)
at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransaction$3.load(NeoStoreTransaction.java:198)
at org.neo4j.kernel.impl.nioneo.xa.RecordChanges.getOrLoad(RecordChanges.java:63)
at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransaction.relLoadLight(NeoStoreTransaction.java:1189)
at org.neo4j.kernel.impl.persistence.PersistenceManager.loadLightRelationship(PersistenceManager.java:109)
at org.neo4j.kernel.impl.core.NodeManager$2.loadById(NodeManager.java:114)
at org.neo4j.kernel.impl.core.NodeManager$2.loadById(NodeManager.java:110)
at org.neo4j.kernel.impl.cache.AutoLoadingCache.get(AutoLoadingCache.java:93)
at org.neo4j.kernel.impl.core.NodeManager.getRelationshipForProxy(NodeManager.java:544)
at org.neo4j.kernel.InternalAbstractGraphDatabase$6.lookupRelationship(InternalAbstractGraphDatabase.java:849)
at org.neo4j.kernel.impl.core.RelationshipProxy.getType(RelationshipProxy.java:141)
"taskExecutor-5" - Thread t#16
java.lang.Thread.State: RUNNABLE
at org.neo4j.kernel.impl.nioneo.store.LockableWindow.markAsInUse(LockableWindow.java:70)
- locked <6abe6e1> (a org.neo4j.kernel.impl.nioneo.store.MappedPersistenceWindow)
at org.neo4j.kernel.impl.nioneo.store.BrickElement.getAndMarkWindow(BrickElement.java:94)
at org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.acquire(PersistenceWindowPool.java:147)
at org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.acquireWindow(CommonAbstractStore.java:430)
at org.neo4j.kernel.impl.nioneo.store.RelationshipStore.getChainRecord(RelationshipStore.java:325)
at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransaction.getMoreRelationships(NeoStoreTransaction.java:2331)
at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransaction.getMoreRelationships(NeoStoreTransaction.java:1390)
at org.neo4j.kernel.impl.persistence.PersistenceManager.getMoreRelationships(PersistenceManager.java:94)
at org.neo4j.kernel.impl.core.RelationshipLoader.getMoreRelationships(RelationshipLoader.java:50)
at org.neo4j.kernel.impl.core.NodeManager.getMoreRelationships(NodeManager.java:779)
at org.neo4j.kernel.impl.core.NodeImpl.loadMoreRelationshipsFromNodeManager(NodeImpl.java:577)
at org.neo4j.kernel.impl.core.NodeImpl.getMoreRelationships(NodeImpl.java:540)
- locked <2b29ed5b> (a org.neo4j.kernel.impl.core.NodeImpl)
at org.neo4j.kernel.impl.core.RelationshipIterator.fetchNextOrNull(RelationshipIterator.java:98)
at org.neo4j.kernel.impl.core.RelationshipIterator.fetchNextOrNull(RelationshipIterator.java:36)
at org.neo4j.helpers.collection.PrefetchingIterator.hasNext(PrefetchingIterator.java:55)
at org.neo4j.kernel.impl.core.NodeImpl.hasRelationship(NodeImpl.java:644)
at org.neo4j.kernel.impl.core.NodeProxy.hasRelationship(NodeProxy.java:183)
"taskExecutor-4" - Thread t#15
java.lang.Thread.State: BLOCKED
at java.lang.Object.wait(Native Method)
- waiting on <3a67f9f7> (a org.neo4j.kernel.impl.nioneo.store.MappedPersistenceWindow)
at java.lang.Object.wait(Object.java:503)
at org.neo4j.kernel.impl.nioneo.store.LockableWindow.lock(LockableWindow.java:96)
at org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.acquire(PersistenceWindowPool.java:197)
at org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.acquireWindow(CommonAbstractStore.java:430)
at org.neo4j.kernel.impl.nioneo.store.RelationshipStore.getChainRecord(RelationshipStore.java:325)
at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransaction.getMoreRelationships(NeoStoreTransaction.java:2331)
at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransaction.getMoreRelationships(NeoStoreTransaction.java:1390)
at org.neo4j.kernel.impl.persistence.PersistenceManager.getMoreRelationships(PersistenceManager.java:94)
at org.neo4j.kernel.impl.core.RelationshipLoader.getMoreRelationships(RelationshipLoader.java:50)
at org.neo4j.kernel.impl.core.NodeManager.getMoreRelationships(NodeManager.java:779)
at org.neo4j.kernel.impl.core.NodeImpl.loadMoreRelationshipsFromNodeManager(NodeImpl.java:577)
at org.neo4j.kernel.impl.core.NodeImpl.getMoreRelationships(NodeImpl.java:540)
- locked <41fe8c4f> (a org.neo4j.kernel.impl.core.NodeImpl)
at org.neo4j.kernel.impl.core.RelationshipIterator.fetchNextOrNull(RelationshipIterator.java:98)
at org.neo4j.kernel.impl.core.RelationshipIterator.fetchNextOrNull(RelationshipIterator.java:36)
at org.neo4j.helpers.collection.PrefetchingIterator.hasNext(PrefetchingIterator.java:55)
at org.neo4j.kernel.impl.core.NodeImpl.hasRelationship(NodeImpl.java:644)
at org.neo4j.kernel.impl.core.NodeProxy.hasRelationship(NodeProxy.java:183)
"taskExecutor-3" - Thread t#14
java.lang.Thread.State: WAITING
at java.lang.Object.wait(Native Method)
- waiting on <3a67f9f7> (a org.neo4j.kernel.impl.nioneo.store.MappedPersistenceWindow)
at java.lang.Object.wait(Object.java:503)
at org.neo4j.kernel.impl.nioneo.store.LockableWindow.lock(LockableWindow.java:96)
at org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.acquire(PersistenceWindowPool.java:197)
at org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.acquireWindow(CommonAbstractStore.java:430)
at org.neo4j.kernel.impl.nioneo.store.RelationshipStore.getRecord(RelationshipStore.java:84)
at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransaction$3.load(NeoStoreTransaction.java:208)
at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransaction$3.load(NeoStoreTransaction.java:198)
at org.neo4j.kernel.impl.nioneo.xa.RecordChanges.getOrLoad(RecordChanges.java:63)
at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransaction.relLoadLight(NeoStoreTransaction.java:1189)
at org.neo4j.kernel.impl.persistence.PersistenceManager.loadLightRelationship(PersistenceManager.java:109)
at org.neo4j.kernel.impl.core.NodeManager$2.loadById(NodeManager.java:114)
at org.neo4j.kernel.impl.core.NodeManager$2.loadById(NodeManager.java:110)
at org.neo4j.kernel.impl.cache.AutoLoadingCache.get(AutoLoadingCache.java:93)
at org.neo4j.kernel.impl.core.NodeManager.getRelationshipForProxy(NodeManager.java:544)
at org.neo4j.kernel.InternalAbstractGraphDatabase$6.lookupRelationship(InternalAbstractGraphDatabase.java:849)
at org.neo4j.kernel.impl.core.RelationshipProxy.getOtherNode(RelationshipProxy.java:108)
at org.neo4j.kernel.impl.traversal.TraversalBranchImpl.next(TraversalBranchImpl.java:145)
at org.neo4j.graphdb.traversal.PreorderDepthFirstSelector.next(PreorderDepthFirstSelector.java:49)
at org.neo4j.kernel.impl.traversal.MonoDirectionalTraverserIterator.fetchNextOrNull(MonoDirectionalTraverserIterator.java:68)
at org.neo4j.kernel.impl.traversal.MonoDirectionalTraverserIterator.fetchNextOrNull(MonoDirectionalTraverserIterator.java:35)
at org.neo4j.helpers.collection.PrefetchingIterator.hasNext(PrefetchingIterator.java:55)
Any idea ?