I've got a UI automation framework that launches tests using TestNG and runs through pages using Selenium/WebDriver. Oftentimes the pages I'm testing make AJAX calls that modify the DOM upon returning. In these cases I use Selenium explicit waits to declare a DOM condition that I want to be met before the automation can proceed (IE: some button gets enabled).
Internally Selenium's FluentWait.until method handles this by polling the DOM for my ExpectedCondition every 500ms and calling Thread.sleep() in-between these checks.
When I run two tests back to back in a TestNG suite this works perfectly fine for the first test, but starts to fail with an InterruptedException about halfway through each subsequent test. This is consistent. The exceptions look like this:
Associated Throwable Type: class org.openqa.selenium.WebDriverException Associated Throwable Message: java.lang.InterruptedException: sleep interrupted
The strange thing is that there's no multi-threading going on here. I've disabled Selenium Grid, BrowserMob Proxy, and every other bit of code that could be conflicting. I've read both of these questions:
https://stackoverflow.com/questions/24495176/why-is-thread-sleep-being-interrupted - Closed for not providing enough detail, but one of the proposed answers states that one should override the Thread.interrupt method for debugging.
Who interrupts my thread? - Accepted answer also states that one should override the Thread.interrupt method for debugging.
My problem with this solution is that placing a breakpoint inside the existing Thread.interrupt method does not reveal any calls around the time that the thread is interrupted. This includes calls from all of my third party dependencies (IE: TestNG and Selenium). Whatever is calling this thread interrupt appears to be external to my framework.
I've also tried calling Thread.currentThread.isInterrupted() at every point prior to the FluentWait.until call and it consistently returns false. I've even used IntelliJ's evaluate function to check for isInterrupted inside the Selenium code itself. This thread is only being interrupted once the Thread.sleep call occurs inside FluentWait.until.
I've seen this happen on multiple Windows build servers as well as on my Macbook, so this does not appear to be machine specific.
I thought for a while that this might be caused by a TestNG timeout, but reducing the TestNG timeout in my suite yielded a different behavior than these interruptions.
Currently I'm working around this issue with the following code which swallows the exception and resumes the explicit wait:
public static boolean waitForElementStatus(Stuff)
{
/* snip - setup for ExpectedCondition (change) */
long startSeconds = new Date().getTime() / 1000;
long currentSeconds = startSeconds;
long remainingSeconds = maxElementStatusChangeSeconds;
WebDriverWait waitForElement = new WebDriverWait(driver, maxElementStatusChangeSeconds);
boolean changed = false;
boolean firstWait = true; // If specified time is 0 we still want to check once.
out:while(firstWait || remainingSeconds > 0)
{
firstWait = false;
Boolean exceptionThrown = false;
try
{
waitForElement.until(change);
}
catch(Throwable t)
{
exceptionThrown = true;
if(t.getCause()) != null
{
t = t.getCause(); // InterruptedException is wrapped inside a WebDriverException
}
if(t.getClass().equals(InterruptedException.class))
{
Thread.interrupted(); // clear interrupt status for this thread
currentSeconds = new Date().getTime() / 1000;
remainingSeconds = startSeconds + maxElementStatusChangeSeconds - currentSeconds;
if(remainingSeconds > 0)
{
String warning = String.format("Caught unidentified interrupt inside Selenium " +
"FluentWait.until call. Swallowing interrupt and repeating call with [%s] seconds " +
"remaining.", remainingSeconds);
CombinedLogger.warn(warning);
waitForElement = new WebDriverWait(driver, remainingSeconds);
}
else
{
// If a timeout exception would have been thrown instead of the interruption then
// we'll allow the WebDriverWait to execute one last time so it can throw the
// timeout instead.
waitForElement = new WebDriverWait(driver, 0);
}
}
else if(haltOnFailure) // for any other exception type such as TimeoutException
{
CombinedLogger.error(stuff + "...FAILURE(HALTING)", t);
break out;
}
else // for any other exception type such as TimeoutException
{
CombinedLogger.info(stuff + "...failure(non-halting)");
break out;
}
}
if(!exceptionThrown)
{
changed = true;
CombinedLogger.info(stuff + "...success ");
break out;
}
}
return changed;
}
This workaround does function, and fortunately these mystery interrupts are only occurring sporadically afterwards (they don't happen repeatedly), so the tests are able to proceed. However, I understand that swallowing InterruptedException is bad form. If possible, I'd like to determine where and why these interrupts are taking place so that I can put an end to them instead of using this hack.
Simply propagating the exceptions is not an option since these tests need to continue running instead of obediently crashing.
Are there any known utilities, JVM arguments, or libraries that I could use which would help me track down Java thread interruptions that are caused by code which is out of my control?
Update 12/10/2014: I've captured two thread dumps. One is from immediately before the interrupt and one is from immediately after it. The only difference between the two is the line number of the interrupted thread (it goes from the try block to the catch block after being interrupted). Not sure what this tells me, but here's the data:
Full thread dump (immediately before interrupt)
"TestNG#1359" prio=5 tid=0xc nid=NA runnable
java.lang.Thread.State: RUNNABLE
at org.openqa.selenium.support.ui.FluentWait.until(FluentWait.java:232)
/* snip - company stuff */
at sun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.java:-1)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:84)
at org.testng.internal.InvokeMethodRunnable.runOne(InvokeMethodRunnable.java:46)
at org.testng.internal.InvokeMethodRunnable.run(InvokeMethodRunnable.java:37)
at org.testng.internal.MethodInvocationHelper.invokeWithTimeoutWithNoExecutor(MethodInvocationHelper.java:240)
at org.testng.internal.MethodInvocationHelper.invokeWithTimeout(MethodInvocationHelper.java:229)
at org.testng.internal.Invoker.invokeMethod(Invoker.java:724)
at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:901)
at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1231)
at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:127)
at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:111)
at org.testng.TestRunner.privateRun(TestRunner.java:767)
at org.testng.TestRunner.run(TestRunner.java:617)
at org.testng.SuiteRunner.runTest(SuiteRunner.java:348)
at org.testng.SuiteRunner.access$000(SuiteRunner.java:38)
at org.testng.SuiteRunner$SuiteWorker.run(SuiteRunner.java:382)
at org.testng.internal.thread.ThreadUtil$2.call(ThreadUtil.java:64)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
"main#1" prio=5 tid=0x1 nid=NA waiting
java.lang.Thread.State: WAITING
at sun.misc.Unsafe.park(Unsafe.java:-1)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:422)
at java.util.concurrent.FutureTask.get(FutureTask.java:199)
at java.util.concurrent.AbstractExecutorService.invokeAll(AbstractExecutorService.java:289)
at org.testng.internal.thread.ThreadUtil.execute(ThreadUtil.java:72)
at org.testng.SuiteRunner.runInParallelTestMode(SuiteRunner.java:367)
at org.testng.SuiteRunner.privateRun(SuiteRunner.java:308)
at org.testng.SuiteRunner.run(SuiteRunner.java:254)
at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86)
at org.testng.TestNG.runSuitesSequentially(TestNG.java:1224)
at org.testng.TestNG.runSuitesLocally(TestNG.java:1149)
at org.testng.TestNG.run(TestNG.java:1057)
at org.testng.remote.RemoteTestNG.run(RemoteTestNG.java:111)
at org.testng.remote.RemoteTestNG.initAndRun(RemoteTestNG.java:204)
at org.testng.remote.RemoteTestNG.main(RemoteTestNG.java:175)
at org.testng.RemoteTestNGStarter.main(RemoteTestNGStarter.java:125)
"Thread-8#2432" daemon prio=5 tid=0x15 nid=NA runnable
java.lang.Thread.State: RUNNABLE
at java.io.FileInputStream.readBytes(FileInputStream.java:-1)
at java.io.FileInputStream.read(FileInputStream.java:272)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
- locked <0xe08> (a java.lang.UNIXProcess$ProcessPipeInputStream)
at java.io.FilterInputStream.read(FilterInputStream.java:107)
at org.apache.commons.exec.StreamPumper.run(StreamPumper.java:105)
at java.lang.Thread.run(Thread.java:745)
"Thread-7#2431" daemon prio=5 tid=0x14 nid=NA runnable
java.lang.Thread.State: RUNNABLE
at java.io.FileInputStream.readBytes(FileInputStream.java:-1)
at java.io.FileInputStream.read(FileInputStream.java:272)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
- locked <0xe09> (a java.lang.UNIXProcess$ProcessPipeInputStream)
at java.io.FilterInputStream.read(FilterInputStream.java:107)
at org.apache.commons.exec.StreamPumper.run(StreamPumper.java:105)
at java.lang.Thread.run(Thread.java:745)
"Thread-6#2424" prio=5 tid=0x13 nid=NA waiting
java.lang.Thread.State: WAITING
at java.lang.Object.wait(Object.java:-1)
at java.lang.Object.wait(Object.java:503)
at java.lang.UNIXProcess.waitFor(UNIXProcess.java:261)
at org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:347)
at org.apache.commons.exec.DefaultExecutor.access$200(DefaultExecutor.java:46)
at org.apache.commons.exec.DefaultExecutor$1.run(DefaultExecutor.java:188)
"process reaper#2008" daemon prio=10 tid=0x10 nid=NA runnable
java.lang.Thread.State: RUNNABLE
at java.lang.UNIXProcess.waitForProcessExit(UNIXProcess.java:-1)
at java.lang.UNIXProcess.access$500(UNIXProcess.java:54)
at java.lang.UNIXProcess$4.run(UNIXProcess.java:225)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
"ReaderThread#645" prio=5 tid=0xb nid=NA runnable
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(SocketInputStream.java:-1)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
- locked <0xe0b> (a java.io.InputStreamReader)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
at java.io.BufferedReader.fill(BufferedReader.java:154)
at java.io.BufferedReader.readLine(BufferedReader.java:317)
at java.io.BufferedReader.readLine(BufferedReader.java:382)
at org.testng.remote.strprotocol.BaseMessageSender$ReaderThread.run(BaseMessageSender.java:245)
"Finalizer#2957" daemon prio=8 tid=0x3 nid=NA waiting
java.lang.Thread.State: WAITING
at java.lang.Object.wait(Object.java:-1)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:189)
"Reference Handler#2958" daemon prio=10 tid=0x2 nid=NA waiting
java.lang.Thread.State: WAITING
at java.lang.Object.wait(Object.java:-1)
at java.lang.Object.wait(Object.java:503)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
"Signal Dispatcher#2956" daemon prio=9 tid=0x4 nid=NA runnable
java.lang.Thread.State: RUNNABLE
There is not much that can be inferred from the thread dump as in what caused it.
But in reality you cannot rely on Thread.sleep() too much ,it might be interrupted for known/unknown reason.OS might be the reason in the later case.
Thread.sleep() is one of the few methods which takes interrupt seriously. As a thread cannot handle InterruptedException while it is sleeping ,you need to handle it.
What you are doing right now might not be a workaround but a way to go in such cases,where we cannot do without Thread.sleep().
A bit outdated but I have similar problem and with the help of your previously posted link (https://stackoverflow.com/a/2476246) I put a breakpoint into the Thread.interrupt() method.
It reveals that the interruption was made by StoryManager.waitUntilAllDoneOrFailed() method that triggers future.cancel() method after the timeout set on whole story.
My whole setup is:
page.getPageObject().withTimeoutOf(convertDuration(duration)).waitFor(by);
where duration is about 60 secs. (the minute is due to some async stuff)
and
configuredEmbedder().embedderControls().useStoryTimeouts("30");
And the stackTrace is:
at java.util.concurrent.FutureTask.cancel(FutureTask.java:174)
at org.jbehave.core.embedder.StoryManager.waitUntilAllDoneOrFailed(StoryManager.java:184)
at org.jbehave.core.embedder.StoryManager.performStories(StoryManager.java:121)
at org.jbehave.core.embedder.StoryManager.runStories(StoryManager.java:107)
and that interrupts later Thread.sleep() method in ThucydidesFluentWait.doWait() (basically in the underneath Sleeper instance method sleep())
Increasing the story timeout or proper setup of waitFor(...) timeout vs. story timeout solves the problem on my side.
Related
I have written following program. Basically I am using executor framework to manage threads. I've also used a BlockingQueue and deliberately keeping it empty so that the thread remains in waiting state.
The below is the program:
package com.example.executors;
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.Executors;
import java.util.concurrent.LinkedBlockingQueue;
import java.util.concurrent.ScheduledExecutorService;
import java.util.concurrent.ScheduledFuture;
import java.util.concurrent.TimeUnit;
public class ExecutorDemo {
public static void main(String[] args) throws InterruptedException {
ScheduledExecutorService scheduledThreadPool = null;
BlockingQueue<Integer> bq = new LinkedBlockingQueue<>();
scheduledThreadPool = Executors.newSingleThreadScheduledExecutor((Runnable run) -> {
Thread t = Executors.defaultThreadFactory().newThread(run);
t.setDaemon(true);
t.setName("Worker-pool-" + Thread.currentThread().getName());
t.setUncaughtExceptionHandler(
(thread, e) -> System.out.println("thread is --> " + thread + "exception is --> " + e));
return t;
});
ScheduledFuture<?> f = scheduledThreadPool.scheduleAtFixedRate(() -> {
System.out.println("Inside thread.. working");
try {
bq.take();
} catch (InterruptedException e) {
e.printStackTrace();
}
}, 2000, 30000, TimeUnit.MILLISECONDS);
System.out.println("f.isDone() ---> " + f.isDone());
Thread.sleep(100000000000L);
}
}
Once the program runs, main thread remains in TIMED_WAITING state, due to Thread.sleep(). In thread, which is managed by executor, i am making it to read an empty blocking queue, and this thread remain in WAITING state for ever. I wanted to see how does the thread dump looks in this scenario. I have captured it below:
"Worker-pool-main" #10 daemon prio=5 os_prio=31 tid=0x00007f7ef393d800 nid=0x5503 waiting on condition [0x000070000a3d8000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000007955f7110> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at com.example.cs.executors.CSExecutorUnderstanding.lambda$2(CSExecutorUnderstanding.java:34)
at com.example.cs.executors.CSExecutorUnderstanding$$Lambda$2/1705736037.run(Unknown Source)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
As expected thread Worker-pool-main remains in WAITING state. My doubt is on the thread dump.
As it is executor service which manages the life-cycle of thread in executor framework, then how this thread dump starts with Thread.run() method.
Shouldn't it be that first some portion of executor appearing and then Thread.run()
Basically , the doubt is: when life-cycle is managed by executor, then how come Thread.run() is appearing first and up the stack see portions of executors. Isn't executors starting these threads, so how they are appearing up in the stack?
When you start a new Thread, it will execute its run method on a completely new call stack. That is the entrypoint for the code in that Thread. It is completely decoupled from the thread that called start. The "parent" thread continues to run its own code on its own stack independently, and if either of the two threads crashes or completes it does not impact the other.
The only thing that shows up in a thread's stack frames is whatever gets called inside of run. You don't get to see who called run (the JVM did that). Unless of course, you confused start with run and called the run directly from your own code. Then there is no new thread involved at all.
Here, the thread is not created by your own code directly, but by the executor service. But that one does not do anything different, it also has to create threads by calling constructors and start them using start. The end result is the same.
What run usually does is delegate to a Runnable that has been set in its constructor. You see that here: The executor service has installed a ThreadPoolExecutor$Worker instance. This one contains all the code to be run on the new thread and control its interactions with the executor.
That ThreadPoolExecutor$Worker in turn will then call into its payload code, your application code, the tasks that have been submitted to the executor. In your case, that is com.example.cs.executors.CSExecutorUnderstanding$$Lambda$2/1705736037.
I have the following code snippet in my actual code. I have mutiple consumers listening to a queue.
#RabbitListener
private void abc(ETLConfigDTO config){
try{
log.info("load started");
loadService.loadData(config);
}
catch(Exception e){
log.error("Load failed"):
}
finally{
log.info("finished processing"):
}
}
loadData() takes few minutes to few hours of processing. Its kind of etl processing. There is intensove logging inside this method, so i know in which state the process is.
The problem is that the process is kind of stuck inside loadPlans() method. The message in the queue is in unacknowledged state since it is still processing which i need in that way.
There is no exception since catch is not printing anything or even the finally block.
I also have a spring cron (5 minutes interval) in the same class which is also running fine and doing its tasks.
The point to note is that this is running fine if I not use rabbit amqp.
Is there any connection/network drop? Or any timeout? Or the main thread is hung/dead? I am really not understanding what is happening here.
Thanks in advance.
UPDATE:
Thanks Gary,
I see this in jstack 19:
"SimpleAsyncTaskExecutor-1" #25 prio=5 os_prio=0 tid=0x00007f5615b3d800 nid=0x2f runnable [0x00007f56703cd000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
at sun.security.ssl.InputRecord.read(InputRecord.java:503)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
- locked <0x000000067a6bada8> (a java.lang.Object)
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:940)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
- locked <0x000000067a6baea0> (a sun.security.ssl.AppInputStream)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
- locked <0x000000067a717cb8> (a java.io.BufferedInputStream)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:735)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1569)
- locked <0x000000067a6b4b60> (a sun.net.www.protocol.https.DelegateHttpsURLConnection)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1474)
- locked <0x000000067a6b4b60> (a sun.net.www.protocol.https.DelegateHttpsURLConnection)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:338)
at org.springframework.http.client.SimpleClientHttpResponse.getRawStatusCode(SimpleClientHttpResponse.java:48)
at org.springframework.http.client.AbstractClientHttpResponse.getStatusCode(AbstractClientHttpResponse.java:33)
at org.springframework.web.client.DefaultResponseErrorHandler.getHttpStatusCode(DefaultResponseErrorHandler.java:56)
at org.springframework.web.client.DefaultResponseErrorHandler.hasError(DefaultResponseErrorHandler.java:50)
at org.springframework.web.client.RestTemplate.handleResponse(RestTemplate.java:602)
at org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:570)
at org.springframework.web.client.RestTemplate.execute(RestTemplate.java:530)
at org.springframework.web.client.RestTemplate.exchange(RestTemplate.java:448)
..
...
...
...
Please advise.
NEW UPDATE:
I have increase memory -XX:MaxMetaspaceSize=1024M -Xms4096M -Xmx4096M
The thread is being stuck on oracle connection now.
"SimpleAsyncTaskExecutor-1" #25 prio=5 os_prio=0 tid=0x00007ff6102c8800 nid=0x33 runnable [0x00007ff619ad9000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at oracle.net.ns.Packet.receive(Packet.java:300)
at oracle.net.ns.DataPacket.receive(DataPacket.java:106)
at oracle.net.ns.NetInputStream.getNextPacket(NetInputStream.java:315)
at oracle.net.ns.NetInputStream.read(NetInputStream.java:260)
at oracle.net.ns.NetInputStream.read(NetInputStream.java:185)
at oracle.net.ns.NetInputStream.read(NetInputStream.java:102)
at oracle.jdbc.driver.T4CSocketInputStreamWrapper.readNextPacket(T4CSocketInputStreamWrapper.java:124)
at oracle.jdbc.driver.T4CSocketInputStreamWrapper.read(T4CSocketInputStreamWrapper.java:80)
at oracle.jdbc.driver.T4CMAREngine.unmarshalUB1(T4CMAREngine.java:1137)
at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:290)
at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:192)
at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:531)
at oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:193)
at oracle.jdbc.driver.T4CStatement.executeForRows(T4CStatement.java:1033)
at oracle.jdbc.driver.OracleStatement.executeBatch(OracleStatement.java:4536)
- locked <0x00000007b01c6b20> (a oracle.jdbc.driver.T4CConnection)
at oracle.jdbc.driver.OracleStatementWrapper.executeBatch(OracleStatementWrapper.java:230)
at org.springframework.jdbc.core.JdbcTemplate$1BatchUpdateStatementCallback.doInStatement(JdbcTemplate.java:572)
at org.springframework.jdbc.core.JdbcTemplate$1BatchUpdateStatementCallback.doInStatement(JdbcTemplate.java:559)
at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:405)
at org.springframework.jdbc.core.JdbcTemplate.batchUpdate(JdbcTemplate.java:611)
...
...
Your listener is rather unusual; in most cases it would be void listen(SomeObject) where the listener processes the object and exits and the message is acknowledged.
You appear to be ignoring the contents of the message and simply using its presence to trigger loadData().
Regardless, by default, the message won't be acknowledged until the method exits; the container thread will remain in the listener method until it exits.
The default acknowledge mode for the container is AUTO which means the container will automatically acknowledge (or reject) the message when the method exits.
You can change the acknowledge mode to NONE which means RabbitMQ does not require an acknowledgment at all and will remove the message immediately.
However, the container thread will still run in the method until the method exits.
The message will be lost if the application crashes.
The case is that an application hangs infinitely from time to time.
Seems that the bug sits in the following snippet:
ForkJoinPool pool = new ForkJoinPool(1); // parallelism = 1
List<String> entries = ...;
pool.submit(() -> {
entries.stream().parallel().forEach(entry -> {
// An I/O op.
...
});
}).get();
Thread pool-4-thread-1 that executes the code freezes on get():
"pool-4-thread-1" #35 prio=5 os_prio=0 tid=0x00002b42e4013800 nid=0xb7d1 in Object.wait() [0x00002b427b72f000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.util.concurrent.ForkJoinTask.externalInterruptibleAwaitDone(ForkJoinTask.java:367)
- locked <0x00000000e08b68b8> (a java.util.concurrent.ForkJoinTask$AdaptedRunnableAction)
at java.util.concurrent.ForkJoinTask.get(ForkJoinTask.java:1001)
...other app methods
One can assume that the task passed to submit() executes too long.
But surprisingly there is no ForkJoinPool-N-worker-N occurrences in the thread dump, so looks like the pool doesn't perform any computations!
How is that possible? If no tasks are executed by the pool, why pool-4-thread-1 thread waits inside get()?
P.S. I know that it's not recommended to execute I/O-related tasks in ForkJoinPool, but still interested in the root of the problem.
Update. When parallelism is set to value greater than 1, no problems are detected.
Set parallelism = N where N > 1 solved the problem.
Strange thing but seems that there is some bug in ForkJoinPool similar to what is stated here.
I'm investigating the strange issue with tomcat shutdown process: after runnig shutdown.sh the java process still appears(I check it by using ps -ef|grep tomcat)
The situation is a bit complicated, because I have very limitted access to the server(no debug, for example)
I took thread dump(by using kill -3 <PID>) and heap dump by using remote jConsole and Hotspot features.
After looking into thread dump I found this:
"pool-2-thread-1" #74 prio=5 os_prio=0 tid=0x00007f3354359800 nid=0x7b46 waiting on condition [0x00007f333e55d000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000c378d330> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1081)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
So, my understanding of the problem is follows: There is a resource(DB connection or something else) which is used in CachedThreadpool, and this resource is now locked,
and prevent to thread pool-2-thread-1 to stop. Assuming that this thread isn't deamon - JVM cannot gracefully stop.
Is there a way to find out which resource is locked, from where is it locked and how to avoid that? Another question is - how to prevent this situation?
Another this is: what the adress 0x00007f333e55d000 is for?
Thanks!
After a wail of struggling I found out, that the freezing thread always have the same name pool-2-thread-1. I grep all the sources of the project to find any places where any scheduled thread pool is started: Executors#newScheduledThreadPool. After a huge amount of time I logged all that places, with thread name in thread pool.
One more restart server and gotcha!
I found out one thread pool, that is started with only one thread and used following code:
private void shutdownThreadpool(int tries) {
try {
boolean terminated;
LOGGER.debug("Trying to shutdown thread pool in {} tries", tries);
pool.shutdown();
do {
terminated = pool.awaitTermination(WAIT_ON_SHUTDOWN,TimeUnit.MILLISECONDS);
if (--tries == 0
&& !terminated) {
LOGGER.debug("After 10 attempts couldn't shutdown thread pool, force shutdown");
pool.shutdownNow();
terminated = pool.awaitTermination(WAIT_ON_SHUTDOWN, TimeUnit.MILLISECONDS);
if (!terminated) {
LOGGER.debug("Cannot stop thread pool even with force");
LOGGER.trace("Some of the workers doesn't react to Interruption event properly");
terminated = true;
}
} else {
LOGGER.info("After {} attempts doesn't stop", tries);
}
} while (!terminated);
LOGGER.debug("Successfully stop thread pool");
} catch (final InterruptedException ie) {
LOGGER.warn("Thread pool shutdown interrupted");
}
}
After that the issue was solved.
Could somebody help me and explain what exactly is going on, looking at the following thread dump. It is a web application running on Tomcat 7 and we experience that some requests do not get answered:
"ajp-bio-8012-exec-161" daemon prio=10 tid=0x00007fe170603000 nid=0x344f runnable [0x00007fe174fae000]
java.lang.Thread.State: RUNNABLE
at java.security.AccessController.doPrivileged(Native Method)
at java.io.FilePermission.init(FilePermission.java:209)
at java.io.FilePermission.<init>(FilePermission.java:285)
at java.lang.SecurityManager.checkRead(SecurityManager.java:888)
at java.io.File.exists(File.java:808)
at sun.misc.URLClassPath$FileLoader.getResource(URLClassPath.java:1080)
at sun.misc.URLClassPath$FileLoader.findResource(URLClassPath.java:1047)
at sun.misc.URLClassPath.findResource(URLClassPath.java:176)
at java.net.URLClassLoader$2.run(URLClassLoader.java:551)
at java.net.URLClassLoader$2.run(URLClassLoader.java:549)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findResource(URLClassLoader.java:548)
at java.lang.ClassLoader.getResource(ClassLoader.java:1147)
at java.lang.ClassLoader.getResource(ClassLoader.java:1142)
at org.apache.catalina.loader.WebappClassLoader.getResource(WebappClassLoader.java:1445)
at java.lang.Class.getResource(Class.java:2142)
at javassist.ClassClassPath.find(ClassClassPath.java:84)
at javassist.ClassPoolTail.find(ClassPoolTail.java:317)
at javassist.ClassPool.find(ClassPool.java:495)
at javassist.ClassPool.createCtClass(ClassPool.java:479)
at javassist.ClassPool.get0(ClassPool.java:445)
- locked <0x00000000db6cdb48> (a javassist.ClassPool)
at javassist.ClassPool.get(ClassPool.java:414)
at javassist.compiler.MemberResolver.lookupClass0(MemberResolver.java:425)
at javassist.compiler.MemberResolver.lookupClass(MemberResolver.java:389)
at javassist.compiler.MemberResolver.lookupClassByJvmName(MemberResolver.java:310)
at javassist.compiler.MemberResolver.lookupClass(MemberResolver.java:327)
at javassist.compiler.MemberResolver.lookupClass(MemberResolver.java:314)
at javassist.compiler.Javac.compileField(Javac.java:122)
at javassist.compiler.Javac.compile(Javac.java:91)
at javassist.CtField.make(CtField.java:163)
at com.mycompany.validation.util.BeanGenerator.addProperty(BeanGenerator.java:157)
...
at java.lang.Thread.run(Thread.java:745)
"ajp-bio-8012-exec-12" daemon prio=10 tid=0x0000000000c49800 nid=0x3d99 waiting for monitor entry [0x00007fe1756b7000]
java.lang.Thread.State: BLOCKED (on object monitor)
at javassist.ClassPool.get0(ClassPool.java:432)
- waiting to lock <0x00000000db6cdb48> (a javassist.ClassPool)
at javassist.ClassPool.get(ClassPool.java:414)
at javassist.compiler.MemberResolver.lookupClass0(MemberResolver.java:425)
at javassist.compiler.MemberResolver.lookupClass(MemberResolver.java:389)
at javassist.compiler.MemberResolver.lookupClassByJvmName(MemberResolver.java:310)
at javassist.compiler.MemberResolver.lookupClass(MemberResolver.java:327)
at javassist.compiler.MemberResolver.lookupClass(MemberResolver.java:314)
at javassist.compiler.Javac.compileField(Javac.java:122)
at javassist.compiler.Javac.compile(Javac.java:91)
at javassist.CtField.make(CtField.java:163)
at com.mycompany.validation.util.BeanGenerator.addProperty(BeanGenerator.java:157)
...
at java.lang.Thread.run(Thread.java:745)
"ajp-bio-8012-exec-5" daemon prio=10 tid=0x0000000001603800 nid=0x7c77 waiting for monitor entry [0x00007fe174aaa000]
java.lang.Thread.State: BLOCKED (on object monitor)
at javassist.ClassPool.makeClass(ClassPool.java:621)
- waiting to lock <0x00000000db6cdb48> (a javassist.ClassPool)
at javassist.ClassPool.makeClass(ClassPool.java:606)
at com.mycompany.validation.util.BeanGenerator.init(BeanGenerator.java:334)
...
at java.lang.Thread.run(Thread.java:745)
I am not a specialist in thread dumps and just would like to know how to get rid of the problem.
Has the first thread locked an object (ClassPool) that two other request threads are waiting for to unlock? Why is it locked? Do I have to synchronize and how?
Thanks for any hint!
You have a single ClassPool in your application which has the monitor id 0x00000000db6cdb48.
If you see the method signature of ClassPool.get0:
protected synchronized CtClass get0(String classname, boolean useCache)
So it is synchronized. I suppose you simply used the ClassPool.getDefault(). You have to know that this method returns a single instance, so all of your calls will be sychronized.
Create a new pool for each thread (with new ClassPool(ClassPool.getDefault()) for example) and it will be fine then.
Apart from that you might check your secrurity manager, why it takes so long.