Handling job/step exceptions with asynchronous TaskManager - java

I can't find a proper way to handle spring-batch exception in asynchronous context.
When I set a ThreadPoolTaskManager to my JobLauncher, the real job/step exception is not logged anymore. Instead the log will be something like:
org.springframework.batch.core.JobInterruptedException: Job interrupted by step execution
at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:165)
at ...
I tried to resolve this adding a JobExecutionListener like this:
#Override
public void afterJob(JobExecution jobExecution) {
List<Throwable> jobExceptions = jobExecution.getFailureExceptions();
if (CollectionUtils.isNotEmpty(jobExceptions)) {
Throwable lastJobException = jobExceptions.get(jobExceptions.size() - 1);
LOGGER.error("Spring-Batch error at job level", lastJobException);
String lastJobExceptionMessage = ExceptionUtils.getRootCauseMessage(lastJobException);
// storing message in ExecutionContext for the batch-admin webapp
String message = "";
if (jobExecution.getExecutionContext().get(Consts.JOB_EXECUTION_MESSAGE_KEY) != null) {
message = jobExecution.getExecutionContext().getString(Consts.JOB_EXECUTION_MESSAGE_KEY);
}
message += "\n" + lastJobExceptionMessage;
jobExecution.getExecutionContext().put(Consts.JOB_EXECUTION_MESSAGE_KEY, message);
}
}
But I still end with a JobInterruptedException.
Is there a way to retrieve the initial cause of the interruption (might be a error in the reader/processor/writer code?

I don't think your diagnosis is correct. That exception is thrown with that error message only in SimpleStepHandler:
if (currentStepExecution.getStatus() == BatchStatus.STOPPING
|| currentStepExecution.getStatus() == BatchStatus.STOPPED) {
// Ensure that the job gets the message that it is stopping
execution.setStatus(BatchStatus.STOPPING);
throw new JobInterruptedException("Job interrupted by step execution");
}
and only if the step itself didn't throw JobInterruptedException. The most obvious case where this can happen is if the job was stopped. See this example, whose output ends with
INFO: Executing step: [step1]
Feb 24, 2016 1:25:02 PM org.springframework.batch.core.repository.support.SimpleJobRepository checkForInterruption
INFO: Parent JobExecution is stopped, so passing message on to StepExecution
Feb 24, 2016 1:25:02 PM org.springframework.batch.core.step.ThreadStepInterruptionPolicy isInterrupted
INFO: Step interrupted through StepExecution
Feb 24, 2016 1:25:02 PM org.springframework.batch.core.step.AbstractStep execute
INFO: Encountered interruption executing step step1 in job myJob : Job interrupted status detected.
Feb 24, 2016 1:25:02 PM org.springframework.batch.core.repository.support.SimpleJobRepository checkForInterruption
INFO: Parent JobExecution is stopped, so passing message on to StepExecution
Feb 24, 2016 1:25:02 PM org.springframework.batch.core.job.AbstractJob execute
INFO: Encountered interruption executing job: Job interrupted by step execution
Feb 24, 2016 1:25:02 PM org.springframework.batch.core.launch.support.SimpleJobLauncher$1 run
INFO: Job: [SimpleJob: [name=myJob]] completed with the following parameters: [{}] and the following status: [STOPPED]
Status is: STOPPED
This other example shows that throwing an exception when using a thread pool changes nothing. The final output is
INFO: Executing step: [step1]
Feb 24, 2016 1:28:44 PM org.springframework.batch.core.step.AbstractStep execute
SEVERE: Encountered an error executing step step1 in job myJob
java.lang.RuntimeException: My exception
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
(...)
Feb 24, 2016 1:28:44 PM org.springframework.batch.core.launch.support.SimpleJobLauncher$1 run
INFO: Job: [SimpleJob: [name=myJob]] completed with the following parameters: [{}] and the following status: [FAILED]
Status is: FAILED, job execution id 0
#1 step1 FAILED
Step step1
java.lang.RuntimeException: My exception
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
(...)

So the answer was so simple than I felt really stupid when I understood it :
#Artefacto was right. The job was stopped. By the end of the process. Because it reached the end of the main() method.
When I switch to an asynchronous mode with my ThreadPoolTaskManager, I forgot to add one very important line to my main method:
// Wait for the end of the JobExecution
main.endOfJobLatch.await();
Hope this answer will help someone else...

Related

HttpMethodDirectory executeWithRetry and SSLProtocolException in Java

I am using httpclient-3.0 library to parse data to cloud. When I run the application on my local machine (Windows 10), it works fine and the data gets parsed to the server and I receive success response, but when I deployed it on our server which runs on Windows server 2012 R2, it throws below error. I have used the same JDK as well I tried many ways like adding -Djdk.tls.client.protocols="TLSv1,TLSv1.1,TLSv1.2" in my java.security jdk file, but still the issue is not resolved.
Here is my code
PostMethod post = new PostMethod(apiUrl);
post.setParameter("authtoken", authToken);
post.setParameter("dateFormat", dateTimeFormat);
post.setParameter("data", emloyeesAttendanceJsonArr.toString());
HttpClient httpclient = new HttpClient();
// Configuring proxy
httpclient.getHostConfiguration().setProxy("**.**.**.**", ****);
try {
long timeTrace = System.currentTimeMillis();
int result = httpclient.executeMethod(post);
System.out.println(">> HTTP Response status code: "+result);
System.out.println(">> Response Time: "+(System.currentTimeMillis() - timeTrace));
.......
.......
.......
}
I appreciate any quick help and guidelines.
Here is the error I get
Mar 11, 2020 4:23:08 PM org.apache.commons.httpclient.HttpMethodDirector execute
WithRetry
INFO: I/O exception (javax.net.ssl.SSLProtocolException) caught when processing
request: Connection reset
Mar 11, 2020 4:23:08 PM org.apache.commons.httpclient.HttpMethodDirector execute
WithRetry
INFO: Retrying request
Mar 11, 2020 4:23:23 PM org.apache.commons.httpclient.HttpMethodDirector execute
WithRetry
INFO: I/O exception (javax.net.ssl.SSLProtocolException) caught when processing
request: Connection reset
Mar 11, 2020 4:23:23 PM org.apache.commons.httpclient.HttpMethodDirector execute
WithRetry
INFO: Retrying request
Mar 11, 2020 4:23:38 PM org.apache.commons.httpclient.HttpMethodDirector execute
WithRetry
INFO: I/O exception (javax.net.ssl.SSLProtocolException) caught when processing
request: Connection reset
Mar 11, 2020 4:23:38 PM org.apache.commons.httpclient.HttpMethodDirector execute
WithRetry
INFO: Retrying request
javax.net.ssl.SSLProtocolException: Connection reset
at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:126)
at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.ja
va:321)
at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.ja
va:264)
at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.ja
va:259)
at java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:137)
at java.base/sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:11
52)
at java.base/sun.security.ssl.SSLSocketImpl.readHandshakeRecord(SSLSocke
tImpl.java:1063)
at java.base/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl
.java:402)
at java.base/sun.security.ssl.SSLSocketImpl.ensureNegotiated(SSLSocketIm
pl.java:716)
at java.base/sun.security.ssl.SSLSocketImpl$AppOutputStream.write(SSLSoc
ketImpl.java:970)
at java.base/java.io.BufferedOutputStream.flushBuffer(BufferedOutputStre
am.java:81)
at java.base/java.io.BufferedOutputStream.flush(BufferedOutputStream.jav
a:142)
at java.base/java.io.FilterOutputStream.flush(FilterOutputStream.java:15
3)
at org.apache.commons.httpclient.methods.EntityEnclosingMethod.writeRequ
estBody(EntityEnclosingMethod.java:502)
at org.apache.commons.httpclient.HttpMethodBase.writeRequest(HttpMethodB
ase.java:1973)
at org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.j
ava:993)
at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(Htt
pMethodDirector.java:397)
at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMe
thodDirector.java:170)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.jav
a:396)
at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.jav
a:324)
at af.aib.etl.AttendanceETL.fetchAndParseAttendanceRecord(AttendanceETL.
java:99)
at af.aib.attendance.ApplicationStartPoint.main(ApplicationStartPoint.ja
va:28)
Caused by: java.net.SocketException: Connection reset
at java.base/java.net.SocketInputStream.read(SocketInputStream.java:186)
at java.base/java.net.SocketInputStream.read(SocketInputStream.java:140)
at java.base/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRe
cord.java:448)
at java.base/sun.security.ssl.SSLSocketInputRecord.decode(SSLSocketInput
Record.java:165)
at java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:108)
The local proxy and the server proxy were different that is why it was running fine on my local machine but not on the server, once I changed the proxy to the specific proxy which server was using then the application was working fine.

How to add timeout to a task in jBPM

I have a service task in jBPM. The process diagram is shown below.
I register the work item handler for the service task "Hello" as shown below using the default service task handler provided.
ksession.getWorkItemManager().registerWorkItemHandler("Service Task", new ServiceTaskHandler());
I am calling a java function from the service task. The function call an api and it takes almost 10 min to get the response. But before the task is completed, I get below error:
Jun 11, 2018 8:09:43 AM com.arjuna.ats.arjuna.coordinator.TransactionReaper check
WARN: ARJUNA012117: TransactionReaper::check timeout for TX 0:ffff0a923832:e68d:5b1e2e09:19 in state RUN
Jun 11, 2018 8:09:43 AM com.arjuna.ats.arjuna.coordinator.BasicAction checkChildren
WARN: ARJUNA012095: Abort of action id 0:ffff0a923832:e68d:5b1e2e09:19 invoked while multiple threads active within it.
Jun 11, 2018 8:09:43 AM com.arjuna.ats.arjuna.coordinator.BasicAction checkChildren
WARN: ARJUNA012381: Action id 0:ffff0a923832:e68d:5b1e2e09:19 completed with multiple threads - thread main was in progress with java.net.SocketInputStream.socketRead0(Native Method)
java.net.SocketInputStream.socketRead(Unknown Source)
java.net.SocketInputStream.read(Unknown Source)
java.net.SocketInputStream.read(Unknown Source)
org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282)
Jun 11, 2018 8:09:43 AM com.arjuna.ats.arjuna.coordinator.CheckedAction check
WARN: ARJUNA012108: CheckedAction::check - atomic action 0:ffff0a923832:e68d:5b1e2e09:19 aborting with 1 threads active!
WARNING: Unable to put resource app-updateable-resource value [] due to No transaction is running
Jun 11, 2018 8:10:18 AM org.drools.persistence.PersistableRunner rollbackTransaction
WARNING: Could not commit session
java.lang.IllegalStateException: Process instance 131[com.sample.bpmn.hello] is disconnected.
How can I solve this issue. Is it a timeout issue. If yes, how can I increase the timeout

Expectit doesn't find my expected string

I'm trying to use the expectit library with sshj like so:
final Session session = getSharedSession();
final Session.Command sessionCommand = session.exec(command);
try (Expect expect = new ExpectBuilder()
.withOutput(sessionCommand.getOutputStream())
.withInputs(sessionCommand.getInputStream(), sessionCommand.getErrorStream())
.withInputFilters(removeColors(), removeNonPrintable())
.withEchoInput(LoggingAppendableAdapter.getInstance())
.withEchoOutput(LoggingAppendableAdapter.getInstance())
.withExceptionOnFailure()
.build()) {
for (SshExpectations sshExpectation : sequence) {
expect.expect(contains(sshExpectation.getExpectation()));
expect.sendLine(sshExpectation.getReaction());
}
}
The command I'm executing is "sleep 5; rm -i test.txt", I get the following output:
Jul 26, 2017 6:07:14 PM net.sf.expectit.SingleInputExpect start
FINE: Starting expect thread: input=< ChannelInputStream for Channel #1 >, charset=UTF-8, echoInput=LoggingAppendableAdapter#7a1099d7, filter=net.sf.expectit.filter.Filters$3#3afe1f22, bufferSize=1024
Jul 26, 2017 6:07:14 PM net.sf.expectit.SingleInputExpect start
FINE: Starting expect thread: input=< ChannelInputStream for Channel #1 >, charset=UTF-8, echoInput=LoggingAppendableAdapter#7a1099d7, filter=net.sf.expectit.filter.Filters$3#3afe1f22, bufferSize=1024
Jul 26, 2017 6:07:14 PM net.sf.expectit.ExpectImpl expectIn
FINE: Expect matcher 'contains('remove regular empty')' with timeout 30000 (ms) in input #0
Jul 26, 2017 6:07:14 PM net.sf.expectit.SingleInputExpect expect
FINE: Initial matcher contains('remove regular empty') result: SimpleResult{succeeded=false, before='null', group='null', input='', canStopMatching=false}
2017-07-26 18:07:19,574 INFO [expect-pool-26-thread-2] LoggingAppendableAdapter rm: remove regular empty file ‘test’?
Jul 26, 2017 6:07:19 PM net.sf.expectit.InputStreamCopier call
FINE: Received from < ChannelInputStream for Channel #1 >: rm: remove regular empty file ‘test’?
Jul 26, 2017 6:07:44 PM net.sf.expectit.SingleInputExpect expect
FINE: Selector returns 0 key
Jul 26, 2017 6:07:44 PM net.sf.expectit.SingleInputExpect stop
FINE: Releasing resources for input: < ChannelInputStream for Channel #1 >
Jul 26, 2017 6:07:44 PM net.sf.expectit.SingleInputExpect stop
FINE: Releasing resources for input: < ChannelInputStream for Channel #1 >
2017-07-26 18:07:46,800 ERROR Could not execute command
net.sf.expectit.ExpectIOException: Expect operation fails (timeout: 30000 ms) for matcher: contains('remove regular empty')
at net.sf.expectit.ExpectImpl.expectIn(ExpectImpl.java:106)
at net.sf.expectit.AbstractExpectImpl.expectIn(AbstractExpectImpl.java:57)
at net.sf.expectit.AbstractExpectImpl.expect(AbstractExpectImpl.java:61)
One would think that everything should work fine as both the LoggingAppendableAdapter as well as the internal logging report the string "rm: remove regular empty file ‘test’?".
Any suggestions on what I might doing wrong?
The issue is that rm -i test.txtwrites its question to stderr instead of stdout, but expectit by default only checks the first input stream (which in my case was the stdout stream).
By using
expect.expectIn(1, contains(sshExpectation.getExpectation()));
expectit then expects the string to be on the second given input stream, which in my case is stderr.

Stopping threads in a multi threaded application

I have created a service using procrun which launches certain jars through reflection. When the service is started it starts a thread and rest of the execution happens in that thread. Then each of the plugin loads its own threads and does the execution in there.
During service stop, I have called the stop method of the plugins. Those methods have returned and whatever thread I have created has been terminated for the plugins. But even after that the following threads are still running.
INFO: Thread No:0 = Timer-0
Jan 13, 2016 10:49:58 AM com.test.desktop.SdkMain stop
INFO: Thread No:1 = WebSocketWorker-14
Jan 13, 2016 10:49:58 AM com.test.desktop.SdkMain stop
INFO: Thread No:2 = WebSocketWorker-15
Jan 13, 2016 10:49:58 AM com.test.desktop.SdkMain stop
INFO: Thread No:3 = WebSocketWorker-16
Jan 13, 2016 10:49:58 AM com.test.desktop.SdkMain stop
INFO: Thread No:4 = WebSocketWorker-17
Jan 13, 2016 10:49:58 AM com.kube.desktop.KubeSdkMain stop
INFO: Thread No:5 = WebsocketSelector18
Jan 13, 2016 10:49:58 AM com.test.desktop.SdkMain stop
INFO: Thread No:6 = AWT-EventQueue-0
Jan 13, 2016 10:49:58 AM com.test.desktop.SdkMain stop
INFO: Thread No:7 = DestroyJavaVM
Jan 13, 2016 10:49:58 AM com.test.desktop.SdkMain stop
INFO: Thread No:8 = Thread-11
Jan 13, 2016 10:49:58 AM com.test.desktop.SdkMain stop
The following is how I printed those threads.
ThreadGroup currentGroup = Thread.currentThread().getThreadGroup();
int noThreads = currentGroup.activeCount();
Thread[] lstThreads = new Thread[noThreads];
currentGroup.enumerate(lstThreads);
for (int i = 0; i < noThreads; i++)
LOGGER.log(Level.INFO, "Thread No:" + i + " = " + lstThreads[i].getName());
Because of these threads, when I stop the service, it takes forever and then times out. But when I call System.exit(0) the service stops quickly. What should I do to get rid of these threads? When I launch the jars through reflection, are there separate threads created for each plugin? If so could these be them? Please advice.
It looks like the plugins are themself launching threads ("INFO: Thread No:1 = WebSocketWorker-14" -> sockets usually should be put in seperate threads) which will not be shut down if you kill the initiating thread. You'll have to enforce your plugins to kill all threads they started when they get shut down to make sure that they will not leave stuff behind. "some plugins don't do a good job cleaning up whatever they created. it's just sloppy programming." - bayou.io is describing it really good there.
Calling System.exit() will just kill the process meaning it will kill all threads created by the process as well.
The other way would be to manually iterate over all running threads, check if it's the main thread, and if not proceed to kill it. You can get all running threads in an iterable set using
Set<Thread> threadSet = Thread.getAllStackTraces().keySet();
And you can get your currently running Thread using
Thread currentThread = Thread.currentThread();
Still this is the way you would not want to do it, it's more of a way to clean up if plugins decide to leave stuff behind rather than just doing it that way. The plugins themselfes should take care of shutting down the threads when they get disabled but if they don't do that you can use above way to manually clean it up.

Selenium occasional UnreachableBrowserException

I am trying to access several websites by using Selenium in Java. Occasionally, I get an UnreachableBrowserException. I have read many threads about this error but it seems like there are many different causes of the error. I get the error about 1% of the time when I attempt to access a new page and I cannot find any similarities between occurrences. I currently am using Firefox, however I have also tried Internet Explorer and experienced similar errors. I am only opening one page at a time and have tried using the same window and completely quitting the driver before trying to access another page and either way the error still occurs. It is important to note I do not always get this error sometimes my code can run without this occurring. Here is the error message:
Jan 12, 2015 10:39:40 PM org.apache.http.impl.execchain.RetryExec execute
INFO: I/O exception (java.net.SocketException) caught when processing request to {}- http://127.0.0.1:7055: Permission denied: connect
Jan 12, 2015 10:39:40 PM org.apache.http.impl.execchain.RetryExec execute
INFO: Retrying request to {}->http://127.0.0.1:7055
Jan 12, 2015 10:39:40 PM org.apache.http.impl.execchain.RetryExec execute
INFO: I/O exception (java.net.SocketException) caught when processing request to {}->http://127.0.0.1:7055: Permission denied: connect
Jan 12, 2015 10:39:40 PM org.apache.http.impl.execchain.RetryExec execute
INFO: Retrying request to {}->http://127.0.0.1:7055
Jan 12, 2015 10:39:40 PM org.apache.http.impl.execchain.RetryExec execute
INFO: I/O exception (java.net.SocketException) caught when processing request to {}->http://127.0.0.1:7055: Permission denied: connect
Jan 12, 2015 10:39:40 PM org.apache.http.impl.execchain.RetryExec execute
INFO: Retrying request to {}->http://127.0.0.1:7055
Jan 12, 2015 10:39:40 PM org.apache.http.impl.execchain.RetryExec execute
INFO: I/O exception (java.net.SocketException) caught when processing request to {}->http://127.0.0.1:7055: Permission denied: connect
Jan 12, 2015 10:39:40 PM org.apache.http.impl.execchain.RetryExec execute
INFO: Retrying request to {}->http://127.0.0.1:7055
Jan 12, 2015 10:39:40 PM org.apache.http.impl.execchain.RetryExec execute
INFO: I/O exception (java.net.SocketException) caught when processing request to {}->http://127.0.0.1:7055: Permission denied: connect
Jan 12, 2015 10:39:40 PM org.apache.http.impl.execchain.RetryExec execute
INFO: Retrying request to {}->http://127.0.0.1:7055
Jan 12, 2015 10:39:40 PM org.apache.http.impl.execchain.RetryExec execute
INFO: I/O exception (java.net.SocketException) caught when processing request to {}->http://127.0.0.1:7055: Permission denied: connect
Jan 12, 2015 10:39:40 PM org.apache.http.impl.execchain.RetryExec execute
INFO: Retrying request to {}->http://127.0.0.1:7055
Exception in thread "main" org.openqa.selenium.remote.UnreachableBrowserException: Error communicating with the remote browser. It may have died.
Build info: version: '2.44.0', revision: '76d78cf', time: '2014-10-23 20:03:00'
System info: host: '****', ip: '**.*.*.*', os.name: 'Windows 7', os.arch: 'amd64', os.version: '6.1', java.version: '1.7.0_60'
Driver info: driver.version: RemoteWebDriver
at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:593)
at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:614)
at org.openqa.selenium.remote.RemoteWebDriver.quit(RemoteWebDriver.java:468)
at scrape.Scraper.killInstance(Scraper.java:162)
at scrape.Updater.main(Updater.java:93)
Caused by: java.net.SocketException: Permission denied: connect
at java.net.DualStackPlainSocketImpl.connect0(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at org.apache.http.conn.socket.PlainConnectionSocketFactory.connectSocket(PlainConnectionSocketFactory.java:72)
at org.apache.http.impl.conn.HttpClientConnectionOperator.connect(HttpClientConnectionOperator.java:123)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:318)
at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:363)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:219)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:195)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:86)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:108)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)
at org.openqa.selenium.remote.HttpCommandExecutor.fallBackExecute(HttpCommandExecutor.java:215)
at org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:184)
at org.openqa.selenium.firefox.internal.NewProfileExtensionConnection.execute(NewProfileExtensionConnection.java:165)
at org.openqa.selenium.firefox.FirefoxDriver$LazyCommandExecutor.execute(FirefoxDriver.java:362)
at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:572)
... 4 more
How can I prevent this error or at least catch the error and deal with it effectively?
UnreachableBrowserExceptions can happen for multiple reasons - the most obvious is that the browser was closed, either in code or physically in the GUI, and then the code attempted to access it. Often, like in your case, they are caused by socket errors. This can mean, again, multiple things - your program tried to open too many sockets, it couldn't connect to a remote website, and others.
What I would suggest doing in a situation like this is waiting a short time, then retrying to see if the exception is still thrown. Sometimes these situations resolve themselves and your program can recover.
Here is some code to do that. It keeps retrying as long as the UnreachableBrowserException is thrown and the number of retries is below some limit that you set. If it hits the retry limit and the exception is still being thrown, it closes the browser and restarts it, resetting the retry count to 0. There is also a restart counter, to make sure that if for some reason restarting the browser doesn't help, you don't loop endlessly through running code -> exception -> wait -> retry -> hit retry limit, restart browser -> run code -> exception. Here, exceeding the restart limit (or successfully accessing the browser) will break out of the loop.
If you want more help, let me know.Hope this is helpful!
WebDriver driver = new FirefoxDriver(); //or whatever you're using
boolean worked = false;
int numredos = 0;
final int REDO_LIMIT = 3; //or however many times you want to retry before giving up
final int RESTART_LIMIT = 3; //or however many times you want to restart the browser b/f terminating
int numrestarts = 0;
boolean restart = false;
do
{
try{
if(restart)
{
driver = new FirefoxDriver();
numrestarts++;
}
//RUN YOUR BROWSER CODE HERE
worked = true;
}
//if the browser becomes unreachable (probably b/c of a socket issue),
// write the error to the log and then sleep for 10 seconds
//if we've already retried the set limit number of times, restart the browser and try again
catch (UnreachableBrowserException ube)
{
worked = false;
if(numredos >= REDO_LIMIT)
{
//if you've already restarted the browser too many times, it will set it to null
//and return an error code. If not, it will set the restart flag so it will be restarted on the next iteration.
//try quitting. If it can't do it, it's already dead; just set it to null
//(set it to null either way, just in case)
try
{
driver.quit();
}
catch(Exception j)
{
errorwriter.println(j);
}
driver = null;
if(numrestarts < RESTART_LIMIT)
{
//log that you're restarting the driver (not coded here), then set the restart flag to true. This will cause the browser to be restarted after falling out of the catch block
numredos = 0;
restart = true;
}
}
else
{
//print details of the exception to the error file
errorfile.println("\n\n\n");
//timestamp, and some exception details - you can decide which you want
errorfile.println(new SimpleDateFormat("yyyy-MM-dd HH:mm:ss").format(Calendar.getInstance().getTime()));
errorfile.println(s.getClass());
errorfile.println(s.getMessage());
errorfile.println("Cause: " + s.getCause());
errorfile.flush();
//now sleep for some number of seconds - here 10
try
{
TimeUnit.SECONDS.sleep(10);
}
catch(InterruptedException e)
{
System.out.println("waiting after socket crash interrupted");
}
numredos++;
}
}
}while(!worked && numredos <= REDO_LIMIT && numrestarts <= RESTART_LIMIT);

Categories