How to handle akka AskTimeoutException when submiting flink job

How to handle akka AskTimeoutException when submiting flink job - java

Flink 1.5.3, When I submit flink job to flink cluster (on yarn), it always throw AskTimeoutException. In flink configuration file, I have configed the parmater "akka.ask.timeout=1000s" , but the Exception is still like this below.
That means I have increased the timeout parameter, "akka.ask.timeout=1000s" , but it does not work.
org.apache.flink.runtime.rest.handler.RestHandlerException: Job submission failed.
at org.apache.flink.runtime.rest.handler.job.JobSubmitHandler.lambda$handleRequest$2(JobSubmitHandler.java:116)
at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
at org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:770)
at akka.dispatch.OnComplete.internal(Future.scala:258)
at akka.dispatch.OnComplete.internal(Future.scala:256)
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:186)
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:183)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
at org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:83)
at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44)
at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252)
at akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:603)
at akka.actor.Scheduler$$anon$4.run(Scheduler.scala:126)
at scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:601)
at scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109)
at scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:599)
at akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:329)
at akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:280)
at akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:284)
at akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:236)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.CompletionException: akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://flink/user/dispatcher#-1851759541]] after [10000 ms]. Sender[null] sent message of type "org.apache.flink.runtime.rpc.messages.LocalFencedMessage".
at java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326)
at java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338)
at java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:911)
at java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:899)
... 21 more
Caused by: akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://flink/user/dispatcher#-1851759541]] after [10000 ms]. Sender[null] sent message of type "org.apache.flink.runtime.rpc.messages.LocalFencedMessage".
at akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:604)
... 9 more
So is there any solution to avoid this issue?

The timeouts of the communication between the REST handlers and the Flink cluster is controlled by web.timeout. The timeout is specified in milliseconds and, thus, you would need to set it to web.timeout: 1000000 in your flink-conf.yaml if you want to wait 1000s.
Moreover, it would be good to check the cluster entrypoint logs why the job submission takes so long. Usually it should not take longer than 10 seconds.

Related

What does Retry mean in context of Java Couchbase SDK?

I am using Java couchbase sdk in my application. While setting up the DefaultCouchbaseEnvironment, I came across the property RetryStrategy. Now I am using the default configuration for which the retry strategy is BestEffortRetryStrategy. According to documentation
BestEffortRetryStrategy will retry the operation until it either succeeds or the maximum request lifetime is reached
By default the maximum request lifetime is 75 seconds.
Now what i what i want to understand here is what does retry mean here. Does retry mean retrying the request whenever an exception occurs or does it mean it will retry to allocate this request to some node to process the request in case it can't and it will keep retrying for 75 seconds?
I am looking at my application logs for different exceptions to understand this and I could see that TemporaryFailureException wasn't retried and i could also see that in some instances RequestCancelledException was being thrown after 75 seconds. Is it fair to assume that couchbase retries a request to assign it to node to process it and not actually retries on any exception once it makes it to the node that will process this request?
StackTrace for TemporaryFailureException-
stackTrace: com.couchbase.client.java.error.TemporaryFailureException: null
at com.couchbase.client.java.bucket.api.Mutate$2$1.call(Mutate.java:246)
at com.couchbase.client.java.bucket.api.Mutate$2$1.call(Mutate.java:220)
at rx.internal.operators.OnSubscribeMap$MapSubscriber.onNext(OnSubscribeMap.java:69)
at rx.observers.Subscribers$5.onNext(Subscribers.java:235)
at rx.internal.operators.OnSubscribeDoOnEach$DoOnEachSubscriber.onNext(OnSubscribeDoOnEach.java:101)
at rx.internal.producers.SingleProducer.request(SingleProducer.java:65)
at rx.Subscriber.setProducer(Subscriber.java:211)
at rx.internal.operators.OnSubscribeMap$MapSubscriber.setProducer(OnSubscribeMap.java:102)
at rx.Subscriber.setProducer(Subscriber.java:205)
at rx.internal.operators.OnSubscribeMap$MapSubscriber.setProducer(OnSubscribeMap.java:102)
at rx.Subscriber.setProducer(Subscriber.java:205)
at rx.Subscriber.setProducer(Subscriber.java:205)
at rx.subjects.AsyncSubject.onCompleted(AsyncSubject.java:103)
at com.couchbase.client.core.endpoint.AbstractGenericHandler.completeResponse(AbstractGenericHandler.java:508)
at com.couchbase.client.core.endpoint.AbstractGenericHandler.access$000(AbstractGenericHandler.java:86)
at com.couchbase.client.core.endpoint.AbstractGenericHandler$1.call(AbstractGenericHandler.java:526)
at rx.internal.schedulers.ScheduledAction.run(ScheduledAction.java:55)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java)
at java.lang.Thread.run(Thread.java:748)
Caused by: rx.exceptions.OnErrorThrowable$OnNextValue: OnError while emitting onNext value: com.couchbase.client.core.message.kv.UpsertResponse.class
at rx.exceptions.OnErrorThrowable.addValueAsLastCause(OnErrorThrowable.java:118)
at rx.internal.operators.OnSubscribeMap$MapSubscriber.onNext(OnSubscribeMap.java:73)
... 21 common frames omitted```

BestEffortRetryStrategy should retry until the the request is cancelled by the timeout.
FailFastRetryStrategy should not retry. It should fail immediately.
If you have a TemporaryFailureException and have BestEffortRetryStrategy, that should have been retried. If you had one that was not retried can you share the stacktrace?
Mike

How to block Java Thin client request till preloading of data in Ignite cluster is completed

We are running an ignite cluster with 3 nodes which pre-loads the data from 3rd party database (using custom cache store). when we try to connect to the cluster using java thin client and if the request reaches the cluster before data loading gets completed, we are getting unknown pair exception and some unstable behavior.
Is there anyway we can block the client request (TCP socket connection) till the data loading gets completed?
I tried with different life cycle events (NODE_START_COMPLETED) but no luck.
Stack trace
Caused by: org.apache.ignite.binary.BinaryInvalidTypeException: Unknown pair [platformId=0, typeId=-845247802]
at org.apache.ignite.internal.binary.BinaryContext.descriptorForTypeId(BinaryContext.java:707)
at org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize0(BinaryReaderExImpl.java:1757)
at org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize(BinaryReaderExImpl.java:1716)
at org.apache.ignite.internal.binary.BinaryObjectImpl.deserializeValue(BinaryObjectImpl.java:798)
at org.apache.ignite.internal.binary.BinaryObjectImpl.value(BinaryObjectImpl.java:143)
at org.apache.ignite.internal.processors.cache.CacheObjectUtils.unwrapBinary(CacheObjectUtils.java:177)
at org.apache.ignite.internal.processors.cache.CacheObjectUtils.unwrapBinaryIfNeeded(CacheObjectUtils.java:67)
at org.apache.ignite.internal.processors.cache.CacheObjectContext.unwrapBinaryIfNeeded(CacheObjectContext.java:125)
at org.apache.ignite.internal.processors.cache.GridCacheContext.unwrapBinaryIfNeeded(GridCacheContext.java:1773)
at org.apache.ignite.internal.processors.cache.GridCacheContext.unwrapBinaryIfNeeded(GridCacheContext.java:1761)
at org.apache.ignite.internal.processors.cache.store.GridCacheStoreManagerAdapter.put(GridCacheStoreManagerAdapter.java:573)
at org.apache.ignite.internal.processors.cache.store.GridCacheStoreManagerAdapter.putAll(GridCacheStoreManagerAdapter.java:627)
at org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter.batchStoreCommit(IgniteTxAdapter.java:1507)
at org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter.userCommit(IgniteTxLocalAdapter.java:589)
at org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.localFinish(GridNearTxLocal.java:3646)
at org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxFinishFuture.doFinish(GridNearTxFinishFuture.java:475)
... 41 common frames omitted
Caused by: java.lang.ClassNotFoundException: Unknown pair [platformId=0, typeId=-845247802]
at org.apache.ignite.internal.MarshallerContextImpl.getClassName(MarshallerContextImpl.java:394)
at org.apache.ignite.internal.MarshallerContextImpl.getClass(MarshallerContextImpl.java:344)
at org.apache.ignite.internal.binary.BinaryContext.descriptorForTypeId(BinaryContext.java:698)
... 56 common frames omitted

There is no way to forbid thin clients to connect to a cluster using Ignite API at the moment. I created a JIRA ticket for this improvement: https://issues.apache.org/jira/browse/IGNITE-12237
The unknown pair exception doesn't seem to be caused by thin clients connecting at a wrong time though. Usually it's caused by a missing marshaller directory in the work path.

ERROR Error cleaning broadcast Exception [duplicate]

This question already has answers here:
What are possible reasons for receiving TimeoutException: Futures timed out after [n seconds] when working with Spark [duplicate]
(4 answers)
Closed 5 years ago.
I get the following error while running my spark streaming application, we have a large application running multiple stateful (with mapWithState) and stateless operations. It's getting difficult to isolate the error since spark itself hangs and the only error we see is in the spark log and not the application log itself.
The error happens only after abount 4-5 mins with a micro-batch interval of 10 seconds.
I am using Spark 1.6.1 on an ubuntu server with Kafka based input and output streams.
Please note it's not possible for me to provide the smallest possible code to re-create this bug as it does not occur in unit test-cases, and the application itself is very large
Any direction you can give to solve this issue will be helpful. Please let me know if I can provide any more information.
Error inline below:
[2017-07-11 16:15:15,338] ERROR Error cleaning broadcast 2211 (org.apache.spark.ContextCleaner)
org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 seconds]. This timeout is controlled by spark.rpc.askTimeout
at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
at org.apache.spark.storage.BlockManagerMaster.removeBroadcast(BlockManagerMaster.scala:136)
at org.apache.spark.broadcast.TorrentBroadcast$.unpersist(TorrentBroadcast.scala:228)
at org.apache.spark.broadcast.TorrentBroadcastFactory.unbroadcast(TorrentBroadcastFactory.scala:45)
at org.apache.spark.broadcast.BroadcastManager.unbroadcast(BroadcastManager.scala:77)
at org.apache.spark.ContextCleaner.doCleanupBroadcast(ContextCleaner.scala:233)
at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1$$anonfun$apply$mcV$sp$2.apply(ContextCleaner.scala:189)
at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1$$anonfun$apply$mcV$sp$2.apply(ContextCleaner.scala:180)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:180)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1180)
at org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:173)
at org.apache.spark.ContextCleaner$$anon$3.run(ContextCleaner.scala:68)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [120 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:107)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)

Your exception message clearly says that its RPCTimeout due to default configuration of 120 seconds and adjust to optimal value as per your work load.
please see 1.6 configuration
your error messages org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 seconds].
and
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76) confirms that.
For Better understanding please see the below code from
see RpcTimeout.scala
/**
* Wait for the completed result and return it. If the result is not available within this
* timeout, throw a [[RpcTimeoutException]] to indicate which configuration controls the timeout.
* #param awaitable the `Awaitable` to be awaited
* #throws RpcTimeoutException if after waiting for the specified time `awaitable`
* is still not ready
*/
def awaitResult[T](awaitable: Awaitable[T]): T = {
try {
Await.result(awaitable, duration)
} catch addMessageIfTimeout
}
}
Also see my answer in another context

GC over limit exceeded when using Spring integration executor channel

I have got the below exception , I suspect heap memory is full so GC exception was thrown . Kindly explain if any other perspective for the below application solution
2017:06:07 21:18:36.275 [loginputtaskexecutor-7] ERROR o.s.i.handler.LoggingHandler - org.springframework.messaging.MessageHandlingException: nested exception is java.lang.IllegalStateException: Cannot process message
at org.springframework.integration.handler.MethodInvokingMessageProcessor.processMessage(MethodInvokingMessageProcessor.java:96)
at org.springframework.integration.handler.ServiceActivatingHandler.handleRequestMessage(ServiceActivatingHandler.java:89)
at org.springframework.integration.handler.AbstractReplyProducingMessageHandler.handleMessageInternal(AbstractReplyProducingMessageHandler.java:109)
at org.springframework.integration.handler.AbstractMessageHandler.handleMessage(AbstractMessageHandler.java:127)
at org.springframework.integration.dispatcher.AbstractDispatcher.tryOptimizedDispatch(AbstractDispatcher.java:116)
at org.springframework.integration.dispatcher.UnicastingDispatcher.doDispatch(UnicastingDispatcher.java:148)
at org.springframework.integration.dispatcher.UnicastingDispatcher.access$000(UnicastingDispatcher.java:53)
at org.springframework.integration.dispatcher.UnicastingDispatcher$3.run(UnicastingDispatcher.java:129)
at org.springframework.integration.util.ErrorHandlingTaskExecutor$1.run(ErrorHandlingTaskExecutor.java:55)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: Cannot process message
at org.springframework.integration.util.MessagingMethodInvokerHelper.processInternal(MessagingMethodInvokerHelper.java:333)
at org.springframework.integration.util.MessagingMethodInvokerHelper.process(MessagingMethodInvokerHelper.java:155)
at org.springframework.integration.handler.MethodInvokingMessageProcessor.processMessage(MethodInvokingMessageProcessor.java:93)
... 11 more
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
**Application flow in detail :**
Spring integration application build to listen message from ActiveMQ , after consuming message from ActiveMQ it will be handed over to input channel (Executor channel) which has subscriber as service activator . In Service Activator message is converted to json then stored to Cassandra . # transaction was mentioned on the Service activator method .
With the above solution , I thought of breaking Message transaction flow by implementing executor channel , after message consumed it will be handed over to executor channel and the transaction ends . after then threads in executor channel would take care of performing parallel write to cassandra.
Is there any better way to write as fast as possible for large volume of data to casandra using java spring integration

If the data sink can't keep up, add a limit to the queue size in the TaskExecutor and use a CallerRunsPolicy or CallerBlocksPolicy when the queue is full.
That will naturally throttle the workload at the rate the sink can deal with.

odd SQLException - Could not retrieve transation read-only status server

I have a Quartz Job that executes a Stored Procedure in my MySQL database once every 5 minutes, and for some reason, 1 out of 3 executions fails and gives this weird exception. I have searched and searched for what this exception means, but I could not find a solution. Here is the full stack trace:
java.sql.SQLException: Could not retrieve transation read-only status server
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1078)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:989)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:975)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:920)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:951)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:941)
at com.mysql.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:3939)
at com.mysql.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:3910)
at com.mysql.jdbc.PreparedStatement.checkReadOnlySafeStatement(PreparedStatement.java:1258)
at com.mysql.jdbc.CallableStatement.checkReadOnlySafeStatement(CallableStatement.java:2656)
at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1278)
at com.mysql.jdbc.CallableStatement.execute(CallableStatement.java:920)
at com.mchange.v2.c3p0.impl.NewProxyCallableStatement.execute(NewProxyCallableStatement.java:3044)
at org.deadmandungeons.website.tasks.RankUpdateTask.execute(RankUpdateTask.java:30)
at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573)
Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet successfully received from the server was 1,198,219 milliseconds ago. The last packet sent successfully to the server was 950,420 milliseconds ago.
at sun.reflect.GeneratedConstructorAccessor43.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1121)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3673)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3562)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4113)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2570)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2731)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2812)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2761)
at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1612)
at com.mysql.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:3933)
... 9 more
Caused by: java.net.SocketException: Connection timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:150)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:114)
at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:161)
at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:189)
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3116)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3573)
... 17 more
So I figured it is timing out because it thinks the MySQL server is in read-only status?
This only happens for this quartz job, and not any other time when I communicate with the database. This execution is of course happening in another thread, but I don't think that would have anything to do with it.
Why would it think the server was in read-only mode?
Also, I don't think "transation" is a word, so there's that...

Sorry for posting on old thread,
As stack trace says
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
This implies the link between JDBC and DB is broken.As per your observation you say 1 out of 3 job invocations fails.
You have these jobs scheduled every 5 minutes and as per trace the last successful message sent to server is ~15 minutes before.
Hence I suspect either
You are procedure is not returning (waiting on something)
The JDBC connection has been invalidated by the firewall/ proxy
It will interesting to see the how connections are managed, As per logs I see you are using c3p0.
You can try setting unreturnedConnectionTimeout and debugUnreturnedConnectionStackTraces. This will give you more insight into connection leaks or db calls which are taking long.

Research takes nowhere, as you guys said, but the error shows what seems to be a Database being populated by two applications at the same time.
Do you have admin privileges on this MySQL server? If you do, you should try setting
FLUSH TABLES WITH READ LOCK;
SET GLOBAL READ_ONLY=ON;
as a test to reproduce the error. Just to warn you, this command makes your database unwritable, so you will not be able to add data in it until you revert this configuration, obviously with
SET GLOBAL READ_ONLY=0;
UNLOCK TABLES;
If the result of this test is positive (same error had been reproduced), you should try isolating applications that are storing data on your database, to find out which one is conflicting with Quartz.
I'm sorry for being vague, but I hope it gives you some help...

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to handle akka AskTimeoutException when submiting flink job - java

Related

What does Retry mean in context of Java Couchbase SDK?

How to block Java Thin client request till preloading of data in Ignite cluster is completed

ERROR Error cleaning broadcast Exception [duplicate]

GC over limit exceeded when using Spring integration executor channel

odd SQLException - Could not retrieve transation read-only status server

Categories

Resources