DEADLOCK Situation - SQL/Hibernate/Java how to resolve it? - java

[12-Jun-2015 12:40:36][Timer-0] WARN com.mchange.v2.async.ThreadPoolAsynchronousRunner - com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector#6a93dd24 -- APPARENT DEADLOCK!!! Creating emergency threads for unassigned pending tasks!
[12-Jun-2015 12:40:39][Timer-0] WARN com.mchange.v2.async.ThreadPoolAsynchronousRunner - com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector#6a93dd24 -- APPARENT DEADLOCK!!! Complete Status:
Managed Threads: 3
Active Threads: 3
Active Tasks:
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#4da39158 (com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#0)
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#2d9b503a (com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#2)
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#2fb0273a (com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#1)
Pending Tasks:
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#55df694b
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#3e79e6f8
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#352ac3d3
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#a9e6cea
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#27007c18
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#10d31fa9
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#14c398e0
com.mchange.v2.resourcepool.BasicResourcePool$1RefurbishCheckinResourceTask#3569a4c6
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#2ac0de8f
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#5c539436
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#73279494
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#2b8bbb9c
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#62ca1519
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#412f4efa
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask#8ea2085
Pool thread stack traces:
Thread[com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#0,5,main]
com.mchange.v2.resourcepool.BasicResourcePool$AsyncTestIdleResourceTask.run(BasicResourcePool.java:2023)
com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:547)
Thread[com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#2,5,main]
com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:560)
Thread[com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread-#1,5,main]
com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:560)
[12-Jun-2015 12:40:45][scheduler-3] INFO com.ssl.htms.tasks.TaskCheckDatabaseConnection - Database connection pool error, count=1 : This connection has been closed.

This is a warning message and not an actual deadlock. Look at the following threads for more details:
C3P0 apparent deadlock when the threads are all empty?
and also
C3p0 APPARENT DEADLOCK exception

Related

Threads are increasing but not decreasing when the task finished in Java

I have an application that I did thread configuration as follows,
corepoolsize = 10
maxpoolsize = 10
queuecapacity = 10
But when I do use Threads and functions that run on threads (asynchronous process)
I see it is increasing like the follows, But I don't see the threads are decreasing. Is that a indication of memory leak? Can anybody guide me?
Log prints like this,
2021-09-15 01:13:48.554 INFO 111 --- [Async-1]
2021-09-15 01:13:48.654 INFO 121 --- [Async-2]
2021-09-15 01:13:48.754 INFO 132 --- [Async-3]
2021-09-15 01:13:48.854 INFO 140 --- [Async-4]
2021-09-15 01:13:48.954 INFO 155 --- [Async-5]
2021-09-15 01:13:49.554 INFO 160 --- [Async-6]
But I don't see it is calling again [Async-1], [Async-2] like that. Is this a memory leak or a thread cache or after filling up corepoolsize then will it run [Async-1]?
Corepoolsize represents the worker thread that is always alive.
You can check ThreadPoolExecutor#runWorker, ThreadPoolExecutor#getTask method, these worker threads will always try to get tasks from the task queue and execute them. So if no new tasks are posted to the thread pool, then these worker threads will always live, usually blocked in the take method of workQueue, So you don't have to worry about the cpu idling.
Worker threads outside the number of corepoolsize will be recycled, still ThreadPoolExecutor#getTask method:
try {
Runnable r = timed ?
workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
workQueue.take();
if (r != null)
return r;
timedOut = true;
} catch (InterruptedException retry) {
timedOut = false;
}
These threads to be recycled can survive at most keepAliveTime time.
The thread pool uses a Hash table to hold the thread's reference, so that the life cycle of the thread can be controlled by operations such as adding and removing references.(ThreadPoolExecutor#processWorkerExit method).
This is expected behaviour. The documentation for ThreadPoolExecutor explicitly states:
When a new task is submitted in method execute(Runnable), if fewer than corePoolSize threads are running, a new thread is created to handle the request, even if other worker threads are idle.

Mongo Connection issue with error "state should be: open"

I am running an event in Akka actor system, where we run multiple actors to query mongo db and retrieve data. Each actor queries for 1000 documents (each document's size is 9kb)
When running an event that is required to fire 14 actors to query for Mongo DB to retrieve 13000 documents.Once I experienced below exception, not sure why? Have anyone experienced this before?
2020-04-14 19:17:28,818 [erp-writer-actor-system-akka.actor.default-dispatcher-378] ERROR c.a.s.c.m.GlobalContextMongoClientService- 76cd7a80-83ef-4389-885a-be9caed77449 - Exception occured while reading data from cursor
java.lang.IllegalStateException: state should be: open
at com.mongodb.assertions.Assertions.isTrue(Assertions.java:70)
at com.mongodb.connection.DefaultServer.getConnection(DefaultServer.java:84)
at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.getConnection(ClusterBinding.java:86)
at com.mongodb.operation.QueryBatchCursor.getMore(QueryBatchCursor.java:203)
at com.mongodb.operation.QueryBatchCursor.hasNext(QueryBatchCursor.java:103)
at com.mongodb.MongoBatchCursorAdapter.hasNext(MongoBatchCursorAdapter.java:46)
at com.xyz.smartconnect.commons.mongoclient.GlobalContextMongoClientService.findWorkers(GlobalContextMongoClientService.java:145)
at com.xyz.smartconnect.actors.QueryWorkersActor.lambda$createReceive$0(QueryWorkersActor.java:40)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
at akka.actor.Actor$class.aroundReceive(Actor.scala:513)
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:132)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:519)
at akka.actor.ActorCell.invoke(ActorCell.scala:488)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
at akka.dispatch.Mailbox.run(Mailbox.scala:224)
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Suppressed: java.lang. IllegalStateException: state should be: open
at com.mongodb.assertions.Assertions.isTrue(Assertions.java:70)
at com.mongodb.connection.DefaultServer.getConnection(DefaultServer.java:84)
at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.getConnection(ClusterBinding.java:86)
at com.mongodb.operation.QueryBatchCursor.killCursor(QueryBatchCursor.java:261)
at com.mongodb.operation.QueryBatchCursor.close(QueryBatchCursor.java:147)
at com.mongodb.MongoBatchCursorAdapter.close(MongoBatchCursorAdapter.java:41)
at com.xyz.smartconnect.commons.mongoclient.GlobalContextMongoClientService.findWorkers(GlobalContextMongoClientService.java:149)
After running multiple tests and analyzing the logs carefully, I found the root cause. Below are the details.
While the application is using cursor to query data from mongoDb, connection has been released/closed. 'State should be : open' is complaining about a released connection.
In my case, my application experienced OutOfMemory, which caused disposing beans and releasing connections. Here is timeline of log events for this issue.
Since this is a memory issue for my case, fixing memory issue will fix below exception for me.
2020-04-19 12:57:32,981 [xyz-actor-system-akka.actor.default-dispatcher-72] ERROR a.a.ActorSystemImpl- - 413f9298-ca92-4744-913b-59934e4ce831 - exception on LARS’ timer thread
java.lang.OutOfMemoryError: GC overhead limit exceeded
at akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:269)
at akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:235)
at java.lang.Thread.run(Thread.java:748)
2020-04-19 12:57:43,649 [Thread-19] INFO o.s.c.s.DefaultLifecycleProcessor- - - Stopping beans in phase 2147483647
2020-04-19 12:58:13,483 [Thread-19] INFO o.s.j.e.a.AnnotationMBeanExporter- - - Unregistering JMX-exposed beans on shutdown
2020-04-19 12:58:45,186 [localhost-startStop-2] INFO c.a.s.ApplicationContextListener- - - >>>>>>>>> Disposing beans
2020-04-19 12:59:00,182 [localhost-startStop-2] INFO c.a.s.c.SpringBeanDisposer- - - Mongo connections are released.
2020-04-19 12:59:09,591 [xyz-actor-system-akka.actor.default-dispatcher-73] ERROR c.a.s.c.m.GlobalContextMongoClientService- - 413f9298-ca92-4744-913b-59934e4ce831 - Exception occured while reading data from cursor
java.lang.IllegalStateException: state should be: open
at com.mongodb.assertions.Assertions.isTrue(Assertions.java:70)
at com.mongodb.connection.DefaultServer.getDescription(DefaultServer.java:114)
at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.getServerDescription(ClusterBinding.java:81)
at com.mongodb.operation.QueryBatchCursor.initFromCommandResult(QueryBatchCursor.java:251)
at com.mongodb.operation.QueryBatchCursor.getMore(QueryBatchCursor.java:207)
at com.mongodb.operation.QueryBatchCursor.hasNext(QueryBatchCursor.java:103)
at com.mongodb.MongoBatchCursorAdapter.hasNext(MongoBatchCursorAdapter.java:46)

JTA Transaction Timeout Troubleshooting

Setup:
Oracle 12 DB
JBoss EAP7
Webservice running on JBoss, inserts into DB
Batchprogramm calling the webservice from multiple threads about 130.000 times in the span of an hour
The problem:
2018-04-26 18:20:44,675 +0200 [WARN ] [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012117: TransactionReaper::check timeout for TX 0:ffffac110923:-4c44ed1d:5ac9329e:6866ea in state RUN
2018-04-26 18:20:44,675 +0200 [WARN ] [com.arjuna.ats.arjuna] (Transaction Reaper Worker 0) ARJUNA012095: Abort of action id 0:ffffac110923:-4c44ed1d:5ac9329e:6866ea invoked while multiple threads active within it.
2018-04-26 18:20:44,679 +0200 [WARN ] [com.arjuna.ats.arjuna] (Transaction Reaper Worker 0) ARJUNA012381: Action id 0:ffffac110923:-4c44ed1d:5ac9329e:6866ea completed with multiple threads - thread default task-48 was in progress with xxx.BaseEntity.getNextValue(BaseEntity.java:28)
This happens routinely in the production environment under heavy load, not when processing fewer records and not in an identical test environment with the exact same load.
The last line shows that this transaction timeout (300s) occurs while fetching the next value from a sequence:
CREATE SEQUENCE "XXX_S" MINVALUE xxx MAXVALUE xxx INCREMENT BY 1 START WITH xxx CACHE 2 NOORDER NOCYCLE NOPARTITION ;
I know Oracle needs to lock/unlock the sequence in order to keep it consistent, so my parallel webservice calls must somehow run into a deadlock or massive contention, producing the timeout.
How do I find the root of this problem? Which parameters can I try to manipulate?
Issue is now resolved, though very unsatisfyingly. We removed parallelism.

AMQP Channel shutdown but Consumer not always restart

I have a frequent Channel shutdown: connection error issues (under 24.133.241:5671 thread, name is truncated) in RabbitMQ Java client (my producer and consumer are far apart). Most of the time consumer is automatically restarted as I have enabled heartbeat (15 seconds). However, there were some instances only Channel shutdown: connection error but no Consumer raised exception and no Restarting Consumer (under cTaskExecutor-4 thread).
My current workaround is to restart my application. Anyone can shed some light on this matter?
2017-03-20 12:42:38.856 ERROR 24245 --- [24.133.241:5671] o.s.a.r.c.CachingConnectionFactory
: Channel shutdown: connection error
2017-03-20 12:42:39.642 WARN 24245 --- [cTaskExecutor-4] o.s.a.r.l.SimpleMessageListenerCont
ainer : Consumer raised exception, processing can restart if the connection factory supports
it
...
2017-03-20 12:42:39.642 INFO 24245 --- [cTaskExecutor-4] o.s.a.r.l.SimpleMessageListenerCont
ainer : Restarting Consumer: tags=[{amq.ctag-4CqrRsUP8plDpLQdNcOjDw=21-05060179}], channel=Ca
ched Rabbit Channel: AMQChannel(amqp://21-05060179#10.24.133.241:5671/,1), conn: Proxy#7ec317
54 Shared Rabbit Connection: SimpleConnection#44bac9ec [delegate=amqp://21-05060179#10.24.133
.241:5671/], acknowledgeMode=NONE local queue size=0
Generally, this is due to the consumer thread being "stuck" in user code somewhere, so it can't react to the broken connection.
If you have network issues, perhaps it's stuck reading or writing to a socket; make sure you have timeouts set for any I/O operations.
Next time it happens take a thread dump to see what the consumer threads are doing.

EJB JPA CMT - Flush failure on large dataset

I've got a JBoss 6.3 EAP, JPA 2.0, EJB 3.1, CMT JTA web app. DB is MSSQL2008R2, using MS JDBC driver, and hibernate 4.2.14 under the hood.
I've got a method that looks kind of like this, to duplicate a million Prices entities:
public void doStuff(Date newDate)
{
List<Prices> prices = dao.getPrices(); //<< 1000000+ prices
for (Prices price : prices)
{
Prices copy = price.clone();
copy.setDate(newDate);
entityManager.persist(copy);
if (newDate.before(someDate))
{
price.setDate(someDate);
entityManager.merge(price);
}
}
}
I set the JBoss EJB coordinator timeout to an hour, to let it run. I increased heap size to -Xmx 3G after it ran out of memory the first time.
The code starts at 1:24am, it finishes at 1:36am, then at 2:24am, it fails with a transaction error, and rolls back. The stacktrace says its during the flush.
at org.hibernate.ejb.AbstractEntityManagerImpl$CallbackExceptionMapperImpl.mapManagedFlushFailure(AbstractEntityManagerImpl.java:1510) [hibernate-e
ntitymanager-4.2.14.SP1-redhat-1.jar:4.2.14.SP1-redhat-1]
I can see that if I break up the million into chunks of 10000 and flush after each, it doesn't even get near a million during the hour. So flushing is clearly an expensive task. But I suppose it starts flushing implicitly during JTA's post-intercept commit.
Should I just increase the timeout and try again? It is a DEV database being used by a few others, and my code seems to lock the prices table, making it unqueryable from MSSQL SMSS, so it's not something I want to let run indefinitely. But is this just an issue of needing more time?
Start of stacktrace:
02:24:45,157 WARN [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012117: TransactionReaper::check timeout for TX 0:ffff0a14021f:3d218bb8:56009132:22 in state RUN
02:24:45,169 WARN [com.arjuna.ats.arjuna] (Transaction Reaper Worker 0) ARJUNA012095: Abort of action id 0:ffff0a14021f:3d218bb8:56009132:22 invoked while multiple threads active within it.
02:24:45,169 WARN [com.arjuna.ats.arjuna] (Transaction Reaper Worker 0) ARJUNA012108: CheckedAction::check - atomic action 0:ffff0a14021f:3d218bb8:56009132:22 aborting with 1 threads active!
02:24:45,667 WARN [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012117: TransactionReaper::check timeout for TX 0:ffff0a14021f:3d218bb8:56009132:22 in state CANCEL
02:24:46,209 WARN [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012117: TransactionReaper::check timeout for TX 0:ffff0a14021f:3d218bb8:56009132:22 in state CANCEL_INTERRUPTED
02:24:46,210 WARN [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012120: TransactionReaper::check worker Thread[Transaction Reaper Worker 0,5,main] not responding to interrupt when cancelling TX 0:ffff0a14021f:3d218bb8:56009132:22 -- worker marked as zombie and TX scheduled for mark-as-rollback
02:24:46,210 WARN [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012110: TransactionReaper::check successfuly marked TX 0:ffff0a14021f:3d218bb8:56009132:22 as rollback only
02:25:07,968 WARN [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (http-/0.0.0.0:8080-1) SQL Error: 0, SQLState: null
02:25:07,968 ERROR [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (http-/0.0.0.0:8080-1) Transaction cannot proceed STATUS_ROLLEDBACK
02:25:08,085 WARN [com.arjuna.ats.arjuna] (http-/0.0.0.0:8080-1) ARJUNA012125: TwoPhaseCoordinator.beforeCompletion - failed for SynchronizationImple< 0:ffff0a14021f:3d218bb8:56009132:24, org.hibernate.engine.transaction.synchronization.internal.RegisteredSynchronization#2d633a18 >: javax.persistence.PersistenceException: org.hibernate.exception.GenericJDBCException: could not prepare statement
Well I rewrote it as SQL, and used 2 entityManager.createNativeQuery calls, instead of programmatic JPA, and it finished in 30 seconds or so.
So, the lesson is, don't bother with JPA for large data sets. Work out the solution in SQL, and then grab the straight JDBC connection to do it.

Categories