Transaction is alternating Timeouts - java

I am using jboss 5.1.x, EJB3.0
I have MDB which listens to JMS queue. when the MDB taking a message, it dispatch a msg via TCP to some modem.
sometimes that Modem doesnt response when the server is waiting for an answer:
byte[] byteData = receive(is);
coz I cant set timeout on InputStream.
so thanks to the EJB container the transaction timeout(which is there by default) rolling back the operation and then a retry executed again.
this mechanism by default works fine for me, the problem is:
Sometimes the transaction never timed out, and after long time I get the following
msg in the console:
15:18:22,578 WARN [arjLoggerI18N] [com.arjuna.ats.arjuna.coordinator.TransactionReaper_18] - TransactionReaper::check timeout for TX a6b2232:5f8:4d3591c6:76 in state RUN
15:18:22,578 WARN [arjLoggerI18N] [com.arjuna.ats.arjuna.coordinator.BasicAction_58] - Abort of action id a6b2232:5f8:4d3591c6:76 invoked while multiple threads active within it.
15:18:22,578 WARN [arjLoggerI18N] [com.arjuna.ats.arjuna.coordinator.CheckedAction_2] - CheckedAction::check - atomic action a6b2232:5f8:4d3591c6:76 aborting with 1 threads active!
15:18:22,578 WARN [arjLoggerI18N] [com.arjuna.ats.arjuna.coordinator.TransactionReaper_7] - TransactionReaper::doCancellations worker Thread[Thread-10,5,jboss] successfully canceled TX a6b2232:5f8:4d3591c6:76
Any idea what's wrong? and why sometimes it work and sometimes it doesnt?
thanks,
ray.

JBossAS that uses the Arjuna's Transaction Manager. In EJB3 interceptor chain would begin to unroll and eventually hit the transaction manager interceptors whose job it is to abort the transaction.
For MDB's you can annote it with #ActivationConfigProperty(propertyName="transactionTimeout" value="1500")
For other beans you can have #TransactionTimeout(1500) at class level or method level.
When the transaction manager detects that the transaction has timed out and then aborts it from within an asynchronous thread (different from the thread running in method), but it never sends an interrupt to the currently running thread.
Therefore resulting in : invoked while multiple threads active within it ... aborting with 1 threads active!
Edit :
//---
ThreadGroup root = Thread.currentThread().getThreadGroup().getParent();
while (root.getParent() != null)
root = root.getParent();
findAllThread(root,0);
//---
public static findAllThread(ThreadGroup threadGroup, int level){
int actCount = threadGroup.activeCount();
Thread[] threads = new Thread[actCount*2];
actCount = threadGroup.enumerate(threads, false);
for (int i=0; i<actCount; i++) {
Thread thread = threads[i];
thread.interrupt();
}
int groupCount = threadGroup.activeGroupCount();
ThreadGroup[] groups = new ThreadGroup[numGroups*2];
groupCount = threadGroup.enumerate(groups, false);
for (int i=0; i<groupCount; i++)
findAllThread(groups[i], level+1);
//---
It will list other active threads also like Reference Handler, Finalizer, Signal Dispatcher etc.

Related

Threads are increasing but not decreasing when the task finished in Java

I have an application that I did thread configuration as follows,
corepoolsize = 10
maxpoolsize = 10
queuecapacity = 10
But when I do use Threads and functions that run on threads (asynchronous process)
I see it is increasing like the follows, But I don't see the threads are decreasing. Is that a indication of memory leak? Can anybody guide me?
Log prints like this,
2021-09-15 01:13:48.554 INFO 111 --- [Async-1]
2021-09-15 01:13:48.654 INFO 121 --- [Async-2]
2021-09-15 01:13:48.754 INFO 132 --- [Async-3]
2021-09-15 01:13:48.854 INFO 140 --- [Async-4]
2021-09-15 01:13:48.954 INFO 155 --- [Async-5]
2021-09-15 01:13:49.554 INFO 160 --- [Async-6]
But I don't see it is calling again [Async-1], [Async-2] like that. Is this a memory leak or a thread cache or after filling up corepoolsize then will it run [Async-1]?
Corepoolsize represents the worker thread that is always alive.
You can check ThreadPoolExecutor#runWorker, ThreadPoolExecutor#getTask method, these worker threads will always try to get tasks from the task queue and execute them. So if no new tasks are posted to the thread pool, then these worker threads will always live, usually blocked in the take method of workQueue, So you don't have to worry about the cpu idling.
Worker threads outside the number of corepoolsize will be recycled, still ThreadPoolExecutor#getTask method:
try {
Runnable r = timed ?
workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
workQueue.take();
if (r != null)
return r;
timedOut = true;
} catch (InterruptedException retry) {
timedOut = false;
}
These threads to be recycled can survive at most keepAliveTime time.
The thread pool uses a Hash table to hold the thread's reference, so that the life cycle of the thread can be controlled by operations such as adding and removing references.(ThreadPoolExecutor#processWorkerExit method).
This is expected behaviour. The documentation for ThreadPoolExecutor explicitly states:
When a new task is submitted in method execute(Runnable), if fewer than corePoolSize threads are running, a new thread is created to handle the request, even if other worker threads are idle.

Socket read interrupted during callback from Docker Java

Context
I have a webservice writing a test id in a queue. Then, a listener reads the queue, searches the test and starts it. During those steps, it writes updates of the test in the database in order to be shown to the user. To be more precise: the test is launched in a docker container and at the end of it, I want to update the status of the test to FINISHED. For that, I use the docker java library with a callback.
Problem
When it comes to call the callback, I receive multiple error messages on the call to update the test (but it happens only once, if I try twice the second time works, but it still writes a lot of error messages from the transaction manager).
Here are the error messages logged:
2020-11-20 09:20:43,639 WARN [docker-java-stream--1032099154] (org.jboss.jca.core.connectionmanager.listener.TxConnectionListener) IJ000305: Connection error occured: org.jboss.jca.core.connectionmanager.listener.TxConnectionListener#600268e6[state=NORMAL managed connection=org.jboss.jca.adapters.jdbc.local.LocalManagedConnection#236f1a69 connection handles=1 lastReturned=1605860423264 lastValidated=1605860242146 lastCheckedOut=1605860443564 trackByTx=true pool=org.jboss.jca.core.connectionmanager.pool.strategy.OnePool#2efb0d3b mcp=SemaphoreConcurrentLinkedQueueManagedConnectionPool#3e8e7a62[pool=ApplicationDS] xaResource=LocalXAResourceImpl#482fdad2[connectionListener=600268e6 connectionManager=4c83f895 warned=false currentXid=null productName=Oracle productVersion=Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
Version 18.3.0.0.0 jndiName=java:/ApplicationDS] txSync=TransactionSynchronization#1387480544{tx=Local transaction (delegate=TransactionImple < ac, BasicAction: 0:ffffac110002:-50a6b0bf:5fb6bdb9:73db4 status: ActionStatus.ABORTING >, owner=Local transaction context for provider JBoss JTA transaction provider) wasTrackByTx=true enlisted=true cancel=false}]: java.sql.SQLRecoverableException: IO Error: Socket read interrupted
2020-11-20 09:20:43,647 INFO [docker-java-stream--1032099154] (org.jboss.jca.core.connectionmanager.listener.TxConnectionListener) IJ000302: Unregistered handle that was not registered: org.jboss.jca.adapters.jdbc.jdk8.WrappedConnectionJDK8#4f3c1cb2 for managed connection: org.jboss.jca.adapters.jdbc.local.LocalManagedConnection#236f1a69
2020-11-20 09:20:43,656 WARN [docker-java-stream--1032099154] (com.arjuna.ats.jta) ARJUNA016031: XAOnePhaseResource.rollback for < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffffac110002:-50a6b0bf:5fb6bdb9:73db4, node_name=1, branch_uid=0:ffffac110002:-50a6b0bf:5fb6bdb9:73db8, subordinatenodename=null, eis_name=java:/ApplicationDS > failed with exception: org.jboss.jca.core.spi.transaction.local.LocalXAException: IJ001160: Could not rollback local transaction
Caused by: org.jboss.jca.core.spi.transaction.local.LocalResourceException: IO Error: Socket read interrupted
at org.jboss.ironjacamar.jdbcadapters#1.4.22.Final//org.jboss.jca.adapters.jdbc.local.LocalManagedConnection.rollback(LocalManagedConnection.java:139)
...
Caused by: java.sql.SQLRecoverableException: IO Error: Socket read interrupted
at com.oracle.jdbc//oracle.jdbc.driver.T4CConnection.doRollback(T4CConnection.java:1140)
...
Caused by: java.io.InterruptedIOException: Socket read interrupted
at com.oracle.jdbc//oracle.net.nt.TimeoutSocketChannel.handleInterrupt(TimeoutSocketChannel.java:258)
...
Explanation
At the beginning, I thought about a connection problem, maybe the the transaction is no more available at the callback time (because the docker run took too long), maybe it has to be invalidated.
But at the end, as written in the console, it came from an interruption of the thread when it tries to acquire the lock to update the test and I discovered where this interruption came from: I took a look at the method executeAndStream in DefaultInvocationBuilder from the docker java library and I discovered this:
Thread thread = new Thread(() -> {
Thread streamingThread = Thread.currentThread();
try (DockerHttpClient.Response response = execute(request)) {
callback.onStart(() -> {
streamingThread.interrupt();
response.close();
});
sourceConsumer.accept(response);
callback.onComplete();
} catch (Exception e) {
callback.onError(e);
}
}, "docker-java-stream-" + Objects.hashCode(request));
thread.setDaemon(true);
thread.start();
And here it is, the closable given to onStart interrupts the thread. After that, I discovered in the method onComplete from ResultCallbackTemplate (that I was extending for my callback) a close on that closable:
#Override
public void onComplete() {
try {
close();
} catch (IOException e) {
throw new RuntimeException(e);
}
}
Resolution
The problem finally came from the following code I wrote:
#Override
public void onComplete() {
super.onComplete();
updateTest(FINISHED);
}
I was calling the onComplete method from the parent without knowing what is does and as usual, first before doing anything else. To correct that, I only had to call the super method at the end:
#Override
public void onComplete() {
updateTest(FINISHED);
super.onComplete();
}

How to solve java.util.concurrent.RejectedExecutionException jboss?

Hitting this error in jboss server:
Caused by:
java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask#542f7a70 rejected from java.util.concurrent.ThreadPoolExecutor#42XXXXXX[Running, pool size = 10, active threads = 10, queued tasks = 0, completed tasks = 64491]
at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
Is it because the active threads is limited to 10? How to increase it?
Don't display the shutdown method. For example, in Android, if you cancel AsyncTask in the Destory method, then no active thread in the thread pool will recycle itself.
When calling the thread pool, determine whether it has been shut down, and determine it by the API method isShutDown. Sample code:
OR
Maybe your threads is over maximumPoolSize? is possable!?

JTA Transaction Timeout Troubleshooting

Setup:
Oracle 12 DB
JBoss EAP7
Webservice running on JBoss, inserts into DB
Batchprogramm calling the webservice from multiple threads about 130.000 times in the span of an hour
The problem:
2018-04-26 18:20:44,675 +0200 [WARN ] [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012117: TransactionReaper::check timeout for TX 0:ffffac110923:-4c44ed1d:5ac9329e:6866ea in state RUN
2018-04-26 18:20:44,675 +0200 [WARN ] [com.arjuna.ats.arjuna] (Transaction Reaper Worker 0) ARJUNA012095: Abort of action id 0:ffffac110923:-4c44ed1d:5ac9329e:6866ea invoked while multiple threads active within it.
2018-04-26 18:20:44,679 +0200 [WARN ] [com.arjuna.ats.arjuna] (Transaction Reaper Worker 0) ARJUNA012381: Action id 0:ffffac110923:-4c44ed1d:5ac9329e:6866ea completed with multiple threads - thread default task-48 was in progress with xxx.BaseEntity.getNextValue(BaseEntity.java:28)
This happens routinely in the production environment under heavy load, not when processing fewer records and not in an identical test environment with the exact same load.
The last line shows that this transaction timeout (300s) occurs while fetching the next value from a sequence:
CREATE SEQUENCE "XXX_S" MINVALUE xxx MAXVALUE xxx INCREMENT BY 1 START WITH xxx CACHE 2 NOORDER NOCYCLE NOPARTITION ;
I know Oracle needs to lock/unlock the sequence in order to keep it consistent, so my parallel webservice calls must somehow run into a deadlock or massive contention, producing the timeout.
How do I find the root of this problem? Which parameters can I try to manipulate?
Issue is now resolved, though very unsatisfyingly. We removed parallelism.

Java catch/finally blocks skipped completely?

We have a ThreadPoolExecutor-back task execution module. Some tasks are apparently finished in the application codes but the worker code somehow skip all the try/catch/finally and just go pick up the next task, causing the previous one to miss important status reporting code.
Like in this example, 257832:5 was picked up by thread-8 but this thread eventually just start another task.
2012-07-11 15:53:39,389 INFO [pool-1-thread-6]: Task (258861:5) : Starting.
2012-07-11 15:53:39,389 INFO [pool-1-thread-6]: A task running ### logged by app logic
2012-07-11 15:54:18,186 INFO [pool-1-thread-6]: Task (258868:5) : Starting. ### pick up another suddenly!
2012-07-11 15:54:18,186 INFO [pool-1-thread-6]: A task running ### logged by app logic
2012-07-11 15:54:18,445 INFO [pool-1-thread-6]: Task (258868:5) : returned from Task.doWork.
2012-07-11 15:54:18,446 INFO [pool-1-thread-6]: Task (258868:5) : finalizing.
2012-07-11 15:54:18,487 INFO [pool-1-thread-6]: Task (258868:5) : notifying status result: 200.
2012-07-11 15:54:18,487 INFO [pool-1-thread-6]: Task (258868:5) : Finished
The Runnable for the ThreadPoolExecutor) looks fine. It looks like
public void run() {
log.info(String.format("Task (%s:%s) : Starting.", wfId, taskId));
try {
Object result = task.doWork(...); // call the application codes
log.info(String.format("Task (%s:%s) : returned from Task.doWork.", wfId, taskId));
status = "DONE";
} catch (Throwable t) {
log.error("Error executing task: " + taskId + " - " + className);
} finally {
log.info(String.format("Task (%s:%s) : finalizing.", wfId, taskId));
// notify status to a server
log.info(String.format("Task (%s:%s) : Finished", wfId, taskId));
}
}
// the task framework looks like
// Use a queue size larger than numThreads, since the ThreadPoolExecutor#getActiveCount() call only returns an approximate count
this.executor = new ThreadPoolExecutor(numThreads, numThreads, 60, TimeUnit.SECONDS, new ArrayBlockingQueue<Runnable>(numThreads * 2));
executor.execute(new TaskRunnable(...));
Note:
we catch the ultimate Throwable but there is no exception logged between the two tasks
there is no sign that exit() is called or program restart
the first one skipped all the log lines, whether the one right after the call to the app code or the ones in the catch and finally blocks, and skip the status reporting codes
this only happens randomly with low probability; but due to the large number of tasks executed, it still causes us a lot of headache.
It happens as if the thread pool executor simply evicts the running runnable magically (else it would require an InterruptedException which would be caught as a Throwable; Java threads are not to be stopped non-cooperatively except during a shutdown/exit) and skip all the blocks. I checked the ThreeadPoolExecutor javadoc, nothing should cause such event to happen.
What could have happened?
One possible explanation for the log messages from the finally block no appearing is that an exception was thrown while the finally block was executing:
If log was null at that point, you would get an NPE. (If log is declared as final you can eliminate this as a possible cause.)
Depending on what the objects are, you could get an unchecked exception in the wfId or taskId objects' toString() methods.
The Logger object could be broken ...
It is also possible that you are looking at the wrong source code for some reason.
Yet another theoretical possibility is that the Android platform implements Thread.destroy() to actually do something, and something has called it on the thread. If this DEPRECATED method was implemented according to the original javadoc, you'd find that finally blocks were not executed. It might also be possible to do the equivalent of this from outside the JVM. But if this is what is going on, all bets are off!!

Categories