I have an application running on Websphere Application Server 6.1.0.43. And I'm having slowdown issues when thrying to invoke a remote service.
The slowdown is on the method findGroupAndGetConnection from the class outboundConnectionCache.
According to the IBM APAR PK94494:
The delay occurs after the client-side JAX-RPC handler (if present) is invoked and before the actual SOAP message is sent to the provider.
Because the delay occurs in the IBM web services engine, this problem
can be difficult to detect.
A com.ibm.ws.webservices.engine.transport.*=all trace will show entries similar to these which repeat:
[8/19/09 18:08:29:658 GMT] 00000047 OutboundConne 1 Enter:
WSWS3595I: Current pool size: 25. Connections-in-use size: 0.
Configured pool size: 25
In addition, that same trace spec will show long delays in executing
the .findGroupAndGetConnection() method:
[8/19/09 18:08:03:428 GMT] 00000047 OutboundConne >
OutboundConnectionCache.findGroupAndGetConnection()
WAITING_THREADS_THRESHOLD is 5 Entry
[8/19/09 18:08:38:358 GMT] 00000047 OutboundConne <
OutboundConnectionCache.findGroupAndGetConnection() Exit
And they recommend the following:
Reduce the 'com.ibm.websphere.webservices.http.connectionPoolCleanUpTime' from
the default of 180 to 120 seconds
Increase the max connections 'com.ibm.websphere.webservices.http.maxConnection' property from
default of 25 to 50. This will also require increasing the web
container thread pool size to 100.
Before changing the default properties I decided to monitor the Web Container thread usage and I noticed that maximum thread pool size (50) is never reached, but the minimum pool size (10) is reached very often, forcing connections to be destroyed and recreated.
Running over the minimum pool size will cause this slowdown? Should I increase the minimum pool size? Is my problem something other than http outbound connection pool?
Related
Am trying to use jetty websockets, and wish to setup with a user configurable idle timeout setting for connected sessions. It seems everything tried results in a timeout at 60 seconds:
org.eclipse.jetty.websocket.api.CloseException: java.util.concurrent.TimeoutException: Idle timeout expired: 60002/60000 ms
at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onReadTimeout(AbstractWebSocketConnection.java:567)
at org.eclipse.jetty.io.AbstractConnection.onFillInterestedFailed(AbstractConnection.java:172)
at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onFillInterestedFailed(AbstractWebSocketConnection.java:540)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.failed(AbstractConnection.java:317)
at org.eclipse.jetty.io.FillInterest.onFail(FillInterest.java:138)
at org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.onFillableFail(SslConnection.java:579)
at org.eclipse.jetty.io.ssl.SslConnection.onFillInterestedFailed(SslConnection.java:407)
at org.eclipse.jetty.io.ssl.SslConnection$2.failed(SslConnection.java:167)
at org.eclipse.jetty.io.FillInterest.onFail(FillInterest.java:138)
at org.eclipse.jetty.io.AbstractEndPoint.onIdleExpired(AbstractEndPoint.java:407)
at org.eclipse.jetty.io.IdleTimeout.checkIdleTimeout(IdleTimeout.java:171)
at org.eclipse.jetty.io.IdleTimeout.idleCheck(IdleTimeout.java:113)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
Using Jetty 9.4.26 and using #WebSocket annotated pojos for each ws session.
Setting a value in the annotation is not really workable as the idle timeout needs to be configurable at runtime. However even when set, it only makes a difference if set to something less than 60 seconds:
#WebSocket(maxIdleTime = 65000) \\Idle timeout expired: 60002/60000 ms
#WebSocket(maxIdleTime = 50000) \\Idle timeout expired: 50001/50000 ms
Exactly the same if I manually set the timeout on the session passed into the #OnWebSocketConnect method:
session.setIdleTimeout(65000); \\Idle timeout expired: 60001/60000 ms
session.setIdleTimeout(50000); \\Idle timeout expired: 50001/50000 ms
Am calling connect() on a WebSocketClient which I am instantiating as a singleton, and passing the annotated pojo into connect():
HttpClient jettyHttpClient = new HttpClient(getWebSocketSslContextFactory());
jettyHttpClient.start();
WebSocketClient wsClient = new WebSocketClient(jettyHttpClient);
wsClient.start();
However the above timeout behavior is still identical if using either of these:
wsClient.getPolicy().setIdleTimeout(65000);
wsClient.setMaxIdleTimeout(65000);
jettyHttpClient.setIdleTimeout(); seems to have no affect at all....
Adding a breakpoint into setIdleTimeout() of IdleTimeout is not making things much clearer, this breakpoint is hit 6 times on 4 different instances during a connection:
0 set in SocketChannelEndpoint (instance 1) via ClientSelectorManager.newEndPoint
300000 set in SocketChannelEndpoint (instance 2) via ServerConnector.newEndPoint
-1 set in SslConnection$DecyptedEndPoint (instance 1) via Constructor
-1 set in SslConnection$DecyptedEndPoint (instance 2) via setIdleTimeout
60000 set in SocketChannelEndpoint (instance 2) via WebSocketServerFactory.upgrade
65000 set in SocketChannelEndpoint (instance 1) via WebSocketUpgradeRequest.upgrade
Am basically lost as to what is going on internally. There are 5 ways of setting the timeout and nothing seems to work above 60 seconds.
Similar request to this thread, but didn't have enough rep to comment there, hence new thread.
Jetty Websocket IdleTimeout
I am developing an application using play framework (version 2.8.0), java(version 1.8) with an oracle database(version 12C).
There is only zero or one hit to the database in a day, I am getting below error.
java.sql.SQLRecoverableException: IO Error: Socket read timed out
at oracle.jdbc.driver.T4CConnection.logoff(T4CConnection.java:919)
at oracle.jdbc.driver.PhysicalConnection.close(PhysicalConnection.java:2005)
at com.zaxxer.hikari.pool.PoolBase.quietlyCloseConnection(PoolBase.java:138)
at com.zaxxer.hikari.pool.HikariPool.lambda$closeConnection$1(HikariPool.java:447)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.SocketTimeoutException: Socket read timed out
at oracle.net.nt.TimeoutSocketChannel.read(TimeoutSocketChannel.java:174)
at oracle.net.ns.NIOHeader.readHeaderBuffer(NIOHeader.java:82)
at oracle.net.ns.NIOPacket.readFromSocketChannel(NIOPacket.java:139)
at oracle.net.ns.NIOPacket.readFromSocketChannel(NIOPacket.java:101)
at oracle.net.ns.NIONSDataChannel.readDataFromSocketChannel(NIONSDataChannel.java:80)
at oracle.jdbc.driver.T4CMAREngineNIO.prepareForReading(T4CMAREngineNIO.java:98)
at oracle.jdbc.driver.T4CMAREngineNIO.unmarshalUB1(T4CMAREngineNIO.java:534)
at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:485)
at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:252)
at oracle.jdbc.driver.T4C7Ocommoncall.doOLOGOFF(T4C7Ocommoncall.java:62)
at oracle.jdbc.driver.T4CConnection.logoff(T4CConnection.java:908)
... 6 common frames omitted
db {
default {
driver=oracle.jdbc.OracleDriver
url="jdbc:oracle:thin:#XXX.XXX.XXX.XX:XXXX/XXXXXXX"
username="XXXXXXXXX"
password="XXXXXXXXX"
hikaricp {
dataSource {
cachePrepStmts = true
prepStmtCacheSize = 250
prepStmtCacheSqlLimit = 2048
}
}
}
}
It seems it is causing due to inactive database connection, How can I solve this?
Please let me know if any other information is required?
You can enable TCP keepalive for JDBC - either be setting directive or by adding "ENABLE=BROKEN" into connection string.
Usually Cisco/Juniper cuts off TCP connection when it is inactive for more that on hour.
While Linux kernel starts sending keepalive probes after two hours(tcp_keepalive_time). So if you decide to turn tcp keepalive on, you will also need root, to change this kernel tunable to lower value(10-15 minutes)
Moreover HikariCP should not keep open any connection for longer than 30 minutes - by default.
So if your FW, Linux kernel and HikariCP all use default settings, then this error should not occur in your system.
See HikariCP official documentation
maxLifetime:
This property controls the maximum lifetime of a connection in the
pool. An in-use connection will never be retired, only when it is
closed will it then be removed. On a connection-by-connection basis,
minor negative attenuation is applied to avoid mass-extinction in the
pool. We strongly recommend setting this value, and it should be
several seconds shorter than any database or infrastructure imposed
connection time limit. A value of 0 indicates no maximum lifetime
(infinite lifetime), subject of course to the idleTimeout setting. The
minimum allowed value is 30000ms (30 seconds). Default: 1800000 (30
minutes)
I have added the below configuration for hickaricp in configuration file and it is
working fine.
## Database Connection Pool
play.db.pool = hikaricp
play.db.prototype.hikaricp.connectionTimeout=120000
play.db.prototype.hikaricp.idleTimeout=15000
play.db.prototype.hikaricp.leakDetectionThreshold=120000
play.db.prototype.hikaricp.validationTimeout=10000
play.db.prototype.hikaricp.maxLifetime=120000
I am experiencing Embedded InfiniSpan cache issue where nodes timeout on re-joining the cluster.
Caused by: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 7 from vvshost
at org.infinispan.remoting.transport.impl.SingleTargetRequest.onTimeout(SingleTargetRequest.java:64)
at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:86)
at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:21)
The only way I can get the node to re-join is to switch off the cache and delete all local cache persistence files.
Here is the configuration which I am using:
Transport:
TransportConfigurationBuilder - defaultClusteredBuild
JMX Statistics - Enabled
Duplicate domains - Allowed
Cache Manager:
Manager Class - EmbeddedCacheManager
Memory - Memory Size: 0
Persistence: Single File Store
async: disabled
Clustering Cache Mode - CacheMode.DIST_SYNC
It seems right to me, but the value of remote-timeout is "15000" milliseconds by default. Increase the timeout until you stop getting the error.
Hope it helps
Stackoverflow error while using H2 database in Multi Threaded Environment
Our Application has service layer querying H2 database and retrieving the resultset.
The service layer connects to the h2 database using opensource clustering middleware "Sequoia" (that offers load balancing and
transparent failover) and also manages database connections .
https://sourceforge.net/projects/sequoiadb/
Our service layer has 50 service methods and we have exposed the service methods as EJB's . While Invoking the EJB's
we get the response from service (that includes H2 READ) with an average response time of 0.2 secs .
The DAO layer, query the database using Hibernate Criteria and we also use JPA2.0 entity manager to manage datasource.
For Load testing , We created a test class (with a main method) that invokes all the 50 EJB Methods .
50 threads were created and all the threads invoked the test class . The execution was Ok for first run and all the 50 threads succssfully completed
invoking 50 EJB methods .
When we triggered the test class again , we encountered "stackoverflowerror".The Detailed stacktrace is shown below
org.h2.jdbc.JdbcSQLException: General error: "java.lang.StackOverflowError" [50000-176]
at org.h2.message.DbException.getJdbcSQLException(DbException.java:344)
at org.h2.message.DbException.get(DbException.java:167)
at org.h2.message.DbException.convert(DbException.java:290)
at org.h2.server.TcpServerThread.sendError(TcpServerThread.java:222)
at org.h2.server.TcpServerThread.run(TcpServerThread.java:155)
at java.lang.Thread.run(Thread.java:784)
Caused by: java.lang.StackOverflowError
at java.lang.Character.digit(Character.java:4505)
at java.lang.Integer.parseInt(Integer.java:458)
at java.lang.Integer.parseInt(Integer.java:510)
at java.text.MessageFormat.makeFormat(MessageFormat.java:1348)
at java.text.MessageFormat.applyPattern(MessageFormat.java:469)
at java.text.MessageFormat.<init>(MessageFormat.java:361)
at java.text.MessageFormat.format(MessageFormat.java:822)
at org.h2.message.DbException.translate(DbException.java:92)
at org.h2.message.DbException.getJdbcSQLException(DbException.java:343)
at org.h2.message.DbException.get(DbException.java:167)
at org.h2.message.DbException.convert(DbException.java:290)
at org.h2.command.Command.executeUpdate(Command.java:262)
at org.h2.jdbc.JdbcPreparedStatement.execute(JdbcPreparedStatement.java:199)
at org.h2.server.TcpServer.addConnection(TcpServer.java:140)
at org.h2.server.TcpServerThread.run(TcpServerThread.java:152)
... 1 more
at org.h2.engine.SessionRemote.done(SessionRemote.java:606)
at org.h2.engine.SessionRemote.initTransfer(SessionRemote.java:129)
at org.h2.engine.SessionRemote.connectServer(SessionRemote.java:430)
at org.h2.engine.SessionRemote.connectEmbeddedOrServer(SessionRemote.java:311)
at org.h2.jdbc.JdbcConnection.<init>(JdbcConnection.java:107)
at org.h2.jdbc.JdbcConnection.<init>(JdbcConnection.java:91)
at org.h2.Driver.connect(Driver.java:74)
at org.continuent.sequoia.controller.connection.DriverManager.getConnectionForDriver(DriverManager.java:266)
We then even added a random thread sleep(10-25 secs) between EJB Invocation . The execution was successful thrice (all 50 EJB Invocation)
and when we triggered for 4th time ,it failed with above error .
We get to see the above failure even with a thread count of 25 .
The Failure is random and there doesn't seems to be a pattern . Kindly let us know if we have missed any configuration .
Please let me know if you need any additional information . Thanks in Advance for any help .
Technology Stack :
1) Java 1.6
2) h2-1.3.176
3) Sequoia Middleware that manages DB Connection Open and Close.
-Variable Connection Pool Manager
-init pool size 250
Thanks Lance Java for your suggestions . Increasing stack size didnt help in our scenario for the following reasons (i.e additional stack helped only for few more executions).
In Our App , we are using Entity Manager (JPA) and the transaction attribute was not set . Hence each query to the database , created a thread carrying out execution . In JVisualVm , we observed the DB Threads, the Live Threads was equal to Total Threads Started .
Eventually our app created more than 30K threads and hence has resulted in Stackoverflow error .
Upon Setting the transaction attribute , the threads get killed after DB execution and all the transactions are then managed by only 25-30 threads.
The Issue is resolved now .
There's two main causes for a stack overflow error
A bug containing a non-terminating recursive call
The allocated stack size for the jvm isn't big enough
Looking at your stack trace it doesn't look recursive so I'm guessing you are running out of space. Have you set the -Xss flag for your JVM? You might need to increase this value.
After running my application, i am getting this error after around 5 mins.
Even though i am returning the resource after use, i keep getting this.
I have built jedis-2.2.2-SNAPSHOT.jar from the jedis code base, since its not released yet
I had set the minIdle = 100, maxIdle=200 & maxActive=200. At the time of this exception, the connection count to redis was 122 from my application
redis.clients.jedis.exceptions.JedisConnectionException: Could not get a resource from the pool
at redis.clients.util.Pool.getResource(Pool.java:42)
Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:442)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:360)
at redis.clients.util.Pool.getResource(Pool.java:40)
... 6 more
Did you check that redis is still up & running ?
If not, investigate why it died.
try a redis-cli in a terminal if you can. "info" would give you more details.