I have a question related to the scenario when connecting from a Java application using the Microsoft JDBC Driver 4.0 to a SQL Server 2014 with AlwaysOn Availability Groups set up for high availability.
With this set up, we will be connecting to an availability group listener (specified in the db connecting string instead of any particular instance), so that the DB fail-over etc. is handled gracefully by the listener and it tries to connect to the next available instance behind the scenes if current primary goes down in the AG cluster.
Question(s) I have is,
In the data-source that is configured on the j2ee application server side (we use WebSphere), what happens to those connections already pooled by the data-source?
When a database goes down, though the AG listener would try to reconnect on the db side to the next available DB, will the AG Listener also through the jdbc driver send an event or something to the data-source created on the app server and make sure those connections that are already pooled by the datasource to be discarded and have it create new ones so that transactions on the application side wont fail (though they might for a while till new connections are created and fail over is successful) or the java application has to find out only after requesting it from the datasource?
WebSphere Application Server is able to cope with bad connections and removes them from the pool. Exactly when this happens depends on some configurable options and on how fully the Microsoft JDBC driver takes advantage of the javax.sql.ConnectionEventListener API to send notifications to the application server. In the ideal case where a JDBC driver sends the connectionErrorOccurred event immediately for all connections, WebSphere Application Server responds by removing all of these connections from the pool and by marking any connection that is currently in-use as bad so that it does not get returned to the pool once the application closes the handle. Lacking this, WebSphere Application Server will discover the first bad connection upon next use by the application. It is discovered either by a connectionErrorOcurred event that is sent by the JDBC driver at that time, or lacking that, upon inspecting the SQLState/error code of an exception for known indicators of bad connections. WebSphere Application Server then goes about purging bad connections from the pool according to the configured Purge Policy. There are 3 options:
Purge Policy of Entire Pool - all connections are removed from
the pool and in-use connections marked as bad so that they are not
pooled.
Purge Policy of Failing Connection Only - only the
specific connection upon which the error actually occurred is
removed from the pool or marked as bad and not returned to the pool
Purge Policy of Validate All Connections - all connections are
tested for validity (Connection.isValid API) and connections found
to be bad are removed from the pool or marked as bad and not
returned to the pool. Connections found to be valid remain in the
pool and continue to be used.
I'm not sure from your description if you are using WebSphere Application Server traditional or Liberty. If traditional, there is an additional option for pre-testing connections as they are handed out of the pool, but be aware that turning this on can have performance implications.
That said, the one thing to be aware of is that regardless of any of the above, your application will always need to be capable of handling the possibility of errors due to bad connections (even if the connection pool is cleared, connections can go bad while in use) and respond by requesting a new connection and retrying the operation in a new transaction.
Version 4 of that SQL Server JDBC driver is old and doesn't know anything about the always on feature.
Any data source connection pool can be configured to check the status of the connection from the pool prior to doling it out to the client. If the connection cannot be used the pool will create a new one. That's true of all vendors and versions. I believe that's the best you can do.
Related
We are using H2 started as database server process and listening on standard TCP/IP port 9092.
Our application is deployed in a Tomcat using connection pooling. We do a purge during idle time which at the end results in closing all connections to H2. From time to time we observe errors when the application tries to open the connection to H2 again:
SCHEDULERSERVICE schedule: Exception: Database may be already in use: "Waited for database closing longer than 1 minute". Possible solutions: close all other connection(s); use the server mode [90020-199]
org.h2.jdbc.JdbcSQLNonTransientConnectionException: Database may be already in use: "Waited for database closing longer than 1 minute". Possible solutions: close all other connection(s); use the server mode [90020-199]
at org.h2.message.DbException.getJdbcSQLException(DbException.java:617)
at org.h2.message.DbException.getJdbcSQLException(DbException.java:427)
at org.h2.message.DbException.get(DbException.java:205)
at org.h2.message.DbException.get(DbException.java:181)
at org.h2.engine.Engine.openSession(Engine.java:209)
at org.h2.engine.Engine.createSessionAndValidate(Engine.java:178)
at org.h2.engine.Engine.createSession(Engine.java:161)
at org.h2.server.TcpServerThread.run(TcpServerThread.java:160)
at java.lang.Thread.run(Thread.java:748)
at org.h2.message.DbException.getJdbcSQLException(DbException.java:617)
at org.h2.engine.SessionRemote.done(SessionRemote.java:607)
at org.h2.engine.SessionRemote.initTransfer(SessionRemote.java:143)
at org.h2.engine.SessionRemote.connectServer(SessionRemote.java:431)
at org.h2.engine.SessionRemote.connectEmbeddedOrServer(SessionRemote.java:317)
at org.h2.jdbc.JdbcConnection.<init>(JdbcConnection.java:169)
at org.h2.jdbc.JdbcConnection.<init>(JdbcConnection.java:148)
at org.h2.Driver.connect(Driver.java:69)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
The problem occurs when the Tomcat connection pool closes all idle connection (unused) and one connection still in use is closed afterwards.
The next attempt to open a new connection fails, a retry is successfully after some wait time.
Under which circumstances does this exception happen?
What does the exception mean?
Are there any recommendations to follow to avoid the problem?
It sounds to me that H2 does a database close after the last connection has been closed.
When does the database close occure?
How can database closures been controlled?
Thx in advance
Thorsten
Embedded database in web applications needs careful handling of its lifecycle.
You can add a javax.servlet.ServletContextListener implementation (marked with #WebListener annotation or included into web.xml) and add explicit database shutdown to its contextDestroyed() methods.
You can force database shutdown here with connection.createStatement().execute("SHUTDOWN"). If your application needs to write something to database during unload, it should do it before that command.
Without the explicit shutdown H2 closes the database when all connections are closed, if some other behavior wasn't configured explicitly (with parameters in JDBC URL, for example). For example, DB_CLOSE_DELAY sets the additional delay, maybe your application uses that setting and therefore H2 doesn't close the database immediately, or application doesn't close all connections immediately.
Anyway, when you're trying to update the web application of the fly, Tomcat tries to initialize the new version before its old version is unloaded. If H2 is in classpath of the web application itself, the new version will be unable to connect to the database during short period of time when the new version is already online but the old version isn't unloaded yet.
If you don't like it, you can run the standalone H2 Server process and use remote connections to it in your web applications.
Another option is to move H2 to the classpath of Tomcat itself and configure the connection pool as resource in the server.xml, in that case it shouldn't be affected by the lifecycle of your applications.
In both these cases you shouldn't use the SHUTDOWN command.
UPDATED
With client-server connections to a remote server such exception means that server decided to close the database because there are no active connection. This operation can't be interrupted and reverted in the middle. On attempt to open a new connection to the same database during this process it waits at most for 1 minute for completion of this process to re-open the database again. This timeout is not configurable.
There are two possible solutions.
DB_CLOSE_DELAY setting can be used with some large value in seconds. When all connections are closed, database will stay online for the specified number of seconds. -1 also can be used to set an infinite timeout.
You can try to speed up the shutdown process, but you have to figure out what takes so much time by yourself. The file compaction procedure is limited to 200 milliseconds by default, it may take a longer time, but I think it shouldn't be that long. Maybe you have a lot of temporary objects or uncommitted data. Maybe you have a very high fragmentation of database file. It's hard to say what's going wrong without further investigation.
I am trying to achieve application continuity with Oracle 12c database & Oracle UCP(Universal Connection Pool). As per the official documentation, I have implemented the following in my application. I am using ojdbc8.jar along with the equivalent ons.jar and the ucp.jar in my application.
PoolDataSource pds = oracle.ucp.jdbc.PoolDataSourceFactory.getPoolDataSource();
Properties as per oracle documentation:
pds.setConnectionFactoryClassName("oracle.jdbc.replay.OracleDataSourceImpl");
pds.setUser("username");
pds.setPassword("password");
pds.setInitialPoolSize(10);
pds.setMinPoolSize(10);
pds.setMaxPoolSize(20);
pds.setFastConnectionFailoverEnabled(true);
pds.setONSConfiguration("nodes=IP_1:ONS_PORT_NUMBER,IP_2:ONS_PORT_NUMBER");
pds.setValidateConnectionOnBorrow(true);
pds.setURL("jdbc:oracle:thin:#my_scan_name.my_domain_name.com:PORT_NUMBER/my_service_name");
// I have also tried using the TNS-Like URL as well. //
However, I am not able to acheive application continuity. I have some inflight transactions that I expect to replay when I bring down the RAC node on which my database service is running. What I observe is that my service migrates to the next available RAC node in the cluster, however, my in-flight transactions fail. What expect to happen over here is that the drivers will automatically restart the failed in-flight transactions. However, I dont see this happening. The queries that I fire are the database, sometimes I see them being triggered again on the database side, but we see Connection Closed Exception on the client side
According to some documentation, application continuity allows the application to mask outages from the user. My doubt here is whether my understanding that the application continuity will replay the SQL Statement that were in-flight when the outage occured is correct or is the the true meaning of application continuity something else.
I have refered to some blogs such as this,
https://martincarstenbach.wordpress.com/2013/12/13/playing-with-application-continuity-in-rac-12c/
The example mentioned here does not seem to be intended for replaying of in-flight SQL statements.
Is application continuity capable or replaying the in-flight SQL statements during a outage, or is FCF and application continuity only restore the state of the connection object and make it usable by the user post the outage has occured. If the earlier is true, then please guide me if I am missing anything in the application level settings in my code that is keeping me from achieving replay.
Yes your understanding is correct. With the replay driver, Application Continuity can replay in-flight work so that an outage is invisible to the application and the application can continue, hence the name of the feature. The only thing that's visible from the application is a slight delay on the JDBC call that hit the outage. What's also visible is an increase in memory usage on the JDBC side because the driver maintains a queue of calls. What happens under the covers is that, upon outage, your physical JDBC connection will be replaced by a brand new one and the replay driver will replay its queue of calls.
Now there could be cases where replay fails. For example replay will fail if the data has changed. Replay will also be disabled if you have multiple transactions within a "request". A "request" starts when a connection is borrowed from the pool and ends when it's returned back to the pool. Typically a "request" matches a servlet execution. If within this request you have more than one "commit" then replay will be disabled at runtime and the replay driver will stop queuing. Also note that auto-commit must be disabled.
[I'm part of the Oracle team that designed and implemented this feature]
I think the jdbc connection string could be your problem:
pds.setURL("jdbc:oracle:thin:#my_scan_name.my_domain_name.com:PORT_NUMBER/my_service_name");
You are using a so called EZConnect String but this is not supported with AC.
Alias (or URL) = (DESCRIPTION=
(CONNECT_TIMEOUT= 120)(RETRY_COUNT=20) RETRY_DELAY=3)(TRANSPORT_CONNECT_TIMEOUT=3)
(ADDRESS_LIST=(LOAD_BALANCE=on)
(ADDRESS=(PROTOCOL=TCP)(HOST=primary-scan)(PORT=1521)))
(ADDRESS_LIST=(LOAD_BALANCE=on)
(ADDRESS=(PROTOCOL=TCP)(HOST=secondary-scan)(PORT=1521)))
(CONNECT_DATA=(SERVICE_NAME=gold-cloud)))
As per IBM documentation:
Purge Policy
Specifies how to purge connections when a stale connection or fatal connection error is detected.
Valid values are EntirePool and FailingConnectionOnly.
Question:
How/When does the server get to know that a connection got staled? Does it purge the pool as soon as (immediately?) any connection goes stale or it happens as per the reap time?
Say if the reap time is 180 seconds. Let's say the reap thread last ran at 3:05 PM and a connection goes stale at 3:06 PM, will the server purge the pool at 3:06 PM itself or the purge will happen only at 3:08 PM ? Is there a risk of clients getting staled connection objects between 3:06 and 3:08 ?
The IBM document i'm referring to is:
https://www.ibm.com/support/knowledgecenter/en/ssw_i5_54/rzamy/50/admin/help/udat_conpoolset.html
Stale connections are identified in the following ways:
JDBC operation is performed that raises SQLRecoverableException or SQLNonTransientConnectionException, or a general SQLException with a SQL State or error code that the application server has built-in knowledge of. For specific lists of SQL states and error codes, refer to the SQLState mappings in DatabaseHelper and its various subclasses per database.
JDBC driver's ConnectionEventListener.connectionErrorOccurred signals the application server that a connection has gone bad.
When the application server learns that a connection has gone bad, it does not return that connection to the pool. Subsequent requests outside of the sharing scope would never get that same connection.
Purge Policy determines what the application server does with the other connections that are in the pool at the time that a stale connection occurred. The application server can aggressively purge all connections from the pool (EntirePool option), or it can leave the others there (FailingConnectionOnly option), or it can check all connections in the pool before allowing them to be handed out (ValidateAllConnections option).
Note that the property values above are for WebSphere Application Server Liberty. If using traditional, ValidateAllConnections is done as the combination of FailingConectionOnly plus defaultPretestOptimizationOverride=true.
If the server running Oracle database reboots, this probably should invalidate all previous JDBC connections in the connection pool (running as part of application on another server).
Some connection pools can handle this, others require manual re-initialization, about Oracle I do not know. Would Oracle connection pool (versions 11g and up) be able to detect such situation and recover, or do I need to check for this myself and reinitialize?
Ideally, I would like to know also that would happen for the connection in use that has been borrowed from the pool. Would such connection proxy be able to use the rebooted server?
I have read http://docs.oracle.com/cd/B10501_01/java.920/a96654/connpoca.htm without particular success.
How can I know the average or exact number of users accessing database simultaneously in my Java EE web enterprise application? I would like to see if the "Connection pool setting" I set in the Glassfish application server is suitable for my web application or not. I need to correctly set the maximun number of connection in Connection Pool setting in Application Server. Recently, my application ran out of connections and threw exceptions when the client request for DB expires.
There are multiple ways.
One and easiest would be take help from your DBAs - they can tell you exactly how many connections are active from your webserver or the user id for connection pool at a given time.
If you want some excitement, you will have to JMX management extensions provided by glassfish. Listing 6 on this page - gives an example as to how to write a JMS based snippet to monitor a connection pool.
Finally, you must make sure that all connections are closed explicitly by a connection.close(); type of call in your application. In some cases, you need to close ResultSet as well.
Next is throttling your http thread pool to avoid too many concurrent access if your db connections are taking longer to close.