JDBC requests to Oracle 11g failing to be commited although apparently succeding - java

We have an older web-based application (Java with Spring 2.5.4 framework) running on a GlassFish 3.1 (build 43) server. This application was recently (a few weeks ago) re-directed to use an Oracle 11g (11.2.0.3.0) database and ojdbc6.jar/orai18n.jar (up from Oracle 10g 10.2.0.3.0 and ojdbc14.jar) -- using a JDBC Thin connection. The application is using org.apache.commons.dbcp.BasicDataSource version 1.2.2 for connections and the database requests are handled either through Spring jdbcTemplate (via the JdbcDaoSupport abstract class) or Spring's PlatformTransactionManager.
This morning we noticed that application users were able to enter information, modify it and later to retrieve and print that data through the application, but that there were no committed updates for the last 24 hours. This application currently has only a few users each day and they are apparently sharing the same connection which has been kept open by the connection pool during the last day and so their uncommitted updates were visible through the application, but not through other connections to the database. When the connection was closed, the uncommitted updates were lost.
Examining the server logs showed no errors from the time of the last committed changes to the database through the times of printed reports the next morning. In addition, even if some of the changes had been (somehow) made with the JDBC connection being set to Auto-Commit false, there were specific commits made for some of those updates that were part of a transaction which, as part of a try/catch block should have either executed one of the "transactionManager.commit(transactionStatus);" or "transactionManager.rollback(transactionStatus);" calls that must have been processed without error. It looks as though the commit was returning successfully, but no commit actually occurred.
Restarting the GlassFish domain and the application restored the normal operation with the various updates being committed as they are entered.
My question is has anyone seen or heard about anything like this occurring and, if so, what could have caused it?
Thank you for any ideas here -- we are at a loss.
Some new information:
Examination of our Oracle 11g Server showed that near the time that we believe that the commits appeared to stop, there were four operations blocked on some other operation that we were not able to fully resolve, but was probably an update.
Examination of the Glassfish Server logs showed that the appearance of the worker threads changed following this estimated start time and fewer threads were appearing in the log until only one thread continued to be used for several hours.
The problem occurred again about one week later and was caught after about 1/2 hour. At this time, there were two worker threads in operation.

The problem occurred due to a combination of two things. The first was a method that setup a Spring Transaction, but had an exit that bypassed both the TransactionManager.commit() and the TransactionManager.rollback() (as well as the several SQL requests making up the transaction). Although this was admittedly incorrect coding, in the past, this transaction was closed and therefore had no effect on subsequent usage.
The solution was to insure that the transaction was not started if there was nothing to be done; or, in general double check to make sure that all transactions, once started, are completed.
I am not certain of the exact how or why this problem began presenting itself, so the following is partly conjectured. Apparently, upgrading to Oracle 11g and/or switching to the ojdbc6.jar driver altered the earlier behavior of the incorrect code so that the transaction was not terminated and the connection auto-commit was left false. (It could also be due to some other change that we have not identified since the special case above happens rarely – but does happen.) The corresponding JDBC connection appears to be bound to a specific GlassFish worker thread (I will call this a 'bad' thread in the following as opposed to the normally acting 'good' threads). Whenever this 'bad' thread is used to handle an application request (for this particular application), changes are uncommitted and selects return dirty data. As time goes on, when a change is requested on a 'good' thread and JDBC connection that already has an uncommitted change made on the 'bad' thread, than the new request hangs and the worker thread also hangs. Eventually all but the 'bad' worker thread are hung and everything seems to work correctly from the application viewpoint, but nothing is ever committed.
Again, the solution was to correct the bad code.

Related

Does Oracle keeps query running in the background despite losing connection with the app?

Our app written in Java sends long-running queries to Oracle through JDBC API. It's inevitable that sometimes the app could crash or could get killed abruptly for plethora of reasons without giving it the chance to terminate the queries it has sent. When the app gets killed or stops, it also loses connection to Oracle.
Does Oracle DB keep the query running in the background even if it already has lost connection with the app that had sent the query?
Please cite sources.
When a connection between the database and the app is lost, Oracle will stop the session's queries and kill the session. But there are two potential exceptions:
Rollback must finish. From the Database Concepts manual: "A transaction ends when ... A client process terminates abnormally, causing the transaction to be implicitly rolled back using metadata stored in the transaction table and the undo segment." That rollback process cannot be stopped regardless of what happens to the connection. Even if you kill the database instance, when the instance restarts it will resume the rollback. As a general rule of thumb, the time to rollback will be about the same as the time the database spent running the DML. You just have to wait while Oracle puts itself back into a consistent state.
Zombie sessions. Although I don't have a reproducible test case for this problem, I'm sure every DBA has a story about sessions running after the client process disappeared, or even after they killed the session. Before you dismiss this concern as an old myth, note that the SQLNET.EXPIRE_TIME parameter was created for this scenario. Setting the value greater than 0 will have Oracle periodically check and clear terminated sessions. But you don't need to set this parameter unless you're having specific problems.

Application continuity with Universal Connection Pool java JDBC Oracle 12c

I am trying to achieve application continuity with Oracle 12c database & Oracle UCP(Universal Connection Pool). As per the official documentation, I have implemented the following in my application. I am using ojdbc8.jar along with the equivalent ons.jar and the ucp.jar in my application.
PoolDataSource pds = oracle.ucp.jdbc.PoolDataSourceFactory.getPoolDataSource();
Properties as per oracle documentation:
pds.setConnectionFactoryClassName("oracle.jdbc.replay.OracleDataSourceImpl");
pds.setUser("username");
pds.setPassword("password");
pds.setInitialPoolSize(10);
pds.setMinPoolSize(10);
pds.setMaxPoolSize(20);
pds.setFastConnectionFailoverEnabled(true);
pds.setONSConfiguration("nodes=IP_1:ONS_PORT_NUMBER,IP_2:ONS_PORT_NUMBER");
pds.setValidateConnectionOnBorrow(true);
pds.setURL("jdbc:oracle:thin:#my_scan_name.my_domain_name.com:PORT_NUMBER/my_service_name");
// I have also tried using the TNS-Like URL as well. //
However, I am not able to acheive application continuity. I have some inflight transactions that I expect to replay when I bring down the RAC node on which my database service is running. What I observe is that my service migrates to the next available RAC node in the cluster, however, my in-flight transactions fail. What expect to happen over here is that the drivers will automatically restart the failed in-flight transactions. However, I dont see this happening. The queries that I fire are the database, sometimes I see them being triggered again on the database side, but we see Connection Closed Exception on the client side
According to some documentation, application continuity allows the application to mask outages from the user. My doubt here is whether my understanding that the application continuity will replay the SQL Statement that were in-flight when the outage occured is correct or is the the true meaning of application continuity something else.
I have refered to some blogs such as this,
https://martincarstenbach.wordpress.com/2013/12/13/playing-with-application-continuity-in-rac-12c/
The example mentioned here does not seem to be intended for replaying of in-flight SQL statements.
Is application continuity capable or replaying the in-flight SQL statements during a outage, or is FCF and application continuity only restore the state of the connection object and make it usable by the user post the outage has occured. If the earlier is true, then please guide me if I am missing anything in the application level settings in my code that is keeping me from achieving replay.
Yes your understanding is correct. With the replay driver, Application Continuity can replay in-flight work so that an outage is invisible to the application and the application can continue, hence the name of the feature. The only thing that's visible from the application is a slight delay on the JDBC call that hit the outage. What's also visible is an increase in memory usage on the JDBC side because the driver maintains a queue of calls. What happens under the covers is that, upon outage, your physical JDBC connection will be replaced by a brand new one and the replay driver will replay its queue of calls.
Now there could be cases where replay fails. For example replay will fail if the data has changed. Replay will also be disabled if you have multiple transactions within a "request". A "request" starts when a connection is borrowed from the pool and ends when it's returned back to the pool. Typically a "request" matches a servlet execution. If within this request you have more than one "commit" then replay will be disabled at runtime and the replay driver will stop queuing. Also note that auto-commit must be disabled.
[I'm part of the Oracle team that designed and implemented this feature]
I think the jdbc connection string could be your problem:
pds.setURL("jdbc:oracle:thin:#my_scan_name.my_domain_name.com:PORT_NUMBER/my_service_name");
You are using a so called EZConnect String but this is not supported with AC.
Alias (or URL) = (DESCRIPTION=
(CONNECT_TIMEOUT= 120)(RETRY_COUNT=20) RETRY_DELAY=3)(TRANSPORT_CONNECT_TIMEOUT=3)
(ADDRESS_LIST=(LOAD_BALANCE=on)
(ADDRESS=(PROTOCOL=TCP)(HOST=primary-scan)(PORT=1521)))
(ADDRESS_LIST=(LOAD_BALANCE=on)
(ADDRESS=(PROTOCOL=TCP)(HOST=secondary-scan)(PORT=1521)))
(CONNECT_DATA=(SERVICE_NAME=gold-cloud)))

What happens to H2 database cache on a system crash?

I really had a bad experience today. I applied some batch SQL scripts through Netbeans IDE to my H2 database (which is running in TCP mode). After 5 hours of work, the database connection in Netbeans IDE suddenly freezed... Subsequently I restarted the server (on which the H2 database is running) and then I realized that all changes of the last 5 hours were not applied or somehow rolled back...
My conclusion is, that the changes were only in the cache and not flushed to the database, since the results were at any one time visible when queried after each SQL script.
Therefore, what happens to the database cache in the case of a system failure ? Gone ... ?
Yes. The cache is gone in the event of a system failure. You must not have committed the transaction. The only guarantee then is that it must rollback (since it wasn't committed and the client has disconnected).
If it had been committed and subsequently crashed (before flushing) then it would have been possible for the server to still recover based on some combination of commit/transaction log and internal metadata.

What are the implications of running a query against a MySQL database via Hibernate without starting a transaction?

It seems to me that we have some code that is not starting a transaction yet for read-only operations our queries via JPA/Hibernate as well as straight SQL seem to work. A hibernate/jpa session would have been opened by our framework but for a few spots in legacy code we found no transactions were being opened.
What seems to end up happening is that the code usually runs as long as it does not use EntityManager.persist and EntityManager.merge. However once in awhile (maybe 1/10) times the servlet container fails with this error...
Failed to load resource: the server responded with a status of 500 (org.hibernate.exception.JDBCConnectionException: The last packet successfully received from the server was 314,024,057 milliseconds ago. The last packet sent successfully to the server was 314,024,057 milliseconds ago. is longer than the server configured value of 'wait_timeout'. You should consider either expiring and/or testing connection validity before use in your application, increasing the server configured values for client timeouts, or using the Connector/J connection property 'autoReconnect=true' to avoid this problem.)
As far as I can tell only the few spots in our application code that do not have transactions started before a query is made will have this problem. Does anybody else think it could be the non-transactional query running that causes this behaviour?
FYI here is our stack...
-Guice
-Guice-Persist
-Guice-Servlet
-MySql 5.1.63
-Hibernate/C3P0 4.1.4.Final
-Jetty
Yes, I think.
If you start a query without opening a transaction, this transaction will be opened automatically by the underlying layer. This connection, with an opened transaction, will be returned to the connection pool and given to another user, that will receive a connection to an already-opened transaction, and that could lead to inconsistent state.
Here in my company we had a lot of problems in the past with read-only non-transactional queries, and adjusted our framework to handle this.
Besides that, we talked to BoneCP developers and they accepted to develop a set of features to help handle this problem, like auto-rollback uncommitted transactions returned to the pool, and print a stack trace of what method forgot to commit the transaction.
This matter was discussed here:
http://jolbox.com/forum/viewtopic.php?f=3&t=98

How to fix error: [BEA][SQLServer JDBC Driver]No more data available to read

My java application does use DB Connection pooling. One of the functionality started failing today with this error:
[BEA][SQLServer JDBC Driver]No more data available to read
This doesn't occur daily. Once I restart my application server things look fine for some days and this error comes back again.
Anyone encountered this error? Reasons might vary, but I would like to know those various reasons to mitigate my issue.
Is it possible that the database or network connection has briefly had an outage? You might expect any currently open result sets then to become invalid with resulting errors.
I've never seen this particular error, but then I don't work with BEA or SQL Server, but a quick google does show other folks suggesting such a cause.
When you're using a connection pool, if you do get such a glitch, then all connections in teh pool become "stale" or invalid. My application server (WebSphere) has the option to discard the entire connection pool after particular errors are detected. The result then is that one unlucky request sees the error, but then subsequent requests get a new connection and recover. If you don't discard the whole pool then you get a failure as each stale connection is used and discarded.
I suggest you investigate to see a). whether your app server has such a capability b). how you application responds if the database is bounced, if this replicates the error then maybe you've found the cause.

Categories