Log database connections hanging - java

I have a java application that connects to a Sybase database.
I want to log, at the application level, database connections that are hanging.
I am not sure this is possible, if so, help me on this issue.

If a connection actually "hangs", it's often a deadlock. The database is the best place to diagnose and log these problems in detail.
However, I know that the MS SQL Server driver will throw a pretty specific exception after a deadlock victim is chosen and from my (much more limited) experience with Java, I imagine the Sybase driver does the same. You could trap that exception from your application. Even if it's a general exception, you might examine a stack trace or description and ascertain that it pertained to a lock issue.
This post mentions a LockAcquisitionException with Java/Sybase.

Related

Microsoft SQL Server - Query with huge results, causes ASYNC_NETWORK_IO wait issue

I have a Java application requesting about 2.4 million records from a Microsoft SQL Server (Microsoft SQL Server 2008 R2 (SP3))
The application runs fine on all hosts, except one. On this host, the application is able to retrieve data on some occasions. But on some others, it hangs.
Monitoring the MS Sql server indicates that the SPID associated with the query is in an ASYNC_NETWORK_IO wait state.
There are a few links online that talk about it.
https://blogs.msdn.microsoft.com/joesack/2009/01/08/troubleshooting-async_network_io-networkio/
https://social.msdn.microsoft.com/Forums/sqlserver/en-US/6db233d5-8892-4f8a-88c7-b72d0fc59ca9/very-high-asyncnetworkio?forum=sqldatabaseengine
https://social.msdn.microsoft.com/Forums/sqlserver/en-US/1df2cab8-33ca-4870-9daf-ed333a64630c/network-packet-size-and-delay-by-sql-server-sending-data-to-client?forum=sqldatabaseengine
Based on the above, the ASYNC_NETWORK_IO means 2 things:
1. Application is slow to process the results
2. Network between application and DB has some issues.
For #1 above, We analyzed using tcpdumps and found that in the cases where the query goes into ASYNC_NETWORK_IO state, the application server's tcp connection has a window size that oscillates between 0 and a small number, and eventually remains stuck at 0. Based on some more analysis, aspects related to firewalls between DB and application have also been mostly ruled out.
So I am staring at #2, unable to understand what could possibly go wrong. All the more baffling because the same code has been running under similar data loads for more than a year now. And it also runs fine on other hosts.
The JDBC driver being used is sqljdbc4-4.0.jar.
This by default has an adaptive buffering feature, which does things under the hood to reduce application resources.
We use the default fetch size of 128 (which i believe is not a good one).
So i am going to experiment overriding the default adaptive buffering behavior, though the MS docs suggest that it is good to have adaptive buffering for large result sets.
I will change the connection setting to use selectMethod=cursor.
And also change the fetchSize to 1024.
Now if it does not work:
What are some aspects of the problem that are worth investigating.
Assuming its still an issue with the client, what other connection settings, network settings should be inspected/changed to make progress?
If it does work consistently, what is the impact of making the connection setting change to selectMethod=cursor
On the application side?
Database side?
Update: I tested the application adding the selectMethod=cursor to the connection. However, it results in the same issue as above.
Based on discussions with other administrators in the team - at this point the issue may be in the jdbc driver, or on the OS (when it tries to handle the data on the network).
After a good amount of discussions with the System admin, Network Admin and Database admin - it was agreed that somewhere in the OS -> Application stack, the data from network wasn't handled. In the meantime, we tested out a solution where we broke down the query to return smaller sized results. So we broke it down to 5 queries, each returning about 500k records.
Now when we ran these queries sequentially, we still ran into the same issue.
However, when we ran the queries in parallel, it always was successful.
Given that the solution worked always we haven't bothered getting to the root cause of the problem anymore.
On another note, the hardware and software running the application was also outdated. It was running Red Hat 5. So, it could well have to do something with that.

My application is stucked when DB is down

I'm using Hibernate and Tomcat JDBC connection pool to use my MySQL DB.
When, from any reason, the DB is down, my application got stuck.
For example, I have REST resources (with Jersey), they are not getting any requests.
I also using quartz for schedule tasks, they aren't running as well.
When I start my DB again, everything goes back to normal.
I don't even know where to start looking, anyone has an idea?
Thanks
What must be happening is your application must be receiving request but there must be some exception while establishing database connection .see the logs.
try some flow where your are not doing any database operation. It must work fine.
When the application has hung, get a thread dump of the JVM, this will tell you the state of each thread and, rather than guessing as to the cause, you'll have concrete evidence.
Having said that, and going with the guess work approach, a lot comes down to how your connection pool is configured and the action your code takes when it receives the SQLException. If the application is totally hung, then my first port of call would be to find out if the db accessing threads are in a WAIT state, or even deadlocked. The thread dump will help you to determine if that is the case.
see kill -3 to get java thread dump for how to take a thread dump

An SQLException was provoked, java.lang.InterruptedException, am I running out of db connections?

So we run a Hibernate, Spring, Spring Webflow stack. From what I've read so far it might also be important to know we use c3p0-0.9.1.2.
Over the last couple of days we've noticed the server suddenly stop. Users cannot log into the website, nothing appears to happen, the browser simply sits loading the page forever. The server logs also simply halt.
When we notice this we shutdown the tomcat instance and all of a sudden quite a few of the following errors get logged;
13:05:57.492 [TP-Processor7] WARN o.h.util.JDBCExceptionReporter - SQL Error: 0, SQLState: null
13:05:57.492 [TP-Processor7] ERROR o.h.util.JDBCExceptionReporter - An SQLException was provoked by the following failure: java.lang.InterruptedException
Any ideas what these mean? Google hasn't been too helpful. Are we leaking db connections somewhere and the pool cannot gain a new session?
We have just put in a couple new Spring Webflow flows and are experiencing a slightly increased amount of website traffic but we haven't seen this behaviour before.
I suspect those InterruptExceptions come from the actual shutdown of those threads by the container, and simply indicate that those threads are existant when Tomcat shuts down.
Instead, I would grab a thread dump from Tomcat when it next freezes. I would also get a DBA to tell you what's happening in the database. From the above I'm guessing you're hung on a database resource, but a thread dump and analysis from a DBA will certainly point you in the right direction.
Here's a Thread Dump JSP as an alternative means of generating thread dumps.

How to fix error: [BEA][SQLServer JDBC Driver]No more data available to read

My java application does use DB Connection pooling. One of the functionality started failing today with this error:
[BEA][SQLServer JDBC Driver]No more data available to read
This doesn't occur daily. Once I restart my application server things look fine for some days and this error comes back again.
Anyone encountered this error? Reasons might vary, but I would like to know those various reasons to mitigate my issue.
Is it possible that the database or network connection has briefly had an outage? You might expect any currently open result sets then to become invalid with resulting errors.
I've never seen this particular error, but then I don't work with BEA or SQL Server, but a quick google does show other folks suggesting such a cause.
When you're using a connection pool, if you do get such a glitch, then all connections in teh pool become "stale" or invalid. My application server (WebSphere) has the option to discard the entire connection pool after particular errors are detected. The result then is that one unlucky request sees the error, but then subsequent requests get a new connection and recover. If you don't discard the whole pool then you get a failure as each stale connection is used and discarded.
I suggest you investigate to see a). whether your app server has such a capability b). how you application responds if the database is bounced, if this replicates the error then maybe you've found the cause.

Spring UncategorizedSQLException: ORA-01012

I am trying to retrieve data form an Oracle database using jdbc (ojdbc14.jar). I have a limited number of concurrent connections when connecting to the database and these connections are managed by Websphere connection pool.
Sometimes when I make the call I see an UncategorizedSQLException exception thrown in my logs with one of the following oracle codes:
ORA-01012 (not logged in) exception
ORA-17410 (connection timed out, socket empty),
ORA-02396 exceeded maximum idle time, please connect again
Other times I get no exceptions and it works fine.
Anyone understand what might be happening here?
In Websphere I have my cache statement size set to 10. Not sure if it is relevant in this situation, when it looks like the connection is being dropped.
It looks like the database is deciding to drop the connection. It's a good idea to write your code in a way that doesn't require that a connection be held forever. A better choice is to have the program connect to the database, do its work, and disconnect. This eliminates the problem of the database deciding to disconnect the application due to inactivity/server overload/whatever, and the program needing to figure this out and make a reasonable stab at reconnecting.
I hope this helps.

Categories