Calls made from Java to SQL server using jtds driver is stuck. We have checked there is no locking or any long running queries on SQL server.
Following is the stack trace from java thread dump. Any one has any idea?
"core-CommandInvoker-thread-7133" prio=5 tid=7362 #### RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at java.io.DataInputStream.readFully(DataInputStream.java:195)
at java.io.DataInputStream.readFully(DataInputStream.java:169)
at net.sourceforge.jtds.jdbc.SharedSocket.readPacket(SharedSocket.java:850)
at net.sourceforge.jtds.jdbc.SharedSocket.getNetPacket(SharedSocket.java:731)
at net.sourceforge.jtds.jdbc.ResponseStream.getPacket(ResponseStream.java:477)
at net.sourceforge.jtds.jdbc.ResponseStream.read(ResponseStream.java:146)
at net.sourceforge.jtds.jdbc.ResponseStream.read(ResponseStream.java:128)
at net.sourceforge.jtds.jdbc.TdsData.readData(TdsData.java:767)
at net.sourceforge.jtds.jdbc.TdsCore.tdsRowToken(TdsCore.java:3172)
at net.sourceforge.jtds.jdbc.TdsCore.nextToken(TdsCore.java:2430)
at net.sourceforge.jtds.jdbc.TdsCore.getNextRow(TdsCore.java:802)
at net.sourceforge.jtds.jdbc.JtdsResultSet.next(JtdsResultSet.java:608)
at org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:207)
at org.apache.commons.dbcp.DelegatingResultSet.next(DelegatingResultSet.java:20
7)
I had the same problem today on Java 8 / JDBI / Postgresql / HikariCP.
Mysteriously adding order by to the query solved the problem.
To be more specific, the resultset needs to be in deterministic order. If there're any ties in the order by, it'll stuck on socketRead0.
Here is an old report that leads me to this solution.
Related
i have this issue where in even if the query process in Oracle ended, thread in Java Spring is still waiting or reading from DB Connection. this query loads for more than 30 minutes in the Oracle DB, after it ends, i saw web container thread in java and it's still waiting on the DB side.
all i see in thread dump is - locked <33019fca> (a oracle.jdbc.driver.T4CConnection), but i cant see any thread related to 33019fca, plus the web container is in RUNNABLE State so it puzzles me.
my question is, what are things do I need to check? I know the query is the root cause as it should not load that long, but why does web container is still waiting when the query ended in the DB side?
Application:
Java 6
OJDBC 6
Oracle DB
Spring
Websphere 8
Java Thread:
"WebContainer : 11" - Thread t#191
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:128)
at oracle.net.ns.Packet.receive(Packet.java:311)
at oracle.net.ns.DataPacket.receive(DataPacket.java:105)
at oracle.net.ns.NetInputStream.getNextPacket(NetInputStream.java:305)
at oracle.net.ns.NetInputStream.read(NetInputStream.java:249)
at oracle.net.ns.NetInputStream.read(NetInputStream.java:171)
at oracle.net.ns.NetInputStream.read(NetInputStream.java:89)
at oracle.jdbc.driver.T4CSocketInputStreamWrapper.readNextPacket(T4CSocketInputStreamWrapper.java:123)
at oracle.jdbc.driver.T4CSocketInputStreamWrapper.read(T4CSocketInputStreamWrapper.java:79)
at oracle.jdbc.driver.T4CMAREngineStream.unmarshalUB1(T4CMAREngineStream.java:429)
at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:397)
at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:257)
at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:587)
at oracle.jdbc.driver.T4CCallableStatement.doOall8(T4CCallableStatement.java:220)
at oracle.jdbc.driver.T4CCallableStatement.doOall8(T4CCallableStatement.java:48)
at oracle.jdbc.driver.T4CCallableStatement.executeForRows(T4CCallableStatement.java:938)
at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1150)
at oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePreparedStatement.java:4798)
at oracle.jdbc.driver.OraclePreparedStatement.execute(OraclePreparedStatement.java:4901)
**- locked <33019fca> (a oracle.jdbc.driver.T4CConnection)**
at oracle.jdbc.driver.OracleCallableStatement.execute(OracleCallableStatement.java:5631)
**- locked <33019fca> (a oracle.jdbc.driver.T4CConnection)**
at oracle.jdbc.driver.OraclePreparedStatementWrapper.execute(OraclePreparedStatementWrapper.java:1385)
at com.ibm.ws.rsadapter.jdbc.WSJdbcPreparedStatement.pmiExecute(WSJdbcPreparedStatement.java:956)
at com.ibm.ws.rsadapter.jdbc.WSJdbcPreparedStatement.execute(WSJdbcPreparedStatement.java:623)
at com.xx.xx.xxxx.dao.xxDataQueryJdbcDAO.getData(xxDataQueryJdbcDAO.java:346)
Locked ownable synchronizers:
- None
from fiddler perspective, it is still waiting for the response.
from Oracle Side,
I cant see the query loading anymore plus all connections related to schema is inactive.
I have a Quartz Job that executes a Stored Procedure in my MySQL database once every 5 minutes, and for some reason, 1 out of 3 executions fails and gives this weird exception. I have searched and searched for what this exception means, but I could not find a solution. Here is the full stack trace:
java.sql.SQLException: Could not retrieve transation read-only status server
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1078)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:989)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:975)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:920)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:951)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:941)
at com.mysql.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:3939)
at com.mysql.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:3910)
at com.mysql.jdbc.PreparedStatement.checkReadOnlySafeStatement(PreparedStatement.java:1258)
at com.mysql.jdbc.CallableStatement.checkReadOnlySafeStatement(CallableStatement.java:2656)
at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1278)
at com.mysql.jdbc.CallableStatement.execute(CallableStatement.java:920)
at com.mchange.v2.c3p0.impl.NewProxyCallableStatement.execute(NewProxyCallableStatement.java:3044)
at org.deadmandungeons.website.tasks.RankUpdateTask.execute(RankUpdateTask.java:30)
at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573)
Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet successfully received from the server was 1,198,219 milliseconds ago. The last packet sent successfully to the server was 950,420 milliseconds ago.
at sun.reflect.GeneratedConstructorAccessor43.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1121)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3673)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3562)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4113)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2570)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2731)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2812)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2761)
at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1612)
at com.mysql.jdbc.ConnectionImpl.isReadOnly(ConnectionImpl.java:3933)
... 9 more
Caused by: java.net.SocketException: Connection timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:150)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:114)
at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:161)
at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:189)
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3116)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3573)
... 17 more
So I figured it is timing out because it thinks the MySQL server is in read-only status?
This only happens for this quartz job, and not any other time when I communicate with the database. This execution is of course happening in another thread, but I don't think that would have anything to do with it.
Why would it think the server was in read-only mode?
Also, I don't think "transation" is a word, so there's that...
Sorry for posting on old thread,
As stack trace says
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
This implies the link between JDBC and DB is broken.As per your observation you say 1 out of 3 job invocations fails.
You have these jobs scheduled every 5 minutes and as per trace the last successful message sent to server is ~15 minutes before.
Hence I suspect either
You are procedure is not returning (waiting on something)
The JDBC connection has been invalidated by the firewall/ proxy
It will interesting to see the how connections are managed, As per logs I see you are using c3p0.
You can try setting unreturnedConnectionTimeout and debugUnreturnedConnectionStackTraces. This will give you more insight into connection leaks or db calls which are taking long.
Research takes nowhere, as you guys said, but the error shows what seems to be a Database being populated by two applications at the same time.
Do you have admin privileges on this MySQL server? If you do, you should try setting
FLUSH TABLES WITH READ LOCK;
SET GLOBAL READ_ONLY=ON;
as a test to reproduce the error. Just to warn you, this command makes your database unwritable, so you will not be able to add data in it until you revert this configuration, obviously with
SET GLOBAL READ_ONLY=0;
UNLOCK TABLES;
If the result of this test is positive (same error had been reproduced), you should try isolating applications that are storing data on your database, to find out which one is conflicting with Quartz.
I'm sorry for being vague, but I hope it gives you some help...
There is on error that appears in my log file every time:
com.mysql.jdbc.CommunicationsException: Communications link failure due to underlying exception:
** BEGIN NESTED EXCEPTION **
java.net.SocketException
MESSAGE: Socket closed
STACKTRACE:
java.net.SocketException: Socket closed
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:113)
at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:160)
at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:188)
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:1994)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2411)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:2916)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1631)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:1723)
at com.mysql.jdbc.Connection.execSQL(Connection.java:3250)
at com.mysql.jdbc.Connection.execSQL(Connection.java:3179)
at com.mysql.jdbc.Statement.executeQuery(Statement.java:1207)
at com.mchange.v2.c3p0.impl.NewProxyStatement.executeQuery(NewProxyStatement.java:35)
It happens always in the same point of the code, when the application does a specific query in the MySQL database.
The query is LIKE THIS:
SELECT M.*, O.id
FROM order_message M
INNER JOIN orders O ON M.order_id = O.id
WHERE O.seller_id = 14224 AND M.sender_user_id <> 14224
ORDER BY M.creation_date DESC
LIMIT 5
I noticed (by EXPLAINING this query) that it always use temporary/filesort to execute. All indexes are properly created and I think these is no way to improve this, but I suspect the query performance or resource utilization is causing the exception error.
I am using amazon RDS
The problem was that my connection pool was configured in a way that any database connection that took longer than 10 seconds would be dropped by the connection pool (c3p0). I was using the unreturnedConnectionTimeout parameter.
The c3p0 documentation page discourage to use this parameter. Ideally, all connections should be properly closed (and thus returned to the pool)
http://www.mchange.com/projects/c3p0/#unreturnedConnectionTimeout
I have increased the parameter to 60 seconds and the problem get solved.
I have some worker threads running, with MySQL and mysql-connector-java-5.1.20.
When I kill some SQL statement ( using kill "connection id" from mysql client), the java thread hangs, which should throw some exception.
jstack prints:
"quartzBase$child#45e3dd3c_Worker-3" prio=10 tid=0x00007f960004c800 nid=0x713d runnable [0x00007f943b3a0000]
java.lang.Thread.State: RUNNABLE
at java.net.PlainSocketImpl.socketAvailable(Native Method)
at java.net.PlainSocketImpl.available(PlainSocketImpl.java:472)
- locked <0x00007f9e11fe13a8> (a java.net.SocksSocketImpl)
at java.net.SocketInputStream.available(SocketInputStream.java:217)
at com.mysql.jdbc.util.ReadAheadInputStream.available(ReadAheadInputStream.java:232)
at com.mysql.jdbc.MysqlIO.clearInputStream(MysqlIO.java:981)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2426)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2651)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2677)
- locked <0x00007f9e17de2b50> (a com.mysql.jdbc.JDBC4Connection)
at com.mysql.jdbc.ConnectionImpl.rollbackNoChecks(ConnectionImpl.java:4863)
at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4749)
- locked <0x00007f9e17de2b50> (a com.mysql.jdbc.JDBC4Connection)
at org.apache.commons.dbcp.DelegatingConnection.rollback(DelegatingConnection.java:368)
at org.apache.commons.dbcp.PoolingDataSource$PoolGuardConnectionWrapper.rollback(PoolingDataSource.java:323)
at org.hibernate.transaction.JDBCTransaction.rollbackAndResetAutoCommit(JDBCTransaction.java:217)
at org.hibernate.transaction.JDBCTransaction.rollback(JDBCTransaction.java:196)
at org.springframework.orm.hibernate3.HibernateTransactionManager.doRollback(HibernateTransactionManager.java:676)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.processRollback(AbstractPlatformTransactionManager.java:845)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.rollback(AbstractPlatformTransactionManager.java:822)
at org.springframework.transaction.interceptor.TransactionAspectSupport.completeTransactionAfterThrowing(TransactionAspectSupport.java:430)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:112)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:202)
at $Proxy1021.process(Unknown Source)
Using jvmtop, I saw this:
JvmTop 0.8.0 alpha - 22:48:37, amd64, 24 cpus, Linux 2.6.32-35, load avg 11.53
http://code.google.com/p/jvmtop
Profiling PID 27403: com.caucho.server.resin.Resin --root-dir
36.41% ( 0.22s) com.mysql.jdbc.util.ReadAheadInputStream.available()
33.42% ( 0.20s) ....opensymphony.xwork2.conversion.impl.DefaultTypeConve()
30.17% ( 0.18s) com.mysql.jdbc.util.ReadAheadInputStream.fill()
0.00% ( 0.00s) com.rabbitmq.client.impl.Frame.readFrom()
The worker threads will never accept new tasks.
any idea?
According to MySQL documentation "kill connection thread_id" should terminate the connection associated with the given thread_id. But it looks like that is not happening (in which case the Java thread will wait for an answer forever). Maybe you can verify that the connection is actually closed using some network tool (e.g. netstat) .
I've ran into hanging MySQL connections before and had to resort to using the socketTimeout JDBC connection parameter (but be careful: the socketTimeout needs to be larger than the time it takes to complete the longest running query). You could also try to use the QueryTimeout for a prepared statement.
I am experiencing some strange behaviour in my Java server application whereby database operations that usually take a few milliseconds are sporadically taking much longer (30s - 170s) to complete. This isn't isolated to a specific query as I've seen the delays occurring for both SQL update and select statements. Also, all of my select statements use the NOLOCK option so I've ruled out possible lock contention.
The last time I saw a delay I managed to capture the following stack trace from JConsole; the update in question typically takes 5ms to complete but this stack trace was accessible for at least 10 - 20 seconds. The trace suggests to me that the statement has been executed but there is some delay in retrieving the result although I could be wrong? Obviously as this was an update statement the only result I'd expect would be the row count (i.e. not a large result set of data).
I saw a "transport level error" in SQL Server Management Studio at around the time of the delay.
One suggestion I've had is that these problems are due to SQL Server resources being exhausted. Has anyone seen anything similar? Can anyone shed any light on this problem?
Thanks in advance.
Stack Trace:
Name: MessageRouterImplThread-2
State: RUNNABLE
Total blocked: 0 Total waited: 224
Stack trace:
java.net.SocketInputStream.socketRead0(Native Method)
java.net.SocketInputStream.read(SocketInputStream.java:129)
com.microsoft.util.UtilSocketDataProvider.getArrayOfBytes(Unknown Source)
com.microsoft.util.UtilBufferedDataProvider.cacheNextBlock(Unknown Source)
com.microsoft.util.UtilBufferedDataProvider.getArrayOfBytes(Unknown Source)
com.microsoft.jdbc.sqlserver.SQLServerDepacketizingDataProvider.signalStartOfPacket(Unknown Source)
com.microsoft.util.UtilDepacketizingDataProvider.getByte(Unknown Source)
com.microsoft.util.UtilByteOrderedDataReader.readInt8(Unknown Source)
com.microsoft.jdbc.sqlserver.tds.TDSRequest.getTokenType(Unknown Source)
com.microsoft.jdbc.sqlserver.tds.TDSRequest.processReply(Unknown Source)
com.microsoft.jdbc.sqlserver.SQLServerImplStatement.getNextResultType(Unknown Source)
com.microsoft.jdbc.base.BaseStatement.commonTransitionToState(Unknown Source)
com.microsoft.jdbc.base.BaseStatement.postImplExecute(Unknown Source)
com.microsoft.jdbc.base.BasePreparedStatement.postImplExecute(Unknown Source)
com.microsoft.jdbc.base.BaseStatement.commonExecute(Unknown Source)
com.microsoft.jdbc.base.BaseStatement.executeUpdateInternal(Unknown Source)
com.microsoft.jdbc.base.BasePreparedStatement.executeUpdate(Unknown Source)
- locked com.microsoft.jdbc.sqlserver.SQLServerConnection#c4b83f
org.apache.commons.dbcp.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:101)
org.springframework.jdbc.core.JdbcTemplate$2.doInPreparedStatement(JdbcTemplate.java:798)
org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:591)
org.springframework.jdbc.core.JdbcTemplate.update(JdbcTemplate.java:792)
org.springframework.jdbc.core.JdbcTemplate.update(JdbcTemplate.java:850)
org.springframework.jdbc.core.JdbcTemplate.update(JdbcTemplate.java:858)
org.springframework.jdbc.core.simple.SimpleJdbcTemplate.update(SimpleJdbcTemplate.java:237)
"...whereby database operations that
usually take a few milliseconds are
sporadically taking much longer (30s -
170s) to complete."
What you are describing sounds like an incorrectly cached query plan, due to out of date statistics (and/or indexes that need rebuilding), or incorrect parameter sniffing. The timeout could be occuring because the server is taking longer than the default connection timeout.
I would talk to your DBA and first get statistics updated, and if that doesn't work get the indexes of the tables involved in the query rebuilt.
Run this on your Database (with the usual caveat about not runninhg in Production without talking to your admin/DBA, and run at own risk etc.):
EXEC sp_updatestats
EXEC sp_refreshview
EXEC sp_msForEachTable 'EXEC sp_recompile ''?'''
Alternatively, you mention time of day being a factor. Could it be that a backup or scheduled job is occuring at that time?
Update: You could kick off a profiler trace: MS SQL Server 2008 - How Can I Log and Find the Most Expensive Queries? but don't restrict to your DB. Such a trace, as long as it is started from SSMS as per that post, is relatively low impact (3-5% ish).
The "transport level error" seems to indicate connectivity problems. Is the database on a separate machine?