The setup:
2-node Cassandra 1.2.6 cluster
replicas=2
very large CQL3 table with no secondary index
Rowkey is a UUID.randomUUID().toString()
read consistency set to ONE
Using DataStax java driver 1.0
The request:
Attempting to do a table scan by "SELECT some-col from schema.table LIMIT nnn;"
The fail:
Once I go beyond a certain nnn LIMIT, I start to get NoHostAvailableExceptions from the driver.
It reads like this:
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.181.13.239 ([/10.181.13.239] Unexpected exception triggered))
at com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:64)
at com.datastax.driver.core.ResultSetFuture.extractCauseFromExecutionException(ResultSetFuture.java:214)
at com.datastax.driver.core.ResultSetFuture.getUninterruptibly(ResultSetFuture.java:169)
at com.jpmc.es.rtm.storage.impl.EventExtract.main(EventExtract.java:36)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.181.13.239 ([/10.181.13.239] Unexpected exception triggered))
at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:98)
at com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:165)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
Given: This is probably not the most enlightened thing to do to a large table with millions of rows, but this is how I learn what not to do, so I would really appreciate someone who could volunteer how this kind of error can be debugged.
For example, when this happens, there are no indications that the nodes in the cluster ever had an issue with the request (there is nothing in the logs on either node that indicate any timeout or failure). Also, I enabled the trace on the driver, which gives you some nice autotrace (ala Oracle) info as long as the query succeeds. But in this case, the driver blows a NoHostAvailableException and no ExecutionInfo is available, so tracing has not provided any benefit in this case.
I also find it interesting that this does not seem to be recorded as a timeout (my JMX consoles tell me no timeouts have occurred). So, I am left not understanding WHERE the failure is actually occurring. I am left with the idea that it is the driver that is having a problem, but I don't know how to debug it (and I would really like to).
I have read several posts from folks that state that query'g for resultSets > 10000 rows is probably not a good idea, and I am willing to accept this, but I would like to understand what is causing the exception and where the exception is happening.
FWIW, I also tried bumping the timeout properties in the cassandra.yaml, but this made no difference whatsoever.
I welcome any suggestions, anecdotes, insults, or monetary contributions for my registration in the house of moron-developers.
Regards!!
My guess (and perhaps others can confirm) is that you are putting too high a load on the cluster by the query which is causing the timeout. So, yes, it's a little difficult to debug as it's not obvious what the root cause was: was the limit I set too large or is the cluster actually down?
You want to avoid setting large limits on the amount of data you request in a single query, typically by setting a reasonable limit and paging through the results, e.g.,
SELECT * FROM messages WHERE user_id = 101 LIMIT 1000;
SELECT * FROM messages WHERE user_id = 101 AND msg_id > [Last message ID received] LIMIT 1000;
The Automatic Paging functionality added in (see this document, where the code examples in this answer are copied from) is a big improvement in datastax java-driver as it removes the need to manually page and lets you do the following:
Statement stmt = new SimpleStatement("SELECT * FROM images");
stmt.setFetchSize(100);
ResultSet rs = session.execute(stmt);
// Iterate over the ResultSet here
While this won't necessarily solve your problem it will minimise the possibility that it was a "too-big" query.
Related
I have an application written by a 3rd party which uses Java/Tomcat talking to an Oracle 12c (12.2.0.1) DB. In its logs it reports "Error inserting into table" but provides no details. In talking with the author's support staff they indicate it is old code and they have no way to give more detail. They say the application is better supported with MSSQL which we do not support in our shop.
I would like to see what the insert statement going to the Oracle DB looks like, but haven't been able to find it in v$sqltext. As an alternative, I was hoping to find a tool like fiddler to view the outbound traffic on port 1521.
Is there specific tool that would allow trapping this traffic which is not encrypted so I can see the "query" sent and the response coming back from the Oracle DB?
A general sniffer may work, but they generally get a lot of extraneous traffic and require a fair amount of mucking about to find what you want.
Note:
As I mentioned in the comments I am not a Tomcat/Java person. I think I found where the classpath is set. Given the windows batch file below, is the "driver" that needs to be replaced bcprov-jdk16-138.jar?
set PROJLIB=..\..
set JAVA_HOME=%PROJLIB%\jdk\
set libDIR=%PROJLIB%\appserver\webapps\receiver\WEB-INF\lib
set consoleDIR=%PROJLIB%\bin\lib
set endorsedLibDir=%PROJLIB%\appserver\endorsed
set CPATH= %consoleDIR%\console.jar;%libDIR%\ebxml.jar;%libDIR%\commons-io-1.1.jar;%libDIR%\bcprov-jdk16-138.jar;%libDIR%\xercesImpl.jar
set CLASSPATH=%CPATH%
set PATH=%JAVA_HOME%\bin;%SystemRoot%;%SystemRoot%\system32
Additional Notes:
The above file is called setenv.bat.
Regarding trying to capture the SQL from the database, the application is not a windows app, it is an app which accepts data from the network and writes it to the DB. This makes knowing precisely when to start and stop monitoring difficult. It seems to be connected for a very short period. It does seem to be able to read data, but not insert.
Assuming that you are using the Oracle JDBC driver and that you have the ability to replace the JDBC driver in some environment in order to debug the problem, Oracle provides versions of the JDBC driver that can be configured to log the SQL statements that are executed.
An alternative would be to create a servererror trigger in the database that logs the SQL statements that fail. I believe that would require that the SQL statement that is failing is well-formed which isn't guaranteed if the third party app is encountering an error dynamically assembling the statement. If the statement never lands in v$sql that may indicate that it isn't well-formed but it's worth a try.
If you're licensed to use the AWR/ ASH tables, you could also try querying dba_hist_active_sess_history. Oracle samples the active sessions every second. If the failing statement happens to be caught in the sampling, you'd see it there. If this is a typical OLTP application doing single-row inserts, you may need to run through a lot of samples in order to catch an active session with that statement but that may be reasonable.
The simples approach is, if you can localize your database session (using gv$session selecting your connection USERNAME).
Get the SID and SERIAL# of the connection and activate the 10046 trace using the following statement. (substutite SID for session_id and SERIAL# for serial_num)
EXEC DBMS_MONITOR.session_trace_enable(session_id =>271, serial_num=>46473, binds=>TRUE);
Note that you need permissions for both querying gv$session and executing DBMS_MONITOR so DBA access is required to grant them to your user.
Than check the trace file on the database server in folder trace, the trace file has a name such as xe_m005_1336.trc
Grep for the table name, you schould see someting like this I simulated for failed insert on the table my_table
=====================
PARSING IN CURSOR #854854488 len=38 dep=0 uid=104 oct=2 lid=104 tim=380974114197 hv=1259660490 ad='7ff08904d88' sqlid='1ttgvst5j9t6a'
insert into my_table(col1) values(:1 )
END OF STMT
PARSE #854854488:c=0,e=495,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=0,tim=380974114195
=====================
PARSE ERROR #854854488:len=39 dep=0 uid=104 oct=2 lid=104 tim=380974117361 err=904
insert into my_table(col1) values(:1 )
Note that this is example of an exception
java.sql.SQLSyntaxErrorException: ORA-00904: "COL1": invalid identifier
so the statement fails with an PARSE ERROR
If the insert fails due to some constraint vialotation, you will see such sequence
=====================
PARSING IN CURSOR #715594288 len=37 dep=0 uid=104 oct=2 lid=104 tim=382407621534 hv=3290870806 ad='7ff0032e238' sqlid='17t3q0v22dd0q'
insert into my_table(col) values(:1 )
END OF STMT
PARSE #715594288:c=0,e=245,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=0,tim=382407621532
=====================
The cursor id is #715594288so check with this id further in the trace file
BINDS #715594288:
Bind#0
oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
oacflg=03 fl2=1000000 frm=01 csi=873 siz=24 off=0
kxsbbbfp=2aa71a00 bln=22 avl=02 flg=05
value=7
=====================
Here you see the bind variables passed in the insert, it was the value = 7 that caused the failure.
EXEC #715594288:c=0,e=4614,p=0,cr=7,cu=0,mis=1,r=0,dep=0,og=1,plh=0,tim=382407626259
ERROR #715594288:err=2290 tim=382407626283
The statement failed in the execution with exception such as
java.sql.SQLIntegrityConstraintViolationException: ORA-02290: check constraint (XXXX.SYS_C0012357) violated
Check the documentation for further details
If you have db access via sqldeveloper....
Go to reports tab, then drill down through data dictionary, database administration, sessions, and finally sessions.
In that view, look for your app's active module(s) and look at the Active SQL tab.
One of them should have your insert statement....
This might help as well...
https://docs.oracle.com/cd/E17781_01/server.112/e18804/monitoring.htm#ADMQS252
The ultimate approach is to trace the JDBC connection on the client. Please find the full documentation here
In the first step you must get the logging JDBC driver on the CLASSPATH. The logging driver has a suggix _g in the name, e.g. ojdbc8_g.jar if you use ojdbc8.jar
The driver can be found in the Oracle installation in the folder jdbc/lib/
Further you must define a properties file say jdbcLogging.properties with following content
.level=SEVERE
oracle.jdbc.level=ALL
oracle.jdbc.handlers=java.util.logging.ConsoleHandler
java.util.logging.ConsoleHandler.level=FINE
java.util.logging.ConsoleHandler.formatter=java.util.logging.SimpleFormatter
Finally when you run the Java application you must define two properties
java -Doracle.jdbc.Trace=true -Djava.util.logging.config.file=jdbcLogging.properties ...
This will produce a trace file om the error output where you can find the executed statements.
Example
INFO: DRCP Enabled: false
Mar 23, 2021 10:40:31 PM oracle.jdbc.driver.OracleStatement logSQL
CONFIG: BAB2F1 SQL: insert into my_table(col1) values(?)
What I ended up doing was downloading WireShark, a sniffer, and monitored the TCP/IP packets.
I am a Java developer. Since I am new to SQL server, I have limited knowledge on it. I am trying to find out the root cause why our SQL server suddenly hanged and It became normal after restart
Symptoms:
~ Java threads started getting stuck, figured out that the Java JDBC connections started hanging without any response from DB which caused threads to stuck
~ All connections (around 100 ) were active until SQL server was restarted. Finally, DB connections were closed by DB after restart. Java JDBC connection received 'Connection reset by peer' AFTER DB was restarted
Impact duration : 5 hours (until restart)
Tech stack:
Java spring boot running on weblogic. ORM: hibernate
SQL server 2016
Limitation:
Team had restarted the SQL server before we can export any statistics from DB and even doubtful whether we were able to run statistic SQL queries before SQL server restarted as the DB was hanged already
Findings/actions:
After DB was restarted, I tried to extract statistics from dm_exec_query_stats, however, it tracks queries based on last run time only. There was no result for affected period. Same scenario for dm_os_waiting_tasks as well.
Server team say that the CPU and Memory usage were normal (I still to receive complete report)
Could see no error/problem from Windows event log and cluster logs. They look normal
Some Google sources say that some queries may consume complete CPU which may make SQL server to hang, some others say that some queries might have made blocking.
It may look simple or common for SQL server experts/DBA, however, I have googled for finding out relevant issue and resolution, however they doesn't seem to help
Just guiding me to refer any document or expert advise will be great. Let me know if additional info needed. Thanks in advance !
Tried these queries but no joy
SELECT deqs.last_execution_time AS [Time], dest.TEXT AS [Query], dbid, deqs.*
FROM sys.dm_exec_query_stats AS deqs
CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest
where deqs.last_execution_time between '2020-09-29 13:16:52.710' and '2020-09-29 23:16:52.710'
ORDER BY deqs.last_execution_time DESC ;
SELECT
qs.sql_handle,
qs.execution_count,
qs.total_worker_time AS Total_CPU,
total_CPU_inSeconds = --Converted from microseconds
qs.total_worker_time/1000000,
average_CPU_inSeconds = --Converted from microseconds
(qs.total_worker_time/1000000) / qs.execution_count,
qs.total_elapsed_time,
total_elapsed_time_inSeconds = --Converted from microseconds
qs.total_elapsed_time/1000000,
st.text,qs.query_hash,
qp.query_plan
FROM
sys.dm_exec_query_stats AS qs
CROSS APPLY
sys.dm_exec_sql_text(qs.sql_handle) AS st
CROSS APPLY
sys.dm_exec_query_plan (qs.plan_handle) AS qp
where qs.last_execution_time between '2020-09-29 13:16:52.710' and '2020-09-29 23:16:52.710'
ORDER BY
qs.total_worker_time DESC;
--View waiting tasks per connection
SELECT st.text AS [SQL Text], c.connection_id, w.session_id,
w.wait_duration_ms, w.wait_type, w.resource_address,
w.blocking_session_id, w.resource_description, c.client_net_address, c.connect_time
FROM sys.dm_os_waiting_tasks AS w
INNER JOIN sys.dm_exec_connections AS c ON w.session_id = c.session_id
CROSS APPLY (SELECT * FROM sys.dm_exec_sql_text(c.most_recent_sql_handle)) AS st
WHERE w.session_id > 50 AND w.wait_duration_ms > 0
ORDER BY c.connection_id, w.session_id
GO
-- View waiting tasks for all user processes with additional information
SELECT 'Waiting_tasks' AS [Information], owt.session_id,
owt.wait_duration_ms, owt.wait_type, owt.blocking_session_id,
owt.resource_description, es.program_name, est.text,
est.dbid, eqp.query_plan, er.database_id, es.cpu_time,
es.memory_usage*8 AS memory_usage_KB
FROM sys.dm_os_waiting_tasks owt
INNER JOIN sys.dm_exec_sessions es ON owt.session_id = es.session_id
INNER JOIN sys.dm_exec_requests er ON es.session_id = er.session_id
OUTER APPLY sys.dm_exec_sql_text (er.sql_handle) est
OUTER APPLY sys.dm_exec_query_plan (er.plan_handle) eqp
WHERE es.is_user_process = 1
ORDER BY owt.session_id;
GO;
We are having a problem with a prepared statement in Java. The exception seems to be very clear:
Root Exception stack trace:
com.microsoft.sqlserver.jdbc.SQLServerException: The statement must be executed before any results can be obtained.
at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(SQLServerException.java:170)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.getGeneratedKeys(SQLServerStatement.java:1973)
at org.apache.commons.dbcp.DelegatingStatement.getGeneratedKeys(DelegatingStatement.java:315)
It basically states that we are trying to fetch the query results before it has been executed. Sounds plausible. Now, the code which is causing this exception is as follows:
...
preparedStatement.executeUpdate();
ResultSet resultSet = preparedStatement.getGeneratedKeys();
if(resultSet.next()) {
retval = resultSet.getLong(1);
}
...
As you can see, we fetch the query result after we executed the statement.
In this case, we try to get the generated key from the ResultSet of the INSERT query we just succesfully executed.
Problem
We run this code on three different servers (load balanced, in docker containers). Strange enough, this exception only occurs on the third docker server. The other two docker servers have never ran into this exception.
Extra: the failing query is executed approxmately 13000 times per day. (4500 processed by server 3) Most of the times the query works well at server 3 as well. Sometimes, lets say 20 times per day, the query fails. Always the same query, always the same server. Never one of the other servers.
What we've tried
We checked the software versions. But this is all the same because all servers are running with the same docker image.
We updated to the newest Microsoft SQL driver for Java
We checked if all our PreparedStatements were constructed using PreparedStatement.RETURN_GENERATED_KEYS parameter.
It looks like it is some server configuration related problem, since the docker images are all the same. But we can't find the cause. Does anyone have suggestions what the problem can be? Or has anyone ever ran in this problem as well?
As I know, getGeneratedKeys() in case of batch execution is not supported by SQL Server.
Here is feature request which is not satisfied yet: https://github.com/Microsoft/mssql-jdbc/issues/245
My suggestion is that if for some reason on you third server batch insert was executed contitiously, this can cause the exception you mentioned (in case on other two only one item was inserted)
You can try to log the sql statement to check this
I'm using the streaming result sets provided by Spring Data JPA's repositories along with MySQL in order to reduce memory consumption of methods that involve scanning large results sets (which is looking increasingly like a hopelessly vain attempt; in theory the idea of using streams for this is brilliant; in practice, the constrainsts are very difficult to work with).
If I attempt to start using a second query in a thread while a stream produced by a previos query is unclosed, I get an exception like this:
org.springframework.web.util.NestedServletException: Request processing failed; nested exception is org.hibernate.exception.GenericJDBCException: could not extract ResultSet
org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:982)
org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:872)
javax.servlet.http.HttpServlet.service(HttpServlet.java:661)
...
Root cause: org.hibernate.exception.GenericJDBCException: could not extract ResultSet
org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:47)
org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:111)
...
java.sql.SQLException: Streaming result set com.mysql.jdbc.RowDataDynamic#15f2d1f is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.
com.mysql.jdbc.SQLError.createSQLException(SQLError.java:868)
com.mysql.jdbc.SQLError.createSQLException(SQLError.java:864)
com.mysql.jdbc.MysqlIO.checkForOutstandingStreamingData(MysqlIO.java:3214)
...
Unfortunately, when I locate my code in the very long stack trace, when I look at it I don't see any items that are allocated and not disposed, so I'm not really sure what's going on. How can I go about finding which query was not closed on time?
We are using Spring SimpleJdbcCall to call stored procedures in Oracle that return cursors. It looks like SimpleJdbcCall isn't closing the cursors and after a while the max open cursors is exceeded.
ORA-01000: maximum open cursors exceeded ; nested exception is java.sql.SQLException: ORA-01000: maximum open cursors exceeded spring
There are a few other people on forums who've experienced this but seemingly no answers. It looks like me as a bug in the spring/oracle support.
This bug is critical and could impact our future use of Spring JDBC.
Has anybody come across a fix - either tracking the problem to the Spring code or found a workaround that avoids the problem?
We are using Spring 2.5.6.
Here is the new version of the code using SimpleJdbcCall which appears to not be correctly closing the result set that the proc returns via a cursor:
...
SimpleJdbcCall call = new SimpleJdbcCall(dataSource);
Map params = new HashMap();
params.put("remote_user", session.getAttribute("cas_username") );
Map result = call
.withSchemaName("urs")
.withCatalogName("ursWeb")
.withProcedureName("get_roles")
.returningResultSet("rolesCur", new au.edu.une.common.util.ParameterizedMapRowMapper() )
.execute(params);
List roles = (List)result.get("rolesCur")
The older version of the code which doesn't use Spring JDBC doesn't have this problem:
oracleConnection = dataSource.getConnection();
callable = oracleConnection.prepareCall(
"{ call urs.ursweb.get_roles(?, ?) }" );
callable.setString(1, (String)session.getAttribute("cas_username"));
callable.registerOutParameter (2, oracle.jdbc.OracleTypes.CURSOR);
callable.execute();
ResultSet rset = (ResultSet)callable.getObject(2);
... do stuff with the result set
if (rset != null) rset.close(); // Explicitly close the resultset
if (callable != null) callable.close(); //Close the callable
if (oracleConnection != null) oracleConnection.close(); //Close the connection
It would appear that Spring JDBC is NOT calling rset.close(). If I comment out that line in the old code then after load testing we get the same database exception.
After much testing we have fixed this problem. It is a combination of how we were using the spring framework and the oracle client and the oracle DB. We were creating new SimpleJDBCCalls which were using the oracle JDBC client's metadata calls which were returned as cursors which were not being closed and cleaned up. I consider this a bug in the Spring JDBC framework in how it calls metadata but then does not close the cursor. Spring should copy the meta data out of the cursor and close it properly. I haven't bothered opening an jira issue with spring because if you use best practice the bug isn't exhibited.
Tweaking OPEN_CURSORS or any of the other parameters is the wrong way to fix this problem and just delays it from appearing.
We worked around it/fixed it by moving the SimpleJDBCCall into a singleton DAO so there is only one cursor open for each oracle proc that we call. These cursors are open for the lifetime of the app - which I consider a bug. As long as OPEN_CURSORS is larger than the number of SimpleJDBCCall objects then there won't be hassles.
Well, I've got this problem when I was reading BLOBs. Main cause was that I was also updating table and the Statement containing update clause was not closed automatically. Nasty cursorleak eats all free cursors. After explicit call of statement.close() the error disappears.
Moral - always close everything, don't rely on automatic close after disposing Statement.
Just be careful setting OPEN_CURSORS to higher and higher values as there are overheads and it could just be band-aiding over an actual problem/error in your code.
I don't have experience with the Spring side of this but worked on an app where we had many issues with ORA-01000 errors and constantly adjusting OPEN_CURSORS just made the problem go away for a little while ...
I can promise you that it's not Spring. I worked on a Spring 1.x app that went live in 2005 and hasn't leaked a connection since. (WebLogic 9., JDK 5). You aren't closing your resources properly.
Are you using a connection pool? Which app server are you deploying to? Which version of Spring? Oracle? Java? Details, please.
Oracle OPEN_CURSORS is the key alright. We have a small 24x7 app running against Oracle XE with only a few apparently open cursors. We had intermittent max open cursors errors until we set the OPEN_CURSORS initialization value to > 300
The solution is not in Spring, but in Oracle: you need to set the OPEN_CURSORS initialization parameter to some value higher than the default 50.
Oracle -- at least as-of 8i, perhaps it's changed -- would reparse JDBC PreparedStatement objects unless you left them open. This was expensive, and most people end up maintaining a fixed pool of open statements that are resubmitted.
(taking a quick look at the 10i docs, they explicitly note that the OCI driver will cache PreparedStatements, so I'm assuming that the native driver still recreates them each time)