How to find unclosed Spring Data JPA streaming result set - java

I'm using the streaming result sets provided by Spring Data JPA's repositories along with MySQL in order to reduce memory consumption of methods that involve scanning large results sets (which is looking increasingly like a hopelessly vain attempt; in theory the idea of using streams for this is brilliant; in practice, the constrainsts are very difficult to work with).
If I attempt to start using a second query in a thread while a stream produced by a previos query is unclosed, I get an exception like this:
org.springframework.web.util.NestedServletException: Request processing failed; nested exception is org.hibernate.exception.GenericJDBCException: could not extract ResultSet
org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:982)
org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:872)
javax.servlet.http.HttpServlet.service(HttpServlet.java:661)
...
Root cause: org.hibernate.exception.GenericJDBCException: could not extract ResultSet
org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:47)
org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:111)
...
java.sql.SQLException: Streaming result set com.mysql.jdbc.RowDataDynamic#15f2d1f is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.
com.mysql.jdbc.SQLError.createSQLException(SQLError.java:868)
com.mysql.jdbc.SQLError.createSQLException(SQLError.java:864)
com.mysql.jdbc.MysqlIO.checkForOutstandingStreamingData(MysqlIO.java:3214)
...
Unfortunately, when I locate my code in the very long stack trace, when I look at it I don't see any items that are allocated and not disposed, so I'm not really sure what's going on. How can I go about finding which query was not closed on time?

Related

Handling exception resulting from transaction timeout when using Spring / Hibernate / Postgres

I'm attempting to provide a useful error message to users of a system where some queries may take a long time to execute. I've set a transaction timeout using Spring using #Transactional(timeout = 5).
This works as expected, but the caller of the annotated method receives a JpaSystemException: could not extract ResultSet exception (caused by GenericJDBCException: could not extract ResultSet, in turn caused by PSQLException: ERROR: canceling statement due to user request).
As far as I can tell, the exception is a result of a statement being cancelled by Hibernate and the JDBC driver once the timeout has been exceeded.
Is there any way I can determine that the exception was a result of the transaction timeout being exceeded so that I can provide an error message to the user about why their query failed?
The application is using Spring framework 4.2.9, Hibernate 4.3.11, and Postgres JDBC driver 9.4.1209. I realise these are quite old. If newer versions make handling this situation easier I would be interested to know.
How about you check the exception message of the cause of the cause against some regex pattern or just checking if the message contains the string like this?
exception.getCause().getCause().getMessage().equals("ERROR: canceling statement due to user request")

Getting 'It is illegal to run command renameCollection in a multi-document transaction'

Getting this exception when I use below code from Java.
db.getCollection(collectionName).renameCollection(session, new MongoNamespace(db.getName(), newCollectionName));
I went through Mongo DB documentation and mentioned some of the operations are restricted in multi document transactions.
If its restricted why does this method gets session as input?
How to execute this in the middle of the transaction ?
Got the same error while listCollectionNames and createCollection as well.

SQL exception which only occurs on one of the three servers

We are having a problem with a prepared statement in Java. The exception seems to be very clear:
Root Exception stack trace:
com.microsoft.sqlserver.jdbc.SQLServerException: The statement must be executed before any results can be obtained.
at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(SQLServerException.java:170)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.getGeneratedKeys(SQLServerStatement.java:1973)
at org.apache.commons.dbcp.DelegatingStatement.getGeneratedKeys(DelegatingStatement.java:315)
It basically states that we are trying to fetch the query results before it has been executed. Sounds plausible. Now, the code which is causing this exception is as follows:
...
preparedStatement.executeUpdate();
ResultSet resultSet = preparedStatement.getGeneratedKeys();
if(resultSet.next()) {
retval = resultSet.getLong(1);
}
...
As you can see, we fetch the query result after we executed the statement.
In this case, we try to get the generated key from the ResultSet of the INSERT query we just succesfully executed.
Problem
We run this code on three different servers (load balanced, in docker containers). Strange enough, this exception only occurs on the third docker server. The other two docker servers have never ran into this exception.
Extra: the failing query is executed approxmately 13000 times per day. (4500 processed by server 3) Most of the times the query works well at server 3 as well. Sometimes, lets say 20 times per day, the query fails. Always the same query, always the same server. Never one of the other servers.
What we've tried
We checked the software versions. But this is all the same because all servers are running with the same docker image.
We updated to the newest Microsoft SQL driver for Java
We checked if all our PreparedStatements were constructed using PreparedStatement.RETURN_GENERATED_KEYS parameter.
It looks like it is some server configuration related problem, since the docker images are all the same. But we can't find the cause. Does anyone have suggestions what the problem can be? Or has anyone ever ran in this problem as well?
As I know, getGeneratedKeys() in case of batch execution is not supported by SQL Server.
Here is feature request which is not satisfied yet: https://github.com/Microsoft/mssql-jdbc/issues/245
My suggestion is that if for some reason on you third server batch insert was executed contitiously, this can cause the exception you mentioned (in case on other two only one item was inserted)
You can try to log the sql statement to check this

How can I retrieve long response from mongodb?

In Spring mvc + mongodb application, I have 400k documents. if I need to return 300k documents when a query is made, how can I do that?
Following is the stack trace,
HTTP Status 500 - Request processing failed; nested exception is java.lang.IllegalArgumentException: response too long: 1634887426
type Exception report
message Request processing failed; nested exception is java.lang.IllegalArgumentException: response too long: 1634887426
description The server encountered an internal error that prevented it from fulfilling this request.
exception
org.springframework.web.util.NestedServletException: Request processing failed; nested exception is java.lang.IllegalArgumentException: response too long: 1634887426
org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:973)
org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:863)
javax.servlet.http.HttpServlet.service(HttpServlet.java:646)
org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:837)
javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
root cause
java.lang.IllegalArgumentException: response too long: 1634887426
com.mongodb.Response.<init>(Response.java:49)
com.mongodb.DBPort$1.execute(DBPort.java:141)
com.mongodb.DBPort$1.execute(DBPort.java:135)
com.mongodb.DBPort.doOperation(DBPort.java:164)
com.mongodb.DBPort.call(DBPort.java:135)
com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:292)
com.mongodb.DBTCPConnector.call(DBTCPConnector.java:271)
com.mongodb.DBCollectionImpl.find(DBCollectionImpl.java:84)
com.mongodb.DBCollectionImpl.find(DBCollectionImpl.java:66)
com.mongodb.DBCursor._check(DBCursor.java:458)
com.mongodb.DBCursor._hasNext(DBCursor.java:546)
com.mongodb.DBCursor.hasNext(DBCursor.java:571)
org.springframework.data.mongodb.core.MongoTemplate.executeFindMultiInternal(MongoTemplate.java:1803)
org.springframework.data.mongodb.core.MongoTemplate.doFind(MongoTemplate.java:1628)
org.springframework.data.mongodb.core.MongoTemplate.doFind(MongoTemplate.java:1611)
org.springframework.data.mongodb.core.MongoTemplate.find(MongoTemplate.java:535)
com.AnnaUnivResults.www.service.ResultService.getStudentList(ResultService.java:38)
com.AnnaUnivResults.www.service.ResultService$$FastClassBySpringCGLIB$$1f19973d.invoke(<generated>)
org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:711)
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:136)
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:644)
com.AnnaUnivResults.www.service.ResultService$$EnhancerBySpringCGLIB$$f9296292.getStudentList(<generated>)
com.AnnaUnivResults.www.controller.ResultController.searchStudentByCollOrDept(ResultController.java:87)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
java.lang.reflect.Method.invoke(Method.java:597)
org.springframework.web.method.support.InvocableHandlerMethod.invoke(InvocableHandlerMethod.java:215)
org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:132)
org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:104)
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandleMethod(RequestMappingHandlerAdapter.java:749)
org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:690)
org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:83)
org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:945)
org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:876)
org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:961)
org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:863)
javax.servlet.http.HttpServlet.service(HttpServlet.java:646)
org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:837)
javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
I guess the above stack trace is because the documents returned is very large. How can I manage this? I changed tomcat server config as 4096M. But I still have problem.
The math first: you are trying to load more than 1.5GB in a single query. That will take a while and except for very rare use cases is showing - sorry - bad application design.
Don't think of how to deal with that from the database side. You should refactor your code.
There are two common scenarios for loading a lot of documents.
Scenario 1: Calculations over the result set
Sometimes you want to do calculations over a large part of your result set. Let's say you want to find out how much turnover all customers from EMEA generated so far and your order documents look like this (simplified for the sake of this example):
{
_id:<...>,
customerId:<...>,
deliveryAdress: {<...>},
region: "EMEA",
items:[{<...>},{<...>},...],
total:12345.78
}
Now, what you could do to a certain extend is to load all the orders from the EMEA region with the equivalent of
db.orders.find({region:"EMEA"})
// the repository method would be something like
// findByRegion(String region)
and iterate over the result set, building a sum of total. This approach has several problems. First of all, even when doing it in this way, you load a lot of data you don't need (items,deliveryAddress). So the first way to reduce the amount of data returned by MongoDB is to use projection:
db.orders.find({region:"EMEA"},{_id:0,total:1})
// as of now, you would have to create a custom method
// and a custom repository implementation
// See "Further Reading"
which will give you a lot of documents only containing the total of all orders from EMEA, vastly reducing the size returned from the database. As far as I know, this can't be done using spring-data's dynamic finders (repositories) automagically.
But this approach still has the drawback that it doesn't scale too well, since there might be a point in time where you have more orders from EMEA than you can load in a single transaction. You could use a server side cursor and an iterator (see scenario 2 for details), but this still is a bit awkward.
A far better approach would be to let MongoDB do the calculations. For this, you would use MongoDB's aggregation framework. As for the example, the query would look like
db.orders.aggregate([{$match:{region:"EMEA"}},{$group:{_id:"$region",totalTurnover:{$sum:"$total"} } })
which would return a single document looking like
{_id:"EMEA",totalTurnover:<very large Sum>}
The advantage is obvious: you keep the load of your application, you don't need to load all the data, drastically improving the performance. And it is scalable.
Scenario 2: You really need a lot of the document's
Even when you really need a lot of the documents, loading them all in one huge result set is bad practise, as that approach isn't scalable, as you found out. A better approach would be to request parts of the result set. For this you use server side cursors.
With spring-data-mongodb you would use PagingAndSortingRepository instead of a CrudRepository or any other. Since PagingAndSortingRepository is an extension of CrudRepository, migration should be quite easy. The advantage is that you only request a part of the result set at a given point in time, which makes your query scalable at the cost of manually iterating over it.
Further reading
Customization of spring-data-mongodb repositories
Aggregation framework support in spring-data-mongodb
PagingAndSortingRepository in "Core Concepts" of the spring-data-mongodb docs

NoHostAvailableException With Cassandra & DataStax Java Driver If Large ResultSet

The setup:
2-node Cassandra 1.2.6 cluster
replicas=2
very large CQL3 table with no secondary index
Rowkey is a UUID.randomUUID().toString()
read consistency set to ONE
Using DataStax java driver 1.0
The request:
Attempting to do a table scan by "SELECT some-col from schema.table LIMIT nnn;"
The fail:
Once I go beyond a certain nnn LIMIT, I start to get NoHostAvailableExceptions from the driver.
It reads like this:
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.181.13.239 ([/10.181.13.239] Unexpected exception triggered))
at com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:64)
at com.datastax.driver.core.ResultSetFuture.extractCauseFromExecutionException(ResultSetFuture.java:214)
at com.datastax.driver.core.ResultSetFuture.getUninterruptibly(ResultSetFuture.java:169)
at com.jpmc.es.rtm.storage.impl.EventExtract.main(EventExtract.java:36)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.181.13.239 ([/10.181.13.239] Unexpected exception triggered))
at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:98)
at com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:165)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
Given: This is probably not the most enlightened thing to do to a large table with millions of rows, but this is how I learn what not to do, so I would really appreciate someone who could volunteer how this kind of error can be debugged.
For example, when this happens, there are no indications that the nodes in the cluster ever had an issue with the request (there is nothing in the logs on either node that indicate any timeout or failure). Also, I enabled the trace on the driver, which gives you some nice autotrace (ala Oracle) info as long as the query succeeds. But in this case, the driver blows a NoHostAvailableException and no ExecutionInfo is available, so tracing has not provided any benefit in this case.
I also find it interesting that this does not seem to be recorded as a timeout (my JMX consoles tell me no timeouts have occurred). So, I am left not understanding WHERE the failure is actually occurring. I am left with the idea that it is the driver that is having a problem, but I don't know how to debug it (and I would really like to).
I have read several posts from folks that state that query'g for resultSets > 10000 rows is probably not a good idea, and I am willing to accept this, but I would like to understand what is causing the exception and where the exception is happening.
FWIW, I also tried bumping the timeout properties in the cassandra.yaml, but this made no difference whatsoever.
I welcome any suggestions, anecdotes, insults, or monetary contributions for my registration in the house of moron-developers.
Regards!!
My guess (and perhaps others can confirm) is that you are putting too high a load on the cluster by the query which is causing the timeout. So, yes, it's a little difficult to debug as it's not obvious what the root cause was: was the limit I set too large or is the cluster actually down?
You want to avoid setting large limits on the amount of data you request in a single query, typically by setting a reasonable limit and paging through the results, e.g.,
SELECT * FROM messages WHERE user_id = 101 LIMIT 1000;
SELECT * FROM messages WHERE user_id = 101 AND msg_id > [Last message ID received] LIMIT 1000;
The Automatic Paging functionality added in (see this document, where the code examples in this answer are copied from) is a big improvement in datastax java-driver as it removes the need to manually page and lets you do the following:
Statement stmt = new SimpleStatement("SELECT * FROM images");
stmt.setFetchSize(100);
ResultSet rs = session.execute(stmt);
// Iterate over the ResultSet here
While this won't necessarily solve your problem it will minimise the possibility that it was a "too-big" query.

Categories