Using Spring Boot with Spanner in the Google Cloud Env. we are now struggling with performance issues.
To demonstrate that I set up a small demo case baselining our different approaches how to retrieve data from spanner.
The first approach
uses "native" drivers from Google to instantiate a dbClient and retrieves data like so.
#Repository
public class SpannerNativeDAO implements CustomerDAO {
private final DatabaseClient dbClient;
private final String SQL = "select * from customer where customer_id = ";
public SpannerNativeDAO(
#Value("${spring.cloud.gcp.spanner.instanceId}") String instanceId,
#Value("${spring.cloud.gcp.spanner.database}") String dbId,
#Value("${spring.cloud.gcp.spanner.project-id}") String projectId,
#Value("${google.application.credentials}") String pathToCredentials)
throws IOException {
try (FileInputStream google_application_credentials = new FileInputStream(pathToCredentials)) {
final SpannerOptions spannerOptions =
SpannerOptions.newBuilder().setProjectId(projectId)
.setCredentials(ServiceAccountCredentials.fromStream(google_application_credentials)).build();
final Spanner spanner = spannerOptions.getService();
final DatabaseId databaseId1 = DatabaseId.of(projectId, instanceId, dbId);
dbClient = spanner.getDatabaseClient(databaseId1);
// give it a first shot to speed up consecutive calls
dbClient.singleUse().executeQuery(Statement.of("select 1 from customer"));
}
}
private Customer readCustomerFromSpanner(Long customerId) {
try {
Statement statement = Statement.of(SQL + customerId);
ResultSet resultSet = dbClient.singleUse().executeQuery(statement);
while (resultSet.next()) {
return Customer.builder()
.customerId(resultSet.getLong("customer_id"))
.customerStatus(CustomerStatus.valueOf(resultSet.getString("status")))
.updateTimestamp(Timestamp.from(Instant.now())).build();
}
} catch (Exception ex) {
//log
}
return null;
}
....
}
The second approach
uses the Spring Boot Data Starter (https://github.com/spring-cloud/spring-cloud-gcp/tree/master/spring-cloud-gcp-starters/spring-cloud-gcp-starter-data-spanner)
and simply goes like this
#Repository
public interface SpannerCustomerRepository extends SpannerRepository<Customer, Long> {
#Query("SELECT customer.customer_id, customer.status, customer.status_info, customer.update_timestamp "
+ "FROM customer customer WHERE customer.customer_id = #arg1")
List<Customer> findByCustomerId(#Param("arg1") Long customerId);
}
Now if i take the first approach, establishing a initial gRPC connection to Spanner takes > 5 seconds and all consecutive calls are around 1 sec. The second approach takes only approx. 400ms for each call after the initial call.
To test differences I wired up both solutions in one Spring Boot Project and compared it to a in memory solution (~100ms).
All given timings refer to local tests on dev machines but go back to investigating performance problems within the cloud environment.
I testet several different SpannerOptions (SessionOptions) with no results and ran a profiler on the project.
I seems like 96% of response time comes from establishing a gRPC channel to spanner, whereas the database itself processes and responds within 5ms.
We really don't understand the behaviour. We only work with very little test-data and a couple of small tables.
The DatabaseClient is supposed to manage the ConnectionPool and is itself wired into a Singleton-Scoped Repository-Bean. So Sessions should be reused, rigt?
Why does the first approach take much longer than the second one. The Spring FW itself simply uses the DatabaseClient as member within the SpannerOperations / SpannerTemplate.
How can we generally reduce latency. More than 200ms for plain response on each db call seems four times more than we would have expected. (I am aware that local timing benchmarks need to be treated with care)
Tracing give us good visibility into the client, hopefully it can help you with diagnosing the latencies.
Running TracingSample, I get from stackdriver. There are different backends you can use, or print it out as logs.
The sample above also exports http://localhost:8080/rpcz and http://localhost:8080/tracez you can poke around to check latencies and traces.
A tutorial on setting it up: Cloud Spanner, instrumented by OpenCensus and exported to Stackdriver
The problem here is not related to Spring or DAO's, but that you are not closing the ResultSet that is returned by the query. This causes the Spanner library to think that the session that is used to execute your query is still in use, and causes the library to create a new session every time you execute a query. This session creation, handling and pooling is all taken care of for you by the client library, but it does require you to close resources when they are no longer being used.
I tested this with very simple example, and I can reproduce the exact same behavior as what you are seeing by not closing the ResultSet.
Consider the following example:
/**
* This method will execute the query quickly, as the ResultSet
* is closed automatically by the try-with-resources block.
*/
private Long executeQueryFast() {
Statement statement = Statement.of("SELECT * FROM T WHERE ID=1");
try (ResultSet resultSet = dbClient.singleUse().executeQuery(statement)) {
while (resultSet.next()) {
return resultSet.getLong("ID");
}
} catch (Exception ex) {
// log
}
return null;
}
/**
* This method will execute the query slowly, as the ResultSet is
* not closed and the Spanner library thinks that the session is
* still in use. Executing this method repeatedly will cause
* the library to create a new session for each method call.
* Closing the ResultSet will cause the session that was used
* to be returned to the session pool, and the sessions will be
* re-used.
*/
private Long executeQuerySlow() {
Statement statement = Statement.of("SELECT * FROM T WHERE ID=1");
try {
ResultSet resultSet = dbClient.singleUse().executeQuery(statement);
while (resultSet.next()) {
return resultSet.getLong("ID");
}
} catch (Exception ex) {
// log
}
return null;
}
You should always place ResultSets (and all other AutoCloseables) in a try-with-resources block whenever possible.
Note that if you consume a ResultSet that is returned by Spanner completely, i.e. you call ResultSet#next() until it returns false, the ResultSet is also implicitly closed and the session is returned to the pool. I would however recommend not to rely solely on that, but to always wrap a ResultSet in a try-with-resources.
Can you confirm that the performance doesn't change if the SQL strings are made the same between the two methods? (* vs spelling them out individually).
Also, since you're expecting a single customer in the first method, I'm inferring that the customer ID is a key column? If so, you can use the read-by-key methods from SpannerRepository, and that might be faster than a SQL query.
Related
I am trying to build a SP to return a result set from a remote iSeries and I just can't seem to do it. I can return result sets from the local database, but not remote when I use JT400native.jar (also tried just the jt400.jar) to connect. Anyone have any idea what I am doing wrong?
My SP is defined as this.
CREATE PROCEDURE MYLIB.TEST
(INOUT I INTEGER)
EXTERNAL NAME 'jproc1.returnTEST'
PARAMETER STYLE JAVA
MODIFIES SQL DATA
DYNAMIC RESULT SETS 1
LANGUAGE JAVA
Here the java code behind it which works, it will return result sets from the local database to the client (run sql scripts in iNavigator)
import java.sql.*;
import com.ibm.db2.app.*;
public class jproc1 {
public static void returnTEST(int[] myInputInteger, ResultSet[] myResultSet) throws Exception {
Connection con = DriverManager.getConnection("jdbc:default:connection");
Statement stmt1 = con.createStatement();
String sql1 = "select TEST FROM MYLIB.TEST";
myInputInteger[0] = 5;
myResultSet[0] = stmt1.executeQuery(sql1);
}
}
When I change the SP to to replace the connection with one to a remote iSeries it won't return the result set back to the client, it does however return the first out variable myInputInteger just fine. I believe I have everything setup correctly, I have all the Jars I need registered. The important part here is that internally to the java program I get the result set from the remote iSeries, I can loop through it, count it, dump it to the IFS, it just won't pass it back to the client (Run SQL Scripts in iNavigator).
import java.sql.*;
import com.ibm.db2.app.*;
public class jproc1 {
public static void returnTEST(int[] myInputInteger, ResultSet[] myResultSet) throws Exception {
Class.forName ("com.ibm.as400.access.AS400JDBCDriver").newInstance ();
String url = "jdbc:as400://remoteiseries;naming=sql;prompt=false;user=myuser;password=mypass;translate binary=true";
Connection con = DriverManager.getConnection(url);
Statement stmt1 = con.createStatement();
String sql1 = "select TEST FROM MYLIB.TEST";
myInputInteger[0] = 5
myResultSet[0] = stmt1.executeQuery(sql1);
}
}
So, what am I missing? I have tried a ton of variations including using DB2GENERAL parameter types (and corresponding changes to java program per chapter 7 on the DB2 UDB for iSeries manual). No matter what I do it won't return those remote result sets back to client, and I don't get any errors.
TIA.
What version of IBM i?
DB2 for IBM i has long been behind in its federation support...
Only recently, v7.1 TR 4, has it been possible to do a
insert into localtbl
select * from remotetbl
But you still can't do
select * from localtbl join remotetbl using (key)
As the system only really supports one DB connection at a time in a job.
Interestingly, this article gives a technique for using UDTF to get around the one connection limit. I don't know enough about the internals to understand why a UDTF would work but an SP wouldn't. But you might try changing from an SP to a UDTF.
I am getting a No operations allowed after statement closed. - very obvious and also self explanatory
as to what is going on with my code. In any case I am wondering how I can do this in a cleaner way
public class BaseClass {
Connection con;
Statement st;
protected void establishDBConnection() throws ClassNotFoundException,
SQLException {
Class.forName("com.mysql.jdbc.Driver");
String cString = "....";
con = DriverManager.getConnection(cString, user, password);
st = con.createStatement();
}
public BaseClass() {
try {
createDBConnection();
} catch (Exception e) {
e.printStackTrace();
}
}
}
public ClassB extends BaseClass {
public ClassB() {
super();
}
public void doSomething() {
try {
String q = "select * from my_table";
String moreQuery = "update my_table ...."
String anotherQuery = "do something fancy..."
rs = st.executeQuery(q);
while (rs.next()) {
st.executeUpdate(moreQuery);
st.executeUpdate(anotherQuery);
}
} catch (Exception e) {
e.printStackTrace();
System.out.println("Error in getAllAssociatesOfMra: " + e);
}
}
}
Currently my code is throwing a com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: No operations allowed after statement closed.
The exception is obvious as to what is going on but I was wondering how I can go about handling the close in the BaseClass.
Update
I am aware that there a couple of related questions like mine. The only problem with those questions is that everything is done in the main class. Consider this to be kind of design/abstraction question
Your design is not good. You should be getting the connection, preferably from a connection pool, creating the statements in the beginning of your doSomething() method (for example calling the superclass method), and then closing the Statements and ResultSets when you've done "something".
Before you can make a good design you have to understand what you're trying to accomplish. So I want to establish some goals for this, then look at how the design meets those goals. In order to get the goals for this, let's go over how database connections work.
A database is a separate process, it can be on the same machine or on a different machine where it's accessed over the network. Network connections can go stale due to transient conditions, database downtime, etc. Even if the database is on the same machine and the network is not an issue, it's still bad form to have your application dependent on the state of a separate process with no way to recover, it means that if the database goes down the application can't recover by itself, you have to restart it.
Some properties of connections:
They take a while to get initialized, long enough you wouldn't want to create a new one for every user request. (This will not be nearly as big an issue if the database is on the same machine.)
There is a limited number of them, you don't want one user request to take more than necessary because that will limit the number of other users who can connect concurrently.
There's a commit method on your database connection object that lets you group your operations into transactions, so that a) your queries have a consistent view of the data, and b) the operations get applied in an all-or-nothing manner (if an operation fails you don't have half-done junk cluttering up the database that you have to undo). Currently your connections are in autocommit mode, which means each operation is committed separately (and you can't group operations together).
They're synchronized so only one thread at a time can use them. That way multiple users can't corrupt each others' work, but if you have only one connection your application won't scale since every user is waiting in line for the connection.
From all this we can derive some goals for the design. One is that we want to be able to reinitialize database connections that can go bad, we want to have multiple ones available so everybody's not waiting on the same one, and we want to associate a connection with a user request so that we can group the operations for a given user into transactions.
It's hard to tell what your posted code does exactly because it depends on the scope of the objects, which the question leaves unspecified. If these objects are created per-user-request then you will get a new connection every time, which solves the staleness issue, but may be slow. Having a different connection for every table can make the slowness worse, limits the application's concurrency unnecessarily, and also doesn't allow for transactions that group operations on different tables, because you don't have a common connection object to call commit on. How connections get closed is not apparent; it would be wasteful for them to get abandoned to timeout.
A commonly used alternative approach would be to pool the database connections (where the pool can test database connections and replace stale ones) and hand them out to user requests, returning them to the pool when the request is done with them (the pool can wrap a connection in an object where calling close on it returns it to the pool). In the meantime the user thread can execute operations on the connection, create its own statements and close them on the way out, and commit the results.
Spring has a well-thought out way of handling this situation which follows the approach described above (except with a lot more functionality). Some people dislike frameworks for overcomplicating things, but I recommend at least looking at Spring examples. That way you are aware of an alternative viable approach to organizing your code and you can improve your own design.
If I understand your question and objective, you will need to create multiple Statement objects in doSomething(), and you need to clean up your Statements and ResultSet in a finally block with something like -
Statement st = con.createStatement();
String q = "select * from my_table";
String moreQuery = "update my_table ....";
String anotherQuery = "do something fancy...";
ResultSet rs = st.executeQuery(q);
try {
while (rs.next()) {
Statement stmt = null;
try {
stmt = con.createStatement();
stmt.executeUpdate(moreQuery);
} finally {
if (stmt != null) {
stmt.close();
}
}
try {
stmt = con.createStatement();
stmt.executeUpdate(anotherQuery);
} finally {
if (stmt != null) {
stmt.close();
}
}
}
} finally {
if (rs != null) {
rs.close();
}
if (st != null) {
st.close();
}
}
I suggest few things:
Use connection pool design
To prevent statement close, you can use finally block to close them
Since you have a query after another query, use transaction (commit/rollback) to prevent things "half done"
I'm running what would seem as an otherwise simple piece of code. On its simplified form, it looks like this:
public class ReadDB throws SQLException {
private Connection conn;
private PreparedStatement myStmt;
public ReadDB(Connection connection) {
conn = connection;
}
public List<GameEvent> getEvents(int gameId) {
List<GameEvent> ret = new ArrayList<GameEvent>();
myStmt = conn.prepareStatement("select * from logs where gameid=? order by id");
myStmt.setInt(1, gameId);
myStmt.setQueryTimeout(10); // Wasn't there before, doesn't really help
ResultSet rs = myStmt.executeQuery();
while( rs.next() ) {
// Do stuff, using "rs.getString()"
}
rs.close();
myStmt.close()
return ret;
}
}
And this is what the database initialization looks like (the connection parameter):
String url=“jdbc:mysql://server.example.com/database_name”;
cProperties = new Properties();
cProperties.put(“user”, user);
cProperties.put(“password”, password);
// truncate field values that are too long
cProperties.put(“jdbcCompliantTruncation”, “false”);
connection=DriverManager.getConnection(url,cProperties);
Now, my problem is: after calling the getEvents method several times (around 30), executeQuery() will just hang. No exception, no return value, nothing - it just stops there, probably in some kind of loop.
The database is read only, so there are no INSERT of any kind. Connecting to the (MySQL) database, show processlist lists the connection as Sleep while the connection time goes up. Of course, I can run the query just fine in a parallel window, but the Java program for some reason cannot. Also, it always hangs in a different gameId, so it's not related to that particular set.
Given that a very similar piece of code used to run just fine, I'm guessing that either I'm not opening/closing the connection the right way, or a network-related problem.
Ideas, anyone?
Edit: I updated the code according to address some of the comments, still with no positive results. Regarding debugging, the code seems to be stuck at the deepest level in
n = socketRead0(fd, b, off, length, timeout);
inside the read() function from java.net.SocketInputStream. The trace would be: an instance of java.sql.PreparedStatement (the one in the code) calls executeQuery, which calls executeInternal, which calls several MysqlIO functions, the deepest of which is MysqlIO.readFully (called by MysqlIO.nextRowFast). I can't peek inside this functions, but I can see them being called. I suspect, however, that this is too much detail, and that the error must be somewhere else.
I have also faced similar issue. The program actually stops and waits at the executeQuery() command.
But my issue gets resolved when I do the following :
Commit my Oracle Database after I deleted the Table directly from Oracle
Client(Toad).
I'm developing a web application with Play 2.1.0 and programming it with Java and I need to have access to data already saved in a DB to modify them.
I tried to create a new instance without the new operator and reference it to my object saved in the database, but even if there is no pointer error, it won't change values of attributes. I couldn't figure out why, so I've decided to enter SQL queries directly.
Same thing, it does not seems to have any mistake, but it won't change anything... I think this comes from a bad link to the database :
Here is my code in application.java :
public static Result modifyQuestionnaire(Long id) throws SQLException {
Statement stmt = null;
Connection con = DB.getConnection();
try {
stmt = con.createStatement(ResultSet.TYPE_SCROLL_SENSITIVE, ResultSet.CONCUR_UPDATABLE);
String query = "SELECT * FROM WOQ.questionnaire WHERE id=id";
ResultSet uprs = stmt.executeQuery(query);
uprs.updateString("name", "baba");
uprs.updateRow();
} catch(Exception e) {
e.printStackTrace();
} finally {
if (stmt!=null) {
stmt.close();
}
}
return redirect(routes.Application.questionnaire(id));
}
And I also try to enter an UPDATE query directly, still the same..
I've looked everywhere and did not find any solution (except Anorm but it seems to work with Scala language)
Btw, if anyone knows a solution with a second instance that refers to the same object (it seems possible but as I say, there is no error but no actions neither), it's fine for me.
Huh, you showed as that you are trying to create totally new connection, so I supposed, that you don't want to use Ebean, but in case when you are already use it, you can just use its methods for the task:
(copied) There are some options in Ebean's API, so you should check it and choose one:
Update<T> - check in the sample for #NamedUpdates annotation
Ebean.createUpdate(beanType, updStatement)
SqlUpdate - you can just perform raw SQL update, without need for giving the entity type
I have a problem and need some enlightenment here..
I am using trigger to detect change made to my database, means that I set all my table with trigger for insert, update, and delete (MySQL)
Then I write that change into a table that I have made specifically to contain all information about the change. Let's name it xtable. (This table is not equipped with trigger)
My Java program need to continuously read that xtable to let other application know about the change.
Well the problem is, when I read the xtable in a loop, I can only read the initial value of the xtable that is when I established the connection to the database. (connection is established outside the loop)
If a change has been made to the database which will lead to new row in xtable, this new row which is produced by the trigger is not detected no matter how many times I read it with executing "select * from xtable" query..
The code look like this:
public static void main(String[] args) {
Connection conn = null;
try {
conn = database.getConnection();
Statement state = conn.createStatement();
String query = "select * from `xtable`;";
while (true) {
ResultSet rs = state.executeQuery(query);
while(rs.next){
// Some code for letting the other application know of the change
}
}
} catch (SQLException ex) {
} finally {
if (conn != null) {
conn.close();
}
}
}
So basically if I run the program while the xtable is empty, I always gain an empty ResultSet even when there is a new row after sometimes.
Actually this problem can be solved by established the connection inside the loop, but then it will lead to another problem because it will consume more and more resource as the loop go around. (I have already try this and it will eventually use all resource on my computer after sometimes even when I have already properly closed it)
So can anyone please give me some suggestion what to do?
This is my first time posting a question here, I am sorry if there is some rule that I don't follow and please give me the right direction.
Thereis such thing as transaction isolation. It could be possible that your connection does not see changes because you did not commited transaction coming from trigger, or you did not started new one on client side. Impossible to tell without seeing your database set up.
PS: Message queuing is way better alternative
I think you'd better consider trigger instead of querying to the DBMS by looping.
If you use trigger you don't have to use that 'while' loop from Java side to check the change of DB.
Instead, trigger mechanism which is embedded in the DBMS will notify the Java side when the change happens.
For Oracle, you can call Java method from PL/SQL.
For PostgreSQL, you can call Java method from PL/Java.
For CUBRID, you can call Java method from Java stored procedure.
For MySQL, you can call Java method but I don't think it is as easy as above.
I wish this link would help you out. http://code.rocksol.it/call-java-from-mysql-trigger
Or google this keyword, "mysql java user defined functions"
Connection connection=getConnection();
statement="query";
try {
stmt = connection.prepareStatement(statement);
ResultSet rs = stmt.executeQuery();
while (rs.next()) {
}
} catch (SQLException e) {
e.printStackTrace();
} finally {
if (connection != null && stmt != null) {
stmt.close();
}
}