java - Multipile update statements in MySql - java

so I have a software which basically downloads 1.5K game server address from my MySQL db. It then pings all of them and then upload the information such as online players back to the database. The process looks like this:
Download server address
Ping the servers and get information
Upload information back to the database
So far I have been able to solve the part where it download the server host name and pings them but the problem arises when updating the servers.
To update I thought about using a for loop to construct one BIG string of many update statements and execute it at once but this is prone to sql injections. So idealy one would want to use prepared statements.
The SQL update statement i'm using is:
UPDATE serverlist SET `onlineplayers` = '3', maxplayers = '10',
name = 'A game server' WHERE `ip` = 'xxx.xxx.xxx.xxx' AND `port` = 1234;
So my question is: How can i execute all the 1.5K updates statements using parameterized queries?

If you google for "jdbc bulk update" you'll get lots of results like this one or this one.
The latter has an example like this:
try {
...
connection con.setAutoCommit(false);
PreparedStatement prepStmt = con.prepareStatement(
"UPDATE DEPT SET MGRNO=? WHERE DEPTNO=?");
prepStmt.setString(1,mgrnum1);
prepStmt.setString(2,deptnum1);
prepStmt.addBatch();
prepStmt.setString(1,mgrnum2);
prepStmt.setString(2,deptnum2);
prepStmt.addBatch();
int [] numUpdates=prepStmt.executeBatch();
for (int i=0; i < numUpdates.length; i++) {
if (numUpdates[i] == -2)
System.out.println("Execution " + i +
": unknown number of rows updated");
else
System.out.println("Execution " + i +
"successful: " numUpdates[i] + " rows updated");
}
con.commit();
} catch(BatchUpdateException b) {
// process BatchUpdateException
}

Sounds like you want to do a batch SQL update. Prepared statements are your friend. Here's an example of using prepared statements in batch:
http://www.mkyong.com/jdbc/jdbc-preparedstatement-example-batch-update/
Using prepared statements makes setting parameters easier and it allows the DB to efficiently perform multiple updates. Executing multiple SQL strings would work but would be inefficient since each SQL string would be sent to the DBMS, parsed, compiled, then executed. With prepared statements the SQL is parsed and compiled once then reused for future updates with different parameters.

Another important step that you should be aware about during MySQL batch update / insert is JDBC Connection propertie rewriteBatchedStatements=true ( false by default ). Without it batch mode is useless.
It cost me 1 day to "fix bug" till I found out this.
When you have small number of lines and close client-to-DB location ( 1ms ping ) , you even can't realize that you in "fake batch mode" , but when I switch environment to remote client ( ping=100ms ) and 100k lines to update , it would take 4hours of "batch mode update" with default rewriteBatchedStatements=false and just 2minutes with rewriteBatchedStatements=true

Create a prepared statement:
String sql = "update serverlist SET onlineplayers = ?, maxplayers = ?, name = ? where ip = ? and port = ?";
PreparedStatement stmt = connection.prepareStatement(sql);
Then loop through your list, and at each iteration, do
stmt.setInt(1, onlinePlayers);
stmt.setInt(2, maxPlayers);
stmt.setString(3, name);
stmt.setString(4, ip);
stmt.setInt(5, port);
stmt.executeUpdate();
For better performance, you could also use batch updates.
Read the JDBC tutorial.

Related

30 minutes to insert 100k records with JDBC Batch Update while inserting records [duplicate]

I need to insert a couple hundreds of millions of records into the mysql db. I'm batch inserting it 1 million at a time. Please see my code below. It seems to be slow. Is there any way to optimize it?
try {
// Disable auto-commit
connection.setAutoCommit(false);
// Create a prepared statement
String sql = "INSERT INTO mytable (xxx), VALUES(?)";
PreparedStatement pstmt = connection.prepareStatement(sql);
Object[] vals=set.toArray();
for (int i=0; i<vals.length; i++) {
pstmt.setString(1, vals[i].toString());
pstmt.addBatch();
}
// Execute the batch
int [] updateCounts = pstmt.executeBatch();
System.out.append("inserted "+updateCounts.length);
I had a similar performance issue with mysql and solved it by setting the useServerPrepStmts and the rewriteBatchedStatements properties in the connection url.
Connection c = DriverManager.getConnection("jdbc:mysql://host:3306/db?useServerPrepStmts=false&rewriteBatchedStatements=true", "username", "password");
I'd like to expand on Bertil's answer, as I've been experimenting with the connection URL parameters.
rewriteBatchedStatements=true is the important parameter. useServerPrepStmts is already false by default, and even changing it to true doesn't make much difference in terms of batch insert performance.
Now I think is the time to write how rewriteBatchedStatements=true improves the performance so dramatically. It does so by rewriting of prepared statements for INSERT into multi-value inserts when executeBatch() (Source). That means that instead of sending the following n INSERT statements to the mysql server each time executeBatch() is called :
INSERT INTO X VALUES (A1,B1,C1)
INSERT INTO X VALUES (A2,B2,C2)
...
INSERT INTO X VALUES (An,Bn,Cn)
It would send a single INSERT statement :
INSERT INTO X VALUES (A1,B1,C1),(A2,B2,C2),...,(An,Bn,Cn)
You can observe it by toggling on the mysql logging (by SET global general_log = 1) which would log into a file each statement sent to the mysql server.
You can insert multiple rows with one insert statement, doing a few thousands at a time can greatly speed things up, that is, instead of doing e.g. 3 inserts of the form INSERT INTO tbl_name (a,b,c) VALUES(1,2,3); , you do INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(1,2,3),(1,2,3); (It might be JDBC .addBatch() does similar optimization now - though the mysql addBatch used to be entierly un-optimized and just issuing individual queries anyhow - I don't know if that's still the case with recent drivers)
If you really need speed, load your data from a comma separated file with LOAD DATA INFILE , we get around 7-8 times speedup doing that vs doing tens of millions of inserts.
If:
It's a new table, or the amount to be inserted is greater then the already inserted data
There are indexes on the table
You do not need other access to the table during the insert
Then ALTER TABLE tbl_name DISABLE KEYS can greatly improve the speed of your inserts. When you're done, run ALTER TABLE tbl_name ENABLE KEYS to start building the indexes, which can take a while, but not nearly as long as doing it for every insert.
You may try using DDBulkLoad object.
// Get a DDBulkLoad object
DDBulkLoad bulkLoad = DDBulkLoadFactory.getInstance(connection);
bulkLoad.setTableName(“mytable”);
bulkLoad.load(“data.csv”);
try {
// Disable auto-commit
connection.setAutoCommit(false);
int maxInsertBatch = 10000;
// Create a prepared statement
String sql = "INSERT INTO mytable (xxx), VALUES(?)";
PreparedStatement pstmt = connection.prepareStatement(sql);
Object[] vals=set.toArray();
int count = 1;
for (int i=0; i<vals.length; i++) {
pstmt.setString(1, vals[i].toString());
pstmt.addBatch();
if(count%maxInsertBatch == 0){
pstmt.executeBatch();
}
count++;
}
// Execute the batch
pstmt.executeBatch();
System.out.append("inserted "+count);

SQL Server deadlock when using PreparedStatements

I have a java servlet application and I'm using a prepared query to update a record in a SQL Server Database table.
Lets say I want to execute UPDATE MyTable SET name = 'test' WHERE id = '10'. (Yes, id is a varchar)
I used the following code to make this happen:
PreparedStatement pstmt = con.prepareStatement("UPDATE MyTable SET name = ? WHERE id = ?");
pstmt.setString(1, getName() );
pstmt.setString(2, getID() );
pstmt.executeUpdate();
I found out that while I was running a JMeter script to simulate 2 users, this statement causes a deadlock in my database.
I wanted to check what my values were in the SQL Profiler so I used the following code, so I could check the values.
String query = String.format("UPDATE MyTable SET name = '%s' WHERE id = '%s' ", getName(), getID() );
PreparedStatement pstmt = con.prepareStatement(query);
pstmt.executeUpdate();
Suddenly my deadlock was gone! It's a shame the last approach is vulnerable to SQL injection.
Is there somebody who can tell me what is going on and/or how to fix it?
Ok I finally found the problem and solution to my problem.
It seemed that the combination of the jTDS JDBC driver with MSSQL was the 'problem'.
This article explained my situation exactly. And with the help of this FAQ I was able to set the datasource to the right configuration.
From what I understand:
If you have statement that uses a String-like index (Like in my situation), the table performs an index SCAN instead of an index SEEK. This causes the whole table to be locked and vulnerable to deadlocks.
I hope this will help other people too.

Execute more than one statement at once in JDBC

I am using MySQL Database. The following piece creates a record and gets the id from the created record:
insertStmt = connection
.prepareStatement("INSERT INTO bugs (summary, status, report_date) VALUES (?, ?, ? )");
//...
insertStmt.executeUpdate();
idQuery = connection.prepareStatement("SELECT LAST_INSERT_ID()");
rs = idQuery.executeQuery();
if (rs != null) {
rs.next();
return new Long(rs.getLong(1)).toString();
}
Now, if two threads execute this and their execution is interleaved, say, the first thread inserts the record followed by the insertion by the second thread, after which the first thread calls last_insert_id() which will be incorrect for this thread as the second thread has already inserted a record.
This might be overcome using synchronization, however. Is there a way we can execute the two statements in a single database call?
LAST_INSERT_ID works per-connection, and as your question states you can have a race condition if two statements in two threads use the same connection.
You have two ways around this:
1: Use a separate connection per thread (not easy, but this is really the best option for scaling and sense; use connection pooling)
2: Use the form of executeUpdate that records the auto-generated key in the same API call, allowing you to read it back later using getGeneratedKeys so that you don't have to use LAST_INSERT_ID in a second query, so avoiding the race condition. There's a similar form of prepareStatement that you can use in prepared statements.
Option 2 is probably what you want in the short term. The link in option 2 goes straight to that API. This link is a MySQL article outlining how to use it.
According to https://dev.mysql.com/doc/refman/5.7/en/connector-j-reference-configuration-properties.html, you should be able to add ?allowMultiQueries=true to your JDBC connection string. Then you would be able to pass multiple statements, separated by semicolons, in Statement#execute(String sql) calls.
Edit: or, use a stored procedure that does what you want. Or, as you said, synchronize the Java code.
You can try using a Multiquery, combined the Insert and the Select Last_INSERT_ID() in the same string.
1) prepare the connection for using the multiquery:
"jdbc:mysql://"+host+"/"+database+"?allowMultiQueries=true"
2) Combine The Insert Query with the Select:
multiQuerySqlString="INSERT INTO bugs (summary, status, report_date) VALUES (1, 2, 3 ); SELECT LAST_INSERT_ID()"
3) esecute the query and expecting multiple result sets:
boolean isResultSet = statement.execute();
ResultSet res = statement.getResultSet();
if isResultSet = statement.getMoreResults();
// Second ReulstSet object
res = cs.getResultSet();
I hope it works
If you have to do this all on a single connection you can ask the driver to return the generated ID:
insertStmt = connection.prepareStatement("...",PreparedStatement.RETURN_GENERATED_KEYS );
insertStmt.executeUpdate();
ResultSet rs = insertStatement.getGeneratedKeys();
Long id = null;
if (rs != null)
{
rs.next();
id = rs.getLong(1);
}
connection.commit();
return id;
Depending on the driver you might need a different prepareStatement() call that takes the column names as the second parameter:
insertStmt = connection.prepareStatement("INSERT ", new String[] {"ID"});
But even in with the above code you should be doing the concurrent inserts on different physical connections to be able to properly control your transactions.

JDBC prepared statement, batch insert performance improvement [duplicate]

I need to insert a couple hundreds of millions of records into the mysql db. I'm batch inserting it 1 million at a time. Please see my code below. It seems to be slow. Is there any way to optimize it?
try {
// Disable auto-commit
connection.setAutoCommit(false);
// Create a prepared statement
String sql = "INSERT INTO mytable (xxx), VALUES(?)";
PreparedStatement pstmt = connection.prepareStatement(sql);
Object[] vals=set.toArray();
for (int i=0; i<vals.length; i++) {
pstmt.setString(1, vals[i].toString());
pstmt.addBatch();
}
// Execute the batch
int [] updateCounts = pstmt.executeBatch();
System.out.append("inserted "+updateCounts.length);
I had a similar performance issue with mysql and solved it by setting the useServerPrepStmts and the rewriteBatchedStatements properties in the connection url.
Connection c = DriverManager.getConnection("jdbc:mysql://host:3306/db?useServerPrepStmts=false&rewriteBatchedStatements=true", "username", "password");
I'd like to expand on Bertil's answer, as I've been experimenting with the connection URL parameters.
rewriteBatchedStatements=true is the important parameter. useServerPrepStmts is already false by default, and even changing it to true doesn't make much difference in terms of batch insert performance.
Now I think is the time to write how rewriteBatchedStatements=true improves the performance so dramatically. It does so by rewriting of prepared statements for INSERT into multi-value inserts when executeBatch() (Source). That means that instead of sending the following n INSERT statements to the mysql server each time executeBatch() is called :
INSERT INTO X VALUES (A1,B1,C1)
INSERT INTO X VALUES (A2,B2,C2)
...
INSERT INTO X VALUES (An,Bn,Cn)
It would send a single INSERT statement :
INSERT INTO X VALUES (A1,B1,C1),(A2,B2,C2),...,(An,Bn,Cn)
You can observe it by toggling on the mysql logging (by SET global general_log = 1) which would log into a file each statement sent to the mysql server.
You can insert multiple rows with one insert statement, doing a few thousands at a time can greatly speed things up, that is, instead of doing e.g. 3 inserts of the form INSERT INTO tbl_name (a,b,c) VALUES(1,2,3); , you do INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(1,2,3),(1,2,3); (It might be JDBC .addBatch() does similar optimization now - though the mysql addBatch used to be entierly un-optimized and just issuing individual queries anyhow - I don't know if that's still the case with recent drivers)
If you really need speed, load your data from a comma separated file with LOAD DATA INFILE , we get around 7-8 times speedup doing that vs doing tens of millions of inserts.
If:
It's a new table, or the amount to be inserted is greater then the already inserted data
There are indexes on the table
You do not need other access to the table during the insert
Then ALTER TABLE tbl_name DISABLE KEYS can greatly improve the speed of your inserts. When you're done, run ALTER TABLE tbl_name ENABLE KEYS to start building the indexes, which can take a while, but not nearly as long as doing it for every insert.
You may try using DDBulkLoad object.
// Get a DDBulkLoad object
DDBulkLoad bulkLoad = DDBulkLoadFactory.getInstance(connection);
bulkLoad.setTableName(“mytable”);
bulkLoad.load(“data.csv”);
try {
// Disable auto-commit
connection.setAutoCommit(false);
int maxInsertBatch = 10000;
// Create a prepared statement
String sql = "INSERT INTO mytable (xxx), VALUES(?)";
PreparedStatement pstmt = connection.prepareStatement(sql);
Object[] vals=set.toArray();
int count = 1;
for (int i=0; i<vals.length; i++) {
pstmt.setString(1, vals[i].toString());
pstmt.addBatch();
if(count%maxInsertBatch == 0){
pstmt.executeBatch();
}
count++;
}
// Execute the batch
pstmt.executeBatch();
System.out.append("inserted "+count);

PreparedStatement performance tuning

Is there any way to improve performance of prepared statements? It's about many select queries. I do the queries like this way:
String query = "SELECT NAME, ADDRESS "
+ "FROM USERS "
+ "where ID = ? "
+ "group by NAME, ADDRESS";
PreparedStatement pstmt = connection.prepareStatement(query);
for(long id: listIDs){
pstmt.setLong(1, id);
ResultSet rs = pstmt.executeQuery();
...
}
The database is MySQL.
It's the server that prepares the queries (that's why you need a connection). To improve performance of prepared statements you have to tune the DB server itself (indexes, etc...).
Another way, is writing queries that only get the results you want.
Another idea is to cache in client side the data you know you'll be using a lot, this way you won't be querying the DB for the same data again and again.
Two suggestions:
Make sure the ID field is indexed.
Combine many small queries into one, for example by using WHERE ID IN (...).
For a more detailed discussion of the latter, see Batching Select Statements in JDBC.
You might also want to investigate whether your JDBC driver supports statement caching. I know oracle's JDBC driver does support.

Categories