Improve the speed of insert in mysql - java

I want to insert 700 million rows to a table which is defined in a following way.
CREATE TABLE KeywordIndex (id INT PRIMARY KEY AUTO_INCREMENT,
keyValue VARCHAR(45) NOT NULL, postings LONGTEXT NOT NULL);
To insert data in the table I first check if the keyValue exists I update the value of postings by concatenating new value to old value. Otherwise, insert data as a new row of the table. Also, if the size of postings is bigger than its definition I consider a new row to write extension of postings of the keyValue. In my implementation, inserting 70,294 entry took 12 hours!!!!
( I am not a database expert, so the code I've written could be based on wrong foundations. Please help me to understand my mistakes :) )
I read this page but I could not find a solution for my problem.
I add code that I wrote to do this process.
public void writeTermIndex(
HashMap<String, ArrayList<TermPosting>> finalInvertedLists) {
try {
for (String key : finalInvertedLists.keySet()) {
int exist=ExistTerm("KeywordIndex",key);
ArrayList<TermPosting> currentTermPostings=finalInvertedLists.get(key);
if (exist>0)
{
String postings=null;
String query = "select postings from KeywordIndex where keyValue=?";
PreparedStatement preparedStmt = conn.prepareStatement(query);
preparedStmt.setString (1, key);
ResultSet rs=preparedStmt.executeQuery();
if(rs.next())
postings=rs.getString("postings");
postings=postings+convertTermPostingsToString(currentTermPostings);
if(getByteSize(postings)>65530)
insertUpdatePostingList("KeywordIndex",key,postings);
else{
updatePosting("KeywordIndex",key,postings);
rs.close();
preparedStmt.close();
}
}
else
{
String postings=convertTermPostingsToString(currentTermPostings);
if(getByteSize(postings)>65530)
insertPostingList("KeywordIndex",key,postings);
else
insetToHashmap("KeywordIndex",key,postings);
}
}
}
catch(Exception e){
e.printStackTrace();
}
}

You should think about using executeBatch() for insert (I'm not talking about the load part of your request). Depending on your database, performances can change a lot (see benchmark at the end of this page)(I once tested it with oracle database)
Something like :
PreparedStatement statement = null;
try {
statement = getConnection().prepareStatement(insertQuerry);
for (/*...*/) {
statement.clearParameters();
statement.setString(1, "Hi");
statement.addBatch();
}
statement.executeBatch();
} catch (SQLException se) {
//Handle exception
} finally {
//Close everything
}

Related

In an java/mysql application can sql Update behave as sql Insert

In an database application, while testing an update button statements for some application jtable which represents a database table contents, i gained a resulting behavior with either creating a new row/record or throwing an duplicate primary key exception such being the same row/record key, with a clear update statement and syntax, actually i asked here before and i was guided to provide a "Minimal, Complete, and Verifiable example" , so i grabbed those updating lines in a new java runnable class - which worked very well and i didn't notice the difference - and it was such this counterpart parameters code example :
import java.sql.*;
import java.util.logging.Level;
import java.util.logging.Logger;
public class QueryTesting
{
public static void main(String[] args)
{
try
{
// create a java mysql database connection
Class.forName("com.mysql.jdbc.Driver");
try (Connection conn = DriverManager.getConnection("jdbc:mysql://localhost/?verifyServerCertificate=false&useSSL=true", "username", "password")) {
Statement s = null ;
try {
s = conn.createStatement();
} catch (SQLException ex) {
Logger.getLogger(QueryTestingForWMC.class.getName()).log(Level.SEVERE, null, ex);
}
int Result ;
try {
Result = s.executeUpdate("USE myDatabase");
} catch (SQLException ex) {
Logger.getLogger(QueryTestingForWMC.class.getName()).log(Level.SEVERE, null, ex);
}
PreparedStatement preparedStmt= null;
// create the java mysql update preparedstatement
String query = "update relativesTable set idRelativeMembers = ? , Name = ? , Picture = ? , RelationDegree = ? , persons_idPersons = ? where idRelativeMembers = ?";
// create the mysql update preparedstatement
preparedStmt = conn.prepareStatement(query);
preparedStmt.setString(1,"00002/10");
preparedStmt.setString(2,"عبد الحفيظ أحمد عبد الفتاح الدؤري");
preparedStmt.setBinaryStream(3, null);
preparedStmt.setString(4, "إبن عم");
preparedStmt.setString(5, "00002");
preparedStmt.setString(6, "00002/10");
// execute the java preparedstatement
preparedStmt.executeUpdate();
preparedStmt.close();
s.close();
}
catch (Exception e)
{
System.err.println("Got an exception! ");
System.err.println(e.getMessage());
}
}catch (ClassNotFoundException e)
{
System.err.println("Got an exception! ");
System.err.println(e.getMessage());
}
}
}
the update code block - in the original application - mentioned with a System.out.println() hint statement to assure that is the updating statement conditional block :
// the mysql update statement
String selectedMemberPrimaryKey = getMembersWithoutPhotos().get(jTable5.convertRowIndexToView(jTable5.getSelectedRow())).getId() ;
System.out.println("This is an update query");
String updateStatement = "update relativesTable set idRelativeMembers = ? , Name = ? , Picture = ? , RelationDegree = ? , persons_idPersons = ? where idRelativeMembers = ?";
// create the mysql update preparedstatement
preparedStmt = conn.prepareStatement(updateStatement);
preparedStmt.setString(1, jTextField2.getText());
while insert condition is hinted also with an insertion hint :
// the mysql insert statement
System.out.println("This is an insert query");
String insertStatement = "insert into relativesTable (idRelativeMembers, Name, Picture, RelationDegree, persons_idPersons) values ( ?, ?, ?, ?, ?)";
// create the mysql insert preparedstatement
preparedStmt = conn.prepareStatement(insertStatement);
preparedStmt.setString(1, jTextField2.getText());
The database table is related with another table through persons_idPersons foreign key, till now, i cannot figure out how the behavior would came from, however, is there is a condition(s) that may leads to the insertion behavior for the jdbc sql update statement ?
NEW NOTION :
I have tried to prepare the update query by other prepare statement way such :
System.out.println("This is an update query");
String updateStatement = "update relativesTable set idRelativeMembers = '00002/20' , Name =' أحمد عبد البارئ الثبنيتي ', Picture = null , RelationDegree =' عم ', persons_idPersons = '00002' where idRelativeMembers = '00002/10'";
// create the mysql update preparedstatement
preparedStmt = conn.prepareStatement(updateStatement);
and it has worked very well !
The whole UPDATE statement block in the jbutton for saving data to the database is - while i have commented the whole INSERT statement block in the jbutton, and still UPDATE works in such INSERT behavior! - :
try
{
// the mysql update statement
String selectedRelateivesMemberPrimaryKey = getMembersWithoutPhotos().get(jTable5.convertRowIndexToView(jTable5.getSelectedRow())).getId() ;
System.out.println("This is an update query");
String updateStatement = "update relativesTable set idRelativeMembers = ? , Name = ? , Picture = ? , RelationDegree = ? , persons_idPersons = ? where idRelativeMembers = ?";
// create the mysql update preparedstatement
preparedStmt = conn.prepareStatement(updateStatement);
preparedStmt.setString(1, jTextField2.getText());
selectedRelativeMemberForUpdateId = jTextField2.getText();
preparedStmt.setString(2, jTextField4.getText());
if(null == jLabel17.getIcon())
{
preparedStmt.setBinaryStream(3, null);
}else
{
Icon icon = jLabel17.getIcon();
ImageIcon img = (ImageIcon) icon ;
BufferedImage bI = new BufferedImage(img.getIconWidth(), img.getIconHeight(), BufferedImage.TYPE_INT_RGB);
Graphics g = bI.createGraphics();
icon.paintIcon(null, g, 0,0);
g.dispose();
ImageIO.write(bI, "jpg", new File("personRelativeMemTempImage.jpg"));
personRelativeMemImageFileForDatabase = new File("personRelativeMemTempImage.jpg");
personRelativeMemImageFileInputStream = new FileInputStream(personRelativeMemImageFileForDatabase);
preparedStmt.setBinaryStream(3,(InputStream) personRelativeMemImageFileInputStream,(int) personRelativeMemImageFileForDatabase.length());
}
preparedStmt.setString(4, jTextField15.getText());
preparedStmt.setString(5, selectedPersonForRelativeMembersId);
preparedStmt.setString(6, selectedRelateivesMemberPrimaryKey);
// execute the preparedstatement
preparedStmt.execute();
try {
preparedStmt.close();
} catch (SQLException ex) {
Logger.getLogger(GUI.class.getName()).log(Level.SEVERE, null, ex);
}
}
catch (FileNotFoundException | SQLException e)
{
System.err.println("Got an exception!");
System.err.println(e.getMessage());
String exceptionMessage = e.getMessage();
if(exceptionMessage.contains(exceptionStringPartOne) && exceptionMessage.contains(exceptionStringPartTwo))
{
JOptionPane.showMessageDialog(null, "Duplicate Keys","error",JOptionPane.ERROR_MESSAGE);
duplicatePrimaryKeyFlag = true ;
}else
{
JOptionPane.showMessageDialog(null, "A Related Database Error","error",JOptionPane.ERROR_MESSAGE);
return;
}
} catch (IOException ex) {
Logger.getLogger(GUI.class.getName()).log(Level.SEVERE, null, ex);
}
No, you need an INSERT statement to insert a new row, an UPDATE will only act on an existing record.
There is a possibility that there is an before/after UPDATE trigger defined on the table that does the INSERT but it would still be an INSERT that creates the new record, however it is called.
You could check for a trigger with
select * from `information_schema`.`triggers`
where event_object_schema = 'myDatabase'
and event_object_table = 'relativesTable'\G
However, a more likely explanation is an error in your application logic that runs the insert path rather than the update path, or both.
Preamble :
The application form was providing a selection to a person from person's jtable and then other selection to his/her relatives members from relatives jtable, and the other selection which is to a relative member brings members's row data to jtextfields to be modified and saved by a save button, and what was happening is throwing duplicate primary key exception in case of updating a member data with the same key, and in case of entering new id for key, a new member appears in the member's jtable.
Behind the Behavior :
Regardless of the state of affairs of writting this application, the problem was in superfluous using of methods at this line :
String selectedMemberPrimaryKey = getMembersWithoutPhotos().get(jTable5.convertRowIndexToView(jTable5.getSelectedRow())).getId() ;
it was enough to retrieve the key of the relatives members jtable to update the corresponding keyed record in the relatives members database table through the id column with just :
String selectedMemberPrimaryKey = jTable5.getValueAt(jTable5.convertRowIndexToView(jTable5.getSelectedRow()), 0).toString();
, and this mismatch led to retrieve another key and specify another record to be updated with the relative member's updating data. Thus, updating the wrong record (key column, and other columns) with the desired record data leads to duplicate primary key exception, and in case of updating the wrong record with a new data of a new relative member for a person, the new record appears in the the desired person's members jtable, which was the update of a member from another - wrong - person's members.
So, in an java/mysql application, an sql UPDATE cannot behave directly as an sql INSERT, but it may leave an INSERT effects as in this case if this wrong updated record was auto-renewable or auto-recreatable, so it would leave some INSERT behavior.
Thank you all For caring, viewing and trying to answer.

Getting inserted (or existing) ids with executeBatch()

I am trying to insert some words to database and return newly inserted id or existing id if the word is already in the database.
I found that I can do this using PreparedStatement and including Statement.RETURN_GENERATED_KEYS. But PreparedStatement is terribly slow. I need to insert like 5000 words at once. Another way I could achieve it by running individual query in for loop:
public ArrayList<Integer> addWords(ArrayList<String[]> allTermsForTag) {
ArrayList ids = new ArrayList<Integer>();
ResultSet rs = null;
try{
Statement st = connection.createStatement();
for (String[] articleTerms: allTermsForTag) {
for(String term: articleTerms) {
String query = "WITH a AS (INSERT INTO tag (name) SELECT '"+term+"' WHERE NOT EXISTS (SELECT name FROM tag WHERE name = '"+term+"') " +
"RETURNING id) SELECT id FROM a UNION SELECT id FROM tag WHERE name = '"+term+"'";
rs = st.executeQuery(query);
while (rs.next())
{
int id = rs.getInt(1);
ids.add(id);
System.out.printf("id: "+id);
}
}
}
rs.close();
st.close();
}catch(SQLException e){
System.out.println("SQL exception was raised while performing SELECT: "+e);
}
return ids;
}
This does what I need nicely, but this is too slow as well.
Another method that I wrote uses executeBatch(), however, it does not return ids:
public ArrayList<Integer> addWords(ArrayList<String[]> allTermsForTag){
ResultSet rs = null;
ArrayList ids = new ArrayList<Integer>();
try{
Statement st = connection.createStatement();
for (String[] articleTerms: allTermsForTag) {
for(String term: articleTerms) {
String query = "WITH a AS (INSERT INTO tag (name) SELECT '"+term+"' WHERE NOT EXISTS (SELECT name FROM tag WHERE name = '"+term+"') " +
"RETURNING id) SELECT id FROM a UNION SELECT id FROM tag WHERE name = '"+term+"'";
st.addBatch(query);
}
st.executeBatch();
rs = st.getGeneratedKeys();
while (rs.next()) {
int id = rs.getInt(1);
ids.add(id);
}
}
st.close();
return ids;
}catch (SQLException e){
System.out.println("SQL exception was raised while performing batch INSERT: "+e.getNextException());
System.out.println("dub");
}
return null;
}
So the question is - how to get ids when using executeBatch() or if this is not possible, how to approach this problem? I need it to work as fast as possible, because there will be a lot of INSERT operations with large amount of data.
Thank you!
Set set = new HashSet();
try {
PreparedStatement ps = cn.prepareStatement("delete from myTable where... ",
Statement.RETURN_GENERATED_KEYS);
ps.setInt(1,200);
ps.setInt(2,262);
ps.setString(3, "108gf99");
ps.addBatch();
ps.setInt(1,200);
ps.setInt(2,250);
ps.setString(3, "hgfha");
ps.addBatch();
ps.executeBatch();
ResultSet rs = ps.getGeneratedKeys();
while (rs.next()){
set.addAll(Collections.singleton(rs.getLong(1)));
}
System.out.println(set);
} catch (SQLException e) {
e.printStackTrace();
}
executeBatch can return generated keys in the latest PgJDBC versions. See issue 195 and pull 204. You must use the prepareStatement variant that takes a String[] of returned column names.
However... take a step back here. The solution isn't loops. The solution is almost never loops.
In this case, you should almost certainly use COPY via the PgJDBC CopyManager API to COPY data into a TEMPORARY table. Then do an INSERT INTO ... SELECT ... RETURNING ... to insert the temp table's contents into the final table and return any generated fields. You can also do a SELECT to join on the temp table to return any that already exist. This is basically a bulk upsert or closely related bulk insert-if-not-exists.
If for some reason you can't do that, the next-best option is probably multi-valued INSERTs with large VALUES lists, but this requires some ugly dynamic SQL. Since you need existing values if the row already exists you'll probably need a writeable CTE too. So really, just use COPY and a query to do the table merge.

Using Java, what is the best way to update a column in every row with values from an array? (SQLite)

Today is my first day using SQLite. I am using Java to interact with an SQLite database that contains fields called ID, NAME, CITY. I would like to take every record in the database and replace the CITY field with a value from an array. Here is what I tried, but realized right away it was wrong. I believe the query is replacing every record 3 times which gives the result that each CITY field is 'Compton'. I am not sure on what is a good or efficient way to do this.
public void update(String cities[]) throws SQLException {
PreparedStatement updateCity = null;
Connection con = null;
String updateString = "update SUPPLIERS set CITY = ?";
try {
Class.forName("org.sqlite.JDBC");
con = DriverManager.getConnection("jdbc:sqlite:suppliers.db);
con.setAutoCommit(false);
updateCity = con.prepareStatement(updateString);
for (int i = 0; i < cities.length; i++) {
updateCity.setString(1, cities[i]);
updateCity.executeUpdate();
con.commit();
}
} catch (Exception e) {
e.printStackTrace();
if (con != null) {
try {
System.err.print("Transaction is being rolled back");
con.rollback();
} catch (SQLException excep) {
excep.printStackTrace();
}
}
} finally {
if (updateCity != null) {
updateCity.close();
}
con.setAutoCommit(true);
con.close();
}
}
I was calling the method like so instance.update(new String[]{"san diego", "los angeles", "Compton"});. I would like to know how to go about doing this with a PreparedStatement if possible, but if this is not the best way to go, please post an alternative suggestion.
Note: This is not my code, it is code taken from SQLite Java Tutorials.
Your update statement will update ever record in the table each time it is called. If you want the statement to update only one record at a time, you will need to change your update statement to something like this:
update SUPPLIERS set CITY = ? where ID = ?
If you want to update all records, you'll need to execute a query to get all of the IDs. A query like this should work:
select ID from SUPPLIERS
Then for each ID returned, call the update statement using that ID and whatever city you wish to update the record with.

Child records rarely wind up with wrong foreign key using java batch statements / arraylists

I have an app that logs a lot of data to a database. I write records to a primary table and when necessary, some extra data is then written to a separate table with a foreign key pointing to the primary table.
The dilemma originally was that for performance reasons, the primary table data was written in batches. I had to find a way to write the "child" data after the primary record batches were complete.
However, I'm noticing that in a small percent of cases, the child records are being written with incorrect foreign keys. < 5% of the time, they have the wrong parent.
Essentially, code runs through a queue of objects that contain data which will be written to a mysql database:
conn.setAutoCommit(false);
s = conn.prepareStatement("insert into ... ");
while( !queue.isEmpty() ){
SomeObject a = queue.poll();
s.setString(1,a.someValue);
// ... etc, prepare
extraDataQueue.add( a );
}
s.executeBatch();
insertExtraData( extraDataQueue, s.getGeneratedKeys() );
conn.commit();
We have a separate table for extra data that's only needed for 1/3 of the records written to a primary table. The solution I came up with was to keep an array of all objects actually used in writing the data to the primary table.
Then I could match those objects with the returned keys, so I can find the correct foreign key when saving the "extra" data:
/**
*
* #param keys
* #throws SQLException
*/
protected void insertExtraData( ArrayList<SomeObject> extraDataQueue, ResultSet keys ) throws SQLException{
if( extraDataQueue.isEmpty() ) return;
PreparedStatement s = null;
Connection conn = null;
try {
conn = Database.conn();
conn.setAutoCommit(false);
s = conn.prepareStatement("INSERT INTO data_extra ...");
int i = 0;
while(keys.next()){
SomeObject a = extraDataQueue.get(i);
s.setString(//stuff)
s.addBatch();
i++;
}
s.executeBatch();
conn.commit();
} catch (SQLException e){
e.printStackTrace();
} finally {
if(s != null) try { s.close(); } catch (SQLException e) {}
if(conn != null) try { conn.close(); } catch (SQLException e) {}
}
}
The extraDataQueue array list was originally a class-level private variable which could have led to some overlap when a lot of data is being recorded so I've moved it to a local variable created new for each time the "queue dump" activates. I hope that will reduce/solve the issue but I'm trying to find out:
Is there a better way to write child table data when using batches?
Am I missing something about ArrayLists or returned keys that would produce an alternate order than I'm expecting? I've seen that with HashMaps, but never ArrayLists.

ResultSet.getNext() not working with PreparedStatement

I am trying to figure out why ResultSet.next() is never true in Java code that I am writing after I execute a SQL query that returns results from an Oracle 11g table into that ResultSet... it seems as though the code does not pick up a returned ResultSet's contents correctly when using a PreparedStatement in a java.sql.Connection. Any help appreciated, here are the details:
Table:
CREATE TABLE "SHANDB"."ABSCLOBS"
( "ID" NUMBER,
"XMLVAL" "XMLTYPE",
"IDSTRING" VARCHAR2(20 BYTE)
)
Data:
INSERT INTO absclobs VALUES ( 1,
xmltype('<?xml version="1.0"?>
<EMP>
<EMPNO>221</EMPNO>
<ENAME>John</ENAME>
</EMP>', '1'));
INSERT INTO absclobs VALUES (2,
xmltype('<?xml version="1.0"?>
<PO>
<PONO>331</PONO>
<PONAME>PO_1</PONAME>
</PO>', '2'));
Java code I am running to get values from the above to test the code:
public static void main(String[] args) throws Exception {
try {
String url = "jdbc:oracle:thin:#//localhost:1521/xe";
String driver = "sun.jdbc.odbc.JdbcOdbcDriver";
String user = "shandb";
String password = "test";
Class.forName(driver);
connection = DriverManager.getConnection(url,user, password);
String selectID1 = "SELECT a.xmlval.getClobval() AS poXML FROM absclobs a where idstring=? and id=? ";
PreparedStatement preparedStatement = connection.prepareStatement(selectID1);
preparedStatement.setString(1, "1");
preparedStatement.setInt(2, 1);
rowsUpdated = preparedStatement.executeQuery();
while(rowsUpdated.next()){
String clobxml = rowsUpdated.getString(1);
System.out.println(clobxml);
}
} catch (ClassNotFoundException cnfe) {
System.err.println(cnfe);
} catch (SQLException sqle) {
System.err.println(sqle);
}
finally{
System.out.println("Rows affected: " + rowsUpdated);
connection.close();
}
}
This part of the above code is never run, which I don't understand:
while(rowsUpdated.next()){
String clobxml = rowsUpdated.getString(1);
System.out.println(clobxml);
}
... however the final print statement shows that the ResultSet is not empty:
Rows affected: oracle.jdbc.driver.OracleResultSetImpl#15f157b
Does anyone know why I can't display the actual retrieved XML clob contents, and/or why the while block above is never true?
Thanks :)
Your diagnostics are incorrect - this:
Rows affected: oracle.jdbc.driver.OracleResultSetImpl#15f157b
doesn't show that the result set is non-empty. It just shows that the value of rowsUpdated is a reference to an instance of oracle.jdbc.driver.OracleResultSetImpl, which doesn't override toString(). That can very easily be empty.
I suspect the problem is just that your WHERE clause doesn't match any records. For the sake of diagnostics, I suggest you change it to just:
String selectID1 = "SELECT a.xmlval.getClobval() AS poXML FROM absclobs a";
(and get rid of the parameter-setting calls, of course). That way you should be able to see all your table's values. You can then work on discovering why your WHERE clause wasn't working as expected.
(As an aside, it's not clear why you haven't declared connection or rowsUpdated in the code in the question. They should definitely be local variables...)

Categories