Optimization reading from database - java

I need to check in database exist ~2k records, I test id on db size 1.000.000 and it takes 46s. Its too long, because in future this db can have more than 500.000.000 records. Is any way to speed up searching from db ? I use JDBC in java, here's code :
public int search(List<String> toSearch) throws SQLException {
String query = "SELECT * FROM strings where string = ?";
StringBuilder sB = new StringBuilder(query);
for (int i=0; i<toSearch.size()-1; i++) {
sB.append("OR string=?");
}
System.out.println(toSearch.size());
PreparedStatement prep = con.prepareStatement(sB.toString());
int i=1;
for (String string : toSearch) {
prep.setString(i, string);
i++;
}
long data = System.currentTimeMillis();
ResultSet resultSet = prep.executeQuery();
long data2 = System.currentTimeMillis();
System.out.println((data2 - data) / 1000);
List<String> toReturn = new ArrayList<>();
while (resultSet.next()) {
toReturn.add(resultSet.getString("string"));
}
return toReturn.size();
}
Table name is strings, column string.

First of all, you need to have an index on string column.
CREATE INDEX indexName ON strings(string);
Do not construct queries with variable number of arguments for H2. Use
PreparedStatement ps = con.prepareStatement("SELECT * FROM strings WHERE string = ANY(?)");
ps.setObject(1, toSearch.toArray(new String[0]));
If you use some old unsupported version of H2, use its TABLE() function instead as described in documentation distributed with your version (Prepared Statements and IN(...) section in PDF).
With other DBMS you may need other tricks to avoid dynamic generation of SQL code, = ANY(?) will work in PostgreSQL and its forks and in H2, but not in other DBMS.

Related

How to batch process contents of a database table?

I want to do the following:
Connect to a Postgres database and select the contents of a particular table 2 rows at a time
Take the 2 rows, convert them to CSV strings and send that string over the wire to a RESTful API endpoint
The problem I am having is that in order to process the ResultSet and convert it into a CSV String, I need to call rs.next() but in doing that, I am pulling the next 'batch' from the database. Here's the code I have so far:
public void load() throws Exception {
Connection conn = connectToDB();
conn.setAutoCommit(false);
Statement st = conn.createStatement();
st.setFetchSize(2);
ResultSet rs = st.executeQuery("SELECT * FROM friends");
String csv = resultSetToCSV(rs);
System.out.println("CSV output is: " + csv);
rs.close();
st.close();
}
private static String resultSetToCSV(ResultSet rs) throws Exception {
int colCount = rs.getMetaData().getColumnCount();
StringBuilder result = new StringBuilder();
while (rs.next()) {
for (int i = 0; i < colCount; i++) {
result.append("\"").append(rs.getString(i + 1)).append('"');
}
result.append(System.lineSeparator());
}
return result.toString();
}
How can I get this to work so that each "batch" of 2 rows is processed and converted to a CSV string? I want to do this because I am going to be running this on a very large table which won't fit into memory (I am just testing it out with a batch size of 2).
You can set the fetch size with ResultSet.setFetchSize(int) but generally you should not concern yourself with the entire table fitting in memory and let the driver do the job.

how Can I get the full query that a PreparedStatement is about to execute? [duplicate]

I have a general Java method with the following method signature:
private static ResultSet runSQLResultSet(String sql, Object... queryParams)
It opens a connection, builds a PreparedStatement using the sql statement and the parameters in the queryParams variable length array, runs it, caches the ResultSet (in a CachedRowSetImpl), closes the connection, and returns the cached result set.
I have exception handling in the method that logs errors. I log the sql statement as part of the log since it's very helpful for debugging. My problem is that logging the String variable sql logs the template statement with ?'s instead of actual values. I want to log the actual statement that was executed (or tried to execute).
So... Is there any way to get the actual SQL statement that will be run by a PreparedStatement? (Without building it myself. If I can't find a way to access the PreparedStatement's SQL, I'll probably end up building it myself in my catches.)
Using prepared statements, there is no "SQL query" :
You have a statement, containing placeholders
it is sent to the DB server
and prepared there
which means the SQL statement is "analysed", parsed, some data-structure representing it is prepared in memory
And, then, you have bound variables
which are sent to the server
and the prepared statement is executed -- working on those data
But there is no re-construction of an actual real SQL query -- neither on the Java side, nor on the database side.
So, there is no way to get the prepared statement's SQL -- as there is no such SQL.
For debugging purpose, the solutions are either to :
Ouput the code of the statement, with the placeholders and the list of data
Or to "build" some SQL query "by hand".
It's nowhere definied in the JDBC API contract, but if you're lucky, the JDBC driver in question may return the complete SQL by just calling PreparedStatement#toString(). I.e.
System.out.println(preparedStatement);
At least MySQL 5.x and PostgreSQL 8.x JDBC drivers support it. However, most other JDBC drivers doesn't support it. If you have such one, then your best bet is using Log4jdbc or P6Spy.
Alternatively, you can also write a generic function which takes a Connection, a SQL string and the statement values and returns a PreparedStatement after logging the SQL string and the values. Kickoff example:
public static PreparedStatement prepareStatement(Connection connection, String sql, Object... values) throws SQLException {
PreparedStatement preparedStatement = connection.prepareStatement(sql);
for (int i = 0; i < values.length; i++) {
preparedStatement.setObject(i + 1, values[i]);
}
logger.debug(sql + " " + Arrays.asList(values));
return preparedStatement;
}
and use it as
try {
connection = database.getConnection();
preparedStatement = prepareStatement(connection, SQL, values);
resultSet = preparedStatement.executeQuery();
// ...
Another alternative is to implement a custom PreparedStatement which wraps (decorates) the real PreparedStatement on construction and overrides all the methods so that it calls the methods of the real PreparedStatement and collects the values in all the setXXX() methods and lazily constructs the "actual" SQL string whenever one of the executeXXX() methods is called (quite a work, but most IDE's provides autogenerators for decorator methods, Eclipse does). Finally just use it instead. That's also basically what P6Spy and consorts already do under the hoods.
I'm using Java 8, JDBC driver with MySQL connector v. 5.1.31.
I may get real SQL string using this method:
// 1. make connection somehow, it's conn variable
// 2. make prepered statement template
PreparedStatement stmt = conn.prepareStatement(
"INSERT INTO oc_manufacturer" +
" SET" +
" manufacturer_id = ?," +
" name = ?," +
" sort_order=0;"
);
// 3. fill template
stmt.setInt(1, 23);
stmt.setString(2, 'Google');
// 4. print sql string
System.out.println(((JDBC4PreparedStatement)stmt).asSql());
So it returns smth like this:
INSERT INTO oc_manufacturer SET manufacturer_id = 23, name = 'Google', sort_order=0;
If you're executing the query and expecting a ResultSet (you are in this scenario, at least) then you can simply call ResultSet's getStatement() like so:
ResultSet rs = pstmt.executeQuery();
String executedQuery = rs.getStatement().toString();
The variable executedQuery will contain the statement that was used to create the ResultSet.
Now, I realize this question is quite old, but I hope this helps someone..
I've extracted my sql from PreparedStatement using preparedStatement.toString() In my case toString() returns String like this:
org.hsqldb.jdbc.JDBCPreparedStatement#7098b907[sql=[INSERT INTO
TABLE_NAME(COLUMN_NAME, COLUMN_NAME, COLUMN_NAME) VALUES(?, ?, ?)],
parameters=[[value], [value], [value]]]
Now I've created a method (Java 8), which is using regex to extract both query and values and put them into map:
private Map<String, String> extractSql(PreparedStatement preparedStatement) {
Map<String, String> extractedParameters = new HashMap<>();
Pattern pattern = Pattern.compile(".*\\[sql=\\[(.*)],\\sparameters=\\[(.*)]].*");
Matcher matcher = pattern.matcher(preparedStatement.toString());
while (matcher.find()) {
extractedParameters.put("query", matcher.group(1));
extractedParameters.put("values", Stream.of(matcher.group(2).split(","))
.map(line -> line.replaceAll("(\\[|])", ""))
.collect(Collectors.joining(", ")));
}
return extractedParameters;
}
This method returns map where we have key-value pairs:
"query" -> "INSERT INTO TABLE_NAME(COLUMN_NAME, COLUMN_NAME, COLUMN_NAME) VALUES(?, ?, ?)"
"values" -> "value, value, value"
Now - if you want values as list you can just simply use:
List<String> values = Stream.of(yourExtractedParametersMap.get("values").split(","))
.collect(Collectors.toList());
If your preparedStatement.toString() is different than in my case it's just a matter of "adjusting" regex.
Using PostgreSQL 9.6.x with official Java driver 42.2.4:
...myPreparedStatement.execute...
myPreparedStatement.toString()
Will show the SQL with the ? already replaced, which is what I was looking for.
Just added this answer to cover the postgres case.
I would never have thought it could be so simple.
Code Snippet to convert SQL PreparedStaments with the list of arguments. It works for me
/**
*
* formatQuery Utility function which will convert SQL
*
* #param sql
* #param arguments
* #return
*/
public static String formatQuery(final String sql, Object... arguments) {
if (arguments != null && arguments.length <= 0) {
return sql;
}
String query = sql;
int count = 0;
while (query.matches("(.*)\\?(.*)")) {
query = query.replaceFirst("\\?", "{" + count + "}");
count++;
}
String formatedString = java.text.MessageFormat.format(query, arguments);
return formatedString;
}
Very late :) but you can get the original SQL from an OraclePreparedStatementWrapper by
((OraclePreparedStatementWrapper) preparedStatement).getOriginalSql();
I implemented the following code for printing SQL from PrepareStatement
public void printSqlStatement(PreparedStatement preparedStatement, String sql) throws SQLException{
String[] sqlArrya= new String[preparedStatement.getParameterMetaData().getParameterCount()];
try {
Pattern pattern = Pattern.compile("\\?");
Matcher matcher = pattern.matcher(sql);
StringBuffer sb = new StringBuffer();
int indx = 1; // Parameter begin with index 1
while (matcher.find()) {
matcher.appendReplacement(sb,String.valueOf(sqlArrya[indx]));
}
matcher.appendTail(sb);
System.out.println("Executing Query [" + sb.toString() + "] with Database[" + "] ...");
} catch (Exception ex) {
System.out.println("Executing Query [" + sql + "] with Database[" + "] ...");
}
}
If you're using MySQL you can log the queries using MySQL's query log. I don't know if other vendors provide this feature, but chances are they do.
Simply function:
public static String getSQL (Statement stmt){
String tempSQL = stmt.toString();
//please cut everything before sql from statement
//javadb...:
int i1 = tempSQL.indexOf(":")+2;
tempSQL = tempSQL.substring(i1);
return tempSQL;
}
It's fine aswell for preparedStatement.
I'm using Oralce 11g and couldn't manage to get the final SQL from the PreparedStatement. After reading #Pascal MARTIN answer I understand why.
I just abandonned the idea of using PreparedStatement and used a simple text formatter which fitted my needs. Here's my example:
//I jump to the point after connexion has been made ...
java.sql.Statement stmt = cnx.createStatement();
String sqlTemplate = "SELECT * FROM Users WHERE Id IN ({0})";
String sqlInParam = "21,34,3434,32"; //some random ids
String sqlFinalSql = java.text.MesssageFormat(sqlTemplate,sqlInParam);
System.out.println("SQL : " + sqlFinalSql);
rsRes = stmt.executeQuery(sqlFinalSql);
You figure out the sqlInParam can be built dynamically in a (for,while) loop I just made it plain simple to get to the point of using the MessageFormat class to serve as a string template formater for the SQL query.
You can try to use javaagent to print SQL:
public class Main {
private static final String mybatisPath = "org.apache.ibatis.executor.statement.PreparedStatementHandler";
private static final String mybatisMethod = "parameterize";
private static final String sqlPath = "java.sql.Statement";
public static void premain(String arg, Instrumentation instrumentation) {
instrumentation.addTransformer(new ClassFileTransformer() {
#Override
public byte[] transform(
ClassLoader loader,
String className,
Class<?> classBeingRedefined,
ProtectionDomain protectionDomain,
byte[] classfileBuffer) throws IllegalClassFormatException {
if (!mybatisPath.replaceAll("\\.", "/").equals(className)) {
return null;
}
ClassPool pool = new ClassPool();
pool.appendClassPath(new LoaderClassPath(loader));
pool.appendSystemPath();
try {
CtClass ctClass = pool.get(mybatisPath);
CtMethod method = ctClass.getDeclaredMethod(mybatisMethod, new CtClass[]{pool.get(sqlPath)});
method.insertAfter("cn.wjhub.Main#printSQL($1)");
return ctClass.toBytecode();
} catch (Exception e) {
e.printStackTrace();
}
return null;
}
});
}
/**
* printSQL
*
* #param statement statement
*/
private void printSQL(Statement statement) {
String sqlSource = statement.toString();
System.out.println(sqlSource);
}
}
To do this you need a JDBC Connection and/or driver that supports logging the sql at a low level.
Take a look at log4jdbc

JDBC with MySQL - SELECT ... IN

Using PreparedStatement to build a query that looks like this...
SELECT * FROM table1 WHERE column1 IN ('foo', 'bar')
...without knowing the number of strings in the IN statement
Constructing a string like...
"'foo', 'bar'"
...and passing that in with ps.setString() results in:
"\'foo\', \'bar\'"
Which is probably a good thing, but it makes this approach to my problem useless.
Any ideas on how to pass in an unknown number of values into a JDBC PreparedStatement without dynamically creating the query string too (this query lives in a file for easy reuse and I'd like to keep it that way)?
I tend to use a method that will modify the query to modify the query accordingly. This is a basic example that omits error handling for simplicity:
public String addDynamicParameters(String query, List<Object> parameters) {
StringBuilder queryBuilder = new StringBuilder(query);
queryBuilder.append("?");
for (int i = 1; i < parameters.size(); i++) {
queryBuilder.append(", ?");
}
queryBuilder.append(") ");
return queryBuilder.toString();
}
public void addParameters(PreparedStatement pstmt, List<Object> parameters) {
int i = 1;
for(Object param : parameters) {
pstmt.setObject(i++, param);
}
}
public void testDynamicParameters() {
String query = "SELECT col3 FROM tableX WHERE col1 = ? AND col2 IN (";
List<Object> parametersForIn = ...;
query = addDynamicParameters(query, parametersForIn);
List<Object> parameters = ...;
PreparedStatement pstmt = ...; //using your Connection object...
parameters.addAll(parametersForIn);
addParameters(pstmt, parameters);
//execute prepared statement...
//clean resources...
}

Building a String parameter (with SQL content) from resource file for SQL PreparedStatement

I need to execute a SQL PreparedStatement in Java using jdbc.
I'm facing problems with one of the parameters because it has SQL content and also Strings from a resource file.
It looks something like this:
Required SQL:
SELECT * FROM Table T WHERE T.value = 10 AND T.display IN ('Sample1', 'Sample2')
In the above query, the Sample1 and Sample2 values must be passed through a parameter to a PreparedStatement.
PreparedStatement:
SELECT * FROM Table T WHERE T.value = 10 ?
In my application code I'm setting the parameters like:
statement.setString(1, "AND T.display IN ('Sample1', 'Sample2')");
However this is not returning the appropriate results.
Is there a better way to build this particular parameter considering it has SQL content and Strings too?
EDIT:
Sample1, Sample2 etc. are strings that are retrieved from an external file at run-time and there can be different number of these strings each time. I.e. there can be only one string Sample1 or multiple strings Sample1, Sample2, Sample3, etc..
EDIT2:
Database being used is Oracle.
The ? placeholder can only be used in a position where a value is expected in the query. Having a ? in any other position (as in your question: WHERE T.value = 10 ?) is simply a syntax error.
In other words: it is not possible to parametrize part of the query itself as you are trying to do; you can only parametrize values. If you need to add a dynamic number of parameters, you will need to construct the query dynamically by adding the required number of parameters and using setString(). For example:
StringBuilder sb = new StringBuilder(
"SELECT * FROM Table T WHERE T.value = 10 AND T.display IN (?");
// Note: intentionally starting at 1, first parameter already above
// Assuming always at least 1 parameter
while (int i = 1; i < params.length; i++) {
sb.append(", ?");
}
sb.append(')');
try (
PreparedStatement pstmt = con.prepareStatement(sb.toString())
) {
for (int i = 0; i < params.length; i++) {
pstmt.setString(i + 1, params[i]);
}
try (
ResultSet rs = pstmt.executeQuery();
) {
// Use resultset
}
}
Use this as PreparedStatement
"SELECT * FROM Table T WHERE T.value = 10 AND T.display IN (?, ?);"
and then call
statement.setString(1, "Sample1");
statement.setString(2, "Sample2");
before executing the statement.
Update:
String generateParamString(int params) {
StringBuilder sb = new StringBuilder("(");
for (int i = 1; i < params; i++) {
sb.append("?, ");
}
sb.append("?)");
return sb.toString();
}
List<String> samples = ... // your list with samples.
String stmtString = "SELECT * FROM Table T WHERE T.value = 10 AND T.display IN "
+ generateParamString(samples.size());
// generate statement with stmtString
for (int i = 0; i < samples.size(); i++) {
statement.setString(i + 1, samples.get(i));
}
// execute statement...

JAVA Writing to Access database and retrieving and index

I'm writing data from Java to an Access database on Windows 32 bit. When I write a record, I need to retrieve the row ID / primary key so that I can a) update the record easily if I want to and b) cross reference other data to that record.
When did something similar in C, I could make a updatable cursor which allowed me to write a new record and simultaneously retrieve the row ID. With Java, it looks as though I should be able to do this, but it throws an exception with the following code.
con = openAccessDatabase();
String selectString = "SELECT ID, RunCount FROM SpeedTable";
try {
PreparedStatement selectStatement = con.prepareStatement(selectString,
ResultSet.TYPE_SCROLL_INSENSITIVE,
ResultSet.CONCUR_UPDATABLE);
ResultSet idResult = selectStatement.executeQuery();
int id;
for (int i = 0; i < nWrites; i++) {
idResult.moveToInsertRow();
idResult.updateObject(1, null); // this line makes no difference whatsoever !
idResult.updateInt(2, i);
idResult.insertRow(); // throws java.sql.SQLException: [Microsoft][ODBC Microsoft Access Driver]Error in row
id = idResult.getInt(1);
}
selectStatement.close();
} catch (SQLException e) {
e.printStackTrace();
}
The only thing I've been able to do is to write a new record and then run a different query to get the Row id back ...
String insertString = "INSERT INTO SpeedTable (RunCount) VALUES (?)";
String idString = "SELECT ID FROM SpeedTable ORDER BY ID DESC";
//
try {
ResultSet idResult = null;
PreparedStatement preparedStatement, idStatement;
preparedStatement = con.prepareStatement(insertString,
ResultSet.TYPE_FORWARD_ONLY,
ResultSet.CONCUR_READ_ONLY);
idStatement = con.prepareStatement(idString,
ResultSet.TYPE_FORWARD_ONLY,
ResultSet.CONCUR_READ_ONLY);
for (int i = 0; i < nWrites; i++) {
// write the data into the database
preparedStatement.setInt(1, i);
preparedStatement.execute();
// re-run the query to get the index back from the database.
idResult = idStatement.executeQuery();
idResult.next();
int lastIndex = idResult.getInt(1);
idResult.close();
}
This works but becomes impossibly slow when the table has more than a few 10's of 1000's of records in it. There is also a risk of returning the wrong ID if two parts of the program start writing at the same time (unlikely but not impossible).
I know that at least one suggestion will be to either not use Java or not use Access, but they are not options. It's also part of a free open source software package, so I'm reluctant to pay for anything. Writing my own C JNI interface which provides the basic functionality that I need for my application is even less appealing.
Not sure if this works for MS Access but you can try:
st.executeUpdate("INSERT INTO SpeedTable (RunCount) VALUES (1000)", Statement.RETURN_GENERATED_KEYS);
ResultSet rs = st.getGeneratedKeys();
rs.next();
long id = rs.getLong(1);

Categories