So basically, I would like to avoid stored procedures, but at the same time I would'nt want multiple round-trips to database to execute sequential statements.
Apparently this blog says Facebook uses mysql's multiple-statement-queries. Unfortunately, its a C API, is there a java equivalent of it?
So in brief, the question "is in java+mysql how can a second jdbc statement use the output of the first statement as input to execute, without a round-trip to database and without a storedproc" ?
If not how do other people approach this problem?
Yes, the JDBC driver for MySQL support the multi-statement queries. It is however disabled by default for security reasons, as multi-statement queries significantly increase the risks associated with eventual SQL injections.
To turn on multi-statement queries support, simply add the allowMultiQueries=true option to your connection string (or pass the equivalent option in map format). You can get more information on that option here: https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-configuration-properties.html.
Once this option enabled, you can simply execute a call similar to: statement.execute("select ... ; select ... ; select ..."). Returned ResultSets can be iterated from the Statement object: each call to statement.getResultSet() return the next one. Use statement.getMoreResults() to determine if there are indeed more ResultSet available.
It sounds like you want to do batch processing.
here is a duplicate question with an good answer:
How to execute multiple SQL statements from java
I have looked into various places, and have heard a lot of dubious claims, ranging from PreparedStatement should be preferred over Statement everywhere, even if only for the performance benefit; all the way to claims that PreparedStatements should be used exclusively for batched statements and nothing else.
However, there seems to be a blind spot in (primarily online) discussions I have followed. Let me present a concrete scenario.
We have an EDA-designed application with a DB connection pool. Events come, some of them require persistence, some do not. Some are artificially generated (e.g. update/reset something every X minutes, for example).
Some events come and are handled sequentially, but other types of events (also requiring persistence) can (and will) be handled concurrently.
Aside from those artificially generated events, there is no structure in how events requiring persistence arrive.
This application was designed quite a while ago (roughly 2005) and supports several DBMSes. The typical event handler (where persistence is required):
get connection from pool
prepare sql statement
execute prepared statement
process the result set, if applicable, close it
close prepared statement
prepare a different statement, if necessary and handle the same way
return connection to pool
If an event requires batch processing, the statement is prepared once and addBatch/executeBatch methods are used. This is an obvious performance benefit and these cases are not related to this question.
Recently, I have received an opinion, that the whole idea of preparing (parsing) a statement, executing it once and closing is essentially a misuse of PreparedStatement, provides zero performance benefits, regardless of whether server or client prepared statements are used and that typical DBMSes (Oracle, DB2, MSSQL, MySQL, Derby, etc.) will not even promote such a statement to prepared statement cache (or at least, their default JDBC driver/datasource will not).
Moreover, I had to test certain scenarios in dev environment on MySQL, and it seems that the Connector/J usage analyzer agrees with this idea. For all non-batched prepared statements, calling close() prints:
PreparedStatement created, but used 1 or fewer times. It is more efficient to prepare statements once, and re-use them many times
Due to application design choices outlined earlier, having a PreparedStatement instance cache that holds every single SQL statement used by any event for each connection in the connection pool sounds like a poor choice.
Could someone elaborate further on this? Is the logic "prepare-execute (once)-close" flawed and essentially discouraged?
P.S. Explicitly specifying useUsageAdvisor=true and cachePrepStmts=true for Connector/J and using either useServerPrepStmts=true or useServerPrepStmts=false still results in warnings about efficiency when calling close() on PreparedStatement instances for every non-batched SQL statement.
Is the logic prepare-execute [once]-close flawed and essentially discouraged?
I don't see that as being a problem, per se. A given SQL statement needs to be "prepared" at some point, whether explicitly (with a PreparedStatement) or "on the fly" (with a Statement). There may be a tiny bit more overhead incurred if we use a PreparedStatement instead of a Statement for something that will only be executed once, but it is unlikely that the overhead involved would be significant, especially if the statement you cite is true:
typical DBMSes (Oracle, DB2, MSSQL, MySQL, Derby, etc.) will not even promote such a statement to prepared statement cache (or at least, their default JDBC driver/datasource will not).
What is discouraged is a pattern like this:
for (int thing : thingList) {
PreparedStatement ps = conn.prepareStatement(" {some constant SQL statement} ");
ps.setInt(1, thing);
ps.executeUpdate();
ps.close();
}
because the PreparedStatement is only used once and the same SQL statement is being prepared over and over again. (Although even that might not be such a big deal if the SQL statement and its executation plan are indeed cached.) The better way to do that is
PreparedStatement ps = conn.prepareStatement(" {some constant SQL statement} ");
for (int thing : thingList) {
ps.setInt(1, thing);
ps.executeUpdate();
}
ps.close();
... or even better, with a "try with resources" ...
try (PreparedStatement ps = conn.prepareStatement(" {some constant SQL statement} ")) {
for (int thing : thingList) {
ps.setInt(1, thing);
ps.executeUpdate();
}
}
Note that this is true even without using batch processing. The SQL statement is still only prepared once and used several times.
As others already stated, the most expensive part is the parsing the statement in the database. Some database systems (this is pretty much DB dependent – I will speak mainly for Oracle) may profit, if the statement is already parsed in the shared pool. (In Oracle terminology this is called a soft parse that is cheaper than a hard parse - a parse of a new statement). You can profit from soft parse even if you use the prepared statement only once.
So the important task is to give the database a chance to reuse the statement. A typical counter example is the handling of the IN list based on a collection in Hibernate. You end with the statement such as
.. FROM T WHERE X in (?,?,?, … length based on the size of the collection,?,? ,?,?)
You can’t reuse this statement if the size of the collection differ.
A good starting point to get overview of the spectrum of the SQL queries produced by a running application is (by Oracle) the V$SQL view. Filter the PARSING_SCHEMA_NAME with you connection pool user and check the SQL_TEXT and the EXECUTIONS count.
Two extreme situation should be avoided:
Passing parameters (IDs) in the query text (this is well known) and
Reusing statement for different access paths.
An example of the latter is a query that with a provided parameter performs an index access to a limited part of the table, while without the parameter all records should be processed (full table scan). In that case is definitively no problem to create two different statements (as the parsing of both leads to different execution plans).
PreparedStatements are preferable because one is needed regardless of whether you create one programmatically or not; internally the database creates one every time a query is run - creating one programatically just gives you a handle to it. Creating and throwing away a PreparedStatement every time doesn't add much overhead over using Statement.
A large effort is required by the database to create one (syntax checking, parsing, permissions checking, optimization, access strategy, etc). Reusing one bypasses this effort for subsequent executions.
Instead of throwing them away, try either writing the query in such a way that it can be reused, eg by ignoring null input parameters:
where someCol = coalesce(?, someCol)
so if you set the parameter to null (ie "unspecified), the condition succeeds)
or if you absolutely must build the query every time, keep references to the PreparedStatements in a Map where the built query is the key and reuse them if you get a hit. Use a WeakHashMap<String, PreparedStatements> for you map implementation to prevent running out of memory.
PreparedStatement created, but used 1 or fewer times. It is more efficient to prepare statements once, and re-use them many times
I thing you may safely ignore this warning, it is similar to a claim It is more efficient to work first 40 hour in the week, than sleep next 56 hours, eat following 7 hours and the rest is your free time.
You need exactly one execution per event - should you perform 50 to get a better average?
SQL commands that run only once, in terms of performance, just waste database resources (memory, processing) being sent in a Prepared Statement. In other hand, not using Prepared Statement let app vulnerable to SQL injection.
Are security (protection from SQL injection) working against performance (prepared statement that runs just once) ? Yes, but...
But it should not be that way. It's a choice java does NOT implement an interface to let developers call the right database API: SQL commands that run just once AND are properly protected against SQL injection ! Why Java just not implement the correct tool for this specific task?
It could be as follows:
Statement Interface - Different SQL commands could be submitted. One execution of SQL commands. Bind variables not allowed.
PreparedStatement Interface - One SQL command could be submitted. Multiple executions of SQL command. Bind variables allowed.
(MISSING IN JAVA!) RunOnceStatement - One SQL command could be submitted. One execution of SQL command. Bind variables allowed.
For example, the correct routine (API) could be called in Postgres, by driver mapping to:
- Statement Interface - call PQExec()
- PreparedStatement Interface - call PQPrepare() / PQExecPrepare() / ...
- (MISSING IN JAVA!) RunOnceStatement Interface - call PQExecParams()
Using prepared statement in SQL code that runs just once is a BIG performance problem: more processing in database, waste database memory, by maintaining plans that will not called later. Cache plans get so crowed that actual SQL commands that are executed multiple times could be deleted from cache.
But Java does not implement the correct interface, and forces everybody to use Prepared Statement everywhere, just to protect against SQL injection...
When to use statement instead of prepared statement. i suppose statement is used in queries with no parameter but why not use prepared statement ? Which one is faster for queries with no params.
I suppose statement is used in queries with no parameter but why not use prepared statement ?
That's not even close. PreparedStatements are used in the case of INSERT, UPDATE and DELETE statements that return a ResultSet or an update count. They will not work for DDL statements as pointed out by Joachim, and neither will they work for invocation of stored procedures where a CallableStatement ought to be used (this is not a difference between the two classes). As far as queries with no bind parameters are concerned, PreparedStatements can turn out to be better than Statements (see below).
Which one is faster for queries with no params.
PreparedStatements will turn out to be faster in the long run, over extended use in a single connection. This is because, although PreparedStatements have to be compiled, which would take some time (this really isn't a lot, so don't see this as a drawback), the compiled version essentially holds a reference to the SQL execution plan in the database. Once compiled, the PreparedStatement is stored in a connection specific cache, so that the compiled version may be reused to achieve performance gains. If you are using JDBC Batch operations, using PreparedStatements will make the execution of the batch much faster than the use of plain Statement objects, where the plan may have to be prepared time and again, if the database has to do so.
That's depending on Your requirement.
If you have a SQL statement which runs in a loop or frequently with different parameters then PreparedStatement is the best candidate since it is getting pre-compiled and cache the execution plan for this parameterized SQL query. Each time it runs from the same PreparedStatement object it will use cached execution plan and gives the better performance.
Also SQL injection can be avoided using PreparedStatement .
But if you are sure that you run SQL query only once, sometimes Statement will be the best candidate since when you create PreparedStatement object sometimes it make additional db call, if the driver supports precompilation, the method Connection.prepareStatement(java.lang.String) will send the statement to the database for precompilation.
Read below article to understand "Statement Versus PreparedStatement"
Java Programming with Oracle JDBC
Recently one of my colleagues made a comment that I should not use
LIKE '%'||?||'%'
rather use
LIKE ?
in the SQL and then replace the LIKE ? marker with LIKE '%'||?||'%' before I execute the SQL. He made the point that with a single parameter marker DB2 database will cache the statement always and thus cut down on the SQL prepare time.
However, I am not sure if it is accurate or not. To me it should be the other way around since we are doing more processing by doing a string replace on the SQL everytime the query is getting executed.
Does anyone know if a single marker really speeds up execution? Just FYI - I am using Spring 2.5 JDBC framework and the DB2 version is 9.2.
My question is - does DB2 treat "LIKE ?" differently from "LIKE '%'||?||'%'" as far as caching and preparation goes.
'LIKE ?' is a PreparedStatement. Prepared statements are an optimization at the JDBC driver level. The thinking is that databases analyze queries to decide how to most efficiently process them. The DB can then cache the resulting query plan, keyed on the full statement. Reusing identical statements reuses the query plan. So basically if you are running the same query multiple times with different comparison strings, and if the query plan stays cached, then yes, using 'LIKE ?' will be faster.
Some useful (though somewhat dated) info on PreparedStatements:
Prepared Statments
More Prepared Statments
I haven't done too much DB2, not since the 90's and I'm not really sure if I'm understanding what your underlying question is. Way back then I got a phone call from the head of the DBA team. "What are you doing different than every other programmer we've got!??" Mind you, this was early in my career, so tentatively I answered, "Nothing....", imagine it in kind of a whiny voice. "Well then, why do your queries take 50% of the cpu resources of any the other guys???". I took a quick poll of all the other guys and found I was the only one using prepared statements. Now under the covers Spring automatically makes prepared statements, and they've improved statement caching in the database a lot over the years, but if you make use of the properly, you can get the speedup there, AND it'll make the statement cache swap things out less often. It really depends on your use case, if you're only going to hit the query once, then there would be no difference, if its a few thousand times, obviously it would make a much greater difference.
in the SQL and then replace the LIKE ? marker with LIKE '%'||?||'%' before I execute the SQL. He made the point that with a single parameter marker DB2 database will cache the statement always and thus cut down on the SQL prepare time.
Unless DB2 is some sort of weird alien SQL database, or if it's driver does some crazy things, then the database server will never see your prepared statement until you actually execute it. So you can swap clauses in and out of the PreparedStatement all day long, and it will have no effect until you actually send it to the server when you execute it.
I have the below questions on Prepared Statements in Java.
Is it beneficial to use Prepared Statements when the SQL Query does not have any Where clause ? Assume a simple query Select * from tablename;
It is said that the Prepared Statement is compiled once and only the values are substituted the second time. Hence it is faster as the Query validation and compilation step can be skipped. Where is the compiled form stored ? What is the life time of this compiled form ?
A PreparedStatement is beneficial when there are parameters to be passed and when the query is to be executed repeatedly. If there is a simple query to be fired once, a Statement will prove faster.
The caching takes place on DB server. The DB server has APIs that help caching compiled queries. Hence for repeated execution of queries, the same compiled query will run again and boost performance.
Use PreparedStatement everytime there's an input or more from the user. It will help you escape the needed characters to prevent SQL Injection and errors in queries.