Prepared Statements and JDBC Drivers

Prepared Statements and JDBC Drivers - java

I have the below questions on Prepared Statements in Java.
Is it beneficial to use Prepared Statements when the SQL Query does not have any Where clause ? Assume a simple query Select * from tablename;
It is said that the Prepared Statement is compiled once and only the values are substituted the second time. Hence it is faster as the Query validation and compilation step can be skipped. Where is the compiled form stored ? What is the life time of this compiled form ?

A PreparedStatement is beneficial when there are parameters to be passed and when the query is to be executed repeatedly. If there is a simple query to be fired once, a Statement will prove faster.
The caching takes place on DB server. The DB server has APIs that help caching compiled queries. Hence for repeated execution of queries, the same compiled query will run again and boost performance.

Use PreparedStatement everytime there's an input or more from the user. It will help you escape the needed characters to prevent SQL Injection and errors in queries.

Related

Closing a PreparedStatement after a single execute – is it a design flaw?

I have looked into various places, and have heard a lot of dubious claims, ranging from PreparedStatement should be preferred over Statement everywhere, even if only for the performance benefit; all the way to claims that PreparedStatements should be used exclusively for batched statements and nothing else.
However, there seems to be a blind spot in (primarily online) discussions I have followed. Let me present a concrete scenario.
We have an EDA-designed application with a DB connection pool. Events come, some of them require persistence, some do not. Some are artificially generated (e.g. update/reset something every X minutes, for example).
Some events come and are handled sequentially, but other types of events (also requiring persistence) can (and will) be handled concurrently.
Aside from those artificially generated events, there is no structure in how events requiring persistence arrive.
This application was designed quite a while ago (roughly 2005) and supports several DBMSes. The typical event handler (where persistence is required):
get connection from pool
prepare sql statement
execute prepared statement
process the result set, if applicable, close it
close prepared statement
prepare a different statement, if necessary and handle the same way
return connection to pool
If an event requires batch processing, the statement is prepared once and addBatch/executeBatch methods are used. This is an obvious performance benefit and these cases are not related to this question.
Recently, I have received an opinion, that the whole idea of preparing (parsing) a statement, executing it once and closing is essentially a misuse of PreparedStatement, provides zero performance benefits, regardless of whether server or client prepared statements are used and that typical DBMSes (Oracle, DB2, MSSQL, MySQL, Derby, etc.) will not even promote such a statement to prepared statement cache (or at least, their default JDBC driver/datasource will not).
Moreover, I had to test certain scenarios in dev environment on MySQL, and it seems that the Connector/J usage analyzer agrees with this idea. For all non-batched prepared statements, calling close() prints:
PreparedStatement created, but used 1 or fewer times. It is more efficient to prepare statements once, and re-use them many times
Due to application design choices outlined earlier, having a PreparedStatement instance cache that holds every single SQL statement used by any event for each connection in the connection pool sounds like a poor choice.
Could someone elaborate further on this? Is the logic "prepare-execute (once)-close" flawed and essentially discouraged?
P.S. Explicitly specifying useUsageAdvisor=true and cachePrepStmts=true for Connector/J and using either useServerPrepStmts=true or useServerPrepStmts=false still results in warnings about efficiency when calling close() on PreparedStatement instances for every non-batched SQL statement.

Is the logic prepare-execute [once]-close flawed and essentially discouraged?
I don't see that as being a problem, per se. A given SQL statement needs to be "prepared" at some point, whether explicitly (with a PreparedStatement) or "on the fly" (with a Statement). There may be a tiny bit more overhead incurred if we use a PreparedStatement instead of a Statement for something that will only be executed once, but it is unlikely that the overhead involved would be significant, especially if the statement you cite is true:
typical DBMSes (Oracle, DB2, MSSQL, MySQL, Derby, etc.) will not even promote such a statement to prepared statement cache (or at least, their default JDBC driver/datasource will not).
What is discouraged is a pattern like this:
for (int thing : thingList) {
PreparedStatement ps = conn.prepareStatement(" {some constant SQL statement} ");
ps.setInt(1, thing);
ps.executeUpdate();
ps.close();
}
because the PreparedStatement is only used once and the same SQL statement is being prepared over and over again. (Although even that might not be such a big deal if the SQL statement and its executation plan are indeed cached.) The better way to do that is
PreparedStatement ps = conn.prepareStatement(" {some constant SQL statement} ");
for (int thing : thingList) {
ps.setInt(1, thing);
ps.executeUpdate();
}
ps.close();
... or even better, with a "try with resources" ...
try (PreparedStatement ps = conn.prepareStatement(" {some constant SQL statement} ")) {
for (int thing : thingList) {
ps.setInt(1, thing);
ps.executeUpdate();
}
}
Note that this is true even without using batch processing. The SQL statement is still only prepared once and used several times.

As others already stated, the most expensive part is the parsing the statement in the database. Some database systems (this is pretty much DB dependent – I will speak mainly for Oracle) may profit, if the statement is already parsed in the shared pool. (In Oracle terminology this is called a soft parse that is cheaper than a hard parse - a parse of a new statement). You can profit from soft parse even if you use the prepared statement only once.
So the important task is to give the database a chance to reuse the statement. A typical counter example is the handling of the IN list based on a collection in Hibernate. You end with the statement such as
.. FROM T WHERE X in (?,?,?, … length based on the size of the collection,?,? ,?,?)
You can’t reuse this statement if the size of the collection differ.
A good starting point to get overview of the spectrum of the SQL queries produced by a running application is (by Oracle) the V$SQL view. Filter the PARSING_SCHEMA_NAME with you connection pool user and check the SQL_TEXT and the EXECUTIONS count.
Two extreme situation should be avoided:
Passing parameters (IDs) in the query text (this is well known) and
Reusing statement for different access paths.
An example of the latter is a query that with a provided parameter performs an index access to a limited part of the table, while without the parameter all records should be processed (full table scan). In that case is definitively no problem to create two different statements (as the parsing of both leads to different execution plans).

PreparedStatements are preferable because one is needed regardless of whether you create one programmatically or not; internally the database creates one every time a query is run - creating one programatically just gives you a handle to it. Creating and throwing away a PreparedStatement every time doesn't add much overhead over using Statement.
A large effort is required by the database to create one (syntax checking, parsing, permissions checking, optimization, access strategy, etc). Reusing one bypasses this effort for subsequent executions.
Instead of throwing them away, try either writing the query in such a way that it can be reused, eg by ignoring null input parameters:
where someCol = coalesce(?, someCol)
so if you set the parameter to null (ie "unspecified), the condition succeeds)
or if you absolutely must build the query every time, keep references to the PreparedStatements in a Map where the built query is the key and reuse them if you get a hit. Use a WeakHashMap<String, PreparedStatements> for you map implementation to prevent running out of memory.

PreparedStatement created, but used 1 or fewer times. It is more efficient to prepare statements once, and re-use them many times
I thing you may safely ignore this warning, it is similar to a claim It is more efficient to work first 40 hour in the week, than sleep next 56 hours, eat following 7 hours and the rest is your free time.
You need exactly one execution per event - should you perform 50 to get a better average?

SQL commands that run only once, in terms of performance, just waste database resources (memory, processing) being sent in a Prepared Statement. In other hand, not using Prepared Statement let app vulnerable to SQL injection.
Are security (protection from SQL injection) working against performance (prepared statement that runs just once) ? Yes, but...
But it should not be that way. It's a choice java does NOT implement an interface to let developers call the right database API: SQL commands that run just once AND are properly protected against SQL injection ! Why Java just not implement the correct tool for this specific task?
It could be as follows:
Statement Interface - Different SQL commands could be submitted. One execution of SQL commands. Bind variables not allowed.
PreparedStatement Interface - One SQL command could be submitted. Multiple executions of SQL command. Bind variables allowed.
(MISSING IN JAVA!) RunOnceStatement - One SQL command could be submitted. One execution of SQL command. Bind variables allowed.
For example, the correct routine (API) could be called in Postgres, by driver mapping to:
- Statement Interface - call PQExec()
- PreparedStatement Interface - call PQPrepare() / PQExecPrepare() / ...
- (MISSING IN JAVA!) RunOnceStatement Interface - call PQExecParams()
Using prepared statement in SQL code that runs just once is a BIG performance problem: more processing in database, waste database memory, by maintaining plans that will not called later. Cache plans get so crowed that actual SQL commands that are executed multiple times could be deleted from cache.
But Java does not implement the correct interface, and forces everybody to use Prepared Statement everywhere, just to protect against SQL injection...

PreparedStatement is faster in Java, How db do it?

I know that PreparedStatement is faster than Statement in Java.
I don't know How oracle db server do it.
PreparedStatement gets pre-compiled In database server -> less work.
It's reduce load on database.
String sql = "SELECT * FROM users u WHERE u.id = ?";
PreparedStatement pstmt = connenction.prepareStatement(sql);
pstmt.setLong(1, userId);
pstmt.executeQuery();
The query is cached in the database server, and compile only once?
If yes, how the database server knows that this query was execute before?
For how long is it cached?

The query is cached in the database server, and compile only once?
More precisely, the query plan is cached on the server. When you run a query, your RDBMS prepares a plan first, then executes it. Preparing a plan requires parsing the query, then analyzing and optimizing it, considering the indexes available and the statistics collected on the participating tables.
If yes, how the database server knows that this query was execute before?
By comparing the string of the query to other queries available in the cache. Since you use parameterized queries, the other query would be textually identical. Caching is the second major reason* to use query parameters: if you were to prepare statements like this
// WRONG! Don't do it like this!
String sql = "SELECT * FROM users u WHERE u.id = "+userId;
PreparedStatement pstmt = connenction.prepareStatement(sql);
all the performance improvements would be gone, because providing a different ID would make it a different query that needs a new plan.
* The top reason for parameterizing queries is, of course, avoiding injection attacks.

You can think of a PreparedStatement as a "cached" statement that will be compiled once on the database server.
When you create the statement, it will be sent to the DB server, which will do the usual syntax checking and determine an efficient execution plan for the query. This way, it can re-use the same execution plan (which is cached as well) for multiple invocations of the same statement.
The key of this cache is the statement itself without its parameter values filled. If the values are explicitly filled in the statement (i.e. you don't fill them using the set* methods), then a new execution plan would be accessed in the cache. That's why prepared statements are best used when executing several statements but with different values.

I use Data Source in my project , where query cache is dependent on connection.
Query cache is maintained per connection (default size is 10).
That means if I have 5 connection , per connection latest 10 prepared statements are cached.

Does Connection.setAutoCommit(true) work for SQL stored procedures and functions?

DB is Oracle.
I want to use Connection.setAutoCommit for SQL stored procedures and functions. Will it work?
For calling procedures I use CallableStatement

It won't work , connection.setAutoCommit(true), in this connection is a Java Object. You can't use same in the stored procedures.
we have a sql system variable named autocommit , if we set it as on when we write the stored procedures , your procedure will autocommit the results what ever you are trying to execute.

Yes it works. Actually auto-commit is on by default. This chapter will give you an idea but to save you some time:
By default, JDBC's auto-commit feature is on, which means that each SQL statement is committed as it is executed. If more than one SQL statement is executed by your program, then a small performance increase can be achieved by turning off auto-commit.
Read the CallableStatement part too:
As you may recall, CallableStatement objects are used to execute database stored procedures. I've saved CallableStatement objects until last, because they are the slowest performers of all the JDBC SQL execution interfaces. This may sound counterintuitive, because it's commonly believed that calling stored procedures is faster than using SQL, but that's simply not true. Given a simple SQL statement, and a stored procedure call that accomplishes the same task, the simple SQL statement will always execute faster. Why? Because with the stored procedure, you not only have the time needed to execute the SQL statement but also the time needed to deal with the overhead of the procedure call itself.
I think it will help you to take decisions concerning your application's performance.

In a word: yes. Just make sure that you don't embed transaction logic in the stored procedure. Leave it all up to the JDBC driver or transaction manager.
There's nothing like trying it for yourself and seeing. I think that's even better than asking questions here.

which is faster? Statement or PreparedStatement

Often, in the network can be found code like this:
private static final String SQL = "SELECT * FROM table_name";
....
and for this SQL query is used PreparedStatement. Why?
As i know, PreparedStatement spend time to precompile SQL statement. It turns out so that the Statement is faster than a PreparedStatement. Or I'm mistaken?

Prepared statements are much faster when you have to run the same statement multiple times, with different data. Thats because SQL will validate the query only once, whereas if you just use a statement it will validate the query each time.
The other benefit of using PreparedStatements is to avoid causing a SQL injection vulnerability - though in your case your query is so simple you haven't encountered that.
For your query, the difference between running a prepared statement vs a statement is probably negligible.
EDIT: In response to your comment below, you will need to look closely at the DAO class to see what it is doing. If for example, each time the method is called it re-creates the prepared statement then you will lose any benefit of using prepared statements.
What you want to achieve, is the encapsulation of your persistence layer so that their is no specific call to MySQL or Postgres or whatever you are using, and at the same time take advantage of the performance and security benefits of things like prepared statements. To do this you need to rely on Java's own objects such as PreparedStatement,.
I personally would build my own DAO class for doing CRUD operations, using Hibernate underneath and the Java Persistence API to encapsulate it all, and that should use prepared statements for the security benefits. If you have a specific use-case for doing repeated operations, then I would be inclined to wrap that within its own object.
Hibernate can be configured to use whatever database vendor you are using via an XML file, and thus it provides really neat encapsulation of your persistence layer. However, it is quite a complicated product to get right!

Most of the time queries are not as simple as your example. If there is any variation to the query, i.e. any parameters that are not known at compile time, you must use PreparedStatement to avoid SQL injection vulnerabilities. This trumps any performance concerns.
If there is any difference between PreparedStatement and Statement, it would be highly dependent on the particular JDBC driver in question and most of the time the penalty will be negligible compare to the cost of going to the database, executing actual query and fetching results back.

As Per the My knowledge PreparedStatement is much faster then statement. Here some reason why preparedstatement is faster then statement please read for more detail.
JDBC API is provide the functionality of connectivity with database. Then we try to execute the query with the use of statement and preparedstatement.
There are four step to execute the query.
Parsing of sql query.
Compile this Query.
optimization of data acquisition path.
execute the query.
Statement interface is suitable when we will not need to execute the query multiple time.
Disadvantages of Statement Interface.
hacker can easily to hack the data. Like suppose we have one query which have the username and password is a parameters you can give the proper parameters is username='abc#example.com' and password ='abc123' actually this is current But hacker can do username='abc#example.com' or '1'=1 and password='' that means you can logged successfully. so that is happening possible in Statement.
And sql validate every time when we fetch the data from database.
So Java has the solution for this above problem that is PreparedStatement.
This interface has many advantages. the main advantages of preparedstatement is sql is not validate the query every time. so you can get the result fast. please read the below more advantages of preparedstatement.
1) We can safely provide the value of query's parameters with setter method.
2) it prevent the SQL injection because it is automatically escapes the special characters.
3) When we use statement above four steps are execute every time But when we use the PreparedStatement only last steps is execute so this is faster then statement.

Faster is not the consideration here. Parsing of the sql will generally be a tiny part of overall execution. See more at When should we use a PreparedStatement instead of a Statement?

when to use Statement over Prepared Statement?

When to use statement instead of prepared statement. i suppose statement is used in queries with no parameter but why not use prepared statement ? Which one is faster for queries with no params.

I suppose statement is used in queries with no parameter but why not use prepared statement ?
That's not even close. PreparedStatements are used in the case of INSERT, UPDATE and DELETE statements that return a ResultSet or an update count. They will not work for DDL statements as pointed out by Joachim, and neither will they work for invocation of stored procedures where a CallableStatement ought to be used (this is not a difference between the two classes). As far as queries with no bind parameters are concerned, PreparedStatements can turn out to be better than Statements (see below).
Which one is faster for queries with no params.
PreparedStatements will turn out to be faster in the long run, over extended use in a single connection. This is because, although PreparedStatements have to be compiled, which would take some time (this really isn't a lot, so don't see this as a drawback), the compiled version essentially holds a reference to the SQL execution plan in the database. Once compiled, the PreparedStatement is stored in a connection specific cache, so that the compiled version may be reused to achieve performance gains. If you are using JDBC Batch operations, using PreparedStatements will make the execution of the batch much faster than the use of plain Statement objects, where the plan may have to be prepared time and again, if the database has to do so.

That's depending on Your requirement.
If you have a SQL statement which runs in a loop or frequently with different parameters then PreparedStatement is the best candidate since it is getting pre-compiled and cache the execution plan for this parameterized SQL query. Each time it runs from the same PreparedStatement object it will use cached execution plan and gives the better performance.
Also SQL injection can be avoided using PreparedStatement .
But if you are sure that you run SQL query only once, sometimes Statement will be the best candidate since when you create PreparedStatement object sometimes it make additional db call, if the driver supports precompilation, the method Connection.prepareStatement(java.lang.String) will send the statement to the database for precompilation.
Read below article to understand "Statement Versus PreparedStatement"
Java Programming with Oracle JDBC

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.