given the following function:
public void foo(Connection conn)
{
PreparedStatement statement = conn.prepareStatement("Select a from table");
statement.execute();
}
with a Connection object that was instantiated elsewhere (Whether external or internal to the application does not matter). Are there are any security issues that can arise by allowing functions in the API of an application to accept a non-validated Connection? For example, can a non-validated connection trick my application into running malicious queries?
Yes, this is possible that the Connection object can run malicious queries.
And: No, I don't know a way to prevent this as the Connection (or a DataSource) is needed by the code that executes queries.
Some type of this is used e.g. for connection pools. From connection pools you get a Connection object where e.g. the open() and close() methods do not work as intended (as the pool wants to close the connection if needed). And so a connection pool could also override other methods (as Connection is only an interface).
If you think about it, whatever is injecting that Connection object already has a reference to the connection object. It could run malicious queries on its own, with or without your code.
Related
Will closing the preparedStatement also close and return the connection to the connection pool?
public void insertProjectIntoDatabase(Project project) {
String insertProjectIntoDatabase =
"INSERT INTO projects(Project_Id, Project_Name, Project_StartDate, Deadline) " +
"VALUES (?, ?, ?, ?)";
try {
preparedStatement = DBCPDataSource.getConnection().prepareStatement(insertProjectIntoDatabase);
preparedStatement.setInt(1, project.getProjectId());
preparedStatement.setString(2, project.getName());
preparedStatement.setDate(3, java.sql.Date.valueOf(project.getStartDate()));
preparedStatement.setDate(4, java.sql.Date.valueOf(project.getDeadline()));
preparedStatement.execute();
preparedStatement.close();
}
catch (SQLException e)
{
System.out.println("Error happened in ProjectRepository at insertProjectIntoDatabase(): " + e.getMessage());
}
}
Bonus question:
I have created performance tests for creating a new connection each time an object needs one, Singleton connection and connection pool.
Singleton - Fastest
Creating a new connection each time - Slower (1.2s than the one above)
Connection Pool - Slowest (First connection - 2-3s slower than the one above, following tests are 0.4s slower than the one above)
I am using Apache Commons DBCP for the connection pool.
I thought using connection pools, would be just a little slower than Singleton connection.
Have I done something wrong?
You asked:
Will closing the preparedStatement also close and return the connection to the connection pool?
Start with the documentation:
Releases this Statement object's database and JDBC resources immediately instead of waiting for this to happen when it is automatically closed. It is generally good practice to release resources as soon as you are finished with them to avoid tying up database resources.
Calling the method close on a Statement object that is already closed has no effect.
Note:When a Statement object is closed, its current ResultSet object, if one exists, is also closed.
No mention of closing the connection.
Try intuition: Do we ever run more than one statement in SQL? Yes, obviously. So logically the connection needs to survive across multiple statements to be useful.
Lastly: Try it yourself, an empirical test. Call Connection#isOpen after calling Statement#close.
➥ No, closing the statement does not close the connection.
For the simplest code, learn to use try-with-resources syntax to auto-close your database resources such as result set, statement, and connection. You’ll find many examples of such code on this site, including some written by me.
As for connection pools, yes, calling close on a connection retrieved from a pool causes the connection object to be be returned to the pool. The pool may choose to re-use the connection, or the pool may choose to close the connection. (Not our concern.)
The only point to a connection pool is speed. If opening a connection to the database takes a significant amount of time, we can save that time by re-using existing connection. Generating and re-using connections is the job of a connection pool.
If a connection pool is showing the slowest results in your testing, then here is something seriously wrong with either your pool or your tests. You did not reveal to us your tests, so we cannot help there. Note: As Marmite Bomber commented, be sure your tests do not include the time needed to establish the connection pool.
Frankly, I have found in my experience that opening a database connection does not take a significant amount of time. Furthermore, the details involved in properly implementing a connection pool are complex and treacherous as evidenced by the list of failed and abandoned connection pool implementation projects. That, combined with the inherent risks such as a transaction being left open on a retrieved connection, led me to avoiding the use of connection pools. I would posit that using a connection pool before collecting proof of an actual problem is a case of premature optimization.
I suggest using an implementation of the interface DataSource as a way to mask from the rest of your code whether you are using a pool and to hide which pool implementation you are currently using. Using DataSource gives you the flexibility to to change between using or not using a connection pool, and the flexibility to change between pools. Those changes become deployment choices, with no need to change your app programming.
Pools are meant to improve performance, not degrade it. DBCP is naive, complicated, and outdated.
I don't think it's appropriate for a production application, especially when so many drivers support pooling in their DataSource natively. The entire pool gets locked the whole time a new connection attempt is made to the database. So, if something happens to your database that results in slow connections or timeouts, other threads are blocked when they try to return a connection to the pool—even though they are done using a database.
Even C3PO performs terribly.
Please try using one of the two connection pools tomcat_connection_pool or HikariCP
Now coming to your main part of the question if you have closed the connection correctly?
Whenever you use a connection pool and you fetch an available connection from the pool you need not have to close the connection that you fetched in your Dao layer. The pool manages the connections that you have created and each connection that the pool lends has a timeout associated with it before which it has to return to the pool. When the pool is shut down all the connections shutdown too.
For more information on how to configure these properties in your connection pool. Please check the links above for each of the connection pools.
When using JDBC in Java, the generally accepted method of querying a database is to acquire a connection, create a statement from that connection, and then execute a query from that statement.
// load driver
Connection con = DriverManager.getConnection(..);
Statement stmt = con.createStatement();
ResultSet result = stmt.executeQuery("SELECT..");
// ...
However, I am unsure of how to treat a second query to the same database.
Can another query be executed safely on the same Statement object, or must another statement be created from the Connection object in order to execute another query?
If the same Statement object can be used for multiple queries, what is the purpose of the Statement class (since it would then make more sense for a Connection.executeQuery() method to exist)?
Yes you can reuse the Statement object, but the ResultSet objects returned by the executeQuery closes already opened resultsets.
See the javadoc for the explanation
By default, only one ResultSet object per Statement object can be open
at the same time. Therefore, if the reading of one ResultSet object is
interleaved with the reading of another, each must have been generated
by different Statement objects. All execution methods in the Statement
interface implicitly close a statment's current ResultSet object if an
open one exists.
So the following occurs:
// load driver
Connection con = DriverManager.getConnection(..);
Statement stmt = con.createStatement();
ResultSet result = stmt.executeQuery("select ..");
// do something with result ... or not
ResultSet result2 = stmt.executeQuery("select ...");
// result is now closed, you cannot read from it anymore
// do something with result2
stmt.close(); // will close the resultset bound to it
For example you can find an open source implementation of Statement in the jTDS project.
In the Statement.executeQuery() method you can see a call to initialize() that closes all the resultsets already opened
protected void initialize() throws SQLException {
updateCount = -1;
resultQueue.clear();
genKeyResultSet = null;
tds.clearResponseQueue();
// FIXME Should old exceptions found now be thrown instead of lost?
messages.exceptions = null;
messages.clearWarnings();
closeAllResultSets();
}
Programmatically, you can reuse the same connection and the same statement for more than one query and close the statement and the connection at the end.
However, this is not a good practice. Application performance is very sensitive to the way database is accessed. Ideally, each connection should be open for the least amount of time possible. Then, the connections must be pooled. Going by that, you would enclose each query in a block of {open connection, create a prepared statement, run query, close statement, close connection}. This is also the way most SQL Templates are implemented. If concurrency permits, you can fire several such queries at the same time using a thread pool.
I have one thing to add should you use Connection and Statement in a threaded environment.
My experience shows that stmt.executeQuery(..) is save to use in a parallel environment but with the consequence that each query is serialized and thus processed sequencially, not yielding any speed-ups.
So it es better to use a new Connection (not Statement) for every thread.
For a standard sequential environment my experience has shown that reusing Statements is no problem at all and ResultSets need not be closed manually.
I wouldn't worry about creating new statements. However opening up a database connection may be resource intensive and opening and closing connections does impact performance.
Leaving up connections in some self management way usually is pretty bad.
You should consider using connection pooling. You usually issue a close commando however you are only giving that connection back to the pool. When you request a new connection then it will reuse the connection you gave back earlier.
You may want to have different statements for one connection. Statement is an implementation and an interface. Depending on what you need you sometimes want a use a CallableStatment. Some some logic may be reused when required.
Usually, it's one statement for one query. It might not be necessary to do that but when writing real application, you don't want to repeat those same steps again and again. That's against the DRY principal, plus it also will get more complicated as the application grows.
It's good to write objects that will handle that kind of low level (repetitive) stuffs, and provide different methods to access db by providing the queries.
Well that's why we have the concept of classes in object oriented programming . A class defines constituent members which enable its instances to have state and behavior. Here statement deals with everything related to an sql statement. There are so many more function that one might perform like batch queries etc.
If I get a connection object using DriverManager.getConnection() and DataSource.getConnection(), how they differ in behavior when .close() is called on those objects?
Before .close() method call, I got relevant Statement and ResultSet objects from these two different connections. Soon after getting these two objects, if I say connection1.close() (through DriverManager.getConnection()), it will nullify the connection object and I'm not supposed / allowed to access the relevant Statement and ResultSet objects. Correct me if I'm wrong?
Second scenario, now if I say connection2.close() (through DataSource.getConnection()), it simply returns it back to the pool. But the connection is still live. Will I be able to access the associated Statement and ResultSet objects?
If we assume a (basic) DataSource (that is: one that does not do connection pooling), then you obtain a physical connection that is the same as one obtained from DriverManager (some drivers even internally use DriverManager from the DataSource, or a DataSource from DriverManager). So those connections will behave identically.
Now if we assume a DataSource that provides connection pooling, then the DataSource itself uses a ConnectionPoolDataSource (or a similar internal mechanism) to obtain a PooledConnection. This PooledConnection manages the actual physical connection to the database.
When a user requests a connection from the DataSource, the DataSource will checkout a PooledConnection, and ask it for a Connection. The PooledConnection will then create a logical connection that uses or wraps the physical connection (eg using a Proxy). The DataSource will return that logical connection to the user.
To the user the logical connection should behave identical to a physical connection in all aspects. So when a user closes the connection, that logical connection and all dependent JDBC objects will be closed and behave identical to a physical connection close.
JDBC 4.1 section 11.1 says:
Connection pooling is completely transparent to the client: A client obtains a pooled
connection and uses it just the same way it obtains and uses a non pooled connection.
And section 11.4:
If the application attempts to reuse the logical handle, the Connection implementation
throws an SQLException.
and
For a given PooledConnection object, only the most recently produced logical Connection object will be valid. Any previously existing Connection object is automatically closed when the associated PooledConnection.getConnection method is called.
In the background however, when the logical connection is closed, the PooledConnection will signal the DataSource that it is available for reuse, and the DataSource will then return it to the connection pool, or close the PooledConnection (which closes the physical connection) if it no longer needs the connection.
The DataSource can also forcefully revoke a connection from a user (eg when a connection is checked out too long, etc), by asking the PooledConnection to close the logical connection.
connection1.close() (through DriverManager.getConnection()),
This will close the physical connection established to the database and all the resources viz. Resultset, Statement, Connection are released. So, you cannot access them after the connection is closed.
connection2.close() (through DataSource.getConnection())
This is DataSource-implementation dependent and so the behavior need not be consistent across different DataSource implementations. Also, within a given DataSource implementation, the connection's actual life cycle is dependent on various other parameters that it is strongly recommended to not differentiate this Connection from the one obtained through DriverManager.
If you really want the data held in the ResultSet to be available after the Statement and Connection are closed, you can take a look at CachedRowSet if that fits your usecase.
The client-side caching might be dependent on the driver being used to connection not sure.
But Some drivers specifically prevent you from using a statement or resultset once the connection has been closed. Others maintain the resultset on the client side
well, I've been thinking of making database requests a little faster by keeping the connection to the database open as long as the object is being used. So I was thinking of opening the connection in the constructor of that class.
Now the question is, how can I close the connection after I stopped using? I have to call close() somewhere, don't I?
I've been reading about the finalize() method, but people seemed to be skeptical about usage of this method anywhere at all. I'd expect it to have something like a destructor, but Java doesn't have that, so?
So could anyone provide me with a solution? Thanks in advance.
I would suggest that you rather implement database connection pooling if the application would allow it. With connection pooling a pool of connections would be created and stay connected to the database. Your application would then grab a open/unused connection from the pool use it and return it to the pool.
This would allow you to acquire connections faster and you won't have to modify your classes too much. Database connection pooling is a great technique if you need to scale your application.
The other benefit is that your database connection pool will be managed by some sort of driver which will take care of opening connections, keeping them open, growing the pool if required and also shrinking the pool when extra connections are not used for a certain amount of time. This would be similar to the code you are trying to implement in the constructor and finalization methods.
Generally speaking you aqquire a database connection only when needed and release it as soon as possible.
I would recommend to make your class an implementor of java.io.Closeable. According to this interface you will have to implement void close() throws IOException, which all clients of the class will call, because it's a good practice to close Closeable classes after use.
As per my understanding, JDBC Connection Pooling (at a basic level) works this way:
create connections during app initialization and put in a cache
provide these cached connections on demand to the app
a separate thread maintains the Connection Pool, performing activities like:
discard connections that have been used (closed)
create new connections and add to the cache to maintain a specific count of connections
But, whenever I hear the term "connection reuse" in a JDBC Connection Pooling discussion, I get confused. When does the connection reuse occurs?
Does it means that Connection Pool provides the same connection for two different database interactions (without closing it)? Or, is there a way to continue using a connection even after it gets closed after a DB call?
Connection pooling works by re-using connections. Applications "borrow" a connection from the pool, then "return" it when finished. The connection is then handed out again to another part of the application, or even a different application.
This is perfectly safe as long as the same connection is not is use by two threads at the same time.
The key point with connection pooling is to avoid creating new connections where possible, since it's usually an expensive operation. Reusing connections is critical for performance.
The connection pool does not provide you with the actual Connection instance from the driver, but returns a wrapper. When you call 'close()' on a Connection instance from the pool, it will not close the driver's Connection, but instead just return the open connection to the pool so that it can be re-used (see skaffman's answer).
Connection pooling reuses connections.
Here is how apache dbcp works underline.
Connection poolableConnection= apacheDbcpDataSource.getConnection();
Apache DBCP implementation returns connection wrapper which is of type PoolableConnection.
poolableConnection.close();
PoolableConnection.close() inspects if actual underlying connection is closed or not, if not then it returns this PoolableConnection instance into connection pool (GenericObjectPool in this case).
if (!isUnderlyingConectionClosed) {
// Normal close: underlying connection is still open, so we
// simply need to return this proxy to the pool
try {
genericObjectPool.returnObject(this); //this is PoolableConnection instance in this case
....
}
My understanding is the same as stated above and, thanks to a bug, I have evidence that it's correct. In the application I work with there was a bug, an SQL command with an invalid column name. On execution an exception is thrown. If the connection is closed then the next time a connection is gotten and used, with correct SQL this time, an exception is thrown again and the error message is the same as the first time though the incorrect column name doesn't even appear in the second SQL. So the connection is obviously being reused. If the connection is not closed after the first exception is thrown (because of the bad column name) then the next time a connection is used everything works just fine. Presumably this is because the first connection hasn't been returned to the pool for reuse. (This bug is occurring with Jave 1.6_30 and a connection to a MySQL database.)