Cassandra - using PreparedStatement with ListenableFuture

Cassandra - using PreparedStatement with ListenableFuture - java

Im attempting to make async writes to a Cassandra cluster using ListenableFuture as follows:
private static Cluster cluster = null;
private ListeningExecutorService executorService;
private PreparedStatement preparedStatement;
private Session session = null;
...
executorService = MoreExecutors.listeningDecorator(Executors.newFixedThreadPool(POOL_SIZE));
...
public void writeValue(Tuple tuple) {
ListenableFuture<String> future = executorService.submit(new Callable<String>() {
#Override
public String call() throws Exception {
if(session == null) {
session = getCluster().connect("dbname");
preparedStatement = session.prepare(queryString);
}
try {
BoundStatement boundStatement = preparedStatement.bind(tuple values);
session.execute(boundStatement);
} catch(Exception exception) {
// handle exception
}
return null;
}
});
If I set POOL_SIZE to 1 everything works.
If I set POOL_SIZE to > 1 I get errors as follows:
Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: Tried to execute unknown prepared query : 0x75c5b41b9f07afa5384a69790503f963. You may have used a PreparedStatement that was created with another Cluster instance.
So I session and preparedStatement into local vars. Then I get warnings about Re-preparing already prepared query ... plus it's creating a new session every time.
I want to reuse as much as possible. What am I doing wrong and what are my options?
Would it help to make this class static?

You have all sorts of race conditions here and execution isn't thread safe.
Each of Cluster, Session, and PreparedStatement are meant to be application scoped singletons, ie. you only need one (one for each query for PreparedStatement).
However, you are recreating a Session and potentially preparing PreparedStatement multiple times.
Don't. Initialize your Session once, in a constructor or some location that only runs once and prepare your statements at the same time. Then use the Session and PreparedStatement where appropriate.
Using a single threaded executor, everything runs as if it was synchronous. When you add more threads, many of them may call
session.prepare(queryString);
at the same time. Or the PreparedStatement you use here
BoundStatement boundStatement = preparedStatement.bind(tuple values);
session.execute(boundStatement);
might be different from the one you initialized
preparedStatement = session.prepare(queryString);
even within the same thread of execution. Or you might be attempting to execute the PreparedStatement with a different Session than the one used to initialize it.
Here are some things you should be doing when using CQL drivers.
Is a prepared statement bound on one session or is it useable on another session?
A prepared statement is derived from a particular session instance.
So when you prepare a statement and it is sent over to the server, it
is sent to the cluster with which this session instance is associated
with.
The javadoc of Session states
Session instances are thread-safe and usually a single instance is
enough per application.

You might also want to use the driver's asynchronous API. Instead of calling execute (which will block your thread for the duration of the query), call executeAsync and register a callback on the resulting future to process the result.
If that callback is expensive and you don't want to block the driver's internal I/O thread, then you can provide your own executor:
ListenableFuture<ResultSet> future = session.executeAsync(statement);
Futures.addCallback(future, new FutureCallback<ResultSet>() {
public void onSuccess(ResultSet rs) { ... }
public void onFailure(Throwable t) { ... }
},
executorService);
This page in the documentation has a few tips on async programming.

Related

Trying to identify a design pattern or strategy used to isolate, utilize, and manage database connections

I ran into some code and I wanted to research other people's approaches to it, but I'm not sure what the design pattern is called. I tried searching for "database executer" and mostly got results about Java's Executor framework which is unrelated.
The pattern I'm trying to identify uses a single class to manage connections and execute queries through the use of functions that allow you to isolate any issues related to connection management.
Example:
// Service class
public Service {
private final Executor executor;
public void query(String query) {
ResultSet rs = (ResultSet) executor.execute((connection) -> {
Statement st = connection.createStatement();
return st.executeQuery(query);
});
}
}
// Executer class
public Executer {
private final DataSource dataSource;
public Object execute(Function function) {
Connection connection = dataSource.getConnection();
try {
return function(connection);
} catch(Exception e) {
log...
} finally {
// close or return connection to pool
}
}
}
As you can see from above, if you ever have a connection leak you don't need to search through a bunch of DAOs or services, it's all contained in a single executor class. Any idea what this strategy or design pattern is called? Anyone see this before or know of open source projects that utilize this strategy/pattern?

How to release a connection automatically when exiting any method of a class?

So, here is some background info: I'm currently working at a company providing SaaS and my work involves writing methods using JDBC to retrieve and process data on a database. Here is the problem, most of the methods comes with certain pattern to manage connection:
public Object someMethod(Object... parameters) throws MyCompanyException{
try{
Connection con = ConnectionPool.getConnection();
con.setAutoCommit(false);
// do something here
con.commit();
con.setAutoCommit(true);
}
catch(SomeException1 e){
con.rollback();
throw new MyCompanyException(e);
}
catch(SomeException2 e){
con.rollback();
throw new MyCompanyException(e);
}
// repeat until all exception are catched and handled
finally {
ConnectionPool.freeConnection(con);
}
// return something if the method is not void
}
It had been already taken as a company standard to do all methods like this, so that the method would rollback all changes it had made in case of any exception is caught, and the connection will also be freed asap. However, from time to time some of us may forget to do some certain routine things when coding, like releasing connection or rollback when error occurs, and such mistake is not quite easily detectable until our customers complaint about it. So I've decided to make these routine things be done automatically even it is not declared in the method. For connection initiation and set up, it can be done by using the constructor easily.
public abstract SomeAbstractClass {
protected Connection con;
public SomeAbstractClass() {
con = CoolectionPool.getConnection();
con.setAutoCommit(false);
}
}
But the real problem is to make connection to be released automatically immediately after finishing the method. I've considered using finalize() to do so, but this is not what I'm looking for as finalize() is called by GC and that means it might not finalize my object when the method is finished, and even when the object will never be referenced. finalize() is only called when JVM really run out of memory to go on.
Is there anyway to free my connection automatically and immediately when the method finishes its job?

Use "try with resources". It is a programming pattern such that you write a typical looking try - catch block, and if anything goes wrong or you exit it, the resources are closed.
try (Connection con = ConnectionPool.getConnection()) {
con.doStuff(...);
}
// at here Connection con is closed.
It works by Connection extending Closeable, and if any class within the "resource acquisition" portion of the try statement implements Closeable then the object's close() method will be called before control is passed out of the try / catch block.
This prevents the need to use finally { ... } for many scenarios, and is actually safer than most hand-written finally { ... } blocks as it also accommodates exceptions throw in the catch { ... } and finally { ... } blocks while still closing the resource.

One of the standard ways to do this is using AOP. You can look at Spring Framework on how it handles JDBC tansactions and connections and manages them using MethodInterceptor. My advice is to use Spring in your project and not reinvent the wheel.
The idea behind MethodInterceptor is that you add a code that creates and opens connection before JDBC related method is called, puts the connection into the thread local so that your method can get the connection to make SQL calls, and then closes it after the method is executed.

You could add a method to your ConnectionPool class for example:
public <T> T execute(Function<Connection, T> query,
T defaultValue,
Object... parameters) {
try {
Connection con = ConnectionPool.getConnection();
con.setAutoCommit(false);
Object result = query.apply(conn);
con.commit();
con.setAutoCommit(true);
return result;
} catch(SomeException1 e) {
con.rollback();
throw new MyCompanyException(e);
}
//etc.
finally {
ConnectionPool.freeConnection(con);
}
return defaultValue;
}
And you call it from the rest of your code with:
public Object someMethod(Object... parameters) throws MyCompanyException {
return ConnectionPool.execute(
con -> { ... }, //use the connection and return something
null, //default value
parameters
);
}

Does synchronized lock a Result Set object?

I'm trying to multi thread a Result Set. I want to make sure whenever I call the next() within one of the many threads, all other threads are locked out. This is important , because if many threads call the next() method simultaneously, this will result in skipping the rows. Here is what I did
public class MainClass {
private static ResultSet rs;
public static void main (String [] args) {
Thread thread1 = new Thread(new Runnable() {
#Override
public void run() {
runWhile();
}});
Thread thread2 = new Thread(new Runnable() {
#Override
public void run() {
runWhile();
}});
thread1.start();
thread2.start();
thread1.join();
thread2.join();
System.exit(0);
}
private static void runWhile () {
String username = null;
while ((username = getUsername()) != null) {
// Use username to complete my logic
}
}
/**
* This method locks ResultSet rs until the String username is retrieved.
* This prevents skipping the rows
* #return
* #throws SQLException
*/
private synchronized static String getUsername() throws SQLException {
if(rs.next()) {
return rs.getString(1).trim();
}
else
return null;
}
}
Is this a correct way of using synchronized. Does it lock the ResutSet and makes sure other thread do not interfere ?
Is this a good approach ?

JDBC objects shouldn't be shared between threads. That goes for Connections, Statements, and ResultSets. The best case here would be that the JDBC vendor follows the spec and does internal locking so that you can get by with this, in which case all the threads are still trying to acquire the same lock and only one can make progress at a time. This will be slower than using a single thread, because on top of doing the same work to read from the database there is extra overhead from managing all the threading.
(Locking done by the driver could be for the driver's benefit, so the provider doesn't have to deal with bug reports of race conditions caused by users misusing their software. That it does locking doesn't necessarily imply the software should actually be used by multiple threads.)
Multithreading works when threads can make progress concurrently, see Amdahl's Law. If you have a situation where you can read the ResultSet and use the results to create tasks which you submit to an ExecutorService (as Peter Lawrey recommends in a comment) then that would make more sense (as long as those tasks can work independently and don't have to wait on each other).

I will suggest to create the ResultSet, then copy all the data into a DTO (Data Transfer Object) or a DAO (Data Access Object). After having the data on the DTO or DAO, close your ResultSet, Statement and Connection.
A very simple structure to creat a DTO/DAO to store records in order, its fields, and parsing capabilities is this:
ArrayList<HashMap<String, Object>> table = new ArrayList<HashMap<String, Object>>();
HashMap<String, Object> record = new HashMap<String, Object>();
String field1 = "something";
Integer field2 = new Integer(45);
record.put("field1", field1);
record.put ("field2", field2);
table.add(record);
You may (and probably you should) automate and make the DTO/DAO flexible enough to use the same class in any table, without hard code or fixed names.
Remember that you will need to create a wrapper and the methods for storing/reading the data, and that these methods should be thread safe.
Keep in mind that this design only works if you have enough memory to store all the records of your ResultSet.

DB connection pool getting exhausted -- Java

I am using Connection Pool (snaq.db.ConnectionPool) in my application. The connection pool is initialized like:
String dburl = propertyUtil.getProperty("dburl");
String dbuserName = propertyUtil.getProperty("dbuserName");
String dbpassword = propertyUtil.getProperty("dbpassword");
String dbclass = propertyUtil.getProperty("dbclass");
String dbpoolName = propertyUtil.getProperty("dbpoolName");
int dbminPool = Integer.parseInt(propertyUtil.getProperty("dbminPool"));
int dbmaxPool = Integer.parseInt(propertyUtil.getProperty("dbmaxPool"));
int dbmaxSize = Integer.parseInt(propertyUtil.getProperty("dbmaxSize"));
long dbidletimeout = Long.parseLong(propertyUtil.getProperty("dbidletimeout"));
Class.forName(dbclass).newInstance();
ConnectionPool moPool = new ConnectionPool(dbpoolName, dbminPool, dbmaxPool, dbmaxSize,
dbidletimeout, dburl, dbuserName, dbpassword);
DB Pool values used are:
dbminPool=5
dbmaxPool=30
dbmaxSize=30
dbclass=org.postgresql.Driver
dbidletimeout=25
My application was leaking connection somewhere (connection was not released) and due to which the connection pool was getting exhausted. I have fixed that code for now.
Shouldn't the connections be closed after idle timeout period? If that is not correct assumption, Is there any way to close the open idle connections anyway (through java code only)?

The timeout variable does not seem to correspond to the time the connection is being idle but to how much time the pool can wait to return a new connection or throw an exception (I had a look at this source code, don't know if it is up-to-date). I think that it would be rather difficult to keep track of "idle" connections because what "idle" really means in this case? You might want to get a connection for later use. So I would say that the only safe way for the connection pool to know that you are done with the connection, is to call close() on it.
If you are worried about the development team forgetting to call close() in their code, there is a technique which I describe below and I have used in the past (in my case we wanted to keep track of unclosed InputStreams but the concept is the same).
Disclaimer:
I assume that the connections are only used during a single request and do not span during consecutive requests. In the latter case you can't use the solution below.
Your connection pool implementation seems to already use similar techniques with the ones I describe below (i.e. it already wraps the connections) so I cannot possibly know if this will work for your case or not. I have not tested the code below, I just use it to describe the concept.
Please use that only in your development environment. In production you should feel confident that your code is tested and that it behaves correctly.
Having said the above, the main idea is this: We have a central place (the connection pool) from where we acquire resources (connections) and we want to keep track if those resources are released by our code. We can use a web Filter that uses a ThreadLocal object that keeps track of the connections used during the request. I named this class TrackingFilter and the object that keeps track of the resources is the Tracker class.
public class TrackingFilter implements Filter {
#Override
public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {
Tracker.start();
try {
chain.doFilter(request, response);
} finally {
Tracker.stop();
}
}
...
}
For the Tracker to be able to keep track of the connections, it needs to be notified every time a connection is acquired with getConnection() and every time a connection is closed with a close() call. To be able to do that in a way that is transparent to the rest of the code we need to wrap the ConnectionPool and the returned Connection objects. Your code should return the new TrackingConnectionPool instead of the original pool (I assume the way to access the connection pool is at a single place). This new pool will wrap in turn, every Connection it provides, as a TrackableConnection. The TrackableConnection is the object that knows how to notify our Tracker when created and when closed.
When you call Tracker.stop() at the end of the request it will report any connections for which close() has not been called yet. Since this is a per request operation you will identify only the faulty operations (i.e. during your "Create new product" functionality) and then hopefully you will be able to track down those queries that leave open connections and fix them.
Below you can find code and comments for the TrackingConnectionPool, TrackableConnection and the Tracker class. The delegate methods were left out for brevity. I hope that helps.
Note: For the wrappers use an automated IDE feature (like Eclipse's "Generate delegate methods") otherwise it would be a time-consuming and error prone task.
//------------- Pool Creation
ConnectionPool original = new ConnectionPool(String dbpoolName, ...);
TrackingConnectionPool trackingCP = new TrackingConnectionPool(original);
// ... or without creating the ConnectionPool yourself
TrackingConnectionPool trackingCP = new TrackingConnectionPool(dbpoolName, ...);
// store the reference to the trackingCP instead of the original
//------------- TrackingConnectionPool
public class TrackingConnectionPool extends ConnectionPool {
private ConnectionPool originalPool; // reference to the original pool
// Wrap all available ConnectionPool constructors like this
public TrackingConnectionPool(String dbpoolName, ...) {
originalPool = new ConnectionPool(dbpoolName, ...);
}
// ... or use this convenient constructor after you create a pool manually
public TrackingConnectionPool(ConnectionPool pool) {
this.originalPool = pool;
}
#Override
public Connection getConnection() throws SQLException {
Connection con = originalPool.getConnection();
return new TrackableConnection(con); // wrap the connections with our own wrapper
}
#Override
public Connection getConnection(long timeout) throws SQLException {
Connection con = originalPool.getConnection(timeout);
return new TrackableConnection(con); // wrap the connections with our own wrapper
}
// for all the rest public methods of ConnectionPool and its parent just delegate to the original
#Override
public void setCaching(boolean b) {
originalPool.setCaching(b);
}
...
}
//------------- TrackableConnection
public class TrackableConnection implements Connection, Tracker.Trackable {
private Connection originalConnection;
private boolean released = false;
public TrackableConnection(Connection con) {
this.originalConnection = con;
Tracker.resourceAquired(this); // notify tracker that this resource is aquired
}
// Trackable interface
#Override
public boolean isReleased() {
return this.released;
}
// Note: this method will be called by Tracker class (if needed). Do not invoke manually
#Override
public void release() {
if (!released) {
try {
// attempt to close the connection
originalConnection.close();
this.released = true;
} catch(SQLException e) {
throw new RuntimeException(e);
}
}
}
// Connection interface
#Override
public void close() throws SQLException {
originalConnection.close();
this.released = true;
Tracker.resourceReleased(this); // notify tracker that this resource is "released"
}
// rest of the methods just delegate to the original connection
#Override
public Statement createStatement() throws SQLException {
return originalConnection.createStatement();
}
....
}
//------------- Tracker
public class Tracker {
// Create a single object per thread
private static final ThreadLocal<Tracker> _tracker = new ThreadLocal<Tracker>() {
#Override
protected Tracker initialValue() {
return new Tracker();
};
};
public interface Trackable {
boolean isReleased();
void release();
}
// Stores all the resources that are used during the thread.
// When a resource is used a call should be made to resourceAquired()
// Similarly when we are done with the resource a call should be made to resourceReleased()
private Map<Trackable, Trackable> monitoredResources = new HashMap<Trackable, Trackable>();
// Call this at the start of each thread. It is important to clear the map
// because you can't know if the server reuses this thread
public static void start() {
Tracker monitor = _tracker.get();
monitor.monitoredResources.clear();
}
// Call this at the end of each thread. If all resources have been released
// the map should be empty. If it isn't then someone, somewhere forgot to release the resource
// A warning is issued and the resource is released.
public static void stop() {
Tracker monitor = _tracker.get();
if ( !monitor.monitoredResources.isEmpty() ) {
// there are resources that have not been released. Issue a warning and release each one of them
for (Iterator<Trackable> it = monitor.monitoredResources.keySet().iterator(); it.hasNext();) {
Trackable resource = it.next();
if (!resource.isReleased()) {
System.out.println("WARNING: resource " + resource + " has not been released. Releasing it now.");
resource.release();
} else {
System.out.println("Trackable " + resource
+ " is released but is still under monitoring. Perhaps you forgot to call resourceReleased()?");
}
}
monitor.monitoredResources.clear();
}
}
// Call this when a new resource is acquired i.e. you a get a connection from the pool
public static void resourceAquired(Trackable resource) {
Tracker monitor = _tracker.get();
monitor.monitoredResources.put(resource, resource);
}
// Call this when the resource is released
public static void resourceReleased(Trackable resource) {
Tracker monitor = _tracker.get();
monitor.monitoredResources.remove(resource);
}
}

You don't have your full code posted so I assume you are not closing your connections. You STILL need to close the connection object obtained from the pool as you would if you were not using a pool. Closing the connection makes it available for the pool to reissue to another caller. If you fail to do this, you will eventually consume all available connections from your pool. A pool's stale connection scavenger is not the best place to clean up your connections. Like your momma told you, put your things away when you are done with them.
try {
conn = moPool.getConnection(timeout);
if (conn != null)
// do something
} catch (Exception e) {
// deal with me
} finally {
try {
conn.close();
} catch (Exception e) {
// maybe deal with me
}
}
E

The whole point of connection pooling is to let pool handle all such things for you.
Having a code for closing open idle connections of java pool will not help in your case.
Think about connection pool maintaining MAPs for IDLE or IN-USE connections.
IN-USE: If a connection object is being referenced by application, it is put in to in-use-map by pool.
IDLE: If a connection object is not being referenced by application / or closed, it is put into idle-map by pool.
Your pool exhausted because you were not closing connections. Not closing connections resulted all idle connections to be put into in-use-map.
Since idle-pool does not have any entry available, pool is forced to create more of them.
In this way all your connections got marked as IN-USE.
Your pool does not have any open-idle-connections, which you can close by code.
Pool is not in position to close any connection even if time-out occurs, because nothing is idle.
You did your best when you fixed connection leakage from your code.
You can force release of pool and recreate one. But you will have to be carefull because of existing connections which are in-use might get affected in their tasks.

In most connection pools, the idle timeout is the maximum time a connection pool is idle in the connection pool (waiting to be requested), not how long it is in use (checked out from the connection pool).
Some connection pools also have timeout settings for how long a connection is allowed to be in use (eg DBCP has removeAbandonedTimeout, c3p0 has unreturnedConnectionTimeout), and if those are enabled and the timeout has expired, they will be forcefully revoked from the user and either returned to the pool or really closed.

log4jdbc can be used to mitigate connection leak troubleshooting by means of jdbc.connection logger.
This technique doesn't require any modification of the code.

dao as a member of a servlet - normal?

I guess, DAO is thread safe, does not use any class members.
So can it be used without any problem as a private field of a Servlet ? We need only one copy, and
multiple threads can access it simultaneously, so why bother creating a local variable, right?

"DAO" is just a general term for database abstraction classes. Whether they are threadsafe or not depends on the specific implementation.
This bad example could be called a DAO, but it would get you into trouble if multiple threads call the insert method at the same time.
class MyDAO {
private Connection connection = null;
public boolean insertSomething(Something o) throws Exception {
try {
connection = getConnection()
//do insert on connection.
} finally {
if (connection != null) {
connection.close();
}
}
}
}
So the answer is: if your DAO handles connections and transactions right, it should work.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.