Cassandra Datastax connection pooling monitor/metrics

Cassandra Datastax connection pooling monitor/metrics - java

my team is moving from using Astyanax driver (which is deprecated soon if not already) to using Datastax 3.0 driver.
Our code implements Astyanax's ConnectionPoolMonitor class and we capture about 22 different metrics on our connection pool usage.
I am trying to find an equivalent way to do this with Datastax driver. But all I could find is this:
https://datastax.github.io/java-driver/manual/pooling/#monitoring-and-tuning-the-pool
Basically, the example above shows how you can run a background thread that continuously polls Session.State. This seems rather awkward. Astyanax does callbacks to the classes that implement ConnectionPoolMonitor.
And the amount of info exposed in Session.State is rather limited: connected hosts, inflight queries, open connections, and trashed connections.
Is there a better option out there that I haven't found somehow? How can I capture metrics such as these:
count of when Pool is exhausted, got connection timeout, socket timeout, got not hosts
count of connection created, closed, borrowed, returned, creation error
count of host added, removed, down, reactivated/reconnected
count of exception unknown error, bad request, interrupted, transport error

Try cluster.getMetrics() and read this Java doc: http://docs.datastax.com/en/drivers/java/3.0/com/datastax/driver/core/Metrics.html

Related

Does the MongoDB Inc.'s async driver (java) multiplex requests over a connection?

I've been trying to get some clarity on the way the sync and async drivers by Mongo DB Inc. differ in terms of underlying implementation, but seem to not find any information on it. Specifically, I'm interested in knowing if, with the async driver, we get the advantage of requests being multiplexed over a single connection. If so, what is the limits for the number of concurrent requests for a connection pool with n connections?
On the ticket here - https://github.com/brianfrankcooper/YCSB/issues/1039 , the comment by #allanbank says that multiplexing is implemented for the async driver, but it's not clear if it's referring to the MongoDB Inc's driver or allanbank's driver.

How/When does the server get to know that a connection got staled?

As per IBM documentation:
Purge Policy
Specifies how to purge connections when a stale connection or fatal connection error is detected.
Valid values are EntirePool and FailingConnectionOnly.
Question:
How/When does the server get to know that a connection got staled? Does it purge the pool as soon as (immediately?) any connection goes stale or it happens as per the reap time?
Say if the reap time is 180 seconds. Let's say the reap thread last ran at 3:05 PM and a connection goes stale at 3:06 PM, will the server purge the pool at 3:06 PM itself or the purge will happen only at 3:08 PM ? Is there a risk of clients getting staled connection objects between 3:06 and 3:08 ?
The IBM document i'm referring to is:
https://www.ibm.com/support/knowledgecenter/en/ssw_i5_54/rzamy/50/admin/help/udat_conpoolset.html

Stale connections are identified in the following ways:
JDBC operation is performed that raises SQLRecoverableException or SQLNonTransientConnectionException, or a general SQLException with a SQL State or error code that the application server has built-in knowledge of. For specific lists of SQL states and error codes, refer to the SQLState mappings in DatabaseHelper and its various subclasses per database.
JDBC driver's ConnectionEventListener.connectionErrorOccurred signals the application server that a connection has gone bad.
When the application server learns that a connection has gone bad, it does not return that connection to the pool. Subsequent requests outside of the sharing scope would never get that same connection.
Purge Policy determines what the application server does with the other connections that are in the pool at the time that a stale connection occurred. The application server can aggressively purge all connections from the pool (EntirePool option), or it can leave the others there (FailingConnectionOnly option), or it can check all connections in the pool before allowing them to be handed out (ValidateAllConnections option).
Note that the property values above are for WebSphere Application Server Liberty. If using traditional, ValidateAllConnections is done as the combination of FailingConectionOnly plus defaultPretestOptimizationOverride=true.

Datasource Microsoft JDBC Driver for SQL Server (AlwaysOn Availability Groups)

I have a question related to the scenario when connecting from a Java application using the Microsoft JDBC Driver 4.0 to a SQL Server 2014 with AlwaysOn Availability Groups set up for high availability.
With this set up, we will be connecting to an availability group listener (specified in the db connecting string instead of any particular instance), so that the DB fail-over etc. is handled gracefully by the listener and it tries to connect to the next available instance behind the scenes if current primary goes down in the AG cluster.
Question(s) I have is,
In the data-source that is configured on the j2ee application server side (we use WebSphere), what happens to those connections already pooled by the data-source?
When a database goes down, though the AG listener would try to reconnect on the db side to the next available DB, will the AG Listener also through the jdbc driver send an event or something to the data-source created on the app server and make sure those connections that are already pooled by the datasource to be discarded and have it create new ones so that transactions on the application side wont fail (though they might for a while till new connections are created and fail over is successful) or the java application has to find out only after requesting it from the datasource?

WebSphere Application Server is able to cope with bad connections and removes them from the pool. Exactly when this happens depends on some configurable options and on how fully the Microsoft JDBC driver takes advantage of the javax.sql.ConnectionEventListener API to send notifications to the application server. In the ideal case where a JDBC driver sends the connectionErrorOccurred event immediately for all connections, WebSphere Application Server responds by removing all of these connections from the pool and by marking any connection that is currently in-use as bad so that it does not get returned to the pool once the application closes the handle. Lacking this, WebSphere Application Server will discover the first bad connection upon next use by the application. It is discovered either by a connectionErrorOcurred event that is sent by the JDBC driver at that time, or lacking that, upon inspecting the SQLState/error code of an exception for known indicators of bad connections. WebSphere Application Server then goes about purging bad connections from the pool according to the configured Purge Policy. There are 3 options:
Purge Policy of Entire Pool - all connections are removed from
the pool and in-use connections marked as bad so that they are not
pooled.
Purge Policy of Failing Connection Only - only the
specific connection upon which the error actually occurred is
removed from the pool or marked as bad and not returned to the pool
Purge Policy of Validate All Connections - all connections are
tested for validity (Connection.isValid API) and connections found
to be bad are removed from the pool or marked as bad and not
returned to the pool. Connections found to be valid remain in the
pool and continue to be used.
I'm not sure from your description if you are using WebSphere Application Server traditional or Liberty. If traditional, there is an additional option for pre-testing connections as they are handed out of the pool, but be aware that turning this on can have performance implications.
That said, the one thing to be aware of is that regardless of any of the above, your application will always need to be capable of handling the possibility of errors due to bad connections (even if the connection pool is cleared, connections can go bad while in use) and respond by requesting a new connection and retrying the operation in a new transaction.

Version 4 of that SQL Server JDBC driver is old and doesn't know anything about the always on feature.
Any data source connection pool can be configured to check the status of the connection from the pool prior to doling it out to the client. If the connection cannot be used the pool will create a new one. That's true of all vendors and versions. I believe that's the best you can do.

How do I close Database Connection Datastax Java Driver

I am using Datastax Java Driver.
There is a tutorial to use the same.
What I do not understand is how would one close the connection to cassandra?
There is no close method available and I assume we do not want to shutdown the Session as it it expected to be one per application.
Regards
Gaurav

tl;dr Calling shutdown on Session is the correct way to close connections.
You should be safe to keep a Session object at hand and shut it when you're finished with Cassandra - this can be long-lived. You can get individual connections in the form of Session objects as you need them and shut them down when done but ideally you should only create one Session object per application. A Session is a fairly heavyweight object that keeps pools of pools of connections to the node in the cluster, so creating multiple of those will be inefficient (and unnecessary) (taken verbatim from advice given by Sylvain Lebresne on the mailing list). If you forget to shut down the session(s) they will be all closed when you call shutdown on your Cluster instance... really simple example below:
Cluster cluster = Cluster.builder().addContactPoints(host).withPort(port).build();
Session session = cluster.connect(keyspace);
// Do something with session...
session.shutdown();
cluster.shutdown();

See here - http://www.datastax.com/drivers....
The driver uses connections in an asynchronous manner. Meaning that
multiple requests can be submitted on the same connection at the same
time. This means that the driver only needs to maintain a relatively
small number of connections to each Cassandra host. These options
allow the driver to control how many connections are kept exactly.
For each host, the driver keeps a core pool of connections open at all
times determined by calling . If the use of those connections reaches
a configurable threshold , more connections are created up to the
configurable maximum number of connections. When the pool exceeds the
maximum number of connections, connections in excess are reclaimed if
the use of opened connections drops below the configured threshold
Each of these parameters can be separately set for LOCAL and REMOTE
hosts (HostDistance). For IGNORED hosts, the default for all those
settings is 0 and cannot be changed.

Tell Datastax Java Cassandra driver to timeout cluster connection

How do you tell the Datastax Java Cassandra driver to time-out when it attempts to connect to your cluster?
I'm particularly interested in the case when the hosts are reachable, but the Cassandra ports are blocked or the Cassandra daemons are not running. I'm writing a command-line client that ought to exit and report a suitable error message if it can not connect in a reasonable time. At present it seems that the driver will wait forever for a contact point to response, if the contact point is reachable.
That is, I want Cluster.build() to throw a NoHostAvailableException if the driver can not communicate with the Cassandra daemon of any of the contact points within a given maximum time.
Creating my own RetryPolicy won't work: that is for retrying queries, and I want the timeout to apply before we are ready to run queries.
Creating my own ReconnectinoPolicy initially looked promising, but the contract for the interface gives no means for indicating "consider this node to be dead forever more"

That is, I want Cluster.build() to throw a NoHostAvailableException if the driver can not communicate with the Cassandra daemon of any of the contact points within a given maximum time.
This is supposed to be the case. The driver will try to connect to each of the contact points and throw an exception if it fails to connect to any. You can control the maximum time the driver will try connecting (to each node) through SocketOptions.setConnectTimeoutMillis() (the default is 5 seconds).
My experience is that Cluster.build() does return an exception if no node can be connected to, but if your experience differs, you might want to report it as a bug (but a bit more detail on how you reproduce this would help).
That being said:
The timeout above is per host. So if you pass a list of 100 contact points, you could in theory have to wait 500 seconds (by default) before getting the NoHostAvailableException. But there is no real point in providing that many contact points, and in practice, if Cassandra is not running on the node tried, the connection attempt will usually fail right away (you won't wait the timeout).
There is currently no real query timeout on the driver side. Which mean that if the driver does connect to a node (which means that some process is listening on that port and accept the connection), but get no answer to his initial messages, then it can indeed hold forever. That should probably be fixed, and I encourage you to open a ticket for that on https://datastax-oss.atlassian.net/browse/JAVA. However, this doesn't seem to be the case you are describing, since if "Cassandra ports are blocked or the Cassandra daemons are not running" then the driver shouldn't be able to connect in the first place.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.