How do I close Database Connection Datastax Java Driver - java

I am using Datastax Java Driver.
There is a tutorial to use the same.
What I do not understand is how would one close the connection to cassandra?
There is no close method available and I assume we do not want to shutdown the Session as it it expected to be one per application.
Regards
Gaurav

tl;dr Calling shutdown on Session is the correct way to close connections.
You should be safe to keep a Session object at hand and shut it when you're finished with Cassandra - this can be long-lived. You can get individual connections in the form of Session objects as you need them and shut them down when done but ideally you should only create one Session object per application. A Session is a fairly heavyweight object that keeps pools of pools of connections to the node in the cluster, so creating multiple of those will be inefficient (and unnecessary) (taken verbatim from advice given by Sylvain Lebresne on the mailing list). If you forget to shut down the session(s) they will be all closed when you call shutdown on your Cluster instance... really simple example below:
Cluster cluster = Cluster.builder().addContactPoints(host).withPort(port).build();
Session session = cluster.connect(keyspace);
// Do something with session...
session.shutdown();
cluster.shutdown();

See here - http://www.datastax.com/drivers....
The driver uses connections in an asynchronous manner. Meaning that
multiple requests can be submitted on the same connection at the same
time. This means that the driver only needs to maintain a relatively
small number of connections to each Cassandra host. These options
allow the driver to control how many connections are kept exactly.
For each host, the driver keeps a core pool of connections open at all
times determined by calling . If the use of those connections reaches
a configurable threshold , more connections are created up to the
configurable maximum number of connections. When the pool exceeds the
maximum number of connections, connections in excess are reclaimed if
the use of opened connections drops below the configured threshold
Each of these parameters can be separately set for LOCAL and REMOTE
hosts (HostDistance). For IGNORED hosts, the default for all those
settings is 0 and cannot be changed.

Related

How costly is opening and closing of a DB connection in Connection Pool?

If we use any connection pooling framework or Tomcat JDBC pool then how much it is costly to open and close the DB connection?
Is it a good practice to frequently open and close the DB connection whenever DB operations are required?
Or same connection can be carried across different methods for DB operations?
Jdbc Connection goes through the network and usually works over TCP/IP and optionally with SSL. You can read this post to find out why it is expensive.
You can use a single connection across multiple methods for different db operations because for each DB operations you would need to create a Statement to execute.
Connection pooling avoids the overhead of creating Connections during a request and should be used whenever possible. Hikari is one of the fastest.
The answer is - its almost always recommended to re-use DB Connections. Thats the whole reason why Connection Pools exist. Not only for the performance, but also for the DB stability. For instance, if you don't limit the number of connections and mistakenly open 100s of DB connections, the DB might go down. Also lets say if DB connections don't get closed due to some reason (Out of Memory error / shut down / unhandled exception etc), you would have a bigger issue. Not only would this affect your application but it could also drag down other services using the common DB. Connection pool would contain such catastrophes.
What people don't realize that behind the simple ORM API there are often 100s of raw SQLs. Imagine running these sqls independent of connection pools - we are talking about a very large overhead.
I couldn't fathom running a commercial DB application without using Connection Pools.
Some good resources on this topic:
https://www.cockroachlabs.com/blog/what-is-connection-pooling/
https://stackoverflow.blog/2020/10/14/improve-database-performance-with-connection-pooling/
Whether the maintenance (opening, closing, testing) of the database connections in a DBConnection Pool affects the working performance of the application depends on the implementation of the pool and to some extent on the underlying hardware.
A pool can be implemented to run in its own thread, or to initialise all connections during startup (of the container), or both. If the hardware provides enough cores, the working thread (the "business payload") will not be affected by the activities of the pool at all.
Other connection pools are implemented to create a new connection only on demand (a connection is requested, but currently there is none available in the pool) and within the thread of the caller. In this case, the creation of that connection reduces the performance of the working thread – this time! It should not happen too often, otherwise your application needs too many connections and/or does not return them fast enough.
But whether you really need a Database Connection Pool at all depends from the kind of your application!
If we talk about a typical server application that is intended to run forever and to serve a permanently changing crowd of multiple clients at the same time, it will definitely benefit from a connection pool.
If we talk about a tool type application that starts, performs a more or less linear task in a defined amount of time, and terminates when done, then using a connection pool for the database connection(s) may cause more overhead than it provides advantages. For such an application it might be better to keep the connection open for the whole runtime.
Taking the RDBMS view, both does not make a difference: in both cases the connections are seen as open.
If you have performance as a key parameter then better to switch to the Hikari connection pool. If you are using spring-boot then by default Hikari connection pool is used and you do not need to add any dependency. The beautiful thing about the Hikari connection pool is its entire lifecycle is managed and you do not have to do anything.
Also, it is always recommended to close the connection and let it return to the connection pool so that other threads can use it, especially in multi-tenant environments. The best way to do this is using "try with resources" and that connection is always closed.
try(Connection con = datasource.getConnection()){
// your code here.
}
To create your data source you can pass the credentials and create your data source for example:
DataSource dataSource = DataSourceBuilder.create()
.driverClassName(JDBC_DRIVER)
.url(url)
.username(username)
.password(password)
.build();
Link: https://github.com/brettwooldridge/HikariCP
If you want to know the answer in your case, just write two implementations (one with a pool, one without) and benchmark the difference.
Exactly how costly it is, depends on so many factors that it is hard to tell without measuring
But in general, a pool will be more efficient.
The costly is always a definition of impact.
Consider, you have following environment.
A web application with assuming a UI-transaction (user click) and causes a thread on the webserver. This thread is coupled to one connection/thread on the database
10 connections per 60000ms / 1min or better to say 0.167 connections/s
10 connections per 1000ms / 1sec => 10 connections/s
10 connections per 100ms / 0.1sec => 100 connections/s
10 connections per 10ms / 0.01sec => 1000 connections/s
I have worked in even bigger environments.
And believe me the more you exceed the 100 conn/s by 10^x factors the more pain you will feel without having a clean connection pool.
The more connections you generate in 1 second the higher latency you generate and the higher impact is it for the database. And the more bandwidth you will eat for recreating over and over a new "water pipeline" for dropping a few drops of water from one side to the other side.
Now getting back, if you have to access a existing connection from a connection pool it is a matter of micros or few ms to access the database connection. So considering one, it is no real impact at all.
If you have a network in between, it will grow to probably x10¹ to x10² ms to create a new connection.
Considering now the impact on your webserver, that each user blocks a thread, memory and network connection it will impact also your webserver load. Typically you run into webserver (e.g. revProxy apache + tomcat, or tomcat alone) thread pools issues on high load environments, if the connections get exhausted or they need too long time (10¹, 10² millis) to create
Now considering also the database.
If you have open connection, each connection is typically mapped to a thread on a DB. So the DB can use thread based caches to make prepared statements and to reuse pre-calculated access plan to make the accesses to data on database very fast.
You may loose this option if you have to recreate the connection over and over again.
But as said, if you are in up to 10 connections per second you shall not face any bigger issue without a connection pool, except the first additional delay to access the DB.
If you get into higher levels, you will have to manage the resources better and to avoid any useless IO-delay like recreating the connection.
Experience hints:
it does not cost you anything to use a connection pool. If you have issues with the connection pool, in all my previous performance tuning projects it was a matter of bad configuration.
You can configure
a connection check to check the connection (use a real SQL to access a real db field). so on every new access the connection gets checked and if defective it gets kicked from the connection pool
you can define a lifetime of a connections, so that you get new connection after a defined time
=> all this together ensure that even if your admins are doing crap and do not inform you (killing connection / threads on DB) the pool gets quickly rebuilt and the impact stays very low. Read the docs of the connection pool.
Is one connection pool better as the other?
A clear no, it is only getting a matter if you get into high end, or into distributed environments/clusters or into cloud based environments. If you have one connection pool already and it is still maintained, stick to it and become a pro on your connection pool settings.

Datasource Microsoft JDBC Driver for SQL Server (AlwaysOn Availability Groups)

I have a question related to the scenario when connecting from a Java application using the Microsoft JDBC Driver 4.0 to a SQL Server 2014 with AlwaysOn Availability Groups set up for high availability.
With this set up, we will be connecting to an availability group listener (specified in the db connecting string instead of any particular instance), so that the DB fail-over etc. is handled gracefully by the listener and it tries to connect to the next available instance behind the scenes if current primary goes down in the AG cluster.
Question(s) I have is,
In the data-source that is configured on the j2ee application server side (we use WebSphere), what happens to those connections already pooled by the data-source?
When a database goes down, though the AG listener would try to reconnect on the db side to the next available DB, will the AG Listener also through the jdbc driver send an event or something to the data-source created on the app server and make sure those connections that are already pooled by the datasource to be discarded and have it create new ones so that transactions on the application side wont fail (though they might for a while till new connections are created and fail over is successful) or the java application has to find out only after requesting it from the datasource?
WebSphere Application Server is able to cope with bad connections and removes them from the pool. Exactly when this happens depends on some configurable options and on how fully the Microsoft JDBC driver takes advantage of the javax.sql.ConnectionEventListener API to send notifications to the application server. In the ideal case where a JDBC driver sends the connectionErrorOccurred event immediately for all connections, WebSphere Application Server responds by removing all of these connections from the pool and by marking any connection that is currently in-use as bad so that it does not get returned to the pool once the application closes the handle. Lacking this, WebSphere Application Server will discover the first bad connection upon next use by the application. It is discovered either by a connectionErrorOcurred event that is sent by the JDBC driver at that time, or lacking that, upon inspecting the SQLState/error code of an exception for known indicators of bad connections. WebSphere Application Server then goes about purging bad connections from the pool according to the configured Purge Policy. There are 3 options:
Purge Policy of Entire Pool - all connections are removed from
the pool and in-use connections marked as bad so that they are not
pooled.
Purge Policy of Failing Connection Only - only the
specific connection upon which the error actually occurred is
removed from the pool or marked as bad and not returned to the pool
Purge Policy of Validate All Connections - all connections are
tested for validity (Connection.isValid API) and connections found
to be bad are removed from the pool or marked as bad and not
returned to the pool. Connections found to be valid remain in the
pool and continue to be used.
I'm not sure from your description if you are using WebSphere Application Server traditional or Liberty. If traditional, there is an additional option for pre-testing connections as they are handed out of the pool, but be aware that turning this on can have performance implications.
That said, the one thing to be aware of is that regardless of any of the above, your application will always need to be capable of handling the possibility of errors due to bad connections (even if the connection pool is cleared, connections can go bad while in use) and respond by requesting a new connection and retrying the operation in a new transaction.
Version 4 of that SQL Server JDBC driver is old and doesn't know anything about the always on feature.
Any data source connection pool can be configured to check the status of the connection from the pool prior to doling it out to the client. If the connection cannot be used the pool will create a new one. That's true of all vendors and versions. I believe that's the best you can do.

Opening a new database connection for every client that connects to the server application?

I am in the process of building a client-server application and I would really like an advise on how to design the server-database connection part.
Let's say the basic idea is the following:
Client authenticates himself on the server.
Client sends a request to server.
Server stores client's request to the local database.
In terms of Java Objects we have
Client Object
Server Object
Database Object
So when a client connects to the server a session is created between them through which all the data is exchanged. Now what bothers me is whether i should create a database object/connection for each client session or whether I should create one database object that will handle all requests.
Thus the two concepts are
Create one database object that handles all client requests
For each client-server session create a database object that is used exclusively for the client.
Going with option 1, I guess that all methods should become synchronized in order to avoid one client thread not overwriting the variables of the other. However, making it synchronize it will be time consuming in the case of a lot of concurrent requests as each request will be placed in queue until the one running is completed.
Going with option 2, seems a more appropriate solution but creating a database object for every client-server session is a memory consuming task, plus creating a database connection for each client could lead to a problem again when the number of concurrent connected users is big.
These are just my thoughts, so please add any comments that it may help on the decision.
Thank you
Option 3: use a connection pool. Every time you want to connect to the database, you get a connection from the pool. When you're done with it, you close the connection to give it back to the pool.
That way, you can
have several clients accessing the database concurrently (your option 1 doesn't allow that)
have a reasonable number of connections opened and avoid bringing the database to its knees or run out of available connections (your option 2 doesn't allow that)
avoid opening new database connections all the time (your option 2 doesn't allow that). Opening a connection is a costly operation.
Basically all server apps use this strategy. All Java EE servers come with a connection pool. You can also use it in Java SE applications, by using a pool as a library (HikariCP, Tomcat connection pool, etc.)
I would suggested a third option, database connection pooling. This way you create a specified number of connections and give out the first available free connection as soon as it becomes available. This gives you the best of both worlds - there will almost always be free connections available quickly and you keep the number of connections the database at a reasonable level. There are plenty of the box java connection pooling solutions, so have a look online.
Just use connection pooling and go with option 2. There are quite a few - C3P0, BoneCP, DBCP. I prefer BoneCP.
Both are not good solutions.
Problem with Option 1:
You already stated the problems with synchronizing when there are multiple threads. But apart from that there are many other problems like transaction management (when are you going to commit your connection?), Security (all clients can see precommitted values).. just to state a few..
Problem with Option 2:
Two of the biggest problems with this are:
It takes a lot of time to create a new connection each and every time. So performance will become an issue.
Database connections are extremely expensive resources which should be used in limited numbers. If you start creating DB Connections for every client you will soon run out of them although most of the connections would not be actively used. You will also see your application performance drop.
The Connection Pooling Option
That is why almost all client-server applications go with the connection pooling solution. You have a set connections in the pool which are obtained and released appropriately. Almost all Java Frameworks have sophisticated connection pooling solutions.
If you are not using any JDBC framework (most use the Spring JDBC\Hibernate) read the following article:
http://docs.oracle.com/javase/jndi/tutorial/ldap/connect/pool.html
If you are using any of the popular Java Frameworks like Spring, I would suggest you use Connection Pooling provided by the framework.

Simultaneous access to Database in Web application

How can I know the average or exact number of users accessing database simultaneously in my Java EE web enterprise application? I would like to see if the "Connection pool setting" I set in the Glassfish application server is suitable for my web application or not. I need to correctly set the maximun number of connection in Connection Pool setting in Application Server. Recently, my application ran out of connections and threw exceptions when the client request for DB expires.
There are multiple ways.
One and easiest would be take help from your DBAs - they can tell you exactly how many connections are active from your webserver or the user id for connection pool at a given time.
If you want some excitement, you will have to JMX management extensions provided by glassfish. Listing 6 on this page - gives an example as to how to write a JMS based snippet to monitor a connection pool.
Finally, you must make sure that all connections are closed explicitly by a connection.close(); type of call in your application. In some cases, you need to close ResultSet as well.
Next is throttling your http thread pool to avoid too many concurrent access if your db connections are taking longer to close.

Relationship between JDBC sessions and Oracle processes

We are having a problem with too many Oracle processes being created (over 2,000) when connections are limited to 1,100 (using C3P0)
Two questions:
What's the relationship between an Oracle process and a JDBC connection? Is one Oracle process created for each session? Is one created for every JDBC statement? No relationship at all?
Did you ever face this scenario, where you are creating more processes than JDBC connections?
Any comment would be really appreciated.
There is one session per connection. This sounds like you have a connection leak, somewhere you're opening a new connection and not closing properly. One possibility is that you open, use and close a connection inside a try block and are handling an exception in a catch, or returning early for someother reason. If so you need to make sure the connection close is done in finally or it may not happen, leaving the connection (and thus session) hanging. Opening two connections in the same scope without an explicit close in between can also do this.
I'm not familiar with C3PO so don't know how connections are handled, or where and how your 1100 limit is imposed; if it (or you) have a connection pool and the 1100 you refer to is the maximm pool size, then this doesn't sound like the issue as you'd hit the pool cap before the session cap.
You can look in v$session to confirm that all the sessions are coming from JDBC, and there isn't something else connecting.
Maybe you want to check if your server runs in dedicated or shared mode (you probably want to switch it to shared mode if you want to decrease the number of active processes).
You can check that by doing
select server from v$session
More information about process architecture
http://docs.oracle.com/cd/B19306_01/server.102/b14220/process.htm
Shared/Dedicated server mode
http://docs.oracle.com/cd/B10501_01/server.920/a96521/manproc.htm

Categories