Jdbc data pooling

Jdbc data pooling - java

I have a problem in understanding How exactly data source object is different from normal connection.I know that using data source we can resue the connection all over the class where we writing our sql queries.As far i know for small standalone applications if we create one connection object which is private or public we can resuse that object any where same with data source once object is created we can re-use that object just like connection object... then how's different? why we need to use data source obj then?

DataSource is most commonly used in webapps, where your code needs a database Connection for the duration of the web request processing.
The DataSource is configured for the webapp server, and can then be used by your code to get a new Connection, without your code needing to know the details (e.g. URL, user, password).
Since creating a connection to a database is a slow operation, rather than closing the connection when your code is done with it, the connection is stored in a pool. When your code then processes another web request, it can get that connection from the pool instantly, gaining a huge boost in performance.
To summarize, the DataSource has two purposes: Removing configuration from your code, and supporting reuse through pooling.
For a small standalone single-threaded application, where the connection is created at startup and just used while the program is running, there is nothing gained by using a DataSource. Using DriverManager is easier for such programs.

Related

Why do we need a connection pool for a standalone application?

I need to know why we need a connection pool for a standalone application. According to my knowledge, a standalone application needs only one database connection instance. That's why we use the singleton pattern while creating the connection object using JDBC. So what's the use of having a connection pool for a standalone application? If I am using a connection pool, do I need to specify the max size as 1? Here I am trying to use the CP30 connection pool with native Hibernate.

A major reason for using a connection pool is that it makes it easier for your application to recover in case the connection goes bad. The only time I would not use a connection pool was if it was acceptable for the program to fail if the connection stopped working. An example could be a very simple batch job that executed one transaction and the job framework running it would retry it if it failed.

I agree you have a stand-alone application, but that does not mean, you always need to use a Singleton design pattern. How about a single application spinning up multiple threads and each thread connecting to the database. In that case, Singleton won't be of any help, and you should implement connection pool, you are gracefully handling the db operations.
Connection pool and applications (stand-alone or distributed), are related to some extent, but it majorly depends on the use case. Suppose you are working on a stand-alone desktop based application, which is a simple CRUD one, in that case, I agree you need not implement connection pool, but in case we are talking about multiple user, and that too parallel, I think we should always leverage connection pool.
Not sure what your use-case talks about, but generalizing the statement, "Stand-alone application, does not need connection pooling", does not stand true always.

The cost of using a connection pool is usually insignificant.
Your data access layer does not need to know whether it's being called from a standalone application or, say, a multithreaded web application. So there's a good case for always using connection pooling, which doesn't hurt in the first case and is probably necessary in the second.

Some questions about connection pooling in a servlet application

I'm new to java in web application design and I'm getting surprised how many things I don't now.
In particular I'm having problems in understending how servlet containers manage resources like connection pooling classes.
Assuming that I choose a pooling library (let's say c3p0), I read that there are many ways to use and manage the connection pooling classes.
For instance in many examples I saw that a certain class (let's say ComboPooledDataSource) is instantiated in the init() method of the servlet and here I'm getting a bit confusing. I mean, I think that a connection pooling system must exist and have a separate life with respect to all the servlets that will need a connection, otherwise it make no sense. So I think that the class below may be a Thread that is started once from the first servlet that call the init method and then it continues to exist until someone doesn not interrupt it. Is that correct? If not how does it work?
Anyway once I start this class , is it shared among all the servlets in the context (I mean all the servlets that call it in the init method)?
Other examples set up the connection pooling system as a resource by for instance defining it in the context.xml and then any servlet that want a connection just have to access it via JNDI (JNDI is correct?). What i understood (or I think to) is that in this case the thread that will execute the pooling system is started when the application is started and each servlet can access it when it wants to. Is that correct?
In this case can I modify the connection pooling system properties by a servlet or a background thread runtime? (for example if I want to change the number of connections as a function of the statistics on the number of requests and so on)
If I want to create different pools (for example I want to subdivide the database access to N different databases or I want to access using different usernames ) do I need to create as many resources as are the different kind of connection I want?
Is there a "better" way among these two or they are equivalent?

It comes down to ease of use of the webapp and (Tomcat) server maintenance.
You describe 2 use cases:
a connection pool used by one webapp
a connection pool shared between webapps
The first case is suitable for "ease of use": you can drop the war-file in any Tomcat server and it will work (e.g. Jenkins delivers such a war-file) since the webapp contains all the code needed to access a database. No need to change the configuration of the Tomcat server and restart it.
If I can, I like to deliver these kind of war-files since less things can go wrong (e.g. no clashes with configuration for other webapps). You can take it a step further by delivering Tomcat embedded with an application (so you don't need a Tomcat server to start with, an example project is here).
Note that opening a database connection pool (preferably via ServletContextListener.contextInitialized, not a servlet) and closing it (via contextDestroyed) does not involve starting and stopping a thread. The pool implementation may decide to start a background thread (e.g. to remove idle and/or abandoned connections), but it does not have to.
The second case is suitable for several webapps that share a common ground. For example, if several webapps all running in the same Tomcat server all use the same database, it saves time and effort if the Tomcat server already has a connection pool available for the different webapps (via JNDI). The Tomcat server can contain the JDBC driver software and connection pool software in the "lib" directory and configuration is done once in the context.xml of the server. If the database changes or different (patched) software components are required, only the tomcat server needs to be updated and all the webapps can remain the same. In this case, updating the Tomcat server instead of each webapp is a lot easier.
The second case is also more suitable for monitoring: there is a good chance you can monitor the connection pool via JMX. This kind of monitoring may not be available in the first case.
Changing "the number of connections as a function of the statistics on the number of requests " is not needed with connection pools: you set a maximum amount of connections to use and a timeout to remove idle connections. The connection pool will then grow and shrink with the number of requests.
Subdividing the database access to N different databases would require a different connection pool for each database, unless this is "read-only" in which case you could also use a load-balancer.
Access using different usernames is a bit tricky. I read in another question (which I cannot find anymore, sorry) that you can change the schema during runtime, but changing username/password might require a new connection in which case connection pooling is out.

Are you using Tomcat to run your servlets? Tomcat 7 has a new pooling solution that you could investigate. Tomcat also includes the dbcp libraries you can use them by setting the factory="org.apache.tomcat.dbcp.dbcp.BasicDataSourceFactory". The DBCP pools are self managed and you can define the configuration values in the context.xml file I am not aware of how to change these values on the fly. I believe the Tomcat 7 pool use the same configuration as the DBCP with a couple of added options to make the conversion between the two simple. We use DBCP in all our applications and have not experienced problems so I have not used the Tomcat 7 pooling. I am thinking you would need to create many pools to handle most of your requirements.

how to manage connections to dynamically created databases

I need to manage connections to multiple databases in my web app. following are facts regarding the current implementation:
1- I use Tomcat
2- databases are created dynamically at runtime ( i am using mysql)
without a doubt, having a connection pool to manage database connections is optimal.
Since the databases are not known at the start of the application, it was not possible for me to set up datasources and make connection pools. (I could not find a way in Tomcat to make dynamic connection pool: a connection pool that is created at runtime).
my question is: what other options do I have to work efficiently with connections to multiple databases ? (I don't have experience to implement connection pools myself)
is there any library that can be used with tomcat and allow me to establish multiple connection pools to different databases at runtime ? if not what do you suggest that I do instead of connection pools ?
i am fairly new with this issue therefore please correct and guide me if I am messing up concepts.
Thank you in advance.

The MySQL JDBC driver allows omitting the database name from the connection URL as follows:
jdbc:mysql://localhost:3306
You only need to specify the database by Connection#setCatalog() or directly in the SQL queries. See also its reference documentation:
If the database is not specified, the connection will be made with no default database. In this case, you will need to either call the setCatalog() method on the Connection instance or fully specify table names using the database name (that is, SELECT dbname.tablename.colname FROM dbname.tablename...) in your SQL. Not specifying the database to use upon connection is generally only useful when building tools that work with multiple databases, such as GUI database managers.
This allows you for creating a single and reuseable connection pooled datasource in Tomcat. You'll perhaps only need to rewrite your connection manager and/or SQL queries.

There are enough connection pooling framework in the open. Proxool is definitely among the best. Its pretty flexible and easy to use.

MongoDB Java Driver database connection pooling with Tomcat

According to the MongoDB Java driver documentation, database connection pooling is magically handled by the Mongo object.
Does this mean it is safe to create an instance of a singleton object which connects to the MongoDB database in a servlet that will run when Tomcat starts and not worry about configuring database connection pooling in Tomcat via the context.xml?
Is this the right way to think about it? Am I misunderstanding some basic concept of Tomcat / database connection pooling in general?

We've been using the Java drivers via the CFMongoDB project and we use it as you describe, but in a ColdFusion application rather then in Java. Same idea though: one object is created and we reuse it and that object maintains the one connection to the Mongo server.
You can create one Mongo Java instance and it will maintain an internal pool of connections (default size of 10) - to you it's hidden and you don't need to worry about it. The Mongo Java docs recommend this:
http://www.mongodb.org/display/DOCS/Java+Driver+Concurrency
We have it running in production now and there have been no issues. Multiple web request threads use the same Mongo instance and Mongo is quick enough to deal with this using it's internal pool (we're doing logging so it can write very fast!).
It is worth remembering to call close() on any instances that you are finished with - this will stop connections getting used up on the Mongo server over time:
http://api.mongodb.org/java/2.5-pre-/com/mongodb/Mongo.html#close()
So in summary, don't worry about configuring Tomcat.
Hope that helps!

Best way to manage database connection for a Java servlet

What is the best way to manage a database connection in a Java servlet?
Currently, I simply open a connection in the init() function, and then close it in destroy().
However, I am concerned that "permanently" holding onto a database connection could be a bad thing.
Is this the correct way to handle this? If not, what are some better options?
edit: to give a bit more clarification: I have tried simply opening/closing a new connection for each request, but with testing I've seen performance issues due to creating too many connections.
Is there any value in sharing a connection over multiple requests? The requests for this application are almost all "read-only" and come fairly rapidly (although the data requested is fairly small).

As everybody says, you need to use a connection pool. Why? What up? Etc.
What's Wrong With Your Solution
I know this since I also thought it was a good idea once upon a time. The problem is two-fold:
All threads (servlet requests get served with one thread per each) will be sharing the same connection. The requests will therefore get processed one at a time. This is very slow, even if you just sit in a single browser and lean on the F5 key. Try it: this stuff sounds high-level and abstract, but it's empirical and testable.
If the connection breaks for any reason, the init method will not be called again (because the servlet will not be taken out of service). Do not try to handle this problem by putting a try-catch in the doGet or doPost, because then you will be in hell (sort of writing an app server without being asked).
Contrary to what one might think, you will not have problems with transactions, since the transaction start gets associated with the thread and not just the connection. I might be wrong, but since this is a bad solution anyway, don't sweat it.
Why Connection Pool
Connection pools give you a whole bunch of advantages, but most of all they solve the problems of
Making a real database connection is costly. The connection pool always has a few extra connections around and gives you one of those.
If the connections fail, the connection pool knows how to open a new one
Very important: every thread gets its own connection. This means that threading is handled where it should be: at the DB level. DBs are super efficient and can handle concurrent request with ease.
Other stuff (like centralizing location of JDBC connect strings, etc.), but there are millions of articles, books, etc. on this
When to Get a Connection
Somewhere in the call stack initiated in your service delegate (doPost, doGet, doDisco, whatever) you should get a connection and then you should do the right thing and return it in a finally block. I should mention that the C# main architect dude said once up a time that you should use finally blocks 100x more than catch blocks. Truer words never spoken...
Which Connection Pool
You're in a servlet, so you should use the connection pool the container provides. Your JNDI code will be completely normal except for how you obtain the connection. As far as I know, all servlet containers have connection pools.
Some of the comments on the answers above suggest using a particular connection pool API instead. Your WAR should be portable and "just deploy." I think this is basically wrong. If you use the connection pool provided by your container, your app will be deployable on containers that span multiple machines and all that fancy stuff that the Java EE spec provides. Yes, the container-specific deployment descriptors will have to be written, but that's the EE way, mon.
One commenter mentions that certain container-provided connection pools do not work with JDBC drivers (he/she mentions Websphere). That sounds totally far-fetched and ridiculous, so it's probably true. When stuff like that happens, throw everything you're "supposed to do" in the garbage and do whatever you can. That's what we get paid for, sometimes :)

I actually disagree with using Commons DBCP. You should really defer to the container to manage connection pooling for you.
Since you're using Java Servlets, that implies running in a Servlet container, and all major Servlet containers that I'm familiar with provide connection pool management (the Java EE spec may even require it). If your container happens to use DBCP (as Tomcat does), great, otherwise, just use whatever your container provides.

I'd use Commons DBCP. It's an Apache project that manages the connection pool for you.
You'd just get your connection in your doGet or doPost run your query and then close the connection in a finally block. (con.close() just returns it to the pool, it doesn't actually close it).
DBCP can manage connection timeouts and recover from them. The way you are currently doing things if your database goes down for any period of time you'll have to restart your application.

Are you pooling your connections? If not, you probably should to reduce the overhead of opening and closing your connections.
Once that's out of the way, just keep the connection open for as long as it's need, as John suggested.

The best way, and I'm currently looking through Google for a better reference sheet, is to use pools.
On initialization, you create a pool that contains X number of SQL connection objects to your database. Store these objects in some kind of List, such as ArrayList. Each of these objects has a private boolean for 'isLeased', a long for the time it was last used and a Connection. Whenever you need a connection, you request one from the pool. The pool will either give you the first available connection, checking on the isLeased variable, or it will create a new one and add it to the pool. Make sure to set the timestamp. Once you are done with the connection, simply return it to the pool, which will set isLeased to false.
To keep from constantly having connections tie up the database, you can create a worker thread that will occasionally go through the pool and see when the last time a connection was used. If it has been long enough, it can close that connection and remove it from the pool.
The benefits of using this, is that you don't have long wait times waiting for a Connection object to connect to the database. Your already established connections can be reused as much as you like. And you'll be able to set the number of connections based on how busy you think your application will be.

You should only hold a database connection open for as long as you need it, which dependent on what you're doing is probably within the scope of your doGet/doPost methods.

Pool it.
Also, if you are doing raw JDBC, you could look into something that helps you manage the Connection, PreparedStatement, etc. Unless you have very tight "lightweightness" requirements, using Spring's JDBC support, for instance, is going to simplify your code a lot- and you are not forced to use any other part of Spring.
See some examples here:
http://static.springframework.org/spring/docs/2.5.x/reference/jdbc.html

A connection pool associated with a Data source should do the trick. You can get hold of the connection from the dataSource in the servlet request method(doget/dopost, etc).
dbcp, c3p0 and many other connection pools can do what you're looking for. While you're pooling connections, you might want to pool Statements and PreparedStatements; Also, if you're a READ HEAVY environment as you indicated, you might want to cache some of the results using something like ehcache.
BR,
~A

Usually you will find that opening connections per request is easier to manage. That means in the doPost() or the doGet() method of your servlet.
Opening it in the init() makes it available to all requests and what happens when you have concurrent requests?

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.