Does hibernate that utilises connection pools require retries to take care of intermittent failures (e.g. network issues). My colleague is of the opinion that it's not necessary cause of the use of connection pools and that if there was anything wrong with the connection then the connection pool manager would take care of it. I'm not convinced as the connection could be open and valid, but when the request is made it could succumb to network issues.
As what is being done is related to payments we need strong guarantees that the update takes place. I tried googling how hibernate/connection pools might deal with intermittent issues during a single request but couldn't find much information.
The entity is being saved by a call to getSession().update(object); where getSession() returns the current Hibernate session. We use Hibernate v4.3 and looking at the hibernate documentation it only mentions an exception is thrown if the persistence instance has the same identifier.
Would appreciate if I could get some links to some references/documentation that might guide my confusion.
You should rely on transactions to give you strong guarantees that a change is made atomically. So in case of a (network) failure your transaction would rollback.
Connection pools provide no such functionality, they facilitate the reuse of connections. See this question about connection pooling: What is database pooling?
Related
My current setup:
Java backend
Ebean
HikariCP
RDS Aurora MySQL v5.7 having writer and reader nodes
We use reader RDS node for business operations which only require read access to the database. This works just fine (no db locks, better performance, yay!).
However looking into AWS Performance insights I can see that a lot of time is spent on COMMIT operation. In fact, it's the most expensive operation on read instance by far.
Not only it takes time to process but requires extra client-server roundtrip. My naïve self suggests this could be entirely avoided but I could not find any HikariCP settings on this matter. Surely there's nothing to commit for read-only database access, no?
Above said, I do know that databases are allowed to create temporary tables even for read only replicas but to me it seems that they should be equally smart enough to destroy them once the transaction is over and connection is returned to the pool.
FWIW, we never use autocommit=true for write access due to the nature of our app. I'd prefer not using it for read only access as well.
Has anybody managed to get COMMIT-less setup working, or perhaps this is a bad idea?
I need to know why we need a connection pool for a standalone application. According to my knowledge, a standalone application needs only one database connection instance. That's why we use the singleton pattern while creating the connection object using JDBC. So what's the use of having a connection pool for a standalone application? If I am using a connection pool, do I need to specify the max size as 1? Here I am trying to use the CP30 connection pool with native Hibernate.
A major reason for using a connection pool is that it makes it easier for your application to recover in case the connection goes bad. The only time I would not use a connection pool was if it was acceptable for the program to fail if the connection stopped working. An example could be a very simple batch job that executed one transaction and the job framework running it would retry it if it failed.
I agree you have a stand-alone application, but that does not mean, you always need to use a Singleton design pattern. How about a single application spinning up multiple threads and each thread connecting to the database. In that case, Singleton won't be of any help, and you should implement connection pool, you are gracefully handling the db operations.
Connection pool and applications (stand-alone or distributed), are related to some extent, but it majorly depends on the use case. Suppose you are working on a stand-alone desktop based application, which is a simple CRUD one, in that case, I agree you need not implement connection pool, but in case we are talking about multiple user, and that too parallel, I think we should always leverage connection pool.
Not sure what your use-case talks about, but generalizing the statement, "Stand-alone application, does not need connection pooling", does not stand true always.
The cost of using a connection pool is usually insignificant.
Your data access layer does not need to know whether it's being called from a standalone application or, say, a multithreaded web application. So there's a good case for always using connection pooling, which doesn't hurt in the first case and is probably necessary in the second.
I am looking for a development of a transaction framework, which needs to update the database tables concurrently.
In simple words, a single transaction should update concurrently around 8 independent tables, and the whole transaction should fail if any update thrown error.
Is there any way I can handle it concurrently,
Ie, 10 Threads update 10 Tables and if any update fails all the update should rollback.
Is there any framework which allows to me handle this scenario.
If you use JTA or Spring transaction which will be shared by same connection and defeat the purpose of concurrent update.
Or any way I can write using custom thread based solution.
Why would using JTA or Spring Transaction mean you'll use the same connection? If you configure a connection pool and connect to it correctly, surely you'll get a different connection for each thread that you use?
This just seems like an unusually configured distributed transaction to me, and my first attempt at this would be to use Spring and/or Hibernate. I think you'd just have to ensure that you were treating the transactions as distributed transactions.
You can use the standard JDBC. JDBC allows you to share a single Connection among multiple threads. To make several threads work in one transaction you should
create a java.sql.Connection / or take it from pool
turn autocommit off
run concurrent tasks with the same connection
wait for the tasks to finish
commit if all tasks finished successfully; rollback otherwise
close connection
It is also possible to use Spring JDBC if you use Spring's SingleConnectionDataSource
The framework is JTA.
It depends from your database whether you can use one connection for all threads. Details can be found here. So in the general case you need a connection for each thread.
If you use an XA data source, you could try to run the concurrent threads under control of a JTA transaction.
This is a lot of complexity, and it takes time to prepare the threads, so it's probably only useful, if the updates take a long time, the affected tables are independent, and you have enough CPUs in your database server.
Update
Regarding transaction propagation, here you can find some thoughts on it.
What is the best way to manage a database connection in a Java servlet?
Currently, I simply open a connection in the init() function, and then close it in destroy().
However, I am concerned that "permanently" holding onto a database connection could be a bad thing.
Is this the correct way to handle this? If not, what are some better options?
edit: to give a bit more clarification: I have tried simply opening/closing a new connection for each request, but with testing I've seen performance issues due to creating too many connections.
Is there any value in sharing a connection over multiple requests? The requests for this application are almost all "read-only" and come fairly rapidly (although the data requested is fairly small).
As everybody says, you need to use a connection pool. Why? What up? Etc.
What's Wrong With Your Solution
I know this since I also thought it was a good idea once upon a time. The problem is two-fold:
All threads (servlet requests get served with one thread per each) will be sharing the same connection. The requests will therefore get processed one at a time. This is very slow, even if you just sit in a single browser and lean on the F5 key. Try it: this stuff sounds high-level and abstract, but it's empirical and testable.
If the connection breaks for any reason, the init method will not be called again (because the servlet will not be taken out of service). Do not try to handle this problem by putting a try-catch in the doGet or doPost, because then you will be in hell (sort of writing an app server without being asked).
Contrary to what one might think, you will not have problems with transactions, since the transaction start gets associated with the thread and not just the connection. I might be wrong, but since this is a bad solution anyway, don't sweat it.
Why Connection Pool
Connection pools give you a whole bunch of advantages, but most of all they solve the problems of
Making a real database connection is costly. The connection pool always has a few extra connections around and gives you one of those.
If the connections fail, the connection pool knows how to open a new one
Very important: every thread gets its own connection. This means that threading is handled where it should be: at the DB level. DBs are super efficient and can handle concurrent request with ease.
Other stuff (like centralizing location of JDBC connect strings, etc.), but there are millions of articles, books, etc. on this
When to Get a Connection
Somewhere in the call stack initiated in your service delegate (doPost, doGet, doDisco, whatever) you should get a connection and then you should do the right thing and return it in a finally block. I should mention that the C# main architect dude said once up a time that you should use finally blocks 100x more than catch blocks. Truer words never spoken...
Which Connection Pool
You're in a servlet, so you should use the connection pool the container provides. Your JNDI code will be completely normal except for how you obtain the connection. As far as I know, all servlet containers have connection pools.
Some of the comments on the answers above suggest using a particular connection pool API instead. Your WAR should be portable and "just deploy." I think this is basically wrong. If you use the connection pool provided by your container, your app will be deployable on containers that span multiple machines and all that fancy stuff that the Java EE spec provides. Yes, the container-specific deployment descriptors will have to be written, but that's the EE way, mon.
One commenter mentions that certain container-provided connection pools do not work with JDBC drivers (he/she mentions Websphere). That sounds totally far-fetched and ridiculous, so it's probably true. When stuff like that happens, throw everything you're "supposed to do" in the garbage and do whatever you can. That's what we get paid for, sometimes :)
I actually disagree with using Commons DBCP. You should really defer to the container to manage connection pooling for you.
Since you're using Java Servlets, that implies running in a Servlet container, and all major Servlet containers that I'm familiar with provide connection pool management (the Java EE spec may even require it). If your container happens to use DBCP (as Tomcat does), great, otherwise, just use whatever your container provides.
I'd use Commons DBCP. It's an Apache project that manages the connection pool for you.
You'd just get your connection in your doGet or doPost run your query and then close the connection in a finally block. (con.close() just returns it to the pool, it doesn't actually close it).
DBCP can manage connection timeouts and recover from them. The way you are currently doing things if your database goes down for any period of time you'll have to restart your application.
Are you pooling your connections? If not, you probably should to reduce the overhead of opening and closing your connections.
Once that's out of the way, just keep the connection open for as long as it's need, as John suggested.
The best way, and I'm currently looking through Google for a better reference sheet, is to use pools.
On initialization, you create a pool that contains X number of SQL connection objects to your database. Store these objects in some kind of List, such as ArrayList. Each of these objects has a private boolean for 'isLeased', a long for the time it was last used and a Connection. Whenever you need a connection, you request one from the pool. The pool will either give you the first available connection, checking on the isLeased variable, or it will create a new one and add it to the pool. Make sure to set the timestamp. Once you are done with the connection, simply return it to the pool, which will set isLeased to false.
To keep from constantly having connections tie up the database, you can create a worker thread that will occasionally go through the pool and see when the last time a connection was used. If it has been long enough, it can close that connection and remove it from the pool.
The benefits of using this, is that you don't have long wait times waiting for a Connection object to connect to the database. Your already established connections can be reused as much as you like. And you'll be able to set the number of connections based on how busy you think your application will be.
You should only hold a database connection open for as long as you need it, which dependent on what you're doing is probably within the scope of your doGet/doPost methods.
Pool it.
Also, if you are doing raw JDBC, you could look into something that helps you manage the Connection, PreparedStatement, etc. Unless you have very tight "lightweightness" requirements, using Spring's JDBC support, for instance, is going to simplify your code a lot- and you are not forced to use any other part of Spring.
See some examples here:
http://static.springframework.org/spring/docs/2.5.x/reference/jdbc.html
A connection pool associated with a Data source should do the trick. You can get hold of the connection from the dataSource in the servlet request method(doget/dopost, etc).
dbcp, c3p0 and many other connection pools can do what you're looking for. While you're pooling connections, you might want to pool Statements and PreparedStatements; Also, if you're a READ HEAVY environment as you indicated, you might want to cache some of the results using something like ehcache.
BR,
~A
Usually you will find that opening connections per request is easier to manage. That means in the doPost() or the doGet() method of your servlet.
Opening it in the init() makes it available to all requests and what happens when you have concurrent requests?
The JDBC 3.0 spec talks about Connection (and Prepared Statement) pooling.
We have several standalone Java programs (i.e. we are not using an application server) that have been using DBCP to provide connection pooling. Should we continue to use DBCP, or can we take advantage of the JDBC-provided pooling and get rid of DBCP?
We are using MySQL (Connector/J) and will eventually be adding SQL Server support (jTDS); it's unlikely that we'll support any other databases.
EDIT: See comment below about my attempt to eliminate the connection pooling library. It appears that DBCP is still relevant (note that some commenters recommended C3P0 over DBCP).
Based on the encouragement of other posters, I attempted to eliminate DBCP and use the MySQL JDBC driver directly (Connector/J 5.0.4). I was unable to do so.
It appears that while the driver does provide a foundation for pooling, it does not provide the most important thing: an actual pool (the source code came in handy for this). It is left up to the application server to provide this part.
I took another look at the JDBC 3.0 documentation (I have a printed copy of something labeled "Chapter 11 Connection Pooling", not sure exactly where it came from) and I can see that the MySQL driver is following the JDBC doc.
When I look at DBCP, this decision starts to make sense. Good pool management provides many options. For example, when do you purge unused connection? which connections do you purge? is there a hard or soft limit on the max number of connections in the pool? should you test a connection for "liveness" before giving it to a caller? etc.
Summary: if you're doing a standalone Java application, you need to use a connection pooling library. Connection pooling libraries are still relevant.
DBCP has serious flaws. I don't think it's appropriate for a production application, especially when so many drivers support pooling in their DataSource natively.
The straw that broke the camel's back, in my case, was when I found that the entire pool was locked the whole time a new connection attempt is made to the database. So, if something happens to your database that results in slow connections or timeouts, other threads are blocked when they try to return a connection to the pool—even though they are done using a database.
Pools are meant to improve performance, not degrade it. DBCP is naive, complicated, and outdated.
I prefer using dbcp or c3p0 because they are vendor neutral. I found out, at least with mysql or oracle, that whenever I try to do something with the jdbc client that is not standard sql I have to introduce compile-time dependency on the vendor's classes. See, for example, a very annoying example here.
I am not sure about mysql, but oracle uses their specific, non-standard classes for connection pooling.
People still use DBCP, I think it even comes as a default with Hibernate.
Is DBCP not meeting your current needs?
I'm not a big believer in replacing infrastructure unless there's already a performance or functionality gap that it can't fill, even if there are newer or fancier alternatives around.