I have a java program which updates a table in oracle database.
I have tried it using a single JDBC connection and it's very slow and takes hours to complete.
I'm trying to use HikariCP to make a connection pool and have multiple threads get separate connections from the pool.
Suppose I have 6 threads and 5 database connections in the pool and 5 of the threads call the HikariDataSource.getConnection() method. Will each of them get a separate db connection object?
If yes, then, will the thread be in blocked/ waiting state, when it calls the getConnection method or it executes the remaining code with a null connection?
If no, how do I get them separate connections?
Will each of them get a separate db connection object?
Each thread ask connection, if available gets a separate db connection object
If yes, then, will the thread be in blocked/ waiting state, when it calls the getConnection method or it executes the remaining code with a null connection?
If no available connection it will wait until connection is released to pool and take it, if it won't get connection until timeout defined, it will throw a timeout exception
If no, how do I get them separate connections?
Irrelevant, because each thread will get different connection
About HikariCP and concurrency:
HikariCP contains a custom lock-free collection called a ConcurrentBag. The idea was borrowed from the C# .NET ConcurrentBag class, but the internal implementation quite different. The ConcurrentBag provides...
A lock-free design
ThreadLocal caching
Queue-stealing
Direct hand-off optimizations
...resulting in a high degree of concurrency, extremely low latency, and minimized occurrences of false-sharing.
Related
I've seen two ways to deal with database connections:
1) Connection pool
2) Bind connection to a thread (when we have fixed and constant threads count)
But I don't undestand what is the purpose of using #2. What are the advantagase of the second behaviour over the first one?
If you're working with a single thread or a very small set of threads (that need database functionality), binding a connection to a thread would act like a poor man's connection pool. Instead of checking out a connection from the pool every time you use it, you would just use the single connection bound to the thread. This would allow for quick execution of database queries, even with code that hasn't been very well designed.
However in many cases you're not working with a single thread or a small set of threads. As soon as you're developing an application with even dozens of simultaneous users, you're better off working with a connection pool as it will become impossible to dedicate a connection to every thread (see next paragraph).
Some people also have the misunderstanding that a connection pool can and should have a lot of connections (100 or more), even though it's often more advantageous to have fewer. Since all of the connections use the database's resources, the effect is similar to having a store with a single cash register. It's not more efficient to have 10 doors to the store instead of 1, since it will just fill up with customers but the payments won't happen any faster.
In Database connection Pooling,
does each connection correspond to 1 thread in the database ?
Would it really matter ? I mean, there could be a threadPool
executor, that execute whatever a connection Object has to execute ?
I wonder how it works, because that would help in understanding how this things is actually tuned.
My understanding so far is "one connection - one thread". Otherwise why most database would be blocking ?
Connection pooling is what you have on client side (i.e. in Java).
The connection pool is just that - a pool of open connections to the database. These are not bound to threads, any number of threads can request connection from the pool at any given time - the pool will grant the request if a connection is available, and if not, it will either create a new one, block or deny the request (depends on implementation). The main idea here is to have less connections than threads, another purpose is to keep those connections open if there's many short DB operations (the creation of DB connection is an expensive operation).
On the server side, this depends on the DB implementation. I would expect most DB servers to use one thread per connection - someone has to listen on the open socket after all. For many DB engines this can be much more complex though, e.g. there may be one module listening on the socket, and in turn sending the queries to another module that may have different number of threads running.
This question already has answers here:
How to manage db connections on server?
(3 answers)
Closed 7 years ago.
I have a Java server and PostgreSQL database.
There is a background process that queries (inserts some rows) the database 2..3 times per second. And there is a servlet that queries the database once per request (also inserts a row).
I am wondering should I have separate Connection instances for them or share a single Connection instance between them?
Also does this even matter? Or is PostgreSQL JDBC driver internally just sending all requests to a unified pool anyway?
One more thing should I make a new Connection instance for every servlet request thread? Or share a Connection instance for every servlet thread and keep it open the entire up time?
By separate I mean every threads create their own Connection instances like this:
Connection connection = DriverManager.getConnection(url, user, pw);
If you use a single connection and share it, only one thread at a time can use it and the others will block, which will severely limit how much your application can get done. Using a connection pool means that the threads can have their own database connections and can make concurrent calls to the database server.
See the postgres documentation, "Chapter 10. Using the Driver in a Multithreaded or a Servlet Environment":
A problem with many JDBC drivers is that only one thread can use a
Connection at any one time --- otherwise a thread could send a query
while another one is receiving results, and this could cause severe
confusion.
The PostgreSQLâ„¢ JDBC driver is thread safe. Consequently, if your
application uses multiple threads then you do not have to worry about
complex algorithms to ensure that only one thread uses the database at
a time.
If a thread attempts to use the connection while another one is using
it, it will wait until the other thread has finished its current
operation. If the operation is a regular SQL statement, then the
operation consists of sending the statement and retrieving any
ResultSet (in full). If it is a fast-path call (e.g., reading a block
from a large object) then it consists of sending and retrieving the
respective data.
This is fine for applications and applets but can cause a performance
problem with servlets. If you have several threads performing queries
then each but one will pause. To solve this, you are advised to create
a pool of connections. When ever a thread needs to use the database,
it asks a manager class for a Connection object. The manager hands a
free connection to the thread and marks it as busy. If a free
connection is not available, it opens one. Once the thread has
finished using the connection, it returns it to the manager which can
then either close it or add it to the pool. The manager would also
check that the connection is still alive and remove it from the pool
if it is dead. The down side of a connection pool is that it increases
the load on the server because a new session is created for each
Connection object. It is up to you and your applications'
requirements.
As per my understanding,You should defer this task to the container to manage connection pooling for you.
As you're using Servlets,which will be running in a Servlet container, and all major Servlet containers that I'm aware of provide connection pool management.
See Also
Best way to manage database connection for a Java servlet
I have a program that creates multiple threads, and i need that any one of them will write to the DB. the problem is that if i use the same connection the data is incorrect due to the access of multiple threads to the same Variables (like statment.setString()).
if i use different connection it Takes all the benefit from the threads.
In summary: i need that all threads will access a class or another thread that will hold 1 connection and will hold a batch of queries and once in a while will perfome execute batch.
thank you!
I see no point in doing this but if you want to do it anyway, then I suggest you synchronize the access to the DB through this connection. Add some common LOCK object and do this:
synchronized(LOCK){
// use connection by current thread including
// sensitive operations which
// need this synchronization
}
But then note that even though you're using multiple threads they will wait on each other, i.e. their access to the DB through this connection will be serialized (not simultaneous).
The situation you are describing (many threads accessing to a DB) is exactly what is current in web application. And the recommended practice is to set a pool of connections to mitigate between the contention that would arise with one single connection and the resource consumption with one connection per thread. Apache DBCP is one example of such connection pool.
In the event that you just cant initialize more than one connection, in your connection class add a boolean flag called bound, a single thread cannot use the connection "if(bound)" ... any thread using the connection shall clear the flag upon done ... add while loop in each thread to check the flag until it gets false, then sets it true and starts using it ... quite easy ... u could also create a 4th dedicated synchronous thread for all network com that accepts tasks from other threads in a queue and executes them synchronously so nothing gets messed up ... anyway there is a million ways around this , choose which ever more appropriate for your application
What is a Connection Object in JDBC ? How is this Connection maintained(I mean is it a Network connection) ? Are they TCP/IP Connections ? Why is it a costly operation to create a Connection every time ? Why do these connections become stale after sometime and I need to refresh the Pool ? Why can't I use one connection to execute multiple queries ?
These connections are TCP/IP connections. To not have to overhead of creating every time a new connection there are connection pools that expand and shrink dynamically. You can use one connection for multiple queries. I think you mean that you release it to the pool. If you do that you might get back the same connection from the pool. In this case it just doesn't matter if you do one or multiple queries
The cost of a connection is to connect which takes some time. ANd the database prepares some stuff like sessions, etc for every connection. That would have to be done every time. Connections become stale through multiple reasons. The most prominent is a firewall in between. Connection problems could lead to connection resetting or there could be simple timeouts
To add to the other answers:
Yes, you can reuse the same connection for multiple queries. This is even advisable, as creating a new connection is quite expensive.
You can even execute multiple queries concurrently. You just have to use a new java.sql.Statement/PreparedStatement instance for every query. Statements are what JDBC uses to keep track of ongoing queries, so each parallel query needs its own Statement. You can and should reuse Statements for consecutive queries, though.
The answers to your questions is that they are implementation defined. A JDBC connection is an interface that exposes methods. What happens behind the scenes can be anything that delivers the interface. For example, consider the Oracle internal JDBC driver, used for supporting java stored procedures. Simultaneous queries are not only possible on that, they are more or less inevitable, since each request for a new connection returns the one and only connection object. I don't know for sure whether it uses TCP/IP internally but I doubt it.
So you should not assume implementation details, without being clear about precisely which JDBC implementation you are using.
since I cannot comment yet, wil post answer just to comment on Vinegar's answer, situation with setAutoCommit() returning to default state upon returning connection to pool is not mandatory behaviour and should not be taken for granted, also as closing of statements and resultsets; you can read that it should be closed, but if you do not close them, they will be automatically closed with closing of connection. Don't take it for granted, since it will take up on your resources on some versions of jdbc drivers.
We had serious problem on DB2 database on AS400, guys needing transactional isolation were calling connection.setAutoCommit(false) and after finishing job they returned such connection to pool (JNDI) without connection.setAutoCommit(old_state), so when another thread got this connection from pool, inserts and updates have not commited, and nobody could figure out why for a long time...