Simultaneous access to Database in Web application - java

How can I know the average or exact number of users accessing database simultaneously in my Java EE web enterprise application? I would like to see if the "Connection pool setting" I set in the Glassfish application server is suitable for my web application or not. I need to correctly set the maximun number of connection in Connection Pool setting in Application Server. Recently, my application ran out of connections and threw exceptions when the client request for DB expires.

There are multiple ways.
One and easiest would be take help from your DBAs - they can tell you exactly how many connections are active from your webserver or the user id for connection pool at a given time.
If you want some excitement, you will have to JMX management extensions provided by glassfish. Listing 6 on this page - gives an example as to how to write a JMS based snippet to monitor a connection pool.
Finally, you must make sure that all connections are closed explicitly by a connection.close(); type of call in your application. In some cases, you need to close ResultSet as well.
Next is throttling your http thread pool to avoid too many concurrent access if your db connections are taking longer to close.

Related

Direct db connection per http request vs connection pooling- what is the difference

Let's say I am storing data of Person(id, country_id, name). And let's say user just sent the id and country_id and we send back the name.
Now I have one db and 2 webserver and each webserver keeps a connection pool (e.g. c3p0) of 20 connection.
That means db is maintaining 40 connections and each webserver is maintaining 20 connections.
Analyzing the above system we can see that we used connection pool because people say "creating db connection is expensive"
This all make sense
Now let's say I shard table data on country_id, so now there may be 200 db, also assuming our app is popular now and we need to have 50 webserver.
Now the above strategy of connection pooling fails as if each webserver is keeping 20 connections in the pool for each db.
that means each webserver will have 20*200 db = 4000 connection
and each db will have 50 web server *20 = 1000 connection.
This doesn't sound good, so I got the question that why use connection pooling what is the overhead of creating 1 connection per web request?
So I run a test where I saw that DriverManager.getConnection() takes a average of 20 ms on localhost.
20 ms extra per request is not a game killer
Question1: Is there any other downside of using 1 connection per web request ?
Question2: People all over internet say "db connection is expensive". What are the different expenses?
PS: I also see pinterest doing same https://medium.com/#Pinterest_Engineering/sharding-pinterest-how-we-scaled-our-mysql-fleet-3f341e96ca6f
Other than Connection creation & Connection close cycle being a time consuming task ( i.e. being costly ) , pooling is also done to control the number of simultaneous open connections to your database since there is a limit on number of simultaneous connections that a db server can handle. When you do , one connection per request , you loose that control and your application is always at risk of crashing at peak load.
Secondly, you would unnecessarily tie your web server capacity with your database capacity and target is also to treat db connection management not as a developer concern but an infrastructure concern. Would you like to give control to open a database connection for production application to developer as per his/her code?
In traditional monolithic application servers like Weblogic, JBoss, WebSphere etc, Its sys admin who will create a connection pool as per db serer capacity and pass on JNDI name to developers to use to.Developer's job is to only get connection using that JNDI.
Next comes if database is shared among various independent applications then pooling lets you know as what you are giving out to which application. Some apps might be more data intensive and some might not be that intensive.
Traditional problem of resource leak i.e when developers forget to cleanly close their connection is also taken care of with pooling.
All in all - idea behind pooling is to let developers be concerned only about using a connection and do their job and not being worried about opening and closing it. If a connection is not being used for X minutes, it will be returned to pool per configuration.
If you have a busy web site and every request to the database opens and closes a connection, you are dead in the water.
The 20ms you measured are for a localhost connection. I don't think that all your 50 web servers will be on localhost...
Apart from the time it takes to establish and close a database connection, it also uses resources on the database server. This is mostly the CPU, but there could also be contention on kernel data structures.
Also, if you allow several thousand connections, there is nothing that keeps them from all gettings busy at the same time, in which case your database server will be overloaded and unresponsive unless it has several thousand cores (and even then you'd be limited by lock contention).
Your solution is an external connection pool like pgBouncer.

Datasource Microsoft JDBC Driver for SQL Server (AlwaysOn Availability Groups)

I have a question related to the scenario when connecting from a Java application using the Microsoft JDBC Driver 4.0 to a SQL Server 2014 with AlwaysOn Availability Groups set up for high availability.
With this set up, we will be connecting to an availability group listener (specified in the db connecting string instead of any particular instance), so that the DB fail-over etc. is handled gracefully by the listener and it tries to connect to the next available instance behind the scenes if current primary goes down in the AG cluster.
Question(s) I have is,
In the data-source that is configured on the j2ee application server side (we use WebSphere), what happens to those connections already pooled by the data-source?
When a database goes down, though the AG listener would try to reconnect on the db side to the next available DB, will the AG Listener also through the jdbc driver send an event or something to the data-source created on the app server and make sure those connections that are already pooled by the datasource to be discarded and have it create new ones so that transactions on the application side wont fail (though they might for a while till new connections are created and fail over is successful) or the java application has to find out only after requesting it from the datasource?
WebSphere Application Server is able to cope with bad connections and removes them from the pool. Exactly when this happens depends on some configurable options and on how fully the Microsoft JDBC driver takes advantage of the javax.sql.ConnectionEventListener API to send notifications to the application server. In the ideal case where a JDBC driver sends the connectionErrorOccurred event immediately for all connections, WebSphere Application Server responds by removing all of these connections from the pool and by marking any connection that is currently in-use as bad so that it does not get returned to the pool once the application closes the handle. Lacking this, WebSphere Application Server will discover the first bad connection upon next use by the application. It is discovered either by a connectionErrorOcurred event that is sent by the JDBC driver at that time, or lacking that, upon inspecting the SQLState/error code of an exception for known indicators of bad connections. WebSphere Application Server then goes about purging bad connections from the pool according to the configured Purge Policy. There are 3 options:
Purge Policy of Entire Pool - all connections are removed from
the pool and in-use connections marked as bad so that they are not
pooled.
Purge Policy of Failing Connection Only - only the
specific connection upon which the error actually occurred is
removed from the pool or marked as bad and not returned to the pool
Purge Policy of Validate All Connections - all connections are
tested for validity (Connection.isValid API) and connections found
to be bad are removed from the pool or marked as bad and not
returned to the pool. Connections found to be valid remain in the
pool and continue to be used.
I'm not sure from your description if you are using WebSphere Application Server traditional or Liberty. If traditional, there is an additional option for pre-testing connections as they are handed out of the pool, but be aware that turning this on can have performance implications.
That said, the one thing to be aware of is that regardless of any of the above, your application will always need to be capable of handling the possibility of errors due to bad connections (even if the connection pool is cleared, connections can go bad while in use) and respond by requesting a new connection and retrying the operation in a new transaction.
Version 4 of that SQL Server JDBC driver is old and doesn't know anything about the always on feature.
Any data source connection pool can be configured to check the status of the connection from the pool prior to doling it out to the client. If the connection cannot be used the pool will create a new one. That's true of all vendors and versions. I believe that's the best you can do.

Opening a new database connection for every client that connects to the server application?

I am in the process of building a client-server application and I would really like an advise on how to design the server-database connection part.
Let's say the basic idea is the following:
Client authenticates himself on the server.
Client sends a request to server.
Server stores client's request to the local database.
In terms of Java Objects we have
Client Object
Server Object
Database Object
So when a client connects to the server a session is created between them through which all the data is exchanged. Now what bothers me is whether i should create a database object/connection for each client session or whether I should create one database object that will handle all requests.
Thus the two concepts are
Create one database object that handles all client requests
For each client-server session create a database object that is used exclusively for the client.
Going with option 1, I guess that all methods should become synchronized in order to avoid one client thread not overwriting the variables of the other. However, making it synchronize it will be time consuming in the case of a lot of concurrent requests as each request will be placed in queue until the one running is completed.
Going with option 2, seems a more appropriate solution but creating a database object for every client-server session is a memory consuming task, plus creating a database connection for each client could lead to a problem again when the number of concurrent connected users is big.
These are just my thoughts, so please add any comments that it may help on the decision.
Thank you
Option 3: use a connection pool. Every time you want to connect to the database, you get a connection from the pool. When you're done with it, you close the connection to give it back to the pool.
That way, you can
have several clients accessing the database concurrently (your option 1 doesn't allow that)
have a reasonable number of connections opened and avoid bringing the database to its knees or run out of available connections (your option 2 doesn't allow that)
avoid opening new database connections all the time (your option 2 doesn't allow that). Opening a connection is a costly operation.
Basically all server apps use this strategy. All Java EE servers come with a connection pool. You can also use it in Java SE applications, by using a pool as a library (HikariCP, Tomcat connection pool, etc.)
I would suggested a third option, database connection pooling. This way you create a specified number of connections and give out the first available free connection as soon as it becomes available. This gives you the best of both worlds - there will almost always be free connections available quickly and you keep the number of connections the database at a reasonable level. There are plenty of the box java connection pooling solutions, so have a look online.
Just use connection pooling and go with option 2. There are quite a few - C3P0, BoneCP, DBCP. I prefer BoneCP.
Both are not good solutions.
Problem with Option 1:
You already stated the problems with synchronizing when there are multiple threads. But apart from that there are many other problems like transaction management (when are you going to commit your connection?), Security (all clients can see precommitted values).. just to state a few..
Problem with Option 2:
Two of the biggest problems with this are:
It takes a lot of time to create a new connection each and every time. So performance will become an issue.
Database connections are extremely expensive resources which should be used in limited numbers. If you start creating DB Connections for every client you will soon run out of them although most of the connections would not be actively used. You will also see your application performance drop.
The Connection Pooling Option
That is why almost all client-server applications go with the connection pooling solution. You have a set connections in the pool which are obtained and released appropriately. Almost all Java Frameworks have sophisticated connection pooling solutions.
If you are not using any JDBC framework (most use the Spring JDBC\Hibernate) read the following article:
http://docs.oracle.com/javase/jndi/tutorial/ldap/connect/pool.html
If you are using any of the popular Java Frameworks like Spring, I would suggest you use Connection Pooling provided by the framework.

Connection Pooling and Oracle Seesion

Before I start with my question, I would like to clarify that I am a DB developer and have limited understanding of things on Java/J2EE side.
Ours is a web application (n-tier with app server/web server). We are using connection pooling to manage connections to the database. I have limited understanding of connection pooling - App server manages connections for applications, lets the application get a connection from a pool, return the connection once its done back to the pool.
Let's say that I follow these steps -
1. Let's say that I log in the application
2. Application requests for a connection from connection pool to authenticate me
3. Once authentication is done, App server will return the connection back to pool
4. I browse to a page where I have to do some CRUD operation and let's say that I am updating some data on the page.
5. App Server will again request for a connection from Pool
6. Application will process the data using the connection.
Here is my problem statement -
Let's say that I have to capture audit information using triggers (on the tables which are undergoing update). One of the attribute which I need to capture is username (logged in user).
I set a global package variable when I log in (step 1 - 3), which stores the logged in user name. My trigger is going to read the global package variable for the username. Since the connections are not going to remain same (connection pool manages connection), will my global package variable be available when I am processing the trigger?
What will happen to the variable (it obviously depends on answer to first question) when multiple users are logged in and accessing the application?
I tried to look around but have not been able to get clear answer to my doubts.
Pardon me, if my question is not clear. Let me know and I can edit to provide more information.
You can use CLIENT_IDENTIFIER attribute to preserve the actual user who logged in to the application.
Please find below more information from Oracle documentation:
Support for Application User Models by Using Client Identifiers
Many applications use session pooling to set up a number of sessions to be reused by multiple application users. Users authenticate themselves to a middle-tier application, which uses a single identity to log in to the database and maintains all the user connections. In this model, application users are users who are authenticated to the middle tier of an application, but who are not known to the database. Oracle Database supports use of a CLIENT_IDENTIFIER attribute that acts like an application user proxy for these types of applications.
In this model, the middle tier passes a client identifier to the database upon the session establishment. The client identifier could actually be anything that represents a client connecting to the middle tier, for example, a cookie or an IP address. The client identifier, representing the application user, is available in user session information and can also be accessed with an application context (by using the USERENV naming context). In this way, applications can set up and reuse sessions, while still being able to keep track of the application user in the session. Applications can reset the client identifier and thus reuse the session for a different user, enabling high performance.
You can set the CLIENT_IDENTIFIER in java using the following code snippet:
public Connection prepare(Connection conn) throws SQLException {
String prepSql = "{ call DBMS_SESSION.SET_IDENTIFIER('userName') }";
CallableStatement cs = conn.prepareCall(prepSql);
cs.execute();
cs.close();
return conn;
}

JDBC Connection Link Failure - How to fail over?

I have a stand-alone Java windows application developed based on Swing. It connects to a MySQL database for data storage. In case the database connection fails, I am getting a link failure exception from the MySQL JDBC driver (MySQLNonTransientConnectionException). I don't want to re-instantiate my database connection object or the whole program in case such a link failure issue happens. I just want to tell the user to try again later without having to restart the entire application. If the user is asked to restart the entire application, that would probably give a negative impression on the quality of the program. What do you think would be the preferred way for a standard java application to fail-over after such a database link failure without having to re-instantiate all the communication objects? Thanks in advance.
Use a Connection Pool (such as C3PO or DBCP). Your application takes the Connections from the pool, executes the statement(s) and puts the Connection back into the pool. The pool can be configured to test the JDBC Connections. For example, if they become stale, they can be automatically reinstantiated by the pool.
If your application takes the Connection from the pool, it will be a valid Connection. Let the pool handle the management of valid/invalid/stale JDBC Connections.

Categories