Before I ask this question, I am sure this has been asked before but I had a hard time filling in proper terms to find this. As a result I was unable to find any information. So I apologize if it has been asked before.
Consider the following scenario:
A game server is backed by an SQL database for player storage and logging. Every time a player logs in data is retrieved and written. Also every few seconds (20 seconds or something) the logs are written to the database including changed data about the players.
I am wondering how to handle these connections. Keeping the connection open forever is a bad idea because the MySQL server closes it after "inactivity".
Opening the connection each time works but I am wondering if it is the best approach or is there another possibility?
That is what connection pools are good for. Try HikariCP, its extremly fast. You can use it on top of plain JDBC, as well as JPA or O/R mappers. It will keep a set of connections open (pooling) and manage their reuse, if you have a lot of concurrent connections.
If you have to store logs in the database, there are several logging frameworks, that already have funtions to do so. For example Logback has a DBAppender that works on top of connection pools:
"..., sending 500 logging requests to the aforementioned MySQL database takes around 0.5 seconds, for an average of 1 millisecond per request, that is a tenfold improvement in performance." (source)
Related
This question already has answers here:
Should a database connection stay open all the time or only be opened when needed?
(5 answers)
Closed 3 years ago.
I'm connecting to a small PostgreSQL database on the local LAN using Java and running some queries and I'm not sure what is best practice.
Should I close the connection after each query or keep the connection open for the next query?
you probably want to keep the connection open for as long a possible, but in practice it probably won't matter much unless you're making hundreds of little queries and reconnecting for every one
here's an overview of what's going on, assuming your network has ping times of around 1ms:
making the socket connection, a TCP handshake should take <5ms
your client with present authentication credentials, and the server will check them. this might take a while depending on how you've got things configured but by default it'll take <1ms
your client sends a query, the server will need to populate various internal data-structures and caches (this will take a few milliseconds), the server will then be able to plan and execute the query which will vary according to the query
more details here and here
if you reuse this connection then you save the cost of authentication and catalog loading. you're therefore recommended to reuse connections where possible so this work isn't performed every time. depending on your usage this may or may not actually make much difference. for example:
if you're running lots of queries that take a few milliseconds each then it'll certainly be helpful to reuse connections
if you're mostly running queries that take many seconds to execute then this cost will be negligible
I have created a custom web API for submitting data to a MYSQL server for my Java application. Within my Java application I need to add/update 200 rows, and I already have it making these requests one at a time.
This can be pretty time consuming, so can I create threads for all
of these different connections?
Should I limit the number of maximum connections made at a time? Maybe like 10 at a time?
Will this cause any issues with mysql
possibly adding rows at exactly the same time? (No 2 rows would ever
be needed to be changed at a single point in time)
Inserting records in multiple connections will potentially speed up the 200 inserts, but you can only know for sure by testing and measuring after introducing multiple threads. I will also suggest trying JDBC batch and sending all 200 inserts to the database in one go (if that is possible in your implementation), as that might provide a performance boost by saving round trips to the database.
To create a connection pool, look at HikariCP which is a JDBC connection pool implementation. It will allow you to specify the min/max concurrent connections along with other settings. Your worker threads then can request connections from the pool and perform the inserts.
Inserting multiple records concurrently could have issues at mySQL level if it acquires a table lock for each insert. In that case you would not get a speed improvement with multiple threads and might need some tuning at the database level to work around it. Here's a good article that should help: High rate insertion with mySQL
I am in the process of building a client-server application and I would really like an advise on how to design the server-database connection part.
Let's say the basic idea is the following:
Client authenticates himself on the server.
Client sends a request to server.
Server stores client's request to the local database.
In terms of Java Objects we have
Client Object
Server Object
Database Object
So when a client connects to the server a session is created between them through which all the data is exchanged. Now what bothers me is whether i should create a database object/connection for each client session or whether I should create one database object that will handle all requests.
Thus the two concepts are
Create one database object that handles all client requests
For each client-server session create a database object that is used exclusively for the client.
Going with option 1, I guess that all methods should become synchronized in order to avoid one client thread not overwriting the variables of the other. However, making it synchronize it will be time consuming in the case of a lot of concurrent requests as each request will be placed in queue until the one running is completed.
Going with option 2, seems a more appropriate solution but creating a database object for every client-server session is a memory consuming task, plus creating a database connection for each client could lead to a problem again when the number of concurrent connected users is big.
These are just my thoughts, so please add any comments that it may help on the decision.
Thank you
Option 3: use a connection pool. Every time you want to connect to the database, you get a connection from the pool. When you're done with it, you close the connection to give it back to the pool.
That way, you can
have several clients accessing the database concurrently (your option 1 doesn't allow that)
have a reasonable number of connections opened and avoid bringing the database to its knees or run out of available connections (your option 2 doesn't allow that)
avoid opening new database connections all the time (your option 2 doesn't allow that). Opening a connection is a costly operation.
Basically all server apps use this strategy. All Java EE servers come with a connection pool. You can also use it in Java SE applications, by using a pool as a library (HikariCP, Tomcat connection pool, etc.)
I would suggested a third option, database connection pooling. This way you create a specified number of connections and give out the first available free connection as soon as it becomes available. This gives you the best of both worlds - there will almost always be free connections available quickly and you keep the number of connections the database at a reasonable level. There are plenty of the box java connection pooling solutions, so have a look online.
Just use connection pooling and go with option 2. There are quite a few - C3P0, BoneCP, DBCP. I prefer BoneCP.
Both are not good solutions.
Problem with Option 1:
You already stated the problems with synchronizing when there are multiple threads. But apart from that there are many other problems like transaction management (when are you going to commit your connection?), Security (all clients can see precommitted values).. just to state a few..
Problem with Option 2:
Two of the biggest problems with this are:
It takes a lot of time to create a new connection each and every time. So performance will become an issue.
Database connections are extremely expensive resources which should be used in limited numbers. If you start creating DB Connections for every client you will soon run out of them although most of the connections would not be actively used. You will also see your application performance drop.
The Connection Pooling Option
That is why almost all client-server applications go with the connection pooling solution. You have a set connections in the pool which are obtained and released appropriately. Almost all Java Frameworks have sophisticated connection pooling solutions.
If you are not using any JDBC framework (most use the Spring JDBC\Hibernate) read the following article:
http://docs.oracle.com/javase/jndi/tutorial/ldap/connect/pool.html
If you are using any of the popular Java Frameworks like Spring, I would suggest you use Connection Pooling provided by the framework.
I have a Java app with more than 100 servers. Currently each server opens up connections to 7 database schemas in a relational databases (logging, this, that, the other). All schemas connect to the same DB cluster, but its all effectively one database instance.
The server managed connection pools open up a handfull of connections (1 - 5) on each database schema per instance, then double that on a redundant pool. So each server open up a minimum of 30 database connections and can grow to a maximum of several hundred per server, and again, there are more than 100 servers.
All in all, the minimum number of database connections used are 3000, and this can grow to ludricous.
Obviously this is just plain wrong. The database cluster can only effeciently handle X concurrent requests and any number of requests > X introduces unnecessary contention and slows the whole lot down. (X is unknown, but it is way smaller than the 3000 minimum concurrent connections).
I want to lower the total connections used by implementing the following strategy:
Connect to one schema only (Application-X), have 6 connections per pool maximum.
Write a layer above the pool that will switch to the schema I want. The getConnection(forSchema) function will take a parameter for the target schema (eg. logging), will get a connection that could last be pointing to any schema, and issue a schema switch SQL statement (set search_path to 'target_schema').
Please do not comment on whether this approach is right or wrong. Because 'it depends' needs to be considered, such comments will not add value.
My question is whether there is a DB pool implementation out there that already does this - allows me to have one set of connections and automatically places me at the right schema, or better yet - tracks whether a pooled connection is available for your target schema before making a decision to go ahead and switch the schema (saves a DB round trip).
I would also like to hear from anyone else who has a similar issue (real-world experience) if you solved it in a different way.
Having learned the hard way myself, the best way to stabilize the number of database connections between a web application and a database is to put a reverse proxy in front of the web application.
Here's why it works:
A slow part of a web request can be returning the data to the client. If there's a lot of data or the user is on a slow connection, the connection can remain open to the web server where the data dribbles out to the client slowly. Meanwhile, the application server continues to hold a database connection open to the backend server. While the database connection may only needed for a fraction of the transaction, it's tied up until client disconnects from the application server.
Now, consider what happens when a reverse proxy is added in front of the app server. When the app server has a response prepared, it can quickly reply back to the reverse proxy in front of it, and free up the database connection behind it. The reverse proxy can then handle slowly dribbling out responses to uses, without keeping a related database connection tied up.
Before I made this architectural change, there were a number of traffic spikes that resulted in death spirals: the database handle usage would spike to exhaustion, and things would go downhill from there.
After the change, the number of the database handles required was both far less and far more stable.
If a reverse proxy is not already part of your architecture, I recommend it as a first step to control the number of database connections you require.
I have a requirement to write a java application (web based) that will connect to an Oracle 11G databases (currently is connecting 10-12 different oracle databases), read some data from it (all are select queries).
After iterating this arraylist, I am connecting each database, fire select query (same query I am firing for all databases), get record , put it in one global collection list and close connection and continuing same process over this loop.
Currently I am "Executors" to connect multiple databases.
again using ExecutorService executor = Executors.newFixedThreadPool(20);
give one surprise to me. for creting first database connection, it shows immediate log, but for subsequent database connection, it prints logs after 60 seconds, so i am not getting why this is taking 60 secs or more for all connections?
logically, it should take time for all connections as like one.
Please suggest performance improvement for this application.
Opening a database connection is an expensive operation; if possible you should connect to each database once and reuse the connection for all queries made to that database (also known as connection pooling). It's not clear if this is what you're already doing.
Another thing that seems strange is that you have made the openConnection method synchronized. I see no need for doing that, and depending on how your code is structured it may mean that only one thread can be opening a connection at a given time. Since connecting takes a long time you'd want each thread to make its connection in parallel.