Apache Tomcat JDBC Connection Pool bad performance on batch \ bulk inserts

Apache Tomcat JDBC Connection Pool bad performance on batch \ bulk inserts - java

I have recently incorporated the Apache Tomcat JDBC Connection Pool to my application (using MySQL DB). I tried using Apache DBCP before, but didn't like its results, and the tomcat implementation seemed to fit my needs even though I run a standalone java application and don't use tomcat at all.
Recently, I encountered a huge performance problem when executing batch (aka bulk ) insert queries.
I have a flow in which I insert ~2500 records to a table in a batched fashion. It takes forever when using the jdbc connection pool, compared to a few seconds when reverting back to opening a connection for each query (no pooling).
I wrote a small application that inserts 30 rows to the same table. It takes 12 seconds when pooling, and ~ 800 millis when not pooling.
Prior to using the connection pool, I used com.mysql.jdbc.jdbc2.optional.MysqlDataSource as my DataSource. The connection was configured with the following line:
dataSource.setRewriteBatchedStatements(true);
I'm quite sure that this is the core difference between the two approaches, but couldn't find an equivalent parameter in jdbc-pool.

MySql JDBC driver does not support batch operations. RewriteBatchedStatement is the best that you can get. Here the code from mysql PreparedStatement.java:
try {
statementBegins();
clearWarnings();
if (!this.batchHasPlainStatements && this.connection.getRewriteBatchedStatements()) {
if (canRewriteAsMultiValueInsertAtSqlLevel()) {
return executeBatchedInserts(batchTimeout);
}
if (this.connection.versionMeetsMinimum(4, 1, 0) && !this.batchHasPlainStatements && this.batchedArgs != null
&& this.batchedArgs.size() > 3 /* cost of option setting rt-wise */) {
return executePreparedBatchAsMultiStatement(batchTimeout);
}
}
return executeBatchSerially(batchTimeout);
} finally {
this.statementExecuting.set(false);
clearBatch();
}
It is one of the reason why I do not like MySql and prefer Postgres
EDIT:
You should combine connection pool, batch operation, and RewriteBatchedStatement option. You can set RewriteBatchedStatement option through jdbc url parameter: jdbc:mysql://localhost:3307/mydb?rewriteBatchedStatements=true

Related

HikariCP failing to initilize pool: 'FATAL: sorry, too many clients already' [duplicate]

I am trying to connect to a Postgresql database, I am getting the following Error:
Error:org.postgresql.util.PSQLException: FATAL: sorry, too many clients already
What does the error mean and how do I fix it?
My server.properties file is following:
serverPortData=9042
serverPortCommand=9078
trackConnectionURL=jdbc:postgresql://127.0.0.1:5432/vTrack?user=postgres password=postgres
dst=1
DatabaseName=vTrack
ServerName=127.0.0.1
User=postgres
Password=admin
MaxConnections=90
InitialConnections=80
PoolSize=100
MaxPoolSize=100
KeepAliveTime=100
TrackPoolSize=120
TrackMaxPoolSize=120
TrackKeepAliveTime=100
PortNumber=5432
Logging=1

An explanation of the following error:
org.postgresql.util.PSQLException: FATAL: sorry, too many clients already.
Summary:
You opened up more than the allowed limit of connections to the database. You ran something like this: Connection conn = myconn.Open(); inside of a loop, and forgot to run conn.close();. Just because your class is destroyed and garbage collected does not release the connection to the database. The quickest fix to this is to make sure you have the following code with whatever class that creates a connection:
protected void finalize() throws Throwable
{
try { your_connection.close(); }
catch (SQLException e) {
e.printStackTrace();
}
super.finalize();
}
Place that code in any class where you create a Connection. Then when your class is garbage collected, your connection will be released.
Run this SQL to see postgresql max connections allowed:
show max_connections;
The default is 100. PostgreSQL on good hardware can support a few hundred connections at a time. If you want to have thousands, you should consider using connection pooling software to reduce the connection overhead.
Take a look at exactly who/what/when/where is holding open your connections:
SELECT * FROM pg_stat_activity;
The number of connections currently used is:
SELECT COUNT(*) from pg_stat_activity;
Debugging strategy
You could give different usernames/passwords to the programs that might not be releasing the connections to find out which one it is, and then look in pg_stat_activity to find out which one is not cleaning up after itself.
Do a full exception stack trace when the connections could not be created and follow the code back up to where you create a new Connection, make sure every code line where you create a connection ends with a connection.close();
How to set the max_connections higher:
max_connections in the postgresql.conf sets the maximum number of concurrent connections to the database server.
First find your postgresql.conf file
If you don't know where it is, query the database with the sql: SHOW config_file;
Mine is in: /var/lib/pgsql/data/postgresql.conf
Login as root and edit that file.
Search for the string: "max_connections".
You'll see a line that says max_connections=100.
Set that number bigger, restart postgresql database.
What's the maximum max_connections?
Use this query:
select min_val, max_val from pg_settings where name='max_connections';
I get the value 8388607 so in theory that's the most you are allowed to have, but then a runaway process can eat up thousands of connections, and surprise, your database is unresponsive until reboot. If you had a sensible max_connections like 100. The offending program would be denied a new connection and the database is safu.

We don't know what server.properties file is that, we neither know what SimocoPoolSize means (do you?)
Let's guess you are using some custom pool of database connections. Then, I guess the problem is that your pool is configured to open 100 or 120 connections, but you Postgresql server is configured to accept MaxConnections=90 . These seem conflictive settings. Try increasing MaxConnections=120.
But you should first understand your db layer infrastructure, know what pool are you using, if you really need so many open connections in the pool. And, specially, if you are gracefully returning the opened connections to the pool

No need to increase the MaxConnections & InitialConnections. Just close your connections after after doing your work. For example if you are creating connection:
try {
connection = DriverManager.getConnection(
"jdbc:postgresql://127.0.0.1/"+dbname,user,pass);
} catch (SQLException e) {
e.printStackTrace();
return;
}
After doing your work close connection:
try {
connection.commit();
connection.close();
} catch (SQLException e) {
e.printStackTrace();
}

The offending lines are the following:
MaxConnections=90
InitialConnections=80
You can increase the values to allow more connections.

You need to close all your connexions for example:
If you make an INSERT INTO statement you need to close the statement and your connexion in this way:
statement.close();
Connexion.close():
And if you make a SELECT statement you need to close the statement, the connexion and the resultset in this way:
resultset.close();
statement.close();
Connexion.close();
I did this and it worked

I had postgres and other apps up in docker. I was facing this problem when more than ten apps connected to postgres database. The solution was to increase postgres max_connection count. This is 100 by default. To increase this value, either find max_connection in the /var/lib/pgsql/data/postgresql.conf file and edit it. Another way is to add and run the docker-compose.yml document as follows.
version: '3'
services:
taxi-postgresql:
container_name: my-postgresql
image: postgres:13.3
volumes:
- ./postgres-volume:/var/lib/postgresql/data
environment:
- POSTGRES_DB=taxi-postgresql
- POSTGRES_USER=user
- POSTGRES_PASSWORD=user
- POSTGRES_HOST_AUTH_METHOD=trust
command: postgres -c 'max_connections=1000'
ports:
- 127.0.0.1:5432:5432

The same error appears in our microservices deployment, and it is solved by increasing the below value in the Postgresql container:
num_init_children

SQL execution time much slower in a Tomcat Servlet than in a normal Java program

For inexplicable reasons however, this morning the performance increased for two of my Queries that used to be slow. I have no idea why.
I have no authority over the server, maybe someone changed something.
The problem is no more.
In a nutshell:
s.executeQuery(sql) runs extremely slowly within a tomcat servlet on server
Same query runs fine without servlet (simple java program) on the same machine
Not all queries are slow within the servlet. Only a few bigger ones do
Same servlet runs fast on another machine
UPDATES
Please read the updates below the text !
I have a servlet that executes SQL requests and sends back the results via JSON. For some reason, some requests take a huge amount of time to execute, but when I run them in any Oracle SQL Client, they are executed in no time.
I am talking about a difference of 1 second vs 5 minutes for the same SQL (that is not that complex).
How can this be explained ?
Is there a way to improve the performance of a java based SQL request ?
I am using the traditional way of executing queries:
java.sql.Connection conn = null;
java.sql.Statement s = null;
ResultSet rs = null;
String dbDriver = "oracle.jdbc.driver.OracleDriver";
String dbConnectionString = "jdbc:oracle:thin:#" + dbHost + ":" + dbPort + ":" + dbSid;
Class.forName(dbDriver).newInstance();
conn = DriverManager.getConnection(dbConnectionString, dbUser, dbPass);
s = conn.createStatement();
s.setQueryTimeout(9999);
rs = s.executeQuery(newStatement);
ResultSetMetaData rsmd = rs.getMetaData();
// Get the results
while (rs.next()) {
// collect the results
}
// close connections
I tried with ojdbc14 and ojdbc6 but there was no difference.
UPDATE 1:
I tried the same SQL in a local Java project (not a servlet) on my client machine, and I get the results immediately. So I assume the problem is coming from my servlet or the tomcat configuration ?
UPDATE 2:
The culprit is indeed rs = s.executeQuery(mySql); I tried to use preparedStatement instead, but there is no difference.
UPDATE 3:
I created a new Servlet running on a local Tomcat and the Query comes back fast. The problem is therefore coming from my production server or Tomcat config. Any ideas what config items could affect this ?
UPDATE 4:
I tried the same code in a normal java program instead of a servlet (still on the same server) and the results are coming fast. Ergo the problem comes from the Servlet itself (or Tomcat ?). Still don't know what to do, but I narrowed it down :)
UPDATE 5:
Jstack shows the following (It starts where my servlet is, I cut the rest)
"http-8080-3" daemon prio=3 tid=0x00eabc00 nid=0x2e runnable [0xaa9ee000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at oracle.net.ns.Packet.receive(Packet.java:311)
at oracle.net.ns.DataPacket.receive(DataPacket.java:105)
at oracle.net.ns.NetInputStream.getNextPacket(NetInputStream.java:305)
at oracle.net.ns.NetInputStream.read(NetInputStream.java:249)
at oracle.net.ns.NetInputStream.read(NetInputStream.java:171)
at oracle.net.ns.NetInputStream.read(NetInputStream.java:89)
at oracle.jdbc.driver.T4CSocketInputStreamWrapper.readNextPacket(T4CSocketInputStreamWrapper.java:123)
at oracle.jdbc.driver.T4CSocketInputStreamWrapper.read(T4CSocketInputStreamWrapper.java:79)
at oracle.jdbc.driver.T4CMAREngineStream.unmarshalUB1(T4CMAREngineStream.java:429)
at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:397)
at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:257)
at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:587)
at oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:210)
at oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:30)
at oracle.jdbc.driver.T4CStatement.executeForDescribe(T4CStatement.java:762)
at oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:925)
at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1104)
at oracle.jdbc.driver.OracleStatement.executeQuery(OracleStatement.java:1309)
- locked <0xe7198808> (a oracle.jdbc.driver.T4CConnection)
at oracle.jdbc.driver.OracleStatementWrapper.executeQuery(OracleStatementWrapper.java:422)
So i am Stuck at java.net.SocketInputStream.socketRead0(Native Method) ?

In some cases (not sure if this applies to yours) setting fetchSize on the Statement object yields great performance improvements. It depends on the size of the resultSet that is being fetched.
Try playing with it by setting it to something bigger than default 10 for Oracle (see this link).
See Statement.setFetchSize.

Given your symptoms, I believe that your issue is not with your SQL client code and you are in fact looking at issues with your server. The stack shows that your client is waiting for a response. This tallies with the fact that you can run the client without any problem in a separate process.
So what you probably need to look at is systemic reasons why the SQL server is running slowly and how that may be tied to Tomcat. My experience in cases like this is its usually the disk, so I'd be inclined to check whether you are paging due to a lack of RAM when Tomcat is loaded, or suffering from much higher disk ops due to a reduced disk cache. Assuming you are running on a UNIX variant, I'd have a look at vmstat and iostat for a working and broken case to eliminate such issues.

For inexplicable reasons however, this morning the performance increased and my problem is no more. I have no idea why. I have no authority over the server, maybe someone changed something.

Since your thread is waiting on socket read, which means is waiting for a response from the database server I would :
Check database performance, make sure not the instance nor the query is getting impacted at some point in time during the day?
Check your network latencies between Java and DB Servers. Same as above. Probably traceroute?

Since you have not put the query, I can give you a scenario where it is possible. If you use a function in your query like to_char etc. then your table indexes wouldn't be used while executing query via JDBC but will work fine you run it in console. I don't exactly know why but there's something with JDBC driver. I had the exact same issue in db2 and I resolved it removing the use of functions.
Other scenario could be that a huge no of records is being fetched and proper batching is not implemented.

Behaviour regarding db pool and connection.setReadOnly() method

I have Java application with Hibernate framework(no spring) connect to MySQL DB , manage connection pooling via c3p0
i try to configure my apllication to read from slave db and write to master db , i have following this link to some extend Master/Slave load balance
let's say if the application already got a session with connection in pool and it need to execute a read-only method , like this
public someReadOnlyMethod()
{
Session session = (get session from current Thread)
//set read-only so that it read from slave db
session.connection().setReadOnly(true);
(...connect to db to do something...)
//set it back in case of this method is followed by write method so that it go to master db
session.connection().setReadOnly(false);
}
Is the pooling create a new connection to connect to db 2 times for read-only and write operation(if so,this will heavily impact performance) or it smart enough to swap the operation to already existing read-only and writable connection pool ?
thx for your advice.

so this has nothing to do with the pool; it's all in the mysql driver. c3p0 will pass your call to setReadOnly (whether true or false) to the underlying Connection, and the Connection will route to the master or the slaves accordingly.
if you don't like how your Connections default (probably by default they are not read only), you can set the read-only property in the onAcquire method of a c3p0 ConnectionCustomizer, and the value use set (true or false) will become th default that c3p0 resets Connections to.
good luck!

tl;dr: It will re-use existing connections whenever you switch setReadOnly(true/false).
JDBC will connect to all servers listed in your connection URL when you do ReplicationDriver().connect(url). Those connections will remain open for re-use no matter how many times you switch setReadOnly().
Source: I just tested Connector/J version 5.1.38 with com.mysql.jdbc.ReplicationDriver.

JDBC opening a new database session

I just want to make sure that if I use the following, I am opening a separate DB session and not resuing the same one (for testing I need individual sessions).
Connection connection = DriverManager.getConnection(URL,USER,PASSWORD);
each time I do the above code, I run my query, then do a connection.close()
So for example:
while(some condition) {
Connection connection = DriverManager.getConnection(URL,USER,PASSWORD);
//now use the connection to generate a ResultSet of some query
connection.close();
}
So, each iteration of the loop (each query) needs its own session.
Is this properly opening separte sessions as I need (and if not, what would I need to add/change)? thanks

The javadoc says:
Attempts to establish a connection to
the given database URL
Slightly woolly language, and I suspect that this is up to the JDBC driver, but I'd be surprised if this did anything other than open a new connection.
I suppose it's possible for a JDBC driver to perform connection pooling under the hood, but I'd be surprised to see that.
In the case of the Oracle JDBC driver, this will open a new connection every time. This is a relatively slow process in Oracle, you may want to consider using a connection pool (e.g. Apache Commons DBCP, or c3p0) to improve performance.

Connection pooling with Java and MySQL in Tomcat web application

I recently wrote and deployed a Java web application to a server and I'm finding an unusual problem which didn't appear during development or testing.
When a user logs in after so long and goes to display data from the database, the page indicates that there are no records to see. But upon page refresh, the first x records are shown according to the pagination rules.
Checking the logs, I find:
ERROR|19 09 2009|09 28 54|http-8080-4|myDataSharer.database_access.Database_Metadata_DBA| - Error getting types of columns of tabular Dataset 12
com.mysql.jdbc.CommunicationsException: Communications link failure due to underlying exception:
** BEGIN NESTED EXCEPTION **
java.io.EOFException
STACKTRACE:
java.io.EOFException
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:1956)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2368)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:2867)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1616)
And so on for several hundred lines.
The application is currently set for about 100 users but is not yet in full use. It uses connection pooling between the Apache Tomcat servlets / jsps and a MySQL database with the following code example forming the general arrangement of a database operation, of which there are typically several per page:
// Gets a Dataset.
public static Dataset getDataset(int DatasetNo) {
ConnectionPool_DBA pool = ConnectionPool_DBA.getInstance();
Connection connection = pool.getConnection();
PreparedStatement ps = null;
ResultSet rs = null;
String query = ("SELECT * " +
"FROM Dataset " +
"WHERE DatasetNo = ?;");
try {
ps = connection.prepareStatement(query);
ps.setInt(1, DatasetNo);
rs = ps.executeQuery();
if (rs.next()) {
Dataset d = new Dataset();
d.setDatasetNo(rs.getInt("DatasetNo"));
d.setDatasetName(rs.getString("DatasetName"));
...
}
return d;
}
else {
return null;
}
}
catch(Exception ex) {
logger.error("Error getting Dataset " + DatasetNo + "\n", ex);
return null;
}
finally {
DatabaseUtils.closeResultSet(rs);
DatabaseUtils.closePreparedStatement(ps);
pool.freeConnection(connection);
}
}
Is anyone able to advise a way of correcting this problem?
I believe it is due to MySQL leaving connection poll connections open for up to eight hours but am not certain.
Thanks
Martin O'Shea.
Just to clarify one point made about my method of connection pooling, it isn't Oracle that I'm using in my application but a class of my own as follows:
package myDataSharer.database_access;
import java.sql.*;
import javax.sql.DataSource;
import javax.naming.InitialContext;
import org.apache.log4j.Logger;
public class ConnectionPool_DBA {
static Logger logger = Logger.getLogger(ConnectionPool_DBA.class.getName());
private static ConnectionPool_DBA pool = null;
private static DataSource dataSource = null;
public synchronized static ConnectionPool_DBA getInstance() {
if (pool == null) {
pool = new ConnectionPool_DBA();
}
return pool;
}
private ConnectionPool_DBA() {
try {
InitialContext ic = new InitialContext();
dataSource = (DataSource) ic.lookup("java:/comp/env/jdbc/myDataSharer");
}
catch(Exception ex) {
logger.error("Error getting a connection pool's datasource\n", ex);
}
}
public void freeConnection(Connection c) {
try {
c.close();
}
catch (Exception ex) {
logger.error("Error terminating a connection pool connection\n", ex);
}
}
public Connection getConnection() {
try {
return dataSource.getConnection();
}
catch (Exception ex) {
logger.error("Error getting a connection pool connection\n", ex);
return null;
}
}
}
I think the mention of Oracle is due to me using a similar name.

There are a few pointers on avoiding this situation, obtained from other sources, especially from the connection pool implementations of other drivers and from other application servers. Some of the information is already available in the Tomcat documentation on JNDI Data Sources.
Establish a cleanup/reaper schedule that will close connections in the pool, if they are inactive beyond a certain period. It is not good practice to leave a connection to the database open for 8 hours (the MySQL default). On most application servers, the inactive connection timeout value is configurable and is usually less than 15 minutes (i.e. connections cannot be left in the pool for more than 15 minutes unless they are being reused time and again). In Tomcat, when using a JNDI DataSource, use the removeAbandoned and removeAbandonedTimeout settings to do the same.
When a new connection is return from the pool to the application, ensure that it is tested first. For instance, most application servers that I know, can be configured so that connection to an Oracle database are tested with an execute of "SELECT 1 FROM dual". In Tomcat, use the validationQuery property to set the appropriate query for MySQL - I believe this is "SELECT 1" (without quotes). The reason why setting the value of the validationQuery property helps, is because if the query fails to execute, the connection is dropped from the pool, and new one is created in its place.
As far are the behavior of your application is concerned, the user is probably seeing the result of the pool returning a stale connection to the application for the first time. The second time around, the pool probably returns a different connection that can service the application's queries.
Tomcat JNDI Data Sources are based on Commons DBCP, so the configuration properties applicable to DBCP will apply to Tomcat as well.

I'd wonder why you're using ConnectionPool_DBA in your code instead of letting Tomcat handle the pooling and simply looking up the connection using JNDI.
Why are you using an Oracle connection pool with MySQL? When I do JNDI lookups and connection pooling, I prefer the Apache DBCP library. I find that it works very well.
I'd also ask if your DatabaseUtils methods throw any exceptions, because if either of the calls prior to your call to pool.freeConnection() throw one you'll never free up that connection.
I don't like your code much because a class that performs SQL operations should have its Connection instance passed into it, and should not have the dual responsibility of acquiring and using the Connection. A persistence class can't know if it's being used in a larger transaction. Better to have a separate service layer that acquires the Connection, manages the transaction, marshals the persistence classes, and cleans up when it's complete.
UPDATE:
Google turned up the Oracle class with the same name as yours. Now I really don't like your code, because you wrote something of your own when a better alternative was easily available. I'd ditch yours right away and redo this using DBCP and JNDI.

This error indicates server closes connection unexpectedly. This can occur in following 2 cases,
MySQL closes idle connection after certain time (default is 8 hours). When this occurs, no thread is responsible for closing the connection so it gets stale. This is most likely the cause if this error only happens after long idle.
If you don't completely read all the responses, the connection may get returned to the pool in busy state. Next time, a command is sent to MySQL and it closes connection for wrong state. If the error occurs quite frequent, this is probably the cause.
Meanwhile, setting up an eviction thread will help to alleviate the problem. Add something like this to the Data Source,
...
removeAbandoned="true"
removeAbandonedTimeout="120"
logAbandoned="true"
testOnBorrow="false"
testOnReturn="false"
timeBetweenEvictionRunsMillis="60000"
numTestsPerEvictionRun="5"
minEvictableIdleTimeMillis="30000"
testWhileIdle="true"
validationQuery="select now()"

Is there a router between the web server and the database that transparently closes idle TCP/IP connections?
If so, you must have your connection pool either discard unused-for-more-than-XX-minutes connections from the pool, or do some kind of ping every YY minutes on the connection to keep it active.

On the off chance you haven't found your answer I've been dealing with this for the last day. I am essentially doing the same thing you are except that I'm basing my pooling off of apache.commons.pool. Same exact error you are seeing EOF. Check your mysqld error log file which is most likely in your data directory. Look for mysqld crashing. mysqld_safe will restart your mysqld quickly if it crashes so it won't be apparent that this is the case unless you look in its logfile. /var/log is not help for this scenario.
Connections that were created before the crash will EOF after the crash.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Apache Tomcat JDBC Connection Pool bad performance on batch \ bulk inserts - java

Related

HikariCP failing to initilize pool: 'FATAL: sorry, too many clients already' [duplicate]

SQL execution time much slower in a Tomcat Servlet than in a normal Java program

Behaviour regarding db pool and connection.setReadOnly() method

JDBC opening a new database session

Connection pooling with Java and MySQL in Tomcat web application

Categories

Resources