Connection Pools Size vs. Number of concurrent requests - java

I have to develop a high scalable webservice, but the connection pool size (Oracle DB) is set to 50.
Having this size means that the number of concurrent request served will be max 50 ,otherwise the no new connections will be available right ?
But by configuration is possible for the Weblogic or Glassfish server to accept more then 50 requests simultaneously ?
I read that the server accepts the request which are 'queued' until a thread is handling them.
Regarding 'scalability' I have a question mark as well because the average DB calls take 1,2 sec. + the soap overhead...==> in a 2,3 sec response time on each call.
Can I estimate how many concurrent users the server will support (Weblogic or Glasfish 4gb) ?
Thank you

Having a maximum of 50 connections in the pool doesn't mean you can only handle 50 users at any one time. Each page request should generate queries that can interleave with each other: so while you can only have 50 queries running at any one time, should be able to handle many more page requests. This can be helped by making sure you only connect to the database for short periods.
The use of connection pools is primarily to avoid the cost of setting up new connections all the time (plus prepared statements are cached etc.), so the intention is to re-use them as frequently as possible.
When you say the average DB call takes 1.2 secs: if this a single query I think you want to look at the query or table indexes to reduce this time (otherwise I'm afraid you are going to get scalability problems no matter what), but if it is multiple queries then they should interleave with other requests quite happily.
As regards queuing: weblogic will queue queries, but you can set a timeout so the query is returned unfulfilled after a set time. You can then decide to try again or tell the user the system is busy and perhaps try again later.

When you are talking about web service, you need to keep a optimum balance between your connection pool and concurrent requests. For the concept you can refer: https://dzone.com/articles/optimum-database-connection-pool-size

Related

Postgres RDS database DB connections increasing infinitely on Saturdays causing "JDBCConnectionException" in Spring Boot Java API app

UPDATE Added Read/Write Throughput, IOPS, and Queue-Depth graphs metrics and marked graph at time-position where errors I speak of started
NOTE: Hi, just looking for suggestions of what could possibly be causing this issue from experienced DBA or database developers (or anyone that would have knowledge for that matter). Some of the logs/data I have are sensitive, so I cannot repost here but I did my best to provide screen shots and data from my debugging so it would allow people to help me. Thank you.
Hello, I have a Postgres RDS database (version 12.7 engine) that is hosted on Amazon (AWS). This database is "hit" or called by a API client (Spring Boot/Web/Hibernate/JPA Java API) thousands of times per hour. It is only executing one 1 hibernate sql query on the backend that is on a Postgres View across 5 tables. queryDB instance (class = db.m5.2xlarge) specs are:
8 vCPU
32 GB RAM
Provisioned IOPS SSD Storage Type
800 GiB Storage
15000 Provisioned IOPS
The issue I am seeing is on Saturdays I wake up to many logs of JDBCConnectionExceptions and I noticed my API Docker containers (Defined as Service-Task on ECS) which are hosted on AWS Elastic Container Service (ECS) will start failing and return a HTTP 503 error, e.g.
org.springframework.dao.DataAccessResourceFailureException: Unable to acquire JDBC Connection; nested exception is org.hibernate.exception.JDBCConnectionException: Unable to acquire JDBC Connection
Upon checking AWS RDS DB status, I can see also the sessions/connections increase dramatically, as seen in image below with ~600 connections. It will keep increasing, seeming to not stop.
Upon checking the postgres database pg_locks and pg_stat_activity tables when I started getting all these JDBCConnectionExceptions and the DB Connections jumped to around ~400 (at this specific time), I did indeed see many of my API queries logged with interesting statuses. I exported the data to CSV and have included an excerpt below:
wait_event_type wait_event state. query
--------------- ------------ --------------------------------------------- -----
IO DataFileRead active (480 times logged in pg_stat_activity) SELECT * ... FROM ... query from API on postgres View
IO DataFileRead idle (13 times logged in pg_stat_activity) SELECT * ... FROM ... query from API on postgres View
IO DataFilePreFetch active (57 times logged in pg_stat_activity) SELECT * ... FROM ... query from API on postgres View
IO DataFilePreFetch idle (2 times logged in pg_stat_activity) SELECT * ... FROM ... query from API on postgres View
Client ClientRead idle (196 times logged in pg_stat_activity) SELECT * ... FROM ... query from API on postgres View
Client ClientRead active (10 times logged in pg_stat_activity) SELECT * ... FROM ... query from API on postgres View
LWLock BufferIO idle (1 times logged in pg_stat_activity) SELECT * ... FROM ... query from API on postgres View
LWLock BufferIO active (7 times logged in pg_stat_activity) SELECT * ... FROM ... query from API on postgres View
If I look at my pg_stats_activity table when my API and DB are running and stable, the majority of the rows from the API query are simply Client ClientRead idle status, so I feel something is wrong here.
You can see the below "performance metrics" on the DB at the time this happened (i.e. roughly 19:55 UTC or 2:55PM CST), the DataFileRead and DataFilePrefetch are astronomically high and keep increasing, which backs up the pg_stat_activity data I posted above. Also, as I stated above, during normal DB use when it is stable, the API queries will simply be in Client ClientRead Idle status in pg_stat_activity table, the the numerous DataFileRead/Prefetches/IO and ExclusiveLocks confuses me.
I don't expect anyone to debug this for me, though I would appreciate it if a DBA or someone who has experienced similiar could narrow down the issue possibly for me. I honestly wasn't sure if it was an API query taking too long (wouldn't make sense, because API has ben running stable for years), something running on the Postgres DB without my knowledge on Saturday (I really think something like this is going on), or a bad postgresql Query coming into the DB that LOCKS UP the resources and causes a deadlock (doesn't completely make sense to me as I read Postgres resolves deadlocks on its own). Also, as I stated before, all the API calls that make an SQL query on the backend are just doing SELECT ... FROM ... on a Postgres VIEW, and from what I understand, you can do concurrent SELECTS with ExclusiveLocks so.....
Would take any advice here or suggestions for possible causes of this issue
Read-Throughput (first JdbcConnectionException occured around 2:58PM CST or 14:58, so I marked the graph where READ throughput starts to drop since the DB queries are timing out and API containers are failing)
Write-Throughput (API only READS so I'm assuming spikes here are for writing to Replica RDS to keep in-sync)
Total IOPS (IOPS gradually increasing from morning i.e. 8AM, but that is expected as API calls were increasing, but these total counts of API calls match other days when there are 0 issues so doesn't really point to cause of this issue)
Queue-Depth (you can see where I marked graph and where it spikes is exactly around 14:58 or 2:58PM where first JdbcConnectionExceptions start occuring, API queries start timing out, and Db connections start to increase exponentially)
EBS IO Balance (burst balance basically dropped to 0 at this time as-well)
Performance Insights (DataFileRead, DataFilePrefetch, buffer_io, etc)
This just looks like your app server is getting more and more demanding and the database can't keep up. Most of the rest of your observations are just a natural consequence of that. Why it is happening is probably best investigated from the app server, not from the database server. Either it is making more and more requests, or each one is takes more IO to fulfill. (You could maybe fix this on the database by making it more efficient, like adding a missing index, but that would require you sharing the query and/or its execution plan).
It looks like your app server is configured to maintain 200 connections at all times, even if almost all of them are idle. So, that is what it does.
And that is what ClientRead wait_event is, it is just sitting there idle trying to read the next request from the client but is not getting any. There are probably a handful of other connections which are actively receiving and processing requests, doing all the real work but occupying a small fraction of pg_stat_activity. All of those extra idle connections aren't doing any good. But they probably aren't doing any real harm either, other than making pg_stat_activity look untidy, and confusing you.
But once the app server starts generating requests faster than they can be serviced, the in-flight requests start piling up, and the app server is configured to keep adding more and more connections. But you can't bully the disk drives into delivering more throughput just by opening more connections (at least not once you have met a certain threshold where it is fully saturated). So the more active connections you have, the more they have to divide the same amount of IO between them, and the slower each one gets. Having these 700 extra connections all waiting isn't going to make the data arrive faster. Having more connections isn't doing any good, and is probably doing some harm as it creates contention and dealing with contention is itself a resource drain.
The ExclusiveLocks you mention are probably the locks each active session has on its own transaction ID. They wouldn't be a cause of problems, just an indication you have a lot of active sessions.
The BufferIO is what you get when two sessions want the exact same data at the same time. One asks for the data (DataFileRead) and the other asks to be notified when the first one is done (BufferIO).
Some things to investigate.
Query performance can degrade over time. The amount of data being requested can increase, especially with date predicated ones. Look at Performance Insights you can see how many blocks are read(disk/io), hit(from the buffer) You want as much hit as possible. The loss of burst balance is a real indicator that this is something that is happening. Its not an issue during the week as you have less requests.
The actual amount of shared buffers you have to service these queries, the default is 25% of RAM, you could tweak this to be higher, some say 40%.. Its a dark art and you will unlikely find an answer outside of tweak and test.
Vacuum and analyzing your tables. Data comes from somewhere right? With updates and deletes and inserts tables grow get full of garbage etc. At a certain point the autovacuum processes aren't enough at default levels. You can tweak these to be more agressive, manually fire at night etc.
Index management, same as above.
Autovacuum docs
Resource Consumption
Based on what you've shared I would guess your connections are not being properly closed.

MySQL - Single connection versus a Pool

Due to some previous questions that I've had answered about the synchronous nature of MySQL I'm starting to question the reason people use Connection pools, and if in my scenario I should move to a pool.
Currently my application keeps a single connection active. There's only a single connection, statement, and result set being used in my application that's recycled. All of my database tasks are placed in a queue and executed back to back on a seperate thread. One thread for database queries, One connection for database access. In the event that the connection has an issue, it will dispose of the connection and create a new one.
From my understanding regardless of how many queries are sent to MySQL to be processed they will all be processed synchornously in the order they are received. It does not matter if these queries are coming from a single source or multiple, they will be executed in the order received.
With this being said, what's the point in having multiple connections and threads to smash queries into the database's processing queue, when regardless it's going to process them one by one anyway. A query is not going to execute until the query before it has completed processing, and like-wise in my scenario where I'm not using a pool, the next query is not going to be executed until the previous query has completed processing.
Now you may say:
The amount of time spent on processing the results provided by the MySQL query will increase the amount of time between queries being executed.
That's obviously correct, which is why I have a worker thread that handles the results of a query. When a query is completed, I convert the results into Map<> format and release the statement/resultset from memory and start processing the next query. The Map<> is sent off to a separate Worker thread for processing, so it doesn't congest the query execution thread.
Can anyone tell me if the way I'm doing things is alright, and if I should take the time to move to a pool of connections rather than a persistent connection. The most important thing is why. I'm starting this thread strictly for informational purposes.
EDIT: 4/29/2016
I would like to add that I know what a connection pool is, however I'm more curious about the benefits of using a pool over a single persistent connection when the table locks out requests from all connections during query processing to begin with.
Just trying this StackOverflow thing out but,
In every connection to a database, most of the time, it's idle. When you execute a query in the connection to INSERT or UPDATE a table, it locks the table, preventing concurrent edits. While this is good and all, preventing data overwriting or corruption, this means that no other connections may make edits while the first connection/query is still running.
However, starting a new connection takes time, and in larger infrastructures trying to skim and skin off all excess time wastage, this is not good. As such, connection pools are a whole group of connections left in the idle state, ready for the next query.
Lastly, if you are running a small project, there's usually no reason for a connection pool but if you are running a large site with UPDATEs and INSERTs flying around every millisecond, a connection pool reduces overhead time.
Slightly related answer:
a pool can do additional "connection health checks" (by examining SQL exception codes)
and refresh connections to reduce memory usage (see the note on "maxLifeTime" in the answer).
But all those things might not outweigh the simpler approach using one connection.
Another factor to consider is (blocking) network I/O times. Consider this (rough) scenario:
client prepares query --> client sends data over the network
--> server receives data from the network --> server executes query, prepares results
--> server sends data over the network --> client receives data from the network
--> client prepares resultset
If the database is local (on the same machine as the client) then network times are barely noticeable.
But if the database is remote, network I/O times can become measurable and impact performance.
Assuming the isolation level is at "read committed", running select-statements in parallel could become faster.
In my experience, using 4 connections at the same time instead of 1 generally improves performance (or throughput).
This does depend on your specific situation: if MySQL is indeed just mostly waiting on locks to get released,
adding additional connections will not do much in terms of speed.
And likewise, if the client is single-threaded, the client may not actually perceive any noticeable speed improvements.
This should be easy enough to test though: compare execution times for one program with 1 thread using 1 connection to execute an X-amount of select-queries
(i.e. re-use your current program) with another program using 4 threads with each thread using 1 separate connection
to execute the same X-amount of select-queries divided by the 4 threads (or just run the first program 4 times in parallel).
One note on the connection pool (like HikariCP): the pool must ensure no transaction
remains open when a connection is returned to the pool and this could mean a "rollback" is send each time a connection is returned to the pool
(closed) when auto-commit is off and no "commit" or "rollback" was send previously.
This in turn can increase network I/O times instead of reducing it. So make sure to test with either
auto-commit on or make sure to always send a commit or rollback after your query or set of queries is done.
Connection pool and persistent connection are not the same thing.
One is the limit of the number of SQL connections, the other is single Pipe issues.
The problem is generally the time taken to transfer the SQL output to the server than the query execution time. So if you open two cli SQL clients and fire two queries, one with a large output and one with a small output (in that sequence), the smaller one finishes first while the larger one is still scrolling its output.
The point here is that multiple connection does solve problems for cases like the above.
When you have multiple front end requests asking for queries, you may prefer persistent connections because it gives you the benefit of multiplex over different connections (large versus small outputs) and prevents the overhead of session setup/teardown.
Connection pool APIs have inbuilt error checks and handling but most APIs still expect you to manually declare if you want a persistent connection or not.
So in effect there are 3 variables, pool, persistence and config parameters via the API. One has to make a mix and match of pool size, persistence and number of connections to suite one's environment

Java pooling connection optimization

Which are the commons guidelines/advices to configure, in Java, a http connection pool to support huge number of concurrent http calls to the same server? I mean:
max total connections
max default connection per route
reuse strategy
keep alive strategy
keep alive duration
connection timeout
....
(I am using Apache http components 4.3, but I am available to explore new solutions)
In order to be more clear, this is my situation:
I developed a REST resource that needs to perform about 10 http calls to AWS CloudSearch in order to obtain search results to be collected in a final result (that I really cannot obtain through a single query).
The whole operation must take less than 0.25 seconds. So, I run http calls in parallel in 10 different threads.
During a benchamarking test, I noticed that with few concurrent request, 5, my objective is reached. But, increasing concurrent requests to 30, there is a tremendous degradation of performance due to the connection time that takes about 1 second. With few concurrent requests, instead, the connection time is about 150 ms (to be more precise, the first connection takes 1 second, all the following connections take about 150 ms). I can ensure that CloudSearch returns its response in less than 15 ms, so there is a problem somewhere in my connection pool.
Thank you!
The amount of threads/connections that are best for your implementation depend on that implementation (which you did not post), but here are some guidelines as requested:
If those threads never block at all, you should have as many threads as cores (Runtime.availableCores(), this will include hyperthread-cores). Simply because more than 100% CPU usage isn't possible.
If your threads rarely block, cores * 2 is a good start for benchmarking.
If your threads frequently block, you absolutely need to benchmark your application with various settings to find the best solution for your implementation, OS and hardware.
Now the most optimal case is obviously the first one, but to get to this one, you need to remove blocking from your code as much as you can. Java can do this for IO operations if you use the NIO package in non-blocking mode (which is not how the Apache package does it).
Then you have 1 thread that waits on a selector and awakes as soon as any data is ready to be sent or read. This thread then only copies the data from it's source to the destination and returns to the selector. In case of a read (incoming data), this destination is a blocking queue, on which core amount of threads wait. One of those threads will then pull out the received data and process it, now without any blocking.
You can then use the length of the blocking queue to adjust how many parallel requests are reasonable for your task and hardware.
The first connection takes >1 second, because it actually has to look-up the address via DNS. All other connections are put on hold for the moment, as there is no sense in doing this twice. You can circumvent that by either calling the IP (probably not good if you talk to a load-balancer) or by "warming-up" the connections with an initial request. Any new connection afterwards will use the cached DNS result, but still needs to perform other initializations, so reusing connections as much as you can will reduce latency a lot. With NIO this is a very easy task.
In addition there are HTTP-multi-requests, that is: you make one connection but request several URLs in one request and get several responses over "the same line". This massively reduces connection overhead, but needs to be supported by the server.

How to parallelize Multiple Requests to mongoDb?

I am using a single standalone Mongo DB server with no special topology like replication or sharding etc. Currently I have an issue that mongo DB does not support more than 500 parallel requests. Note that I am using only one instance of MongoClient and the remaining threads are used for inserts. I am using a java executor framework to create the threads and these threads are used to insert data to a collection [all insert in the same collection]
You should queue the requests before you issue them towards the database. There is no use requesting 500 things from your database in parallel. Remember a single request comes with some costs memory wise, locking wise and so on. Actually you are wasting resources by asking your database too much at once - remember I mean this request wise not data wise.
So use a queue (or more) and pool up the requests. From that pool you feed your worker threads (lets say 5 or 10 are enough) and that's it.
Take a look at the Future interface in the concurrent package of java. Using asynchrone processing here looks like the thing with the highest throughput and the lowest resource impact.
But check the MongoDB driver first. I would not be surprised if they have implemented it already this way. If this is the case you just have to limit yourself by using a queue to have only lets say 10 or 100 requests at once being handled by the database driver. Do some performance check tweaking the number of actual requests send to the database.

shared DB connection vs private DB connections

Trying to figure out how to manage/use long-living DB connections. I have too little experience of this kind, as I have used DB only with small systems (up to some 150 concurrent users, each one had its own DB user/pass, so there were up to 150 long-living DB connections at any time) or web pages (each page request has its own DB connection that lasts for less than a second, so number of concurrent DB conncetions isn't huge).
This time there will be a Java server and Flash client. Java connects to PostgreSQL. Connections are expected to be long-living, i.e., they're expected to start when Flash client connects to Java server and to end when Flash client disconnects. Would it be better to share single connection between all users (clients) or to make private connection for every client? Or some other solution would be better?
*) Single/shared connection:
(+) pros
only one DB connection for whole system
(-) cons:
transactions can't be used (e.g., "user1.startTransaction(); user1.updateBooks(); user2.updateBooks(); user1.rollback();" to a single shared connection would rollback changes that are done by user2)
long queries of one user might affect other users (not sure about this, though)
*) Private connections:
(+) pros
no problems with transactions :)
(-) cons:
huge number of concurrent connections might be required, i.e., if there are 10000 users online, 10000 DB connections are required, which seems to be too high number :) I don't know anything about expected number of users though, as we are still in process of researching and planning.
One solution would be to introduce timeouts, i.e., if DB connection is not used for 15/60/900(?) seconds, it gets disconnected. When user again needs a DB, it gets reconnected. This seems to be a good solution for me, but I would like to know what might be the reasonable limits for this, e.g., what might be the max number of concurrent DB connections, what timeout should be used etc.
Another solution would be to group queries into two "types" - one type that can safely use single shared long-living connection (e.g., "update user set last_visit = now() where id = :user_id"), and another type that needs a private short-living connection (e.g., something that can potentially do some heavy work or use transactions). This solution does not seem to be appealing for me, though if that's the way it should be done, I could try to do this...
So... What do other developers do in such cases? Are there any other reasonable solutions?
I don't use long-lived connections. I use a connection pool to manage connections, and I keep them only for as long as it takes to perform an operation: get the connection, perform my SQL operation, return the connection to the pool. It's much more scalable and doesn't suffer from transaction problems.
Let the container manage the pool for you - that's what it's for.
By using single connection, you also get very low performance because the database server will only allocate one connection for you.
You definitely need a connection pool. If you app runs inside an application server, use the container pool. Or you can use a connection pool library like c3p0.

Categories