We have an application which we run inside the JBoss EJB Container. This application makes connections to mysql and runs stored procedures on mysql. We have observed that after a point in time Jboss stops responding to web connections to web application hosted on it. So after investigating we have found the following issues.
The number of socket connections from jboss keeps on increasing and once it goes above thousand we observe that jboss stops working completely because of the limit on a process for the number of socket connections(i.e 1024), we have cross checked the code for socket connections, but we feel it makes socket connections only to mysql, so either this is a problem or something else is doing this, can't find the actual cause. We have tried using netstat, lsof on linux, any other suggestions to finding the root cause of the connection issue would be of great help.
We also checked the show processlist of mysql, but it shows only 8 to 10 active connections at any point in time. So no luck here.
There is also another interesting thing, we had reduced the timeout for connections from our application from 86400 seconds to 30 seconds, and we have reduced the wait timeout for mysql database to 50 seconds, so there is a gap of 20 seconds. We have again and again cross checked the database for any issues with this, but this hardly affects it. But any suggestions in this would also be helpful. We plan to reduce the difference to 5 seconds.
Update : We have subsequently changed the connection timeout from 30 to 170 and also mysql waittimeout to 180
we have tried making changes according to jboss forums where it says cache connection manager tag, we have to enable an attribute called debug=true, we have tried this solution, but what happens is if there are transactions, this causes them to drop off, which is causing havoc in our application, we subsequently reverted the changes, and are running it just like that, but the application is still on the verge of a disaster. We are still running clueless, JBOSS seems to be at the core of our issues, still no solution :(
Related
We have PostgreSQL 9.6 instance at a ubuntu 18.04 machine. When we restart java services deployed in a Kubernetes cluster then already existing idle connection didn't get remove and service create new connections on each restart. Due to this, we have reached a connection limit so many times and we have to terminate connection manually every time. Same service versions are deployed on other instances but we are not getting this scenario on other servers.
I have some questions regarding this
Can it be a PostgreSQL configuration issue? However, i didn't find any timeout-related setting difference in 2 instances (1 is working fine and another isnt)
If this is a java service issue then what should I check?
If its neither a PostgreSQL issue not a java issue then what should i look into?
If the client process dies without closing the database connection properly, it takes a while (2 hours by default) for the server to notice that the connection is dead.
The mechanism for that is provided by TCP and is called keepalive: after a certain idle time, the operating system starts sending keepalive packets. After a certain number of such packets without response, the TCP connection is closed, and the database backend process will die.
To make PostgreSQL detect dead connections faster, set the tcp_keepalives_idle parameter in postgresql.conf to less than 7200 seconds.
I'm new to java and tomcat. I'm developing a website in java using spring mvc. It's deployed to a linux server that's running Tomcat 8. Everything works fine when I deploy, it connects to the database great. The issue is that the site seems to go idle very quickly. I haven't been able to time it exactly, but it seems like it only takes about a minute of inactivity for the entire site to go idle. Then the next request is extremely slow, loading in all my classes. I'm losing my sessions as well.
Is this a common occurrence? Does it sound like I'm doing something wrong in java? Tomcat? Both?
EDIT: In light of StuPointerException's comment, I've updated my database connection management. I'm now using Apache dbcp. I will update if this resolves the problem. I want to give my QA tester ample time to hit my site some more.
It's difficult to answer your question directly without more information about your server setup.
For what it's worth though, every time I see this kind of behaviour it's down to a mis-configured database connection pool. There can be a significant overhead in creating new database connections.
If you don't use a connection pool or you're allowing connections in the pool to die (due to missing validation queries/checks) then you will start to see performance problems over time due to connection timeouts.
I have a Java application requesting about 2.4 million records from a Microsoft SQL Server (Microsoft SQL Server 2008 R2 (SP3))
The application runs fine on all hosts, except one. On this host, the application is able to retrieve data on some occasions. But on some others, it hangs.
Monitoring the MS Sql server indicates that the SPID associated with the query is in an ASYNC_NETWORK_IO wait state.
There are a few links online that talk about it.
https://blogs.msdn.microsoft.com/joesack/2009/01/08/troubleshooting-async_network_io-networkio/
https://social.msdn.microsoft.com/Forums/sqlserver/en-US/6db233d5-8892-4f8a-88c7-b72d0fc59ca9/very-high-asyncnetworkio?forum=sqldatabaseengine
https://social.msdn.microsoft.com/Forums/sqlserver/en-US/1df2cab8-33ca-4870-9daf-ed333a64630c/network-packet-size-and-delay-by-sql-server-sending-data-to-client?forum=sqldatabaseengine
Based on the above, the ASYNC_NETWORK_IO means 2 things:
1. Application is slow to process the results
2. Network between application and DB has some issues.
For #1 above, We analyzed using tcpdumps and found that in the cases where the query goes into ASYNC_NETWORK_IO state, the application server's tcp connection has a window size that oscillates between 0 and a small number, and eventually remains stuck at 0. Based on some more analysis, aspects related to firewalls between DB and application have also been mostly ruled out.
So I am staring at #2, unable to understand what could possibly go wrong. All the more baffling because the same code has been running under similar data loads for more than a year now. And it also runs fine on other hosts.
The JDBC driver being used is sqljdbc4-4.0.jar.
This by default has an adaptive buffering feature, which does things under the hood to reduce application resources.
We use the default fetch size of 128 (which i believe is not a good one).
So i am going to experiment overriding the default adaptive buffering behavior, though the MS docs suggest that it is good to have adaptive buffering for large result sets.
I will change the connection setting to use selectMethod=cursor.
And also change the fetchSize to 1024.
Now if it does not work:
What are some aspects of the problem that are worth investigating.
Assuming its still an issue with the client, what other connection settings, network settings should be inspected/changed to make progress?
If it does work consistently, what is the impact of making the connection setting change to selectMethod=cursor
On the application side?
Database side?
Update: I tested the application adding the selectMethod=cursor to the connection. However, it results in the same issue as above.
Based on discussions with other administrators in the team - at this point the issue may be in the jdbc driver, or on the OS (when it tries to handle the data on the network).
After a good amount of discussions with the System admin, Network Admin and Database admin - it was agreed that somewhere in the OS -> Application stack, the data from network wasn't handled. In the meantime, we tested out a solution where we broke down the query to return smaller sized results. So we broke it down to 5 queries, each returning about 500k records.
Now when we ran these queries sequentially, we still ran into the same issue.
However, when we ran the queries in parallel, it always was successful.
Given that the solution worked always we haven't bothered getting to the root cause of the problem anymore.
On another note, the hardware and software running the application was also outdated. It was running Red Hat 5. So, it could well have to do something with that.
I'm currently developing a Java Websocket application that is deployed on Wildfly 10. I cannot post the code, but here's the logic:
Multiple threads poll a database every 5 seconds(select query, reusing a PreparedStatement after closing previous ResultSet) and send via Websocket to all connected clients.
Have configured datasource that connects to MYSQL server (localhost).
The application runs fine until a while later, it crashes and the log is full with 'Unable to get managed connection from datasource' errors. Also, Websocket fails with 'ClosedChannelException'.
Services on the same server that open a connection and close it immediately do work fine. However there are 5-6 threads in the concerned code that must use connections after 5 secs so a thread is given a dedicated connection that is only torn down when the application context is destroyed.
Another thing is when the application fails, it works for a lesser time on disable-enable. Only a reboot gets it to work better.
Same project works without error on Glassfish.
Somehow, Wildfly seems to periodically reset either DB connections, or all TCP connections altogether.
Is there a setting that is relevant to Wildfly's behaviour towards threads? I have verified that only as many threads as are intended are actually created.
Any help would be appreciated.
Edit: This application works well on my local machine. When I deploy it on remote server, it works for a while (3 hours max) before failing altogether.
I use Netbeans 8 to compile, if that helps.
I am having a strange issue with an exisitng servlet based applications (a litte old one).
I cannot get any servlets to complete operation, if it has a mysql prepared statement running slower than 10 seconds (I only have prepared statements in this app).
I dont have a problem with faster queries (less than 10 second).
catelina.out does not have any trace nor the browser interface. When I inspect the browser request through fiddler it shows
"This session is not yet complete. Press F5 to refresh when session is
complete for updated statistics. Request Count: 1 Bytes Sent:
437 (headers:437; body:0) Bytes Received: 0 (headers:0; body:0)"
MySQL "SHOW FULL PROCESSLIST" command shows the prticular socket connection (of a slow query) running for about 12 seconds with COMMAND="Execute" and STATE="Sending Data", then changing to COMMAND="Sleep" and STATE="" and stays like this for a long time (more than 500 seconds).
Ideally this connection should not remain as a SLEEP for such a long time, but close after completion of sending data.
netstat -ab command shows the same connection as ESTABLISHED under tomcat6.exe and mysqld.exe.
Ideally this should stay as TIME_WAIT once the query execution is finished till the "TcpTimedWaitDelay" value.
Expected payload of this query is pretty nominal (around 2kbytes).
MySQL connect_timeout=100 and Tomcat connectionTimeout="100000" (for the port 8000 which I am running the app).
One thing I did not try to change is the JDBC DriverManager.setLoginTimeout property, since it is only concerned until getting the connection.
This app was working quite well until a couple of days back. but I do not have any statistics with me on the query execution time during those days.
I am running windows 2008 R2 std edition, tomcat 6 and MySQL 5.5.
I cannot pinpoint this behaviour to any cause. appreciate your help very much.
It seems to be an issue related to the MySQL connector. We were using the old connector (version 3.1.x) . Latest stable version(5.1.20) fixed the issue.