I have a application that return a long request that returns a stream (a huge json)
The application is written in Java and I'm using Jetty as server.
The problem is after sometimes getting data, it stops. I made some tests and sometimes I got 10, 15, 40%.. doesn't matter.. Jetty interrupts the connection at some moment. I already isolated only one machine without other requests and it happens the same way.
I do not know how to debug, cause I didn't see any error. It only interrupts.
Any help is appreciate
What version of jetty?
Is this on a slow connection perhaps?
If so, you are likely encountering idle timeouts.
It can happen like this ....
Server has a large amount of data to send, it sends until there is TCP backpressure from the client telling it "woah!".
So the Server waits until the TCP layer says its ok to start sending again.
The client is slow.
This wait is longer than the configured Idle Timeout for that connector.
The server closes the connection.
Related
The problem:
I am having some strange behaviour from a Jetty server (rest over https) when some client connections are closed (client-side) before the server has had time to reply. Normally this is well managed and expected by a webserver/application server but in a specific instance something breaks the server that stops replying.
I am trying to reproduce programmatically and locally the issue, opening a client connection and closing it before the server has had time to reply, but I do not have much experience with a situation like this, normally the clients I write are expected to not die immediately.
I am not interested in the language/application I have to use to replicate my case, it can be a Java program, a netcat command, telnet, dotnetcore... The only limit I have is that it should run on a Kubernetes pod, if possible.
I am trying to use Java to open a socket then close it immediately, or to create an Http client and stop it immediately after a request sent, but with no luck at the moment.
At the same time I am looking at netcat, but I fear it's too low level for a rest request.
In my batch application when I am sending requests across a network using a Web Service and Java, after running about 30000 requests and receiving the responses, the program throws a java.net.connectexception connection timed out exception.
I am using WildFly, along with some Java code in the middle to configure the requests (Strings) before sending it across the network.
After research the possible reasons I found for this is that there is either a Firewall blocking my access, which can't be true since it ran 90% of the requests already.
I've also seen somewhere that says that I could have overloaded the server, although I'm not sure what that means exactly.
You have filled up the server's listen backlog queue. This is caused by creating new connections faster than the server can accept them. You should look into connection pooling at the client, and handling multiple requests per connection at the server.
Disclaimer: there is a lot of information on similar topics. In our case it works as expected without AWS ELB (Elastic Load Balancer), i.e. when the client drops, ServletOutputStream.flush() throws IOException.
Setup: we have an instance running Tomcat 7 (Java 1.7) behind ELB (HTTPS:443 -> HTTP:8080). The servlet streams data to the client through HTTP long lived connection.
Problem: when the client disconnects, the server keeps streaming data, i.e. ServletOutputStream.flush() or .write() does not throw IOException. The ELB kind of "buffers" the connection (we can see it with IpTraf monitor), so from the Tomcat side it appears as the client is still there. Without the ELB, IOException is thrown properly, so the servlet can stop streaming. We have disabled connection draining and reduced connection timeout to 1 sec, we also reduced all timeouts on Tomcat's HTTP Connector including KeepAlive to just few seconds. Nothing helps.
Question: is there anything we can do with the ELB configuration / Tomcat / Java side to allow disconnection detection in this setup?
We had the same kind of problem (but in .NET with IIS).
We seem to have solved it by switching from the classic ELB to the Application ELB. Now writing to the output stream of a closed connection gives an exception, where first (on classic ELB) it didn't.
Hope this helps anyone running into the same problem
I am using Elasticsearch 1.5.1 and Tomcat 7. Web application creates a TCP client instance as Singleton during server startup through Spring Framework.
Just noticed that I failed to close the client during server shutdown.
Through analysis on various tools like VisualVm, JConsole, MAT in Eclipse, it is evident that threads created by the elasticsearch client are live even after server(tomcat) shutdown.
Note: after introducing client.close() via Context Listener destroy methods, the threads are killed gracefully.
But my query here is,
how to check the memory occupied by these live threads?
Memory leak impact due to this thread?
We have got few Out of memory:Perm gen errors in PROD. This might be a reason but still I would like to measure and provide stats for this.
Any suggestions/help please.
Typically clients run in a different process than the services they communicate with. For example, I can open a web page in a web browser, and then shutdown the webserver, and the client will remain open.
This has to do with the underlying design choices of TCP/IP. Glossing over the details, under most cases a client only detects it's server is gone during the next request to the server. (Again generally speaking) it does not continually poll the server to see if it is alive, nor does the server generally send a "please disconnect" message on shutting down.
The reason that clients don't generally poll servers is because it allows the server to handle more clients. With a polling approach, the server is limited by the number of clients running, but without a polling approach, it is limited by the number of clients actively communicating. This allows it to support more clients because many of the running clients aren't actively communicating.
The reason that servers typically don't send an "I'm shutting down" message is because many times the server goes down uncontrollably (power outage, operating system crash, fire, short circuit, etc) This means that an protocol which requires such a message will leave the clients in a corrupt state if the server goes down in an uncontrolled manner.
So losing a connection is really a function of a failed request to the server. The client will still typically be running until it makes the next attempt to do something.
Likewise, opening a connection to a server often does nothing most of the time too. To validate that you really have a working connection to a server, you must ask it for some data and get a reply. Most protocols do this automatically to simplify the logic; but, if you ever write your own service, if you don't ask for data from the server, even if the API says you have a good "connection", you might not. The API can report back a good "connections" when you have all the stuff configured on your machine successfully. To really know if it works 100% on the other machine, you need to ask for data (and get it).
Finally servers sometimes lose their clients, but because they don't waste bandwidth chattering with clients just to see if they are there, often the servers will put a "timeout" on the client connection. Basically if the server doesn't hear from the client in 10 minutes (or the configured value) then it closes the cached connection information for the client (recreating the connection information as necessary if the client comes back).
From your description it is not clear which of the scenarios you might be seeing, but hopefully this general knowledge will help you understand why after closing one side of a connection, the other side of a connection might still think it is open for a while.
There are ways to configure the network connection to report closures more immediately, but I would avoid using them, unless you are willing to lose a lot of your network bandwidth to keep-alive messages and don't want your servers to respond as quickly as they could.
I am using Jersey 1.4, the ApacheHttpClient, and the Apache MultiThreadedHttpConnectionManager class to manage connections. For the HttpConnectionManager, I set staleCheckingEnabled to true, maxConnectionsPerHost to 1000 and maxTotalConnections to 1000. Everything else is default. We are running in Tomcat and making connections out to multiple external hosts using the Jersey client.
I have noticed that after after a short period of time I will begin to see sockets in a CLOSE_WAIT state that are associated with the Tomcat process. Some monitoring with tcpdump shows that the external hosts appear to be closing the connection after some time but it's not getting closed on our end. Usually there is some data in the socket read queue, often 24 bytes. The connections are using https and the data seems to be encrypted so I'm not sure what it is.
I have checked to be sure that the ClientRequest objects that get created are closed. The sockets in CLOSE_WAIT do seem to get recycled and we're not running out of any resources, at least at this time. I'm not sure what's happening on the external servers.
My question is, is this normal and should I be concerned?
Thanks,
John
This is likely to be a device such as the firewall or the remote server timing out the TCP session. You can analyze packet captures of HTTPS using Wireshark as described on their SSL page:
http://wiki.wireshark.org/SSL
The staleCheckingEnabled flag only issues the check when you go to actually use the connection so you aren't using network resources (TCP sessions) when they aren't needed.