AWS ELB servlet client disconnection detection

AWS ELB servlet client disconnection detection - java

Disclaimer: there is a lot of information on similar topics. In our case it works as expected without AWS ELB (Elastic Load Balancer), i.e. when the client drops, ServletOutputStream.flush() throws IOException.
Setup: we have an instance running Tomcat 7 (Java 1.7) behind ELB (HTTPS:443 -> HTTP:8080). The servlet streams data to the client through HTTP long lived connection.
Problem: when the client disconnects, the server keeps streaming data, i.e. ServletOutputStream.flush() or .write() does not throw IOException. The ELB kind of "buffers" the connection (we can see it with IpTraf monitor), so from the Tomcat side it appears as the client is still there. Without the ELB, IOException is thrown properly, so the servlet can stop streaming. We have disabled connection draining and reduced connection timeout to 1 sec, we also reduced all timeouts on Tomcat's HTTP Connector including KeepAlive to just few seconds. Nothing helps.
Question: is there anything we can do with the ELB configuration / Tomcat / Java side to allow disconnection detection in this setup?

We had the same kind of problem (but in .NET with IIS).
We seem to have solved it by switching from the classic ELB to the Application ELB. Now writing to the output stream of a closed connection gives an exception, where first (on classic ELB) it didn't.
Hope this helps anyone running into the same problem

Related

Simulate an Http client disconnection before the server reply

The problem:
I am having some strange behaviour from a Jetty server (rest over https) when some client connections are closed (client-side) before the server has had time to reply. Normally this is well managed and expected by a webserver/application server but in a specific instance something breaks the server that stops replying.
I am trying to reproduce programmatically and locally the issue, opening a client connection and closing it before the server has had time to reply, but I do not have much experience with a situation like this, normally the clients I write are expected to not die immediately.
I am not interested in the language/application I have to use to replicate my case, it can be a Java program, a netcat command, telnet, dotnetcore... The only limit I have is that it should run on a Kubernetes pod, if possible.
I am trying to use Java to open a socket then close it immediately, or to create an Http client and stop it immediately after a request sent, but with no luck at the moment.
At the same time I am looking at netcat, but I fear it's too low level for a rest request.

DropWizard: Why is it ignoring a connection if flooded with 100 requests?

We have Gatling tests (mis)used for integration testing.
One of things it does at one place is that it shoots 100 HTTP requests at the same instant to the tested service, based on DropWizard 1.3.5.
Before upgrading from DropWizard 0.9.5, this worked fine.
But now, these requests end up with ConnectException on the client side, and I don't see any errors on the server side - the requests is silently ignored by DropWizard. I also looked at the pools over JMX and don't see these hanging anywhere. It looks as if the server lost track of the socket and never sent a response or "hanged up".
java.net.ConnectException: Connection timed out: somehost.mycompany.net/10.103.66.45:9000
I have looked at the DropWizard config reference. I don't see any fine tuning of what's rejected and when. Here is my current config:
server:
maxThreads: 64
maxQueuedRequests: 1024
adminMaxThreads: 8
...
database:
url: jdbc:postgresql://localhost:5432/mydb
poolInitialSize: 20
poolMaxSize: 100
I am including the DB pool size because it might be that it gathers the resources for the request handling IoC, but 100 should cover all Dw threads.
I would expect the extra requests to wait in the queue (which AFAIK should be handled on the OS side, since the new IO API is used) and taken by the 64 threads.
What's interesting is that before I set the maxThreads to 64, the default of 1024 applied; and then, sometimes the tests passed, sometimes not.
My question is:
Does DropWizard do this intentionally (a DoS detection)?
If yes, how do I configure it?
If not, well, then that's a bug.

java.net.connectexception when sending string request

In my batch application when I am sending requests across a network using a Web Service and Java, after running about 30000 requests and receiving the responses, the program throws a java.net.connectexception connection timed out exception.
I am using WildFly, along with some Java code in the middle to configure the requests (Strings) before sending it across the network.
After research the possible reasons I found for this is that there is either a Firewall blocking my access, which can't be true since it ran 90% of the requests already.
I've also seen somewhere that says that I could have overloaded the server, although I'm not sure what that means exactly.

You have filled up the server's listen backlog queue. This is caused by creating new connections faster than the server can accept them. You should look into connection pooling at the client, and handling multiple requests per connection at the server.

Jetty interrupting connection

I have a application that return a long request that returns a stream (a huge json)
The application is written in Java and I'm using Jetty as server.
The problem is after sometimes getting data, it stops. I made some tests and sometimes I got 10, 15, 40%.. doesn't matter.. Jetty interrupts the connection at some moment. I already isolated only one machine without other requests and it happens the same way.
I do not know how to debug, cause I didn't see any error. It only interrupts.
Any help is appreciate

What version of jetty?
Is this on a slow connection perhaps?
If so, you are likely encountering idle timeouts.
It can happen like this ....
Server has a large amount of data to send, it sends until there is TCP backpressure from the client telling it "woah!".
So the Server waits until the TCP layer says its ok to start sending again.
The client is slow.
This wait is longer than the configured Idle Timeout for that connector.
The server closes the connection.

Sockets in CLOSE_WAIT from Jersey Client

I am using Jersey 1.4, the ApacheHttpClient, and the Apache MultiThreadedHttpConnectionManager class to manage connections. For the HttpConnectionManager, I set staleCheckingEnabled to true, maxConnectionsPerHost to 1000 and maxTotalConnections to 1000. Everything else is default. We are running in Tomcat and making connections out to multiple external hosts using the Jersey client.
I have noticed that after after a short period of time I will begin to see sockets in a CLOSE_WAIT state that are associated with the Tomcat process. Some monitoring with tcpdump shows that the external hosts appear to be closing the connection after some time but it's not getting closed on our end. Usually there is some data in the socket read queue, often 24 bytes. The connections are using https and the data seems to be encrypted so I'm not sure what it is.
I have checked to be sure that the ClientRequest objects that get created are closed. The sockets in CLOSE_WAIT do seem to get recycled and we're not running out of any resources, at least at this time. I'm not sure what's happening on the external servers.
My question is, is this normal and should I be concerned?
Thanks,
John

This is likely to be a device such as the firewall or the remote server timing out the TCP session. You can analyze packet captures of HTTPS using Wireshark as described on their SSL page:
http://wiki.wireshark.org/SSL
The staleCheckingEnabled flag only issues the check when you go to actually use the connection so you aren't using network resources (TCP sessions) when they aren't needed.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.