In my batch application when I am sending requests across a network using a Web Service and Java, after running about 30000 requests and receiving the responses, the program throws a java.net.connectexception connection timed out exception.
I am using WildFly, along with some Java code in the middle to configure the requests (Strings) before sending it across the network.
After research the possible reasons I found for this is that there is either a Firewall blocking my access, which can't be true since it ran 90% of the requests already.
I've also seen somewhere that says that I could have overloaded the server, although I'm not sure what that means exactly.
You have filled up the server's listen backlog queue. This is caused by creating new connections faster than the server can accept them. You should look into connection pooling at the client, and handling multiple requests per connection at the server.
Related
Disclaimer: there is a lot of information on similar topics. In our case it works as expected without AWS ELB (Elastic Load Balancer), i.e. when the client drops, ServletOutputStream.flush() throws IOException.
Setup: we have an instance running Tomcat 7 (Java 1.7) behind ELB (HTTPS:443 -> HTTP:8080). The servlet streams data to the client through HTTP long lived connection.
Problem: when the client disconnects, the server keeps streaming data, i.e. ServletOutputStream.flush() or .write() does not throw IOException. The ELB kind of "buffers" the connection (we can see it with IpTraf monitor), so from the Tomcat side it appears as the client is still there. Without the ELB, IOException is thrown properly, so the servlet can stop streaming. We have disabled connection draining and reduced connection timeout to 1 sec, we also reduced all timeouts on Tomcat's HTTP Connector including KeepAlive to just few seconds. Nothing helps.
Question: is there anything we can do with the ELB configuration / Tomcat / Java side to allow disconnection detection in this setup?
We had the same kind of problem (but in .NET with IIS).
We seem to have solved it by switching from the classic ELB to the Application ELB. Now writing to the output stream of a closed connection gives an exception, where first (on classic ELB) it didn't.
Hope this helps anyone running into the same problem
I have a RESTful service that works very fast. I am testing it on localhost. The client is using Spring REST template. I started by using a naive approach:
RestTemplate restTemplate = new RestTemplate(Collections.singletonList(new GsonHttpMessageConverter()));
Result result = restTemplate.postForObject(url, payload, Result.class);
When I make a lot of these requests, I am getting the following exception:
Caused by: org.springframework.web.client.ResourceAccessException: I/O error on POST request for "http://localhost:8080/myservice":No buffer space available (maximum connections reached?): connect; nested exception is java.net.SocketException: No buffer space available (maximum connections reached?): connect
This is caused by connections not being closed and hanging in TIME_WAIT state. The exception starts happening when the ephemeral ports are exhausted. Then the execution waits for the ports to be free again. I am seeing peak performance with long breaks. The rate I am getting is almost what I need, but of course, these TIME_WAIT connections are not good. Tested both on Linux (Ubuntu 14) and Windows (7), similar results at different times due to different ranges of the ports.
To fix this, I tried using an HttpClient with HttpClientBuilder from Apache Http Components library.
RestTemplate restTemplate = new RestTemplate(Collections.singletonList(new GsonHttpMessageConverter()));
HttpClient httpClient = HttpClientBuilder.create()
.setMaxConnTotal(TOTAL)
.setMaxConnPerRoute(PER_ROUTE)
.build();
restTemplate.setRequestFactory(new HttpComponentsClientHttpRequestFactory(httpClient));
Result result = restTemplate.postForObject(url, payload, Result.class);
With this client, I see no exceptions. The client is now using only a very limited number of ephemeral ports. But whatever settings I use (TOTAL and PER_ROUTE), I can't get the performance I need.
Using the netstat command, I see that there are not many connections done to the server. I tried setting the numbers to several thousands, but it seems the client never uses that much.
Is there anything I can do to improve the performance, without opening too many connections?
UPDATE: I've tried setting number of total and per route connections to 5000 and 2500 but it still looks like the client is not creating more than a hundred (judging from netstat -n | wc -l). The REST service is implemented using JAX-RS and running on Jetty.
UPDATE2: I have now tuned the server with some memory settings and I am getting really good throughput. The naive approach is still a bit faster, but I think it's just a little overhead of the pooling on client side.
Actually Spring Boot is not leaking connections. What you're seeing here is standard behavior of the Linux kernel (and every major OS). All sockets that are closed from the machine go to a TIME_WAIT state for some duration of time. This is to prevent the next socket that uses that ephemeral port from receiving packets that were actually intended for the previous socket on that port. The difference you're seeing between the two is a result of the connection pooling approaches each one takes.
More specifically, RestTemplate does not use connection pooling by default. This means every rest call opens a new local ephemeral port and a new connection to the server. If your service is very fast, it will blow through its available local port range in no time at all. With the Apache HttpClient, you are taking advantage of connection pooling. This will prevent your application from seeing the problem that you described. However, given that your service is able to respond faster than the Linux kernel is taking sockets out of TIME_WAIT, connection pooling will make your client slower no matter what you do (if it didn't slow anything down - then you'd run out of local ephemeral ports again).
While it's possible to enable TCP reuse in the Linux kernel, it's can get dangerous (packets can get delayed and you could get ephemeral ports receiving random packets they don't understand which could cause all kinds of problems). The solution here is to use connection pooling as you have in the second example, with sufficiently high numbers to achieve close to the performance you're looking for.
To help you tune your connection pool, you'll want to tweak the maxConnPerRoute and maxConnTotal parameters. maxConnPerRoute limits the number of connections that will be made to a single IP:Port pair, and maxTotal limits the number of total connections that will ever be opened. In your case, since it appears all requests are made to the same location, you could set them to the same (high) value.
I am encountering an interesting issue wherein a TCP connection for a HTTP 1.1 POST request is being closed immediately following the request (ie, before the response can be sent by the server).
A few details about the test environment:
Client - Windows XP, Internet Explorer 8, Flash player 12.
Server - Java 7
Prior to the aforementioned behaviour, we have several longstanding TCP connections, each being reused for multiple HTTP requests; we open a long poll and when this poll completes, open another. We see several hours of well behaved and reused TCP connections opening polls as the previous poll closes.
Eventually -- sometimes after 12 or more hours of normal behaviour -- a poll on a long standing connection will send the HTTP POST and immediately send a TCP FIN before the server can write the response.
The client behaviour is to keep a poll open at all times, so at this point we try to open a new poll.
A new TCP connection is then opened by the client sending another HTTP POST, with the same behaviour; the request is sent, followed by a FIN from the client.
This behaviour can continue for several minutes, until the server can finally respond to kill the client. (The server detects the initial closed connection by encountering an IO Exception, the next time it can communicate with the client, the response is to tell the client to close)
Edit: We are opening connections only through the Flash client, and are not delving into low level TCP code. While Steffen Ullrich is correct, and the single sided shutdown is possible and should be dealt with, what is not clear is why a single sided shutdown is occurring at this (seemingly arbitrary) point. We are not calling close from the application to instigate this behaviour.
My questions are:
Under what circumstances would a TCP connection for a HTTP request be terminated prior to the response being received? I understand this is bad behaviour, and an incomplete HTTP transaction, so presumably something lower down is terminating the connection for an unknown reason.
Are there any diagnostics that could be used to help understand the problem? (We are currently monitoring server and client side activity with Wireshark.)
Notes:
In Wireshark, the behaviour we see is:
Longstanding TCP connection (#1) serving multiple HTTP requests.
HTTP request is made over #1.
Server ACKs the request.
Client sends FIN to close connection #1. Server responds with FIN,ACK. (The expected traffic would be the server sending the HTTP response). Around this point the server experiences an IO Exception.
Client opens connection #2 and sends HTTP request.
Behaviour continues as from 3.
Sending a request immediatly followed by a FIN is not a connection close, but shutdown of writing shutdown(socket,SHUT_WR). The client tells the server this way that it will not send any more data, but it might still receive data. It's not that uncommon.
I have a application that return a long request that returns a stream (a huge json)
The application is written in Java and I'm using Jetty as server.
The problem is after sometimes getting data, it stops. I made some tests and sometimes I got 10, 15, 40%.. doesn't matter.. Jetty interrupts the connection at some moment. I already isolated only one machine without other requests and it happens the same way.
I do not know how to debug, cause I didn't see any error. It only interrupts.
Any help is appreciate
What version of jetty?
Is this on a slow connection perhaps?
If so, you are likely encountering idle timeouts.
It can happen like this ....
Server has a large amount of data to send, it sends until there is TCP backpressure from the client telling it "woah!".
So the Server waits until the TCP layer says its ok to start sending again.
The client is slow.
This wait is longer than the configured Idle Timeout for that connector.
The server closes the connection.
I am using Jersey 1.4, the ApacheHttpClient, and the Apache MultiThreadedHttpConnectionManager class to manage connections. For the HttpConnectionManager, I set staleCheckingEnabled to true, maxConnectionsPerHost to 1000 and maxTotalConnections to 1000. Everything else is default. We are running in Tomcat and making connections out to multiple external hosts using the Jersey client.
I have noticed that after after a short period of time I will begin to see sockets in a CLOSE_WAIT state that are associated with the Tomcat process. Some monitoring with tcpdump shows that the external hosts appear to be closing the connection after some time but it's not getting closed on our end. Usually there is some data in the socket read queue, often 24 bytes. The connections are using https and the data seems to be encrypted so I'm not sure what it is.
I have checked to be sure that the ClientRequest objects that get created are closed. The sockets in CLOSE_WAIT do seem to get recycled and we're not running out of any resources, at least at this time. I'm not sure what's happening on the external servers.
My question is, is this normal and should I be concerned?
Thanks,
John
This is likely to be a device such as the firewall or the remote server timing out the TCP session. You can analyze packet captures of HTTPS using Wireshark as described on their SSL page:
http://wiki.wireshark.org/SSL
The staleCheckingEnabled flag only issues the check when you go to actually use the connection so you aren't using network resources (TCP sessions) when they aren't needed.