Multiple complete HTTP requests stuck in TCP CLOSE_WAIT state - java

I have a Java and Tomcat-based server application which initiates many outbound HTTP requests to other web sites. We use Jakarta's HTTP Core/Client libraries, very latest versions.
The server locks up at some point since all its worker threads are stuck trying to close completed HTTP connections. Using 'lsof' reveals a bunch of sockets stuck in TCP CLOSE_WAIT state.
This doesn't happen for all, or even most connections. In fact, I saw it before and resolved it by making sure to set the Connection: Close response header. So that makes me think it may be bad behavior of remote servers.
It may have come up again since I moved the app to a totally new service provider -- different OS, network situation.
But, I am still at a loss as to what I could do, if anything, to work around this. Some poking around on the internet didn't turn up anything I'm not already doing. Just thought I'd ask if anyone has seen and solved this?

I'm not sure how much you know about TCP. A TCP client ends up in the CLOSE_WAIT state when it is in the ESTABLISHED state and receives a FIN packet. The CLOSE_WAIT state means that it is waiting to receive a close command from the application layer - in this case, that means it's waiting for close() to be called on the socket.
So, my guess would be that calling close() in the worker threads will fix the problem. Or are you already doing this?

I believe I might have found a solution -- at least, these changes, together, seem to have made the problem go away.
Call HttpEntity.consumeContent() after you're done reading from its InputStream, to double-check the content is consumed and the framework releases the connection
For good measure, I call ClientConnectionManager.closeExpiredConnections() and ClientConnectionManager.closeIdleConnections(0L, TimeUnit.MILLISECONDS) just after this, to press the framework to release anything it's done with right away. This might be overkill.
OK the above didn't quite do the trick. In the end, I had to use a SingleClientConnManager, and create and shut it down for every request. This did the trick. I assume that this is what's needed to make sure the connection is closed.

Related

Is it a good idea to destroy sockets after a single use?

I've been looking into making a simple Sockets-based game in Java, and read in multiple places that client sockets are destroyed after a single exchange. Is this good practice for continued connections? The server needs to maintain a connection with a client (i.e. not using socket.accept() every time it wants to tell a client about something), but can't wait every time for the client's response. I already have the server/client running in separate threads, but won't destroying the socket after every exchange mean re-acquiring (or failing to re-acquire) a connection to that client? I've seen so many conflicting websites about sockets in Java and how they should be implemented.
There's no hard and fast rules, but it does depend slightly on what data rates you want to achieve.
For example, YouTube is a streaming video service, but the video data is delivered by means of the client using https to fetch batches of video data. Inefficient, yes, but very easy to program for. There's lots of reasons to use https for an application like YouTube (firewalls, etc), but ultimate power saving and network performance were not one of them. The "proper" way would be to use a protocol like RTP which uses UDP to deliver small packets of data which can then be rearranged into order, you also have to deal with missing frames at the CODEC level, etc. Much less network traffic, friendly to bandwidth constrained network links, but significantly more difficult to deal with traversing across firewalls, in client software, etc.
So if your game is sending modest amounts of data, the only thing wrong with setting up and tearing down a whole socket connection for every message is the nagging feeling you yourself will have that it is somehow not the most efficient solution.
Though it sounds like you have a conflict between the need to communicate between client / server and a need to process something else whilst waiting for the communication to complete. Here you're getting into asynchronous I/O territory. To make that easy i strongly suggest you take a look at ZeroMQ - that will make everything a whole lot simpler.
and read in multiple places that client sockets are destroyed after a single exchange.
Only in the places where that actually happens. There are numerous contexts where it doesn't, the outstanding example being HTTP, where every effort is made to reuse connections.
Is this good practice for continued connections?
The question is a contradiction in terms. A continued connection is a connection that isn't closed. A closed connection can't be continued.
The server needs to maintain a connection with a client (i.e. not using socket.accept() every time it wants to tell a client about something), but can't wait every time for the client's response.
The word you are groping for here is 'session'.
I already have the server/client running in separate threads, but won't destroying the socket after every exchange mean re-acquiring (or failing to re-acquire) a connection to that client?
Yes.
I've seen so many conflicting websites about sockets in Java and how they should be implemented.
You should use a connection pool at the client; a request loop at the server that looks for multiple requests per connection; a client-side facility that closes idle connections after some idle timeout; and a read timeout at the server that closes connections on which no request has been read within the timeout.

Reliable/persistent outbound sockets needed, options?

I have a Scala application which maintains (or tries to) TCP connections to various servers for hours (possibly > 24) at a time. Each server sends a short, ~30 character message about twice a second. These messages are fed into an iteratee where they are parsed and eventually end up making state changes to a database.
If any of these connections fail for any reason, my app needs to continually try to reconnect until I specify otherwise. Any messages getting lost is Bad. I have no control over the servers I connect to, or the protocols used.
It is conceivable there would be as many as 300 of these connections at once. No exactly a high-load scenario, so I don't think NIO is needed, though it might be nice to have? Other bits of the app are high-load.
I'm looking for some sort of socket controller / manager which can keep these connections as reliably as possible. I am running my own blocking controller now, but as I'm inexperienced with socket coding (and all the various settings, options, timeouts, etc.) I doubt its will achieve the best possible uptime. Plus I may need SSL support at some point down the line.
Would NIO offer any real advantages?
Would Netty be the best choice here? I've seen the Uptime example here, and was thinking of simply duplicating it, but being new to lower-level networking I wasn't sure if there were better options.
However I'm uncertain of the best strategies for ensuring as few packets are lost as possible, and assumed this would be a "solved" problem in one library or another.
Yup. JMS is an example.
I suppose a lot of it would come down to a timeout guessing strategy? Close and re-open a socket too early and you've lost whatever packets were en-route.
That is correct. That approach is not going to be reliable, especially if connections go up and down regularly.
A real solution involves having the other end keep track of what it has received, and letting the sender know when then connection is re-established. If that can't be done, you have no real way of controlling how much gets lost. (This is what the reliable messaging services do ...)
I have no control over the servers I connect to. So unless there's another way to adapt JMS to a generic TCP stream I don't think it will work.
Yup. And the same applies if you try to implement this by hand. The other end has to cooperate.
I guess you could construct something where you run (say) a JMS end point on each of the remote servers, and have the endpoint use UNIX domain sockets or loopback (i.e. 127.0.0.1) to talk to the server. But you still have potential for message loss.

How to diagnose leaked http connections (org.apache.http.impl.conn.tsccm.ConnPoolByRoute)

I have a multithreaded java program that runs on Amazon's EC2. It queries and fetches data items from a vendor via HttpPost and HttpGet, using a org.apache.http.impl.client.DefaultHttpClient. Concurrently, it pushes the retrieved data items into S3 using AWS's Java SDK.
After a few days of running, I get the symptoms that normally come with http connection leaks:
org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection
at org.apache.http.impl.conn.tsccm.ConnPoolByRoute.getEntryBlocking(ConnPoolByRoute.java:417)
at org.apache.http.impl.conn.tsccm.ConnPoolByRoute$1.getPoolEntry(ConnPoolByRoute.java:300)
at org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager$1.getConnection(ThreadSafeClientConnManager.java:224)
at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:391)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:732)
Since both AWS and my requests to the data vendor use Http connections, I am not quite sure where exactly I forget to HttpEntity.consume(), or S3ObjectInputStream.close() (unless it is yet something else...).
So here is my question: are there ways to monitor org.apache.http.impl.conn.tsccm.ConnPoolByRoute so that at least I can detect when I am starting to leak connections/entities not properly consumed/http streams not closed? (I have a feeling it happens only under certain conditions, e.g. when certain exceptions are being thrown, by-passing the logic in my code that consumes HttpEntities, closes streams, etc.) Any idea on how to diagnose what eventually causes all my http connections to fail with that ConnectionPoolTimeoutException would be most welcome. I don't feel like waiting 4+ days between attempts to fix the root cause of the problem.
If you're using the PoolingClientConnectionManager note there are the methods getTotalStats() and getStats(final HttpRoute route) which will give you a PoolStats object with the data you're looking to monitor.
Just fetch the ConnectionManager from your httpclient:
PoolingClientConnectionManager poolManager = (PoolingClientConnectionManager) httpClient.getConnectionManager();
If you can access the org.apache.http.impl.conn.tsccm.ConnPoolByRoute then set it's connTTL to a low enough value so that it's WaitingThreadAborter will eventually terminate a connection. It will show a nice stacktrace there. The other option is to use CGLIB or some other bytecode manipulating framework to create a proxy class wrapping org.apache.http.impl.conn.tsccm.ConnPoolByRoute. Depending on your environment it might not be that easy to set it up, but it's a rather valuable tool to debug issues like yours. (And yes, if you happen to use spring or just plain Aspects the setup will be supereasy :) )

Detecting Server Crash With RMI

I'm new to Java and RMI, but I'm trying to write my app in such a way that there are many clients connecting to a single server. So far, so good....
But when I close the server (simulating a crash or communication issue) my clients remain unaware until I make my next call to the server. It is a requirement that my clients continue to work without the server in an 'offline mode' and the sooner I know that I'm offline the better the user-experience will be.
Is there an active connection that remains open that the client can detect a problem with or something similar - or will I simply have to wait until the next call fails? I figured I could have a 'health-check' ping the server but it seemed like it might not be the best approach.
Thanks for any help
actually i'm just trying to learn more about RMI and CORBA but i'm not that far as you are. all i know is that those systems are also built to be less expensive, and as far as i know an active conneciton is an expensive thing.
i would suggest you use a multicast address to which your server sends somehow "i'm still here" but without using TCP connections, UDP should be enough for that purpose and more efficient.
I looked into this a bit when I was writing an RMI app (uni assignment) but I didn't come across any inbuilt functionality for testing whether a remote system is alive. I would just use a UDP heartbeat mechanism for this.
(Untested). Having a separate RMI call repeatedly into the server, which just does a "wait X seconds" and then return, should be told that the execution has failed when the server is brought down.

closing a socket channel asynchronously

I have a single-threaded non-blocking socket IO server written in Java using nio.
When I have finished writing to a connection, I want to close it.
Does the closing of the channel mean blocking until all buffered writes have been acknowledged by the recipient?
It would be useful to know if, when asynchronously closing, it succeeded or not, but I could live with any errors in the closing being ignored.
Is there any way to configure this, e.g. with setSoLinger() (and what settings would be appropriate?)
(A general discussion beyond Java about Linux and other OS in this respect would be useful to)
Closing in non-blocking mode is non-blocking.
You could put the channel into blocking mode, set a positive linger timeout, and close, and that would block for up to the linger timeout while the socket send buffer was being emptied, but alas Java doesn't throw an exception if the linger timeout expires, so you can't know whether all the data has gone. I reported this bug ten or more years ago and it came back 'will not fix' because of compatiblity concerns. If you can wait until Java 7 comes out I believe the nio2 stuff has this fixed, I certainly requested it, but who knows when that will be?
And even if you have all that, all you know is that the data was sent. You don't know anything about it being received or processed by the recipient application. If you need that you have to build it into your application protocol.
I'm not sure what really happens but I know that close() includes flush() (except in PrintStream and PrintWriter...).
So my approach would be to add the connections to close to a queue and process that queue in a second thread (including error handling).
I understand that your server is single-threaded but a second thread doesn't cost that much, the complexity of the problem is low and the solution will be easy to understand any maintain.

Categories