Apache HttpClient 4.3 - setting connection idle timeout - java

What's the shortest way to configure connection idle timeout on Apache HttpClient 4.3 version?
I've looked in the documentation and couldn't find anything. My goal is to reduce open connections to a minimum post server-peak.
for example in Jetty Client 8.x you can set httpClient.setIdleTimeout: http://download.eclipse.org/jetty/stable-8/apidocs/org/eclipse/jetty/client/HttpClient.html#setIdleTimeout(long)

The timeout is set in the RequestConfig so you could set the default when the HttpClientBuilder is called.
For example assuming your timeout variable is in seconds to create your custom RequestConfig you could do something like this:
RequestConfig config = RequestConfig.custom()
.setSocketTimeout(timeout * 1000)
.setConnectTimeout(timeout * 1000)
.build();
You could then build your HttpClient setting the default RequestConfig like this:
HttpClients.custom()
.setDefaultRequestConfig(config);

You can't set an idle connection timeout in the config for Apache HTTP Client. The reason is that there is a performance overhead in doing so.
The documentation clearly states why, and gives an example of an idle connection monitor implementation you can copy. Essentially this is another thread that you run to periodically call closeIdleConnections on HttpClientConnectionManager
http://hc.apache.org/httpcomponents-client-ga/tutorial/html/connmgmt.html
One of the major shortcomings of the classic blocking I/O model is that the network socket can react to I/O events only when blocked in an I/O operation. When a connection is released back to the manager, it can be kept alive however it is unable to monitor the status of the socket and react to any I/O events. If the connection gets closed on the server side, the client side connection is unable to detect the change in the connection state (and react appropriately by closing the socket on its end).
HttpClient tries to mitigate the problem by testing whether the connection is 'stale', that is no longer valid because it was closed on the server side, prior to using the connection for executing an HTTP request. The stale connection check is not 100% reliable and adds 10 to 30 ms overhead to each request execution. The only feasible solution that does not involve a one thread per socket model for idle connections is a dedicated monitor thread used to evict connections that are considered expired due to a long period of inactivity. The monitor thread can periodically call ClientConnectionManager#closeExpiredConnections() method to close all expired connections and evict closed connections from the pool. It can also optionally call ClientConnectionManager#closeIdleConnections() method to close all connections that have been idle over a given period of time.

Related

Apache HttpClient connection configuration

I am trying to setup a HttpClient through the HttpClientBuilder. I also had a look at the HttpClientConnectionManager and here the confusion started.
On the ConnectionManager or more exactly the PoolingHttpClientConnectionManager there are methods to:
close expired connections
close idle connections
When is a connection considered expired?
When is it idle?
What happens when a connection from the pool is closed? Is it ensured, that there are connections recreated when needed?
HTTP is based on TCP, which manages that packages are sent and received in the correct order and requests retransmissions if packages got lost mid way. A TCP connection is started with a TCP-Handshake consisting of SYN, SYN-ACK and ACK messages while it is ended with a FIN, ACK-FIN, and ACK series as can be seen from this image taken from Wikipedia
While HTTP is a request-response protocol, opening and closing connections is quite costly and so HTTP/1.1 allowed to reuse existing connections. With the header Connection: keep-alive i.e. you tell your client (i.e. browser) to keep the connection open to a server. A server can have litterally thousands and thousands open connection at the same time. In order to avoid draining the server's resources connection are usually timely limited. Via socket timeouts idle connections or connections with certain connection issues (broken internet access, ...) are closed after some predefined time by the server automatically.
Plenty of HTTP implementations, such as Apaches HTTP client 4.4 and beyond, check the status of a connection only when it is about to use it.
The handling of stale connections was changed in version 4.4. Previously, the code would check every connection by default before re-using it. The code now only checks the connection if the elapsed time since the last use of the connection exceeds the timeout that has been set. The default timeout is set to 2000ms (Source)
If a connection therefore might not have been used for some time the client may not have read the ACK-FIN from the server and therefore still think the connection is open when it actually got already closed by the server some time ago. Such a connection is expired and usually called half-closed. It therefore may be collected by the pool.
Note that if you send requests including a Connection: close HTTP header, the connection should be closed right after the client received the response.
The state of open connections can be checked via netstat which should be present on most modern operation systems. I recently had to check one of our HTTP clients which was managed through a third party library that did not propagate the Connection: Close header properly and therefore led to plenty of half-closed connections.
According to: https://hc.apache.org/httpcomponents-client-4.5.x/current/tutorial/html/connmgmt.html#d5e418
HttpClient tries to mitigate the problem by testing whether the
connection is 'stale', that is no longer valid because it was closed
on the server side, prior to using the connection for executing an
HTTP request. The stale connection check is not 100% reliable. The
only feasible solution that does not involve a one thread per socket
model for idle connections is a dedicated monitor thread used to evict
connections that are considered expired due to a long period of
inactivity. The monitor thread can periodically call
ClientConnectionManager#closeExpiredConnections() method to close all
expired connections and evict closed connections from the pool. It can also optionally call ClientConnectionManager#closeIdleConnections() method to close all connections that have been idle over a given period of time.
The difference between expired and idle is that an expired connection has been closed on the server side, while the idle connection isn't necessarily closed on the server side, but it has been idle over a period of time. When a connection is closed, it becomes available again in the pool to be used.

How to keep connection alive in Java 11 http client

I'm using the Java 11 HttpClient with HTTP/2 and need to keep connection alive for few minutes, but the builder does not have the option to set it. Is there a way to specify this and make client keep the connection alive for some time?
If you build a standard HttpClient e.g. using HttpClient.newHttpClient(); by default a connection pool is created. This pool keeps the connections alive by default for 1200 seconds (20 minutes).
If you want to change the keep-alive timeout you can do so using the property jdk.httpclient.keepalive.timeout. However the value is only read once when the class jdk.internal.net.http.ConnectionPool is loaded. Afterwards it can't be changed anymore.
Therefore you have to set this property for the whole JVM:
-Djdk.httpclient.keepalive.timeout=99999
Or at runtime before the ConnectionPool class has been loaded:
System.setProperty("jdk.httpclient.keepalive.timeout", "99999");
A third option is to using a file named ${java.home}/conf/net.properties and set the property in there.
Both HTTP/2 and HTTP/1.1 connections are kept alive by default. There are some exceptions when several concurrent connections are opened to the same host - then only one of them will be kept alive.

How to determine when HttpClient creates a new connection

I'm using Apache's http library with JAVA to post multiple requests to the same server. I read in the documentation of HttpClient that it keeps connections alive and reuses them by default.
Is there any way to determine when the connection goes stale and a new one is set up ?
The simplest way to do so is by turning on connection management context logging as described here
http://hc.apache.org/httpcomponents-client-4.5.x/logging.html
This should provide you with detailed info about the internal state of the connection manager and the pool of persistent connections.

Apache HttpClient: How to auto close connections by server's keep-alive time?

Apache HttpClient 4.3b2, HttpCore 4.3.
I use PoolingHttpClientConnectionManager to manage 5 connections concurrently:
PoolingHttpClientConnectionManager connectionManager;
HttpClient httpclient;
connectionManager = new PoolingHttpClientConnectionManager();
connectionManager.setDefaultMaxPerRoute(5);
httpclient = HttpClientBuilder.create().setConnectionManager(connectionManager).build();
Server have 5 seconds keep-alive time.
When server initiate close connection process it is staying in FIN_WAIT2 state until I'll execute connectionManager.shutdown() or connectionManager.closeExpiredConnections() or connectionManager.closeIdleConnections(5, TimeUnit.SECONDS) manually. Server waits FIN package. How can I automatically close connections on client side after server start closing process?
When I do requests from Chrome browser, server stay in TIME_WAIT state when it try to close connection by keep-alive (FIN_WAIT2 state changes very quickly). How can I get the same behavior with Apache HttpClient?
This problem is explained in details in HttpClient tutorial
One of the major shortcomings of the classic blocking I/O model is that the network socket can react to I/O events only when blocked in an I/O operation. When a connection is released back to the manager, it can be kept alive however it is unable to monitor the status of the socket and react to any I/O events. If the connection gets closed on the server side, the client side connection is unable to detect the change in the connection state (and react appropriately by closing the socket on its end).
If you want expired connections to get pro-actively evicted from the connection pool there is no way around running an additional thread enforcing a connection eviction policy that suits your application.
In PoolingHttpClientConnectionManager class there is a method setValidateAfterInactivity that sets period of connection inactivity in milliseconds. If this period has been exceeded connection pool revalidates connection before passing it to HttpClient.
This method is available since v.4.4.
In prior versions RequestConfig.Builder.setStaleConnectionCheckEnabled method could have been used.
I found this question multiple times while working on an Apache HttpClient 5 based client implementation to figure out whether a idle http connection monitor is still required.
Apparently, since Apache HttpClient 4.4, there is org.apache.hc.client5.http.impl.IdleConnectionEvictor which does exactly the thing described in HttpClient tutorial (which isn't mentioned in the tutorial).
Thought it might be useful to be aware of this for others as well.

Why does DefaultHttpClient send data over a half-closed socket?

I'm using DefaultHttpClient with a ThreadSafeClientConnManager on Android (2.3.x) to send HTTP requests to a my REST server (embedded Jetty).
After ~200 seconds of idle time, the server closes the TCP connection with a [FIN]. The Android client responds with an [ACK]. This should and does leave the socket in a half-closed state (server is still listening, but can't send data).
I would expect that when the client tries to use that connection again (via HttpClient.execute), DefaultHttpClient would detect the half-closed state, close the socket on the client side (thus sending it's [FIN/ACK] to finalize the close), and open a new connection for the request. But, there's the rub.
Instead, it sends the new HTTP request over the half-closed socket. Only after sending is the half-closed state detected and the socket closed on the client-side (with the [FIN] sent to the server). Of course, the server can't respond to the request (it had already sent its [FIN]), so the client thinks the request failed and automatically retries via a new socket/connection.
The end result is that server sees and processes two copies of the request.
Any ideas on how to fix this? (My server does the correct thing with the second copy, but I'm annoyed that the payload is transmitted twice.)
Shouldn't DefaultHttpClient detect that the socket was closed when it first tries to write the new HTTP packet, close that socket immediately, and start a new one? I'm baffled as to how a new HTTP request is sent on a socket minutes after the server sent a [FIN].
This is a general limitation of the blocking I/O in Java. There is simply no way of finding out whether or not the opposite endpoint has closed connection other than by attempting to read from the socket. Apache HttpClient works this problem around by employing the so stale connection check which is essentially a very brief read operation. However, the check can and often is disabled. In fact it is often advisable to have it disabled due to extra latency the check introduces. I have no idea how exactly the version of HttpClient shipped with Android behaves in this regard but you could try explicitly enabling the check by using an appropriate config parameter.
A better solution to this problem might be evicting connections from the connection pool that have been idle over a particular period of time (say 150 seconds) after a period of inactivity.
http://hc.apache.org/httpcomponents-client-ga/tutorial/html/connmgmt.html#d5e652

Categories