I have issue with using Vert.x httpclient or webclient class to call whatsapp API. If I make two API call back to back to whatsapp using the same httpclient / webclient object then I often encountered connection reset error.
I'm thinking that this issue is caused by whatsapp doesn't want to keep long http connection thus causing http connection in the connection pool becomes stale. I tried to set max pool size of httpclient to be 1 but I still encountered this connection reset issue. I tried to set max pool size of 0, but it turned out that it is not allowed.
How do I then disable connection pool in Vert.x httpclient / webclient class so that each http request using fresh new connection ?
Related
I am trying to setup a HttpClient through the HttpClientBuilder. I also had a look at the HttpClientConnectionManager and here the confusion started.
On the ConnectionManager or more exactly the PoolingHttpClientConnectionManager there are methods to:
close expired connections
close idle connections
When is a connection considered expired?
When is it idle?
What happens when a connection from the pool is closed? Is it ensured, that there are connections recreated when needed?
HTTP is based on TCP, which manages that packages are sent and received in the correct order and requests retransmissions if packages got lost mid way. A TCP connection is started with a TCP-Handshake consisting of SYN, SYN-ACK and ACK messages while it is ended with a FIN, ACK-FIN, and ACK series as can be seen from this image taken from Wikipedia
While HTTP is a request-response protocol, opening and closing connections is quite costly and so HTTP/1.1 allowed to reuse existing connections. With the header Connection: keep-alive i.e. you tell your client (i.e. browser) to keep the connection open to a server. A server can have litterally thousands and thousands open connection at the same time. In order to avoid draining the server's resources connection are usually timely limited. Via socket timeouts idle connections or connections with certain connection issues (broken internet access, ...) are closed after some predefined time by the server automatically.
Plenty of HTTP implementations, such as Apaches HTTP client 4.4 and beyond, check the status of a connection only when it is about to use it.
The handling of stale connections was changed in version 4.4. Previously, the code would check every connection by default before re-using it. The code now only checks the connection if the elapsed time since the last use of the connection exceeds the timeout that has been set. The default timeout is set to 2000ms (Source)
If a connection therefore might not have been used for some time the client may not have read the ACK-FIN from the server and therefore still think the connection is open when it actually got already closed by the server some time ago. Such a connection is expired and usually called half-closed. It therefore may be collected by the pool.
Note that if you send requests including a Connection: close HTTP header, the connection should be closed right after the client received the response.
The state of open connections can be checked via netstat which should be present on most modern operation systems. I recently had to check one of our HTTP clients which was managed through a third party library that did not propagate the Connection: Close header properly and therefore led to plenty of half-closed connections.
According to: https://hc.apache.org/httpcomponents-client-4.5.x/current/tutorial/html/connmgmt.html#d5e418
HttpClient tries to mitigate the problem by testing whether the
connection is 'stale', that is no longer valid because it was closed
on the server side, prior to using the connection for executing an
HTTP request. The stale connection check is not 100% reliable. The
only feasible solution that does not involve a one thread per socket
model for idle connections is a dedicated monitor thread used to evict
connections that are considered expired due to a long period of
inactivity. The monitor thread can periodically call
ClientConnectionManager#closeExpiredConnections() method to close all
expired connections and evict closed connections from the pool. It can also optionally call ClientConnectionManager#closeIdleConnections() method to close all connections that have been idle over a given period of time.
The difference between expired and idle is that an expired connection has been closed on the server side, while the idle connection isn't necessarily closed on the server side, but it has been idle over a period of time. When a connection is closed, it becomes available again in the pool to be used.
I'm using Apache's http library with JAVA to post multiple requests to the same server. I read in the documentation of HttpClient that it keeps connections alive and reuses them by default.
Is there any way to determine when the connection goes stale and a new one is set up ?
The simplest way to do so is by turning on connection management context logging as described here
http://hc.apache.org/httpcomponents-client-4.5.x/logging.html
This should provide you with detailed info about the internal state of the connection manager and the pool of persistent connections.
What's the shortest way to configure connection idle timeout on Apache HttpClient 4.3 version?
I've looked in the documentation and couldn't find anything. My goal is to reduce open connections to a minimum post server-peak.
for example in Jetty Client 8.x you can set httpClient.setIdleTimeout: http://download.eclipse.org/jetty/stable-8/apidocs/org/eclipse/jetty/client/HttpClient.html#setIdleTimeout(long)
The timeout is set in the RequestConfig so you could set the default when the HttpClientBuilder is called.
For example assuming your timeout variable is in seconds to create your custom RequestConfig you could do something like this:
RequestConfig config = RequestConfig.custom()
.setSocketTimeout(timeout * 1000)
.setConnectTimeout(timeout * 1000)
.build();
You could then build your HttpClient setting the default RequestConfig like this:
HttpClients.custom()
.setDefaultRequestConfig(config);
You can't set an idle connection timeout in the config for Apache HTTP Client. The reason is that there is a performance overhead in doing so.
The documentation clearly states why, and gives an example of an idle connection monitor implementation you can copy. Essentially this is another thread that you run to periodically call closeIdleConnections on HttpClientConnectionManager
http://hc.apache.org/httpcomponents-client-ga/tutorial/html/connmgmt.html
One of the major shortcomings of the classic blocking I/O model is that the network socket can react to I/O events only when blocked in an I/O operation. When a connection is released back to the manager, it can be kept alive however it is unable to monitor the status of the socket and react to any I/O events. If the connection gets closed on the server side, the client side connection is unable to detect the change in the connection state (and react appropriately by closing the socket on its end).
HttpClient tries to mitigate the problem by testing whether the connection is 'stale', that is no longer valid because it was closed on the server side, prior to using the connection for executing an HTTP request. The stale connection check is not 100% reliable and adds 10 to 30 ms overhead to each request execution. The only feasible solution that does not involve a one thread per socket model for idle connections is a dedicated monitor thread used to evict connections that are considered expired due to a long period of inactivity. The monitor thread can periodically call ClientConnectionManager#closeExpiredConnections() method to close all expired connections and evict closed connections from the pool. It can also optionally call ClientConnectionManager#closeIdleConnections() method to close all connections that have been idle over a given period of time.
Apache HttpClient 4.3b2, HttpCore 4.3.
I use PoolingHttpClientConnectionManager to manage 5 connections concurrently:
PoolingHttpClientConnectionManager connectionManager;
HttpClient httpclient;
connectionManager = new PoolingHttpClientConnectionManager();
connectionManager.setDefaultMaxPerRoute(5);
httpclient = HttpClientBuilder.create().setConnectionManager(connectionManager).build();
Server have 5 seconds keep-alive time.
When server initiate close connection process it is staying in FIN_WAIT2 state until I'll execute connectionManager.shutdown() or connectionManager.closeExpiredConnections() or connectionManager.closeIdleConnections(5, TimeUnit.SECONDS) manually. Server waits FIN package. How can I automatically close connections on client side after server start closing process?
When I do requests from Chrome browser, server stay in TIME_WAIT state when it try to close connection by keep-alive (FIN_WAIT2 state changes very quickly). How can I get the same behavior with Apache HttpClient?
This problem is explained in details in HttpClient tutorial
One of the major shortcomings of the classic blocking I/O model is that the network socket can react to I/O events only when blocked in an I/O operation. When a connection is released back to the manager, it can be kept alive however it is unable to monitor the status of the socket and react to any I/O events. If the connection gets closed on the server side, the client side connection is unable to detect the change in the connection state (and react appropriately by closing the socket on its end).
If you want expired connections to get pro-actively evicted from the connection pool there is no way around running an additional thread enforcing a connection eviction policy that suits your application.
In PoolingHttpClientConnectionManager class there is a method setValidateAfterInactivity that sets period of connection inactivity in milliseconds. If this period has been exceeded connection pool revalidates connection before passing it to HttpClient.
This method is available since v.4.4.
In prior versions RequestConfig.Builder.setStaleConnectionCheckEnabled method could have been used.
I found this question multiple times while working on an Apache HttpClient 5 based client implementation to figure out whether a idle http connection monitor is still required.
Apparently, since Apache HttpClient 4.4, there is org.apache.hc.client5.http.impl.IdleConnectionEvictor which does exactly the thing described in HttpClient tutorial (which isn't mentioned in the tutorial).
Thought it might be useful to be aware of this for others as well.
Env:
Java 6
Apache HttpClient 4.2.3
Question Detail:
Following httpclient manual, when I use DefaultHttpClient without configuring any connection manager. we need to let connection manager shutdown.
But when I have many requests on many servers, I will configure PoolingClientConnectionManager as connection manager. I don't find any reference on the Apache site for this case - should I do something to release connections for a specific httpclient request? or the httpclient will do it automatically in framework level?
Yes, you do. Connection managers allocate available connections to individual requests but they have no way of knowing whether or not a particular connection is still in use. When processing a response HttpClient only reads message head into memory while message content is streamed directly from the underlying connection. It is a responsibility of the consumer to trigger connection release back to the manager by closing the content input stream associated with the response object.