I have used apache httpclient 4.5 in production for a while now, but recently, with the addition of a new use case, the system started failing.
We have multiple services that communicate through REST webservices, the client is a wrapper around apache httpclient 4.5.
Say i have service A communicating with service B. The communication works correctly until I restart service B. The next call I initiate from service A to service B fails, due to time out. After doing some research I found that the underlying TCP connection is reused for performance reasons (no more 2 way handshake etc). Since the server has been restarted, the underlying TCP connection is stale.
After reading the documentation, I found out that I can expire my connection after n seconds. Say I restart service B, then the call will fail the first n seconds, but after that the connection is rebuild. This is the keepAliveStrategy I implemented
connManager = new PoolingHttpClientConnectionManager();
connManager.setMaxTotal(100);
connManager.setDefaultMaxPerRoute(10);
ConnectionKeepAliveStrategy keepAliveStrategy = new DefaultConnectionKeepAliveStrategy() {
public long getKeepAliveDuration(HttpResponse response, HttpContext context) {
long keepAliveDuration = super.getKeepAliveDuration(response, context);
if (keepAliveDuration == -1) {
keepAliveDuration = 45 * 1000; // 45 seconds
}
return keepAliveDuration;
}
};
CloseableHttpClient closeableHttpClient = HttpClients.custom()
.setConnectionManager(connManager)
.setKeepAliveStrategy(keepAliveStrategy)
.build();
I am just wondering if this is correct usage of this library. I this the way it is meant to work or am I making everything overly complex?
Not sure it's 100% the same scenario, but here's my 2 cents:
We had a similar issues (broken connections in pool after a period of inactivity). When we were using an older version of HttpClient (3.X), we used the http.connection.stalecheck manager parameter, taking a minor performance hit over the possibility to get a IOException when a connection has been used that was closed server-side.
After upgrading to 4.4+ this approach was deprecated and started using setValidateAfterInactivity, which is a middle ground between per-call validation and runtime-error scenario:
PoolingHttpClientConnectionManager poolingConnManager = new PoolingHttpClientConnectionManager();
poolingConnManager.setValidateAfterInactivity(5000);
void o.a.h.i.c.PoolingHttpClientConnectionManager.setValidateAfterInactivity(int ms)
Defines period of inactivity in milliseconds after which persistent connections must be re-validated prior to being leased to the consumer. Non-positive value passed to this method disables connection validation. This check helps detect connections that have become stale (half-closed) while kept inactive in the pool.
If you're also controlling the consumed API, you can adapt the keep-alive strategy to the timing your client uses. We're using AWS Cloudfront + ELB's with connection draining for deregistered instances to ensure the kept-alive connections are fully closed, when performing a rolling upgrade. I guess as long as the connections are guaranteed to be kept alive for, say 30 seconds, any value passed to the connection manager below that will always ensure the validity check will mitigate any runtime I/O errors which are purely related to stale/expired connections.
Related
I felt very confused after reading the Connection Management doc of the Apache HTTP components module, and also a few other resources on connection keep alive strategy and connection eviction policy.
There are a bunch of adjectives used in there to describe the state of a connection like stale, idle, available, expired and closed etc. There isn't a lifecycle diagram describing how a connection changes among these states.
My confusion mainly arose from below situation.
I set a ConnectionKeepAliveStrategy that provides a KeepAliveDuration of 5 seconds via below code snippet.
ConnectionKeepAliveStrategy keepAliveStrategy = ( httpResponse, httpContext ) -> {
HeaderElementIterator iterator =
new BasicHeaderElementIterator( httpResponse.headerIterator( HTTP.CONN_KEEP_ALIVE ) );
while ( iterator.hasNext() )
{
HeaderElement header = iterator.nextElement();
if ( header.getValue() != null && header.getName().equalsIgnoreCase( "timeout" ) )
{
return Long.parseLong( header.getValue(), 10) * 1000;
}
}
return 5 * 1000;
};
this.client = HttpAsyncClients.custom()
.setDefaultRequestConfig( requestConfig )
.setMaxConnTotal( 500 )
.setMaxConnPerRoute( 500 )
.setConnectionManager( this.cm )
.setKeepAliveStrategy( keepAliveStrategy )
.build();
The server I am talking to does support connections to be kept alive. When I printed out the pool stats of the connection manager after executing around ~200 requests asynchronously in a single batch, below info was observed.
Total Stats:
-----------------
Available: 139
Leased: 0
Max: 500
Pending: 0
And after waiting for 30 seconds (by then the keep-alive timeout had long been exceeded), I started a new batch of the same HTTP calls. Upon inspecting the connection manager pool stats, the number of available connections are is still 139.
Shouldn't it be zero since the keep-alive timeout had been reached? The PoolStats Java doc states that Available is "the number of idle persistent connections". Are idle persistent connections considered alive?
I think Apache HttpClient: How to auto close connections by server's keep-alive time is a close hit but hope some expert could give an insightful explanation about the lifecycle of a connection managed by PoolingHttpClientConnectionManager.
Some other general questions:
Does the default connection manager used in HttpAsyncClients.createdDefault() handle connection keep-alive strategy and connection eviction on its own?
What are the requirements/limitations that could call for implementing them on a custom basis? Will they contradict each other?
Documenting some of my further findings which might partially fulfill as an answer.
Whether using a ConnectionKeepAliveStrategy to set a timeout on the keep alive session or not, the connections will end up in the TCP state of ESTABLISHED, as inspected via netstat -apt. And I observed that they are automatically recycled after around 5 minutes in my Linux test environment.
When NOT using a ConnectionKeepAliveStrategy, upon a second request batch the established connections will be reused.
When using a ConnectionKeepAliveStrategy and its timeout has NOT been reached, upon a second request batch the established connections will be reused.
When using a ConnectionKeepAliveStrategy and its timeout has been exceeded, upon a second request batch, the established connections will be recycled into the TIME_WAIT state, indicating that client side has decided to close the connections.
This recycling can be actively exercised by performing connectionManager.closeExpiredConnections(); in a separate connection evicting thread, which will lead the connections into TIME_WAIT stage.
I think the general observation is that ESTABLISHED connections are deemed as Available by the connection pool stats, and the connection keep alive strategy with a timeout does put the connections into expiry, but it only takes effect when new requests are processed, or when we specifically instruct the connection manager to close expired connections.
TCP state diagram from Wikipedia for reference.
I am using Apache HttpClient in one of my project. I am also using PoolingHttpClientConnectionManager along with my HttpClient as well.
I am confuse what are these properties mean. I tried going through documentation in the code but I don't see any documentation around these variables so was not able to understand.
setMaxTotal
setDefaultMaxPerRoute
setConnectTimeout
setSocketTimeout
setConnectionRequestTimeout
setStaleConnectionCheckEnabled
Below is how I am using in my code:
RequestConfig requestConfig = RequestConfig.custom().setConnectTimeout(5 * 1000).setSocketTimeout(5 * 1000)
.setStaleConnectionCheckEnabled(false).build();
PoolingHttpClientConnectionManager poolingHttpClientConnectionManager = new PoolingHttpClientConnectionManager();
poolingHttpClientConnectionManager.setMaxTotal(200);
poolingHttpClientConnectionManager.setDefaultMaxPerRoute(20);
CloseableHttpClient httpClientBuilder = HttpClientBuilder.create()
.setConnectionManager(poolingHttpClientConnectionManager).setDefaultRequestConfig(requestConfig)
.build();
Can anyone explain me these properties so that I can understand and decide what values I should put in there. Also, are there any other properties that I should use apart from as shown above to get better performance?
I am using http-client 4.3.1
Some parameters are explained at http://hc.apache.org/httpclient-3.x/preference-api.html
Others must be gleaned from the source.
setMaxTotal
The maximum number of connections allowed across all routes.
setDefaultMaxPerRoute
The maximum number of connections allowed for a route that has not been specified otherwise by a call to setMaxPerRoute. Use setMaxPerRoute when you know the route ahead of time and setDefaultMaxPerRoute when you do not.
setConnectTimeout
How long to wait for a connection to be established with the remote server before throwing a timeout exception.
setSocketTimeout
How long to wait for the server to respond to various calls before throwing a timeout exception. See http://docs.oracle.com/javase/1.5.0/docs/api/java/net/SocketOptions.html#SO_TIMEOUT for details.
setConnectionRequestTimeout
How long to wait when trying to checkout a connection from the connection pool before throwing an exception (the connection pool won't return immediately if, for example, all the connections are checked out).
setStaleConnectionCheckEnabled
Can be disabled for a slight performance improvement at the cost of potential IOExceptions. See http://hc.apache.org/httpclient-3.x/performance.html#Stale_connection_check
I have developed a standalone Javase client which performs an EJB Lookup to a remote server and executes its method.The Server application is in EJB 3.0
Under some strange magical but rare situations my program hangs indefinetly, on looking inside the issue it seems that while looking up the ejb on the server, I never get the response from the server and it also never times out.
I would like to know if there is a property or any other way through which we can setup the lookup time in client or at the server side.
There is a very nice article that discusses ORB configuration best practices at DeveloperWorks here. I'm quoting the three different settings that can be configured at client (you, while doing a lookup and executing a method at a remote server);
Connect timeout: Before the client ORB can even send a request to a server, it needs to establish an IIOP connection (or re-use an
existing one). Under normal circumstances, the IIOP and underlying TCP
connect operations should complete very fast. However, contention on
the network or another unforeseen factor could slow this down. The
default connect timeout is indefinite, but the ORB custom property
com.ibm.CORBA.ConnectTimeout (in seconds) can be used to change the
timeout.
Locate request timeout: Once a connection has been established and a client sends an RMI request to the server, then LocateRequestTimeout
can be used to limit the time for the CORBA LocateRequest (a CORBA
“ping”) for the object. As a result, the LocateRequestTimeout should
be less than or equal to the RequestTimeout because it is a much
shorter operation in terms of data sent back and forth. Like the
RequestTimeout, the LocateRequestTimeout defaults to 180 seconds.
Request timeout: Once the client ORB has an established TCP connection to the server, it will send the request across. However, it
will not wait indefinitely for a response, by default it will wait for
180 seconds. This is the ORB request timeout interval. This can
typically be lowered, but it should be in line with the expected
application response times from the server.
You can try the following code, which performs task & then waits at most the time specified.
Future<Object> future = executorService.submit(new Callable<Object>() {
public Object call() {
return lookup(JNDI_URL);
}
});
try {
Object result = future.get(20L, TimeUnit.SECONDS); //- Waiting for at most 20 sec
} catch (ExecutionException ex) {
logger.log(LogLevel.ERROR,ex.getMessage());
return;
}
Also, the task can be cancelled by future.cancel(true).
Remote JNDI uses the ORB, so the only option available is com.ibm.CORBA.RequestTimeout, but that will have an affect on all remote calls. As described in the 7.0 InfoCenter, the default value is 180 (3 minutes).
I have a webservice which is accepting a POST method with XML. It is working fine then at some random occasion, it fails to communicate to the server throwing IOException with message The target server failed to respond. The subsequent calls work fine.
It happens mostly, when i make some calls and then leave my application idle for like 10-15 min. the first call which I make after that returns this error.
I tried couple of things ...
I setup the retry handler like
HttpRequestRetryHandler retryHandler = new HttpRequestRetryHandler() {
public boolean retryRequest(IOException e, int retryCount, HttpContext httpCtx) {
if (retryCount >= 3){
Logger.warn(CALLER, "Maximum tries reached, exception would be thrown to outer block");
return false;
}
if (e instanceof org.apache.http.NoHttpResponseException){
Logger.warn(CALLER, "No response from server on "+retryCount+" call");
return true;
}
return false;
}
};
httpPost.getParams().setParameter(HttpMethodParams.RETRY_HANDLER, retryHandler);
but this retry never got called. (yes I am using right instanceof clause). While debugging this class never being called.
I even tried setting up HttpProtocolParams.setUseExpectContinue(httpClient.getParams(), false); but no use. Can someone suggest what I can do now?
IMPORTANT
Besides figuring out why I am getting the exception, one of the important concerns I have is why isn't the retryhandler working here?
Most likely persistent connections that are kept alive by the connection manager become stale. That is, the target server shuts down the connection on its end without HttpClient being able to react to that event, while the connection is being idle, thus rendering the connection half-closed or 'stale'. Usually this is not a problem. HttpClient employs several techniques to verify connection validity upon its lease from the pool. Even if the stale connection check is disabled and a stale connection is used to transmit a request message the request execution usually fails in the write operation with SocketException and gets automatically retried. However under some circumstances the write operation can terminate without an exception and the subsequent read operation returns -1 (end of stream). In this case HttpClient has no other choice but to assume the request succeeded but the server failed to respond most likely due to an unexpected error on the server side.
The simplest way to remedy the situation is to evict expired connections and connections that have been idle longer than, say, 1 minute from the pool after a period of inactivity. For details please see the 2.5. Connection eviction policy of the HttpClient 4.5 tutorial.
Accepted answer is right but lacks solution. To avoid this error, you can add setHttpRequestRetryHandler (or setRetryHandler for apache components 4.4) for your HTTP client like in this answer.
HttpClient 4.4 suffered from a bug in this area relating to validating possibly stale connections before returning to the requestor. It didn't validate whether a connection was stale, and this then results in an immediate NoHttpResponseException.
This issue was resolved in HttpClient 4.4.1. See this JIRA and the release notes
Solution: change the ReuseStrategy to never
Since this problem is very complex and there are so many different factors which can fail I was happy to find this solution in another post: How to solve org.apache.http.NoHttpResponseException
Never reuse connections:
configure in org.apache.http.impl.client.AbstractHttpClient:
httpClient.setReuseStrategy(new NoConnectionReuseStrategy());
The same can be configured on a org.apache.http.impl.client.HttpClientBuilder builder:
builder.setConnectionReuseStrategy(new NoConnectionReuseStrategy());
Although accepted answer is right, but IMHO is just a workaround.
To be clear: it's a perfectly normal situation that a persistent connection may become stale. But unfortunately it's very bad when the HTTP client library cannot handle it properly.
Since this faulty behavior in Apache HttpClient was not fixed for many years, I definitely would prefer to switch to a library that can easily recover from a stale connection problem, e.g. OkHttp.
Why?
OkHttp pools http connections by default.
It gracefully recovers from situations when http connection becomes stale and request cannot be retried due to being not idempotent (e.g. POST). I cannot say it about Apache HttpClient (mentioned NoHttpResponseException).
Supports HTTP/2.0 from early drafts and beta versions.
When I switched to OkHttp, my problems with NoHttpResponseException disappeared forever.
Nowadays, most HTTP connections are considered persistent unless declared otherwise. However, to save server ressources the connection is rarely kept open forever, the default connection timeout for many servers is rather short, for example 5 seconds for the Apache httpd 2.2 and above.
The org.apache.http.NoHttpResponseException error comes most likely from one persistent connection that was closed by the server.
It's possible to set the maximum time to keep unused connections open in the Apache Http client pool, in milliseconds.
With Spring Boot, one way to achieve this:
public class RestTemplateCustomizers {
static public class MaxConnectionTimeCustomizer implements RestTemplateCustomizer {
#Override
public void customize(RestTemplate restTemplate) {
HttpClient httpClient = HttpClientBuilder
.create()
.setConnectionTimeToLive(1000, TimeUnit.MILLISECONDS)
.build();
restTemplate.setRequestFactory(
new HttpComponentsClientHttpRequestFactory(httpClient));
}
}
}
// In your service that uses a RestTemplate
public MyRestService(RestTemplateBuilder builder ) {
restTemplate = builder
.customizers(new RestTemplateCustomizers.MaxConnectionTimeCustomizer())
.build();
}
This can happen if disableContentCompression() is set on a pooling manager assigned to your HttpClient, and the target server is trying to use gzip compression.
Same problem for me on apache http client 4.5.5
adding default header
Connection: close
resolve the problem
Use PoolingHttpClientConnectionManager instead of BasicHttpClientConnectionManager
BasicHttpClientConnectionManager will make an effort to reuse the connection for subsequent requests with the same route. It will, however, close the existing connection and re-open it for the given route.
I have faced same issue, I resolved by adding "connection: close" as extention,
Step 1: create a new class ConnectionCloseExtension
import com.github.tomakehurst.wiremock.common.FileSource;
import com.github.tomakehurst.wiremock.extension.Parameters;
import com.github.tomakehurst.wiremock.extension.ResponseTransformer;
import com.github.tomakehurst.wiremock.http.HttpHeader;
import com.github.tomakehurst.wiremock.http.HttpHeaders;
import com.github.tomakehurst.wiremock.http.Request;
import com.github.tomakehurst.wiremock.http.Response;
public class ConnectionCloseExtension extends ResponseTransformer {
#Override
public Response transform(Request request, Response response, FileSource files, Parameters parameters) {
return Response.Builder
.like(response)
.headers(HttpHeaders.copyOf(response.getHeaders())
.plus(new HttpHeader("Connection", "Close")))
.build();
}
#Override
public String getName() {
return "ConnectionCloseExtension";
}
}
Step 2: set extension class in wireMockServer like below,
final WireMockServer wireMockServer = new WireMockServer(options()
.extensions(ConnectionCloseExtension.class)
.port(httpPort));
My server uses data from an internal web service to construct its response, on a per request basis. I'm using Apache HttpClient 4.1 to make the requests. Each initial request will result in about 30 requests to the web service. Of these, 4 - 8 will end up with sockets stuck in CLOSE_WAIT, which never get released. Eventually these stuck sockets exceed my ulimit and my process runs out of file descriptors.
I don't want to just raise my ulimit (1024), because that will just mask the problem.
The reason I've moved to HttpClient is that java.net.HttpUrlConnection was behaving the same way.
I have tried moving to a SingleClientConnManager per request, and calling client.getConnectionManager().shutdown() on it, but sockets still end up stuck.
Should I be trying to solve this so that I end up with 0 open sockets while there are no running requests, or should I be concentrating on request persistence and pooling?
For clarity I'm including some details which may be relevant:
OS: Ubuntu 10.10
JRE: 1.6.0_22
Language: Scala 2.8
Sample code:
val cleaner = Executors.newScheduledThreadPool(1)
private val client = {
val ssl_ctx = SSLContext.getInstance("TLS")
val managers = Array[TrustManager](TrustingTrustManager)
ssl_ctx.init(null, managers, new java.security.SecureRandom())
val sslSf = new org.apache.http.conn.ssl.SSLSocketFactory(ssl_ctx, SSLSocketFactory.ALLOW_ALL_HOSTNAME_VERIFIER)
val schemeRegistry = new SchemeRegistry()
schemeRegistry.register(new Scheme("https", 443, sslSf))
val connection = new ThreadSafeClientConnManager(schemeRegistry)
object clean extends Runnable{
override def run = {
connection.closeExpiredConnections
connection.closeIdleConnections(30, SECONDS)
}
}
cleaner.scheduleAtFixedRate(clean,10,10,SECONDS)
val httpClient = new DefaultHttpClient(connection)
httpClient.getCredentialsProvider().setCredentials(new AuthScope(AuthScope.ANY), new UsernamePasswordCredentials(username,password))
httpClient
}
val get = new HttpGet(uri)
val entity = client.execute(get).getEntity
val stream = entity.getContent
val justForTheExample = IOUtils.toString(stream)
stream.close()
Test: netstat -a | grep {myInternalWebServiceName} | grep CLOSE_WAIT
(Lists sockets for my process that are in CLOSE_WAIT state)
Post comment discussion:
This code now demonstrates correct usage.
One needs to pro-actively evict expired / idle connections from the connection pool, as in the blocking I/O model connections cannot react to I/O events unless they are being read from / written to. For details see
http://hc.apache.org/httpcomponents-client-dev/tutorial/html/connmgmt.html#d4e631
I've marked oleg's answer as correct, as it highlights an important usage point about HttpClient's connection pooling.
To answer my specific original question, though, which was "Should I be trying to solve for 0 unused sockets or trying to maximize pooling?"
Now that the pooling solution is in place and working correctly the application throughput has increased by about 150%. I attribute this to not having to renegotiate SSL and multiple handshakes, instead reusing persistent connections in accordance with HTTP 1.1.
It is definitely worth working to utilize pooling as intended, rather than trying to hack around with calling ThreadSafeClientConnManager.shutdown() after each request etcetera. If, on the other hand, you were calling arbitrary hosts and not reusing routes the way I am you might easily find that it becomes necessary to do that sort of hackery, as the JVM might surprise you with the long life of CLOSE_WAIT designated sockets if you're not garbage collecting very often.
I had the same issue and solved it using the suggesting found here: here. The author touches on some TCP basics:
When a TCP connection is about to close, its finalization is negotiated by both parties. Think of it as breaking a contract in a civilized manner. Both parties sign the paper and it’s all good. In geek talk, this is done via the FIN/ACK messages. Party A sends a FIN message to indicate it wants to close the socket. Party B sends an ACK saying it received the message and is considering the demand. Party B then cleans up and sends a FIN to Party A. Party A responds with the ACK and everyone walks away.
The problem comes in
when B doesn’t send its FIN. A is kinda stuck waiting for it. It has
initiated its finalization sequence and is waiting for the other party
to do the same.
He then mentions RFC 2616, 14.10 to suggest setting up an http header to solve this issue:
postMethod.addHeader("Connection", "close");
Honestly, I don't really know the implications of setting this header. But it did stop CLOSE_WAIT from happening on my unit tests.