I have a webservice which is accepting a POST method with XML. It is working fine then at some random occasion, it fails to communicate to the server throwing IOException with message The target server failed to respond. The subsequent calls work fine.
It happens mostly, when i make some calls and then leave my application idle for like 10-15 min. the first call which I make after that returns this error.
I tried couple of things ...
I setup the retry handler like
HttpRequestRetryHandler retryHandler = new HttpRequestRetryHandler() {
public boolean retryRequest(IOException e, int retryCount, HttpContext httpCtx) {
if (retryCount >= 3){
Logger.warn(CALLER, "Maximum tries reached, exception would be thrown to outer block");
return false;
}
if (e instanceof org.apache.http.NoHttpResponseException){
Logger.warn(CALLER, "No response from server on "+retryCount+" call");
return true;
}
return false;
}
};
httpPost.getParams().setParameter(HttpMethodParams.RETRY_HANDLER, retryHandler);
but this retry never got called. (yes I am using right instanceof clause). While debugging this class never being called.
I even tried setting up HttpProtocolParams.setUseExpectContinue(httpClient.getParams(), false); but no use. Can someone suggest what I can do now?
IMPORTANT
Besides figuring out why I am getting the exception, one of the important concerns I have is why isn't the retryhandler working here?
Most likely persistent connections that are kept alive by the connection manager become stale. That is, the target server shuts down the connection on its end without HttpClient being able to react to that event, while the connection is being idle, thus rendering the connection half-closed or 'stale'. Usually this is not a problem. HttpClient employs several techniques to verify connection validity upon its lease from the pool. Even if the stale connection check is disabled and a stale connection is used to transmit a request message the request execution usually fails in the write operation with SocketException and gets automatically retried. However under some circumstances the write operation can terminate without an exception and the subsequent read operation returns -1 (end of stream). In this case HttpClient has no other choice but to assume the request succeeded but the server failed to respond most likely due to an unexpected error on the server side.
The simplest way to remedy the situation is to evict expired connections and connections that have been idle longer than, say, 1 minute from the pool after a period of inactivity. For details please see the 2.5. Connection eviction policy of the HttpClient 4.5 tutorial.
Accepted answer is right but lacks solution. To avoid this error, you can add setHttpRequestRetryHandler (or setRetryHandler for apache components 4.4) for your HTTP client like in this answer.
HttpClient 4.4 suffered from a bug in this area relating to validating possibly stale connections before returning to the requestor. It didn't validate whether a connection was stale, and this then results in an immediate NoHttpResponseException.
This issue was resolved in HttpClient 4.4.1. See this JIRA and the release notes
Solution: change the ReuseStrategy to never
Since this problem is very complex and there are so many different factors which can fail I was happy to find this solution in another post: How to solve org.apache.http.NoHttpResponseException
Never reuse connections:
configure in org.apache.http.impl.client.AbstractHttpClient:
httpClient.setReuseStrategy(new NoConnectionReuseStrategy());
The same can be configured on a org.apache.http.impl.client.HttpClientBuilder builder:
builder.setConnectionReuseStrategy(new NoConnectionReuseStrategy());
Although accepted answer is right, but IMHO is just a workaround.
To be clear: it's a perfectly normal situation that a persistent connection may become stale. But unfortunately it's very bad when the HTTP client library cannot handle it properly.
Since this faulty behavior in Apache HttpClient was not fixed for many years, I definitely would prefer to switch to a library that can easily recover from a stale connection problem, e.g. OkHttp.
Why?
OkHttp pools http connections by default.
It gracefully recovers from situations when http connection becomes stale and request cannot be retried due to being not idempotent (e.g. POST). I cannot say it about Apache HttpClient (mentioned NoHttpResponseException).
Supports HTTP/2.0 from early drafts and beta versions.
When I switched to OkHttp, my problems with NoHttpResponseException disappeared forever.
Nowadays, most HTTP connections are considered persistent unless declared otherwise. However, to save server ressources the connection is rarely kept open forever, the default connection timeout for many servers is rather short, for example 5 seconds for the Apache httpd 2.2 and above.
The org.apache.http.NoHttpResponseException error comes most likely from one persistent connection that was closed by the server.
It's possible to set the maximum time to keep unused connections open in the Apache Http client pool, in milliseconds.
With Spring Boot, one way to achieve this:
public class RestTemplateCustomizers {
static public class MaxConnectionTimeCustomizer implements RestTemplateCustomizer {
#Override
public void customize(RestTemplate restTemplate) {
HttpClient httpClient = HttpClientBuilder
.create()
.setConnectionTimeToLive(1000, TimeUnit.MILLISECONDS)
.build();
restTemplate.setRequestFactory(
new HttpComponentsClientHttpRequestFactory(httpClient));
}
}
}
// In your service that uses a RestTemplate
public MyRestService(RestTemplateBuilder builder ) {
restTemplate = builder
.customizers(new RestTemplateCustomizers.MaxConnectionTimeCustomizer())
.build();
}
This can happen if disableContentCompression() is set on a pooling manager assigned to your HttpClient, and the target server is trying to use gzip compression.
Same problem for me on apache http client 4.5.5
adding default header
Connection: close
resolve the problem
Use PoolingHttpClientConnectionManager instead of BasicHttpClientConnectionManager
BasicHttpClientConnectionManager will make an effort to reuse the connection for subsequent requests with the same route. It will, however, close the existing connection and re-open it for the given route.
I have faced same issue, I resolved by adding "connection: close" as extention,
Step 1: create a new class ConnectionCloseExtension
import com.github.tomakehurst.wiremock.common.FileSource;
import com.github.tomakehurst.wiremock.extension.Parameters;
import com.github.tomakehurst.wiremock.extension.ResponseTransformer;
import com.github.tomakehurst.wiremock.http.HttpHeader;
import com.github.tomakehurst.wiremock.http.HttpHeaders;
import com.github.tomakehurst.wiremock.http.Request;
import com.github.tomakehurst.wiremock.http.Response;
public class ConnectionCloseExtension extends ResponseTransformer {
#Override
public Response transform(Request request, Response response, FileSource files, Parameters parameters) {
return Response.Builder
.like(response)
.headers(HttpHeaders.copyOf(response.getHeaders())
.plus(new HttpHeader("Connection", "Close")))
.build();
}
#Override
public String getName() {
return "ConnectionCloseExtension";
}
}
Step 2: set extension class in wireMockServer like below,
final WireMockServer wireMockServer = new WireMockServer(options()
.extensions(ConnectionCloseExtension.class)
.port(httpPort));
Related
Performing millions of HTTP requests with different Java libraries gives me threads hanged on:
java.net.SocketInputStream.socketRead0()
Which is native function.
I tried to set up Apche Http Client and RequestConfig to have timeouts on (I hope) everythig that is possible but still, I have (probably infinite) hangs on socketRead0. How to get rid of them?
Hung ratio is about ~1 per 10000 requests (to 10000 different hosts) and it can last probably forever (I've confirmed thread hung as still valid after 10 hours).
JDK 1.8 on Windows 7.
My HttpClient factory:
SocketConfig socketConfig = SocketConfig.custom()
.setSoKeepAlive(false)
.setSoLinger(1)
.setSoReuseAddress(true)
.setSoTimeout(5000)
.setTcpNoDelay(true).build();
HttpClientBuilder builder = HttpClientBuilder.create();
builder.disableAutomaticRetries();
builder.disableContentCompression();
builder.disableCookieManagement();
builder.disableRedirectHandling();
builder.setConnectionReuseStrategy(new NoConnectionReuseStrategy());
builder.setDefaultSocketConfig(socketConfig);
return HttpClientBuilder.create().build();
My RequestConfig factory:
HttpGet request = new HttpGet(url);
RequestConfig config = RequestConfig.custom()
.setCircularRedirectsAllowed(false)
.setConnectionRequestTimeout(8000)
.setConnectTimeout(4000)
.setMaxRedirects(1)
.setRedirectsEnabled(true)
.setSocketTimeout(5000)
.setStaleConnectionCheckEnabled(true).build();
request.setConfig(config);
return new HttpGet(url);
OpenJDK socketRead0 source
Note: Actually I have some "trick" - I can schedule .getConnectionManager().shutdown() in other Thread with cancellation of Future if request finished properly, but it is depracated and also it kills whole HttpClient, not only that single request.
Though this question mentions Windows, I have the same problem on Linux. It appears there is a flaw in the way the JVM implements blocking socket timeouts:
https://bugs.openjdk.java.net/browse/JDK-8049846
https://bugs.openjdk.java.net/browse/JDK-8075484
To summarize, timeout for blocking sockets is implemented by calling poll on Linux (and select on Windows) to determine that data is available before calling recv. However, at least on Linux, both methods can spuriously indicate that data is available when it is not, leading to recv blocking indefinitely.
From poll(2) man page BUGS section:
See the discussion of spurious readiness notifications under the BUGS section of select(2).
From select(2) man page BUGS section:
Under Linux, select() may report a socket file descriptor as "ready
for reading", while nevertheless a subsequent read blocks. This could
for example happen when data has arrived but upon examination has
wrong checksum and is discarded. There may be other circumstances
in which a file descriptor is spuriously reported as ready. Thus it
may be safer to use O_NONBLOCK on sockets that should not block.
The Apache HTTP Client code is a bit hard to follow, but it appears that connection expiration is only set for HTTP keep-alive connections (which you've disabled) and is indefinite unless the server specifies otherwise. Therefore, as pointed out by oleg, the Connection eviction policy approach won't work in your case and can't be relied upon in general.
As Clint said, you should consider a Non-blocking HTTP client, or (seeing that you are using the Apache Httpclient) implement a Multithreaded request execution to prevent possible hangs of the main application thread (this not solve the problem but is better than restart your app because is freezed). Anyway, you set the setStaleConnectionCheckEnabled property but the stale connection check is not 100% reliable, from the Apache Httpclient tutorial:
One of the major shortcomings of the classic blocking I/O model is
that the network socket can react to I/O events only when blocked in
an I/O operation. When a connection is released back to the manager,
it can be kept alive however it is unable to monitor the status of the
socket and react to any I/O events. If the connection gets closed on
the server side, the client side connection is unable to detect the
change in the connection state (and react appropriately by closing the
socket on its end).
HttpClient tries to mitigate the problem by testing whether the
connection is 'stale', that is no longer valid because it was closed
on the server side, prior to using the connection for executing an
HTTP request. The stale connection check is not 100% reliable and adds
10 to 30 ms overhead to each request execution.
The Apache HttpComponents crew recommends the implementation of a Connection eviction policy
The only feasible solution that does not involve a one thread per
socket model for idle connections is a dedicated monitor thread used
to evict connections that are considered expired due to a long period
of inactivity. The monitor thread can periodically call
ClientConnectionManager#closeExpiredConnections() method to close all
expired connections and evict closed connections from the pool. It can
also optionally call ClientConnectionManager#closeIdleConnections()
method to close all connections that have been idle over a given
period of time.
Take a look at the sample code of the Connection eviction policy section and try to implement it in your application along with the Multithread request execution, I think the implementation of both mechanisms will prevent your undesired hangs.
You should consider a Non-blocking HTTP client like Grizzly or Netty which do not have blocking operations to hang a thread.
I have more than 50 machines that make about 200k requests/day/machine. They are running Amazon Linux AMI 2017.03. I previously had jdk1.8.0_102, now I have jdk1.8.0_131. I am using both apacheHttpClient and OKHttp as scraping libraries.
Each machine was running 50 threads, and sometimes, the threads get lost. After profiling with Youkit java profiler I got
ScraperThread42 State: RUNNABLE CPU usage on sample: 0ms
java.net.SocketInputStream.socketRead0(FileDescriptor, byte[], int, int, int) SocketInputStream.java (native)
java.net.SocketInputStream.socketRead(FileDescriptor, byte[], int, int, int) SocketInputStream.java:116
java.net.SocketInputStream.read(byte[], int, int, int) SocketInputStream.java:171
java.net.SocketInputStream.read(byte[], int, int) SocketInputStream.java:141
okio.Okio$2.read(Buffer, long) Okio.java:139
okio.AsyncTimeout$2.read(Buffer, long) AsyncTimeout.java:211
okio.RealBufferedSource.indexOf(byte, long) RealBufferedSource.java:306
okio.RealBufferedSource.indexOf(byte) RealBufferedSource.java:300
okio.RealBufferedSource.readUtf8LineStrict() RealBufferedSource.java:196
okhttp3.internal.http1.Http1Codec.readResponse() Http1Codec.java:191
okhttp3.internal.connection.RealConnection.createTunnel(int, int, Request, HttpUrl) RealConnection.java:303
okhttp3.internal.connection.RealConnection.buildTunneledConnection(int, int, int, ConnectionSpecSelector) RealConnection.java:156
okhttp3.internal.connection.RealConnection.connect(int, int, int, List, boolean) RealConnection.java:112
okhttp3.internal.connection.StreamAllocation.findConnection(int, int, int, boolean) StreamAllocation.java:193
okhttp3.internal.connection.StreamAllocation.findHealthyConnection(int, int, int, boolean, boolean) StreamAllocation.java:129
okhttp3.internal.connection.StreamAllocation.newStream(OkHttpClient, boolean) StreamAllocation.java:98
okhttp3.internal.connection.ConnectInterceptor.intercept(Interceptor$Chain) ConnectInterceptor.java:42
okhttp3.internal.http.RealInterceptorChain.proceed(Request, StreamAllocation, HttpCodec, Connection) RealInterceptorChain.java:92
okhttp3.internal.http.RealInterceptorChain.proceed(Request) RealInterceptorChain.java:67
okhttp3.internal.http.BridgeInterceptor.intercept(Interceptor$Chain) BridgeInterceptor.java:93
okhttp3.internal.http.RealInterceptorChain.proceed(Request, StreamAllocation, HttpCodec, Connection) RealInterceptorChain.java:92
okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(Interceptor$Chain) RetryAndFollowUpInterceptor.java:124
okhttp3.internal.http.RealInterceptorChain.proceed(Request, StreamAllocation, HttpCodec, Connection) RealInterceptorChain.java:92
okhttp3.internal.http.RealInterceptorChain.proceed(Request) RealInterceptorChain.java:67
okhttp3.RealCall.getResponseWithInterceptorChain() RealCall.java:198
okhttp3.RealCall.execute() RealCall.java:83
I found out that they have a fix for this
https://bugs.openjdk.java.net/browse/JDK-8172578
in JDK 8u152 (early access). I have installed it on one of our machines. Now I am waiting to see some good results.
Given no one else responded so far, here is my take
Your timeout setting looks perfectly OK to me. The reason why certain requests appear to be constantly blocked in a java.net.SocketInputStream#socketRead0() call is likely to be due to a combination of misbehaving servers and your local configuration. Socket timeout defines a maximum period of inactivity between two consecutive i/o read operations (or in other words two consecutive incoming packets). Your socket timeout setting is 5,000 milliseconds. As long as the opposite endpoint keeps on sending a packet every 4,999 milliseconds for a chunk encoded message the request will never time out and will end up sending most of its time blocked in java.net.SocketInputStream#socketRead0(). You can find out whether or not this is the case by running HttpClient with wire logging turned on.
For Apache HTTP Client (blocking) I found best solution is to getConnectionManager(). and shutdown it.
So in high-reliability solution I just schedule shutdown in other thread and in case request does not complete I'm shutting in down from other thread
I bumped into the same issue using apache common http client.
There's a pretty simple workaround (which doesn't require shutting the connection manager down):
In order to reproduce it, one needs to execute the request from the question in a new thread paying attention to details:
run request in separate thread, close request and release it's connection in a different thread, interrupt hanging thread
don't run EntityUtils.consumeQuietly(response.getEntity()) in finally block (because it hangs on 'dead' connection)
First, add the interface
interface RequestDisposer {
void dispose();
}
Execute an HTTP request in a new thread
final AtomicReference<RequestDisposer> requestDisposer = new AtomicReference<>(null);
final Thread thread = new Thread(() -> {
final HttpGet request = new HttpGet("http://my.url");
final RequestDisposer disposer = () -> {
request.abort();
request.releaseConnection();
};
requestDiposer.set(disposer);
try (final CloseableHttpResponse response = httpClient.execute(request))) {
...
} finally {
disposer.dispose();
}
};)
thread.start()
Call dispose() in the main thread to close hanging connection
requestDisposer.get().dispose(); // better check if it's not null first
thread.interrupt();
thread.join();
That fixed the issue for me.
My stacktrace looked like this:
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:139)
at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:155)
at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:284)
at org.apache.http.impl.io.ChunkedInputStream.getChunkSize(ChunkedInputStream.java:253)
at org.apache.http.impl.io.ChunkedInputStream.nextChunk(ChunkedInputStream.java:227)
at org.apache.http.impl.io.ChunkedInputStream.read(ChunkedInputStream.java:186)
at org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:137)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
To whom it might be interesting, it easily reproducable, interrupt the thread without aborting request and releasing connection (ratio is about 1/100).
Windows 10, version 10.0.
jdk8.151-x64.
I feel that all these answers are way too specific.
We have to note that this is probably a real JVM bug. It should be possible to get the file descriptor and close it. All this timeout-talk is too high level. You do not want a timeout to the extent that the connection fails, what you want is an ability to hard break this stuck thread and stop or interrupt it.
The way the JVM should implemented the SocketInputStream.socketRead function is to set some internal default timeout, which should be even as low as 1 second. Then when the timeout comes, immediately looping back to the socketRead0. While that is happening, the Thread.interrupt and Thread.stop commands can take effect.
The even better way of doing this of course is not to do any blocking wait at all, but instead use a the select(2) system call with a list of file descriptors and when any one has data available, let it perform the read operation.
Just look all over the internet all these people having trouble with threads stuck in java.net.SocketInputStream#socketRead0, it's the most popular topic about java.net.SocketInputStream hands down!
So, while the bug is not fixed, I wonder about the most dirty trick I can come up with to break up this situation. Something like connecting with the debugger interface to get to the stack frame of the socketRead call and grab the FileDescriptor and then break into that to get the int fd number and then make a native close(2) call on that fd.
Do we have a chance to do that? (Don't tell me "it's not good practice") -- if so, let's do it!
I faced the same issue today. Based on #Sergei Voitovich I've tried to make it work still using Apache Http Client.
Since I am using Java 8 its simpler to make a timeout to abort the connection.
Here's is a draft of the implementation:
private HttpResponse executeRequest(Request request){
InterruptibleRequestExecution requestExecution = new InterruptibleRequestExecution(request, executor);
ExecutorService executorService = Executors.newSingleThreadExecutor();
try {
return executorService.submit(requestExecution).get(<your timeout in milliseconds>, TimeUnit.MILLISECONDS);
} catch (TimeoutException | ExecutionException e) {
// Your request timed out, you can throw an exception here if you want
throw new UsefulExceptionForYourApplication(e);
} catch (InterruptedException e) {
// Always remember to call interrupt after catching InterruptedException
Thread.currentThread().interrupt();
throw new UsefulExceptionForYourApplication(e);
} finally {
// This method forces to stop the Thread Pool (with single thread) created by Executors.newSingleThreadExecutor() and makes the pending request to abort inside the thread. So if the request is hanging in socketRead0 it will stop and also the thread will be terminated
forceStopIdleThreadsAndRequests(requestExecution, executorService);
}
}
private void forceStopIdleThreadsAndRequests(InterruptibleRequestExecution execution,
ExecutorService executorService) {
execution.abortRequest();
executorService.shutdownNow();
}
The code above will create a new Thread to execute the request using org.apache.http.client.fluent.Executor. Timeout can be easily configured.
The execution of the thread is defined in InterruptibleRequestExecution which you can see below.
private static class InterruptibleRequestExecution implements Callable<HttpResponse> {
private final Request request;
private final Executor executor;
private final RequestDisposer disposer;
public InterruptibleRequestExecution(Request request, Executor executor) {
this.request = request;
this.executor = executor;
this.disposer = request::abort;
}
#Override
public HttpResponse call() {
try {
return executor.execute(request).returnResponse();
} catch (IOException e) {
throw new UsefulExceptionForYourApplication(e);
} finally {
disposer.dispose();
}
}
public void abortRequest() {
disposer.dispose();
}
#FunctionalInterface
interface RequestDisposer {
void dispose();
}
}
The results are really good. We've had times where some connections where hanging in sockedRead0 for 7 hours! Now, it never passes the defined timeout and its working in production with millions of requests per day without having any problems.
I'm creating a (well behaved) web spider and I notice that some servers are causing Apache HttpClient to give me a SocketException -- specifically:
java.net.SocketException: Connection reset
The code that causes this is:
// Execute the request
HttpResponse response;
try {
response = httpclient.execute(httpget); //httpclient is of type HttpClient
} catch (NullPointerException e) {
return;//deep down in apache http sometimes throws a null pointer...
}
For most servers it's just fine. But for others, it immediately throws a SocketException.
Example of site that causes immediate SocketException: http://www.bhphotovideo.com/
Works great (as do most websites): http://www.google.com/
Now, as you can see, www.bhphotovideo.com loads fine in a web browser. It also loads fine when I don't use Apache's HTTP Client. (Code like this:)
HttpURLConnection c = (HttpURLConnection)url.openConnection();
BufferedInputStream in = new BufferedInputStream(c.getInputStream());
Reader r = new InputStreamReader(in);
int i;
while ((i = r.read()) != -1) {
source.append((char) i);
}
So, why don't I just use this code instead? Well there are some key features in Apache's HTTP Client that I need to use.
Does anyone know what causes some servers to cause this exception?
Research so far:
Problem occurs on my local Mac dev machines AND an AWS EC2 Instance, so it's not a local firewall.
It seems the error isn't caused by the remote machine because the exception doesn't say "by peer"
This stack overflow seems relavent java.net.SocketException: Connection reset but the answers don't show why this would happen only from Apache HTTP Client and not other approaches.
Bonus question: I'm doing a fair amount of crawling with this system. Is there generally a better Java class for this other than Apache HTTP Client? I've found a number of issues (such as the NullPointerException I have to catch in the code above). It seems that HTTPClient is very picky about server communications -- more picky than I'd like for a crawler that can't just break when a server doesn't behave.
Thanks all!
Solution
Honestly, I don't have a perfect solution, but it works, so that's good enough for me.
As pointed out by oleg below, Bixo has created a crawler that customizes HttpClient to be more forgiving to servers. To "get around" the issue more than fix it, I just used SimpleHttpFetcher provided by Bixo here:
(linked removed - SO thinks I'm a spammer, so you'll have to google it yourself)
SimpleHttpFetcher fetch = new SimpleHttpFetcher(new UserAgent("botname","contact#yourcompany.com","ENTER URL"));
try {
FetchedResult result = fetch.fetch("ENTER URL");
System.out.println(new String(result.getContent()));
} catch (BaseFetchException e) {
e.printStackTrace();
}
The down side to this solution is that there are a lot of dependencies for Bixo -- so this may not be a good work around for everyone. However, you can always just work through their use of DefaultHttpClient and see how they instantiated it to get it to work. I decided to use the whole class because it handles some things for me, like automatic redirect following (and reporting the final destination url) that are helpful.
Thanks for the help all.
Edit: TinyBixo
Hi all. So, I loved how Bixo worked, but didn't like that it had so many dependencies (including all of Hadoop). So, I created a vastly simplified Bixo, without all the dependencies. If you're running into the problems above, I would recommend using it (and feel free to make pull requests if you'd like to update it!)
It's available here: https://github.com/juliuss/TinyBixo
First, to answer your question:
The connection reset was caused by a problem on the server side. Most likely the server failed to parse the request or was unable to process it and dropped the connection as a result without returning a valid response. There is likely something in the HTTP requests generated by HttpClient that causes server side logic to fail, probably due to a server side bug. Just because the error message does not say 'by peer' does not mean the connection reset took place on the client side.
A few remarks:
(1) Several popular web crawlers such as bixo http://openbixo.org/ use HttpClient without major issues but pretty much of them had to tweak HttpClient behavior to make it more lenient about common HTTP protocol violations. Per default HttpClient is rather strict about the HTTP protocol compliance.
(2) Why did not you report the NPE problem or any other problem you have been experiencing to the HttpClient project?
These two settings will sometimes help:
client.getParams().setParameter("http.socket.timeout", new Integer(0));
client.getParams().setParameter("http.connection.stalecheck", new Boolean(true));
The first sets the socket timeout to be infinite.
Try getting a network trace using wireshark, and augment that with log4j logging of the HTTPClient. That should show why the connection is being reset
I have used apache httpclient 4.5 in production for a while now, but recently, with the addition of a new use case, the system started failing.
We have multiple services that communicate through REST webservices, the client is a wrapper around apache httpclient 4.5.
Say i have service A communicating with service B. The communication works correctly until I restart service B. The next call I initiate from service A to service B fails, due to time out. After doing some research I found that the underlying TCP connection is reused for performance reasons (no more 2 way handshake etc). Since the server has been restarted, the underlying TCP connection is stale.
After reading the documentation, I found out that I can expire my connection after n seconds. Say I restart service B, then the call will fail the first n seconds, but after that the connection is rebuild. This is the keepAliveStrategy I implemented
connManager = new PoolingHttpClientConnectionManager();
connManager.setMaxTotal(100);
connManager.setDefaultMaxPerRoute(10);
ConnectionKeepAliveStrategy keepAliveStrategy = new DefaultConnectionKeepAliveStrategy() {
public long getKeepAliveDuration(HttpResponse response, HttpContext context) {
long keepAliveDuration = super.getKeepAliveDuration(response, context);
if (keepAliveDuration == -1) {
keepAliveDuration = 45 * 1000; // 45 seconds
}
return keepAliveDuration;
}
};
CloseableHttpClient closeableHttpClient = HttpClients.custom()
.setConnectionManager(connManager)
.setKeepAliveStrategy(keepAliveStrategy)
.build();
I am just wondering if this is correct usage of this library. I this the way it is meant to work or am I making everything overly complex?
Not sure it's 100% the same scenario, but here's my 2 cents:
We had a similar issues (broken connections in pool after a period of inactivity). When we were using an older version of HttpClient (3.X), we used the http.connection.stalecheck manager parameter, taking a minor performance hit over the possibility to get a IOException when a connection has been used that was closed server-side.
After upgrading to 4.4+ this approach was deprecated and started using setValidateAfterInactivity, which is a middle ground between per-call validation and runtime-error scenario:
PoolingHttpClientConnectionManager poolingConnManager = new PoolingHttpClientConnectionManager();
poolingConnManager.setValidateAfterInactivity(5000);
void o.a.h.i.c.PoolingHttpClientConnectionManager.setValidateAfterInactivity(int ms)
Defines period of inactivity in milliseconds after which persistent connections must be re-validated prior to being leased to the consumer. Non-positive value passed to this method disables connection validation. This check helps detect connections that have become stale (half-closed) while kept inactive in the pool.
If you're also controlling the consumed API, you can adapt the keep-alive strategy to the timing your client uses. We're using AWS Cloudfront + ELB's with connection draining for deregistered instances to ensure the kept-alive connections are fully closed, when performing a rolling upgrade. I guess as long as the connections are guaranteed to be kept alive for, say 30 seconds, any value passed to the connection manager below that will always ensure the validity check will mitigate any runtime I/O errors which are purely related to stale/expired connections.
I have a JUnit test of a JAX-RS web service. The test launches embedded tomcat, and then talks to it via the Apache CXF JAX-RS client.
Consider this backtrace:
Caused by: java.net.SocketException: Socket Closed
at java.net.PlainSocketImpl.getOption(PlainSocketImpl.java:286)
at java.net.Socket.getSoTimeout(Socket.java:1032)
at sun.net.www.http.HttpClient.available(HttpClient.java:356)
at sun.net.www.http.HttpClient.New(HttpClient.java:273)
at sun.net.www.http.HttpClient.New(HttpClient.java:310)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:987)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:923)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:841)
at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1031)
This fails only on CentOS 4.8. The same unit test (which launches an embedded tomcat and then talks to a web service in it) works just fine on a wide variety of other systems. Note the extreme oddity of this backtrace: HttpHRLConnection has called HttpClient to get a new connection, and that later class has apparently closed its own socket before the connection has been returned where any code of mine could get to it.
Further, the test has friends that do the same server setup of the same service and talk to it without issues.
Even further, the following incantation (slightly abbreviated) is a workaround:
#Before
public void pingServiceToWorkAroundCentos() {
try {
/* ... code to make a connection to the service and close it ... */
} catch (Throwable t) {
// do nothing
}
}
In other words, if I arrange for an extra throwaway connection before running each of the test cases, that uses up whatever this problem is.
What could this be?
Since there is only a backtrace and no code here, I am assuming that there is some sort of race condition or bug where the socket is being closed prior by another thread while this current thread is attempting to get the OutputStream.
Looking at the source for the JDK I see this...
public Object getOption(int opt) throws SocketException {
if (isClosedOrPending()) {
throw new SocketException("Socket Closed");
}
... snip ...
the isClosedOrPending method checks whether the internal FD is null or if a close is pending, i.e. close has been called on the socket.
Good luck tracking it down.
Nothing mysterious about it. You have closed the socket and then continued to use it.
Closing either the input or the output stream of the socket closes the other stream and the socket.
I am pretty sure this is a JDK bug.
HttpClient was modified in a recent commit:
http://hg.openjdk.java.net/jdk7u/jdk7u/jdk/diff/e6dc1d9bc70b/src/share/classes/sun/net/www/http/HttpClient.java
The getSoTimeout() call needs to be in a try/catch block, for now unfortunately the only real option is to downgrade the JDK.
Looks similar to an issue we ran into where the httpclient pooled connections were kept alive longer than the corresponding server side connections in tomcat. Basically this results in stale connections in the httpclient connection pool. When httpclient tries to use these, they basically fail. I believe httpclient actually recovers from this using the standard retry handler.
The solution is to double check your timeout settings client and serverside and your retry policy.
I'm creating a (well behaved) web spider and I notice that some servers are causing Apache HttpClient to give me a SocketException -- specifically:
java.net.SocketException: Connection reset
The code that causes this is:
// Execute the request
HttpResponse response;
try {
response = httpclient.execute(httpget); //httpclient is of type HttpClient
} catch (NullPointerException e) {
return;//deep down in apache http sometimes throws a null pointer...
}
For most servers it's just fine. But for others, it immediately throws a SocketException.
Example of site that causes immediate SocketException: http://www.bhphotovideo.com/
Works great (as do most websites): http://www.google.com/
Now, as you can see, www.bhphotovideo.com loads fine in a web browser. It also loads fine when I don't use Apache's HTTP Client. (Code like this:)
HttpURLConnection c = (HttpURLConnection)url.openConnection();
BufferedInputStream in = new BufferedInputStream(c.getInputStream());
Reader r = new InputStreamReader(in);
int i;
while ((i = r.read()) != -1) {
source.append((char) i);
}
So, why don't I just use this code instead? Well there are some key features in Apache's HTTP Client that I need to use.
Does anyone know what causes some servers to cause this exception?
Research so far:
Problem occurs on my local Mac dev machines AND an AWS EC2 Instance, so it's not a local firewall.
It seems the error isn't caused by the remote machine because the exception doesn't say "by peer"
This stack overflow seems relavent java.net.SocketException: Connection reset but the answers don't show why this would happen only from Apache HTTP Client and not other approaches.
Bonus question: I'm doing a fair amount of crawling with this system. Is there generally a better Java class for this other than Apache HTTP Client? I've found a number of issues (such as the NullPointerException I have to catch in the code above). It seems that HTTPClient is very picky about server communications -- more picky than I'd like for a crawler that can't just break when a server doesn't behave.
Thanks all!
Solution
Honestly, I don't have a perfect solution, but it works, so that's good enough for me.
As pointed out by oleg below, Bixo has created a crawler that customizes HttpClient to be more forgiving to servers. To "get around" the issue more than fix it, I just used SimpleHttpFetcher provided by Bixo here:
(linked removed - SO thinks I'm a spammer, so you'll have to google it yourself)
SimpleHttpFetcher fetch = new SimpleHttpFetcher(new UserAgent("botname","contact#yourcompany.com","ENTER URL"));
try {
FetchedResult result = fetch.fetch("ENTER URL");
System.out.println(new String(result.getContent()));
} catch (BaseFetchException e) {
e.printStackTrace();
}
The down side to this solution is that there are a lot of dependencies for Bixo -- so this may not be a good work around for everyone. However, you can always just work through their use of DefaultHttpClient and see how they instantiated it to get it to work. I decided to use the whole class because it handles some things for me, like automatic redirect following (and reporting the final destination url) that are helpful.
Thanks for the help all.
Edit: TinyBixo
Hi all. So, I loved how Bixo worked, but didn't like that it had so many dependencies (including all of Hadoop). So, I created a vastly simplified Bixo, without all the dependencies. If you're running into the problems above, I would recommend using it (and feel free to make pull requests if you'd like to update it!)
It's available here: https://github.com/juliuss/TinyBixo
First, to answer your question:
The connection reset was caused by a problem on the server side. Most likely the server failed to parse the request or was unable to process it and dropped the connection as a result without returning a valid response. There is likely something in the HTTP requests generated by HttpClient that causes server side logic to fail, probably due to a server side bug. Just because the error message does not say 'by peer' does not mean the connection reset took place on the client side.
A few remarks:
(1) Several popular web crawlers such as bixo http://openbixo.org/ use HttpClient without major issues but pretty much of them had to tweak HttpClient behavior to make it more lenient about common HTTP protocol violations. Per default HttpClient is rather strict about the HTTP protocol compliance.
(2) Why did not you report the NPE problem or any other problem you have been experiencing to the HttpClient project?
These two settings will sometimes help:
client.getParams().setParameter("http.socket.timeout", new Integer(0));
client.getParams().setParameter("http.connection.stalecheck", new Boolean(true));
The first sets the socket timeout to be infinite.
Try getting a network trace using wireshark, and augment that with log4j logging of the HTTPClient. That should show why the connection is being reset