Reliably timing HttpClient Get request

Reliably timing HttpClient Get request - java

I'm trying to port a network speed test from javascript to java running under Android. The way to speed test works is by hitting a CGI and requesting a given amount of data, and timing how long the data takes to transfer. This amount requested is changed dynamically to provide a relatively constant update rate.
But when I try to do this under Android, I see that the amount of time it takes for the response to come doesn't seem to be proportional to the amount of data requested. I am doing something like this:
final HttpParams params = new BasicHttpParams();
HttpClient httpclient = new DefaultHttpClient(ccm,params);
URI url;
try {
url = new URI("https://myserver.com/randomfile.php?pages=250");
} catch (Exception e) {
return -1;
}
HttpParams p = httpclient.getParams();
int timeout = 5000;
p.setIntParameter(CoreConnectionPNames.SO_TIMEOUT, timeout);
p.setIntParameter(CoreConnectionPNames.CONNECTION_TIMEOUT, timeout);
HttpGet request = new HttpGet();
request.setURI(url);
try {
long t0,t1,dt;
int rc;
t0 = java.lang.System.currentTimeMillis();
HttpResponse response = httpclient.execute(request);
t1 = java.lang.System.currentTimeMillis();
dt=t1-t0;
rc=response.getStatusLine().getStatusCode();
Log.d(logtag,"Get response code="+rc+",t="+dt);
} catch (Exception e) {
return -1;
}
From this, I'm guessing that maybe the call to httpclient.execute is returning as soon as it sees the response in the headers, but before all the data is transferred. I'm looking for the minimum amount of work required to know when the data is completely received. I don't care what the data is (I'm happy to just throw it away, it's just random bytes), and I don't want to waste extra time processing it if possible, to avoid skewing the reported transfer rate.
What's the minimum I need to do to accomplish this?
Also, it seems like there is some extra overhead in just setting up the call. For instance, if I try to read 4096 bytes of 1MB I am seeing about 500ms delay either way. I'm not sure where this extra delay is coming from; is there some way to get rid of it, because this is going to skew the results a lot more than a few milliseconds pulling data out of buffers.

It is possible to simply skip over the content returned using the skip() function of the InputStream obtained by the calling getContent():
InputStream instr;
t0 = java.lang.System.currentTimeMillis();
HttpResponse response = httpclient.execute(request);
rc=response.getStatusLine().getStatusCode();
instr=response.getEntity().getContent();
instr.skip(250*4096);
t1 = java.lang.System.currentTimeMillis();
As far as the call overhead, with a larger transfer it is a smaller part of the overall time.

Related

UrlConnection API Call takes much more time the first time, then onwards it is comparable to curl.exe or postman

I have observed that one of my api is taking much more time if called through Java (URLConnection or Apache Http Client or OKHttp) for the first time. For the subsequent calls, the time is much lesser.
Although Postman or curl.exe takes very less time(comparable to the second iterations of java)
For my machine, the first time overhead is around 2 secs. But on some machines this is rising to around 5-6 secs for the first time. Thereafter it is around 300 ms roundtrip.
Here is my sample code:
public static String DoPostUsingURLConnection(String s_uri) throws Exception {
try {
URL uri = new URL(s_uri);
HttpURLConnection connection = (HttpURLConnection) uri.openConnection();
// Logger.log("Opened Connection");
connection.setRequestMethod("POST");
connection.setRequestProperty("Content-Type", "application/json");
connection.setDoOutput(true);
connection.setRequestProperty("Authorization", authorizationHeader);
// Create the Request Body
try (OutputStream os = connection.getOutputStream()) {
byte[] input = jsonRequestBody.getBytes("utf-8");
os.write(input, 0, input.length);
}
// Logger.log("Written Output Stream");
int responseCode = connection.getResponseCode();
InputStream is = null;
if (responseCode == HttpURLConnection.HTTP_OK)
is = connection.getInputStream();
else
is = connection.getErrorStream();
BufferedReader in = new BufferedReader(new InputStreamReader(is));
String inputLine;
StringBuffer response = new StringBuffer();
while ((inputLine = in.readLine()) != null) {
response.append(inputLine).append("\n");
;
}
in.close();
return response.toString();
} catch (Exception ex) {
return ex.getMessage();
} finally {
// Logger.log("Got full response");
}

You can investigate where time is taken by logging OkHttp connections events.
https://square.github.io/okhttp/events/
It will be particularly relevant if you are getting an IPv4 address and IPv6 and one is timing out and the other one succeeding.

This is just a guess. But the way Http connection works, that when you invoke it for the first time the connection gets established and that takes time. After that Http protocol doesn't really close connection for some time in expectation that some more requests would come and the connection could be re-used. And in your case indeed you send subsequent requests that re-use the previously created connection rather then re-establishing it which is expansive. I have written my own Open Source library that has a simplistic Http client in it. I noticed the same effect that first request takes much longer time than subsequent requests. But that doesn't explain why in Postman and curl we don't see the same effect. Anyway, if you want to solve this problem and you know your URL in advance, send a request upon your app initialization (you can even do it in separate thread). That will solve your problem.
If you are interested to look at my library here is Javadoc link. You can find it as maven artifact here and on github here. Article about the library covering partial list of features here

How to resolve "Connection Reset" when using Java Apache HttpClient 4.5.12

We have been discussing with one of our data providers the issue that some of the requests from our HTTP requests are intermittently failing due to "Connection Reset" exceptions, but we have also seen "The target server failed to respond" exceptions too.
Many Stack Overflow posts point to some potential solutions, namely
It's a pooling configuration issue, try reaping
HttpClient version issue - suggesting downgrading to HttpClient 4.5.1 (often from 4.5.3) fixes it. I'm using 4.5.12 https://mvnrepository.com/artifact/org.apache.httpcomponents/httpclient
The target server is actually failing to process the request (or cloudfront before the origin server).
I'm hoping this question will help me get to the bottom of the root cause.
Context
It's a Java web application hosted in AWS Elastic Beanstalk with 2..4 servers based on load. The Java WAR file uses HttpClient 4.5.12 to communicate. Over the last few months we have seen
45 x Connection Reset (only 3 were timeouts over 30s, the others failed within 20ms)
To put this into context, we perform in the region of 10,000 requests to this supplier, so the error rate isn't excessive, but it is very inconvenient because our customers pay for the service that then subsequently fails.
Right now we are trying to focus on eliminating the "connection reset" scenarios and we have been recommended to try the following:
1) Restart our app servers (a desperate just-in-case scenario)
2) Change the DNS servers to use Google 8.8.8.8 & 8.8.4.4 (so our request take a different path)
3) Assign a static IP to each server (so they can enable us to communicate without going through their CloudFront distribution)
We will work through those suggestions, but at the same time I want to understand where our HttpClient implementation might not be quite right.
Typical usage
User Request --> Our server (JAX-RS request) --> HttpClient to 3rd party --> Response received e.g. JSON/XML --> Massaged response is sent back (Our JSON format)
Technical details
Tomcat 8 with Java 8 running on 64bit Amazon Linux
4.5.12 HttpClient
4.4.13 HttpCore <-- Maven dependencies shows HttpClient 4.5.12 requires 4.4.13
4.5.12 HttpMime
Typically a HTTP request will take anywhere between 200ms and 10 seconds, with timeouts set around 15-30s depending on the API we are invoking. I also use a connection pool and given that most requests should be complete within 30 seconds I felt it was safe to evict anything older than double that period.
Any advice on whether these are sensible values is appreciated.
// max 200 requests in the connection pool
CONNECTIONS_MAX = 200;
// each 3rd party API can only use up to 50, so worst case 4 APIs can be flooded before exhuasted
CONNECTIONS_MAX_PER_ROUTE = 50;
// as our timeouts are typically 30s I'm assuming it's safe to clean up connections
// that are double that
// Connection timeouts are 30s, wasn't sure whether to close 31s or wait 2xtypical = 60s
CONNECTION_CLOSE_IDLE_MS = 60000;
// If the connection hasn't been used for 60s then we aren't busy and we can remove from the connection pool
CONNECTION_EVICT_IDLE_MS = 60000;
// Is this per request or each packet, but all requests should finish within 30s
CONNECTION_TIME_TO_LIVE_MS = 60000;
// To ensure connections are validated if in the pool but hasn't been used for at least 500ms
CONNECTION_VALIDATE_AFTER_INACTIVITY_MS = 500; // WAS 30000 (not test 500ms yet)
Additionally we tend to set the three timeouts to 30s, but I'm sure we can fine-tune these...
// client tries to connect to the server. This denotes the time elapsed before the connection established or Server responded to connection request.
// The time to establish a connection with the remote host
.setConnectTimeout(...) // typical 30s - I guess this could be 5s (if we can't connect by then the remote server is stuffed/busy)
// Used when requesting a connection from the connection manager (pooling)
// The time to fetch a connection from the connection pool
.setConnectionRequestTimeout(...) // typical 30s - I guess only applicable if our pool is saturated, then this means how long to wait to get a connection?
// After establishing the connection, the client socket waits for response after sending the request.
// This is the time of inactivity to wait for packets to arrive
.setSocketTimeout(...) // typical 30s - I believe this is the main one that we care about, if we don't get our payload in 30s then give up
I have copy and pasted the main code we use for all GET/POST requests but stripped out the un-important aspects such as our retry logic, pre-cache and post-cache
We are using a single PoolingHttpClientConnectionManager with a single CloseableHttpClient, they're both configured as follows...
private static PoolingHttpClientConnectionManager createConnectionManager() {
PoolingHttpClientConnectionManager cm = new PoolingHttpClientConnectionManager();
cm.setMaxTotal(CONNECTIONS_MAX); // 200
cm.setDefaultMaxPerRoute(CONNECTIONS_MAX_PER_ROUTE); // 50
cm.setValidateAfterInactivity(CONNECTION_VALIDATE_AFTER_INACTIVITY_MS); // Was 30000 now 500
return cm;
}
private static CloseableHttpClient createHttpClient() {
httpClient = HttpClientBuilder.create()
.setConnectionManager(cm)
.disableAutomaticRetries() // our code does the retries
.evictIdleConnections(CONNECTION_EVICT_IDLE_MS, TimeUnit.MILLISECONDS) // 60000
.setConnectionTimeToLive(CONNECTION_TIME_TO_LIVE_MS, TimeUnit.MILLISECONDS) // 60000
.setRedirectStrategy(LaxRedirectStrategy.INSTANCE)
// .setKeepAliveStrategy() - The default implementation looks solely at the 'Keep-Alive' header's timeout token.
.build();
return httpClient;
}
Every minute I have a thread that tries to reap connections
public static PoolStats performIdleConnectionReaper(Object source) {
synchronized (source) {
final PoolStats totalStats = cm.getTotalStats();
Log.info(source, "max:" + totalStats.getMax() + " avail:" + totalStats.getAvailable() + " leased:" + totalStats.getLeased() + " pending:" + totalStats.getPending());
cm.closeExpiredConnections();
cm.closeIdleConnections(CONNECTION_CLOSE_IDLE_MS, TimeUnit.MILLISECONDS); // 60000
return totalStats;
}
}
This is the custom method that performs all HttpClient GET/POST, it does stats, pre-cache, post-cache and other useful stuff, but I've stripped all of that out and this is the typical outline performed for each request. I've tried to follow the pattern as per the HttpClient docs that tell you to consume the entity and close the response. Note I don't close the httpClient because one instance is being used for all requests.
public static HttpHelperResponse execute(HttpHelperParams params) {
boolean abortRetries = false;
while (!abortRetries && ret.getAttempts() <= params.getMaxRetries()) {
// 1 Create HttpClient
// This is done once in the static init CloseableHttpClient httpClient = createHttpClient(params);
// 2 Create one of the methods, e.g. HttpGet / HttpPost - Note this also adds HTTP headers
// (see separate method below)
HttpRequestBase request = createRequest(params);
// 3 Tell HTTP Client to execute the command
CloseableHttpResponse response = null;
HttpEntity entity = null;
boolean alreadyStreamed = false;
try {
response = httpClient.execute(request);
if (response == null) {
throw new Exception("Null response received");
} else {
final StatusLine statusLine = response.getStatusLine();
ret.setStatusCode(statusLine.getStatusCode());
ret.setReasonPhrase(statusLine.getReasonPhrase());
if (ret.getStatusCode() == 429) {
try {
final int delay = (int) (Math.random() * params.getRetryDelayMs());
Thread.sleep(500 + delay); // minimum 500ms + random amount up to delay specified
} catch (Exception e) {
Log.error(false, params.getSource(), "HttpHelper Rate-limit sleep exception", e, params);
}
} else {
// 4 Read the response
// 6 Deal with the response
// do something useful with the response body
entity = response.getEntity();
if (entity == null) {
throw new Exception("Null entity received");
} else {
ret.setRawResponseAsString(EntityUtils.toString(entity, params.getEncoding()));
ret.setSuccess();
if (response.getAllHeaders() != null) {
for (Header header : response.getAllHeaders()) {
ret.addResponseHeader(header.getName(), header.getValue());
}
}
}
}
}
} catch (Exception ex) {
if (ret.getAttempts() >= params.getMaxRetries()) {
Log.error(false, params.getSource(), ex);
} else {
Log.warn(params.getSource(), ex.getMessage());
}
ret.setError(ex); // If we subsequently get a response then the error will be cleared.
} finally {
ret.incrementAttempts();
// Any HTTP 2xx are considered successfull, so stop retrying, or if
// a specifc HTTP code has been passed to stop retring
if (ret.getStatusCode() >= 200 && ret.getStatusCode() <= 299) {
abortRetries = true;
} else if (params.getDoNotRetryStatusCodes().contains(ret.getStatusCode())) {
abortRetries = true;
}
if (entity != null) {
try {
// and ensure it is fully consumed - hand it back to the pool
EntityUtils.consume(entity);
} catch (IOException ex) {
Log.error(false, params.getSource(), "HttpHelper Was unable to consume entity", params);
}
}
if (response != null) {
try {
// The underlying HTTP connection is still held by the response object
// to allow the response content to be streamed directly from the network socket.
// In order to ensure correct deallocation of system resources
// the user MUST call CloseableHttpResponse#close() from a finally clause.
// Please note that if response content is not fully consumed the underlying
// connection cannot be safely re-used and will be shut down and discarded
// by the connection manager.
response.close();
} catch (IOException ex) {
Log.error(false, params.getSource(), "HttpHelper Was unable to close a response", params);
}
}
// When using connection pooling we don't want to close the client, otherwise the connection
// pool will also be closed
// if (httpClient != null) {
// try {
// httpClient.close();
// } catch (IOException ex) {
// Log.error(false, params.getSource(), "HttpHelper Was unable to close httpClient", params);
// }
// }
}
}
return ret;
}
private static HttpRequestBase createRequest(HttpHelperParams params) {
...
request.setConfig(RequestConfig.copy(RequestConfig.DEFAULT)
// client tries to connect to the server. This denotes the time elapsed before the connection established or Server responded to connection request.
// The time to establish a connection with the remote host
.setConnectTimeout(...) // typical 30s
// Used when requesting a connection from the connection manager (pooling)
// The time to fetch a connection from the connection pool
.setConnectionRequestTimeout(...) // typical 30s
// After establishing the connection, the client socket waits for response after sending the request.
// This is the time of inactivity to wait for packets to arrive
.setSocketTimeout(...) // typical 30s
.build()
);
return request;
}

Netty ClosedChannelException

I am new to netty and I followed this example to write a static file server using netty. But whenever the server serves a large js file. It runs into ClosedChannelException.
The following is my code where I write chunkedFile as http response.
When a large js file is being served I get closedChannelException and the raf file is also closed.
Could you help me figure out what I have done wrong here? Also, is there a simple tutorial where I get understand the basic flow of control in netty?
// Write the content.
ChannelFuture writeFuture = null;
try
{
long fileLength = raf.length();
HttpResponse response = new DefaultHttpResponse(
HttpVersion.HTTP_1_1, HttpResponseStatus.OK);
response.setHeader(HttpHeaders.Names.CONTENT_LENGTH, fileLength);
Channel c = ctx.getChannel();
// Write the initial line and the header.
c.write(response);
writeFuture = c.write(new ChunkedFile(raf, 0, fileLength, 8192));
}
finally
{
raf.close();
if (writeFuture != null)
writeFuture.addListener(ChannelFutureListener.CLOSE);
}
}
<

Calling raf.close() in the finally block is wrong as it may not have it written yet. In fact netty will take care to close it after the write is complete.

Incrementally handling twitter's streaming api using apache httpclient?

I am using Apache HTTPClient 4 to connect to twitter's streaming api with default level access. It works perfectly well in the beginning but after a few minutes of retrieving data it bails out with this error:
2012-03-28 16:17:00,040 DEBUG org.apache.http.impl.conn.SingleClientConnManager: Get connection for route HttpRoute[{tls}->http://myproxy:80->https://stream.twitter.com:443]
2012-03-28 16:17:00,040 WARN com.cloudera.flume.core.connector.DirectDriver: Exception in source: TestTwitterSource
java.lang.IllegalStateException: Invalid use of SingleClientConnManager: connection still allocated.
at org.apache.http.impl.conn.SingleClientConnManager.getConnection(SingleClientConnManager.java:216)
Make sure to release the connection before allocating another one.
at org.apache.http.impl.conn.SingleClientConnManager$1.getConnection(SingleClientConnManager.java:190)
I understand why I am facing this issue. I am trying to use this HttpClient in a flume cluster as a flume source. The code looks like this:
public Event next() throws IOException, InterruptedException {
try {
HttpHost target = new HttpHost("stream.twitter.com", 443, "https");
new BasicHttpContext();
HttpPost httpPost = new HttpPost("/1/statuses/filter.json");
StringEntity postEntity = new StringEntity("track=birthday",
"UTF-8");
postEntity.setContentType("application/x-www-form-urlencoded");
httpPost.setEntity(postEntity);
HttpResponse response = httpClient.execute(target, httpPost,
new BasicHttpContext());
BufferedReader reader = new BufferedReader(new InputStreamReader(
response.getEntity().getContent()));
String line = null;
StringBuffer buffer = new StringBuffer();
while ((line = reader.readLine()) != null) {
buffer.append(line);
if(buffer.length()>30000) break;
}
return new EventImpl(buffer.toString().getBytes());
} catch (IOException ie) {
throw ie;
}
}
I am trying to buffer 30,000 characters in the response stream to a StringBuffer and then return this as the data received. I am obviously not closing the connection - but I do not want to close it just yet I guess. Twitter's dev guide talks about this here It reads:
Some HTTP client libraries only return the response body after the
connection has been closed by the server. These clients will not work
for accessing the Streaming API. You must use an HTTP client that will
return response data incrementally. Most robust HTTP client libraries
will provide this functionality. The Apache HttpClient will handle
this use case, for example.
It clearly tells you that HttpClient will return response data incrementally. I've gone through the examples and tutorials, but I haven't found anything that comes close to doing this. If you guys have used a httpclient (if not apache) and read the streaming api of twitter incrementally, please let me know how you achieved this feat. Those who haven't, please feel free to contribute to answers. TIA.
UPDATE
I tried doing this: 1) I moved obtaining stream handle to the open method of the flume source. 2) Using a simple inpustream and reading data into a bytebuffer. So here is what the method body looks like now:
byte[] buffer = new byte[30000];
while (true) {
int count = instream.read(buffer);
if (count == -1)
continue;
else
break;
}
return new EventImpl(buffer);
This works to an extent - I get tweets, they are nicely being written to a destination. The problem is with the instream.read(buffer) return value. Even when there is no data on the stream, and the buffer has default \u0000 bytes and 30,000 of them, so this value is getting written to the destination. So the destination file looks like this.. " tweets..tweets..tweeets.. \u0000\u0000\u0000\u0000\u0000\u0000\u0000...tweets..tweets... ". I understand the count won't return a -1 coz this is a never ending stream, so how do I figure out if the buffer has new content from the read command?

The problem is that your code is leaking connections. Please make sure that no matter what you either close the content stream or abort the request.
InputStream instream = response.getEntity().getContent();
try {
BufferedReader reader = new BufferedReader(
new InputStreamReader(instream));
String line = null;
StringBuffer buffer = new StringBuffer();
while ((line = reader.readLine()) != null) {
buffer.append(line);
if (buffer.length()>30000) {
httpPost.abort();
// connection will not be re-used
break;
}
}
return new EventImpl(buffer.toString().getBytes());
} finally {
// if request is not aborted the connection can be re-used
try {
instream.close();
} catch (IOException ex) {
// log or ignore
}
}

It turns out that it was a flume issue. Flume is optimized to transfer events of size 32kb. Anything beyond 32kb, Flume bails out. (The workaround is to tune event size to be greater than 32KB). So, I've changed my code to buffer 20,000 characters at least. It kind of works, but it is not fool proof. This can still fail if the buffer length exceeds 32kb, however, it hasn't failed so far in an hour of testing - I believe it has to do with the fact that Twitter doesn't send a lot of data on its public stream.
while ((line = reader.readLine()) != null) {
buffer.append(line);
if(buffer.length()>20000) break;
}

HttpClient response handler always returns closed stream

I'm new to Java development so please bear with me. Also, I hope I'm not the champion of tl;dr :).
I'm using HttpClient to make requests over Http (duh!) and I'd gotten it to work for a simple servlet that receives an URL as a query string parameter. I realized that my code could use some refactoring, so I decided to make my own HttpResponseHandler, to clean up the code, make it reusable and improve exception handling.
I currently have something like this:
public class HttpResponseHandler implements ResponseHandler<InputStream>{
public InputStream handleResponse(HttpResponse response)
throws ClientProtocolException, IOException {
int statusCode = response.getStatusLine().getStatusCode();
InputStream in = null;
if (statusCode != HttpStatus.SC_OK) {
throw new HttpResponseException(statusCode, null);
} else {
HttpEntity entity = response.getEntity();
if (entity != null) {
in = entity.getContent();
// This works
// for (int i;(i = in.read()) >= 0;) System.out.print((char)i);
}
}
return in;
}
}
And in the method where I make the actual request:
HttpClient httpclient = new DefaultHttpClient();
HttpGet httpget = new HttpGet(target);
ResponseHandler<InputStream> httpResponseHandler = new HttpResponseHandler();
try {
InputStream in = httpclient.execute(httpget, httpResponseHandler);
// This doesn't work
// for (int i;(i = in.read()) >= 0;) System.out.print((char)i);
return in;
} catch (HttpResponseException e) {
throw new HttpResponseException(e.getStatusCode(), null);
}
The problem is that the input stream returned from the handler is closed. I don't have any idea why, but I've checked it with the prints in my code (and no, I haven't used them both at the same time :). While the first print works, the other one gives a closed stream error.
I need InputStreams, because all my other methods expect an InputStream and not a String. Also, I want to be able to retrieve images (or maybe other types of files), not just text files.
I can work around this pretty easily by giving up on the response handler (I have a working implementation that doesn't use it), but I'm pretty curious about the following:
Why does it do what it does?
How do I open the stream, if something closes it?
What's the right way to do this, anyway :)?
I've checked the docs and I couldn't find anything useful regarding this issue. To save you a bit of Googling, here's the Javadoc and here's the HttpClient tutorial (Section 1.1.8 - Response handlers).
Thanks,
Alex

It closes the stream because ResponseHandler must handle the whole response. Even if you get an open stream, it should be at the end of stream.
The stream is closed by BasicHttpEntity's consumeContent() call to ensure you don't read from the stream again.
In your case, you don't really need ResponseHandler.

The automatic resource management which is called closes the stream for you to make sure all resources are freed and ready for the next task.
If you want streams then you best bet is to copy it to a ByteArray and return a ByteArrayInputStream if the content is relatively modest.
If the content is not modest, then you'll have to do the resource management your self and not the the ResponseHandler.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Reliably timing HttpClient Get request - java

Related

UrlConnection API Call takes much more time the first time, then onwards it is comparable to curl.exe or postman

How to resolve "Connection Reset" when using Java Apache HttpClient 4.5.12

Netty ClosedChannelException

Incrementally handling twitter's streaming api using apache httpclient?

HttpClient response handler always returns closed stream

Categories

Resources