I need a monitor class that regularly checks whether a given HTTP URL is available. I can take care of the "regularly" part using the Spring TaskExecutor abstraction, so that's not the topic here. The question is: What is the preferred way to ping a URL in java?
Here is my current code as a starting point:
try {
final URLConnection connection = new URL(url).openConnection();
connection.connect();
LOG.info("Service " + url + " available, yeah!");
available = true;
} catch (final MalformedURLException e) {
throw new IllegalStateException("Bad URL: " + url, e);
} catch (final IOException e) {
LOG.info("Service " + url + " unavailable, oh no!", e);
available = false;
}
Is this any good at all (will it do what I want)?
Do I have to somehow close the connection?
I suppose this is a GET request. Is there a way to send HEAD instead?
Is this any good at all (will it do what I want?)
You can do so. Another feasible way is using java.net.Socket.
public static boolean pingHost(String host, int port, int timeout) {
try (Socket socket = new Socket()) {
socket.connect(new InetSocketAddress(host, port), timeout);
return true;
} catch (IOException e) {
return false; // Either timeout or unreachable or failed DNS lookup.
}
}
There's also the InetAddress#isReachable():
boolean reachable = InetAddress.getByName(hostname).isReachable();
This however doesn't explicitly test port 80. You risk to get false negatives due to a Firewall blocking other ports.
Do I have to somehow close the connection?
No, you don't explicitly need. It's handled and pooled under the hoods.
I suppose this is a GET request. Is there a way to send HEAD instead?
You can cast the obtained URLConnection to HttpURLConnection and then use setRequestMethod() to set the request method. However, you need to take into account that some poor webapps or homegrown servers may return HTTP 405 error for a HEAD (i.e. not available, not implemented, not allowed) while a GET works perfectly fine. Using GET is more reliable in case you intend to verify links/resources not domains/hosts.
Testing the server for availability is not enough in my case, I need to test the URL (the webapp may not be deployed)
Indeed, connecting a host only informs if the host is available, not if the content is available. It can as good happen that a webserver has started without problems, but the webapp failed to deploy during server's start. This will however usually not cause the entire server to go down. You can determine that by checking if the HTTP response code is 200.
HttpURLConnection connection = (HttpURLConnection) new URL(url).openConnection();
connection.setRequestMethod("HEAD");
int responseCode = connection.getResponseCode();
if (responseCode != 200) {
// Not OK.
}
// < 100 is undetermined.
// 1nn is informal (shouldn't happen on a GET/HEAD)
// 2nn is success
// 3nn is redirect
// 4nn is client error
// 5nn is server error
For more detail about response status codes see RFC 2616 section 10. Calling connect() is by the way not needed if you're determining the response data. It will implicitly connect.
For future reference, here's a complete example in flavor of an utility method, also taking account with timeouts:
/**
* Pings a HTTP URL. This effectively sends a HEAD request and returns <code>true</code> if the response code is in
* the 200-399 range.
* #param url The HTTP URL to be pinged.
* #param timeout The timeout in millis for both the connection timeout and the response read timeout. Note that
* the total timeout is effectively two times the given timeout.
* #return <code>true</code> if the given HTTP URL has returned response code 200-399 on a HEAD request within the
* given timeout, otherwise <code>false</code>.
*/
public static boolean pingURL(String url, int timeout) {
url = url.replaceFirst("^https", "http"); // Otherwise an exception may be thrown on invalid SSL certificates.
try {
HttpURLConnection connection = (HttpURLConnection) new URL(url).openConnection();
connection.setConnectTimeout(timeout);
connection.setReadTimeout(timeout);
connection.setRequestMethod("HEAD");
int responseCode = connection.getResponseCode();
return (200 <= responseCode && responseCode <= 399);
} catch (IOException exception) {
return false;
}
}
Instead of using URLConnection use HttpURLConnection by calling openConnection() on your URL object.
Then use getResponseCode() will give you the HTTP response once you've read from the connection.
here is code:
HttpURLConnection connection = null;
try {
URL u = new URL("http://www.google.com/");
connection = (HttpURLConnection) u.openConnection();
connection.setRequestMethod("HEAD");
int code = connection.getResponseCode();
System.out.println("" + code);
// You can determine on HTTP return code received. 200 is success.
} catch (MalformedURLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} finally {
if (connection != null) {
connection.disconnect();
}
}
Also check similar question How to check if a URL exists or returns 404 with Java?
Hope this helps.
You could also use HttpURLConnection, which allows you to set the request method (to HEAD for example). Here's an example that shows how to send a request, read the response, and disconnect.
The following code performs a HEAD request to check whether the website is available or not.
public static boolean isReachable(String targetUrl) throws IOException
{
HttpURLConnection httpUrlConnection = (HttpURLConnection) new URL(
targetUrl).openConnection();
httpUrlConnection.setRequestMethod("HEAD");
try
{
int responseCode = httpUrlConnection.getResponseCode();
return responseCode == HttpURLConnection.HTTP_OK;
} catch (UnknownHostException noInternetConnection)
{
return false;
}
}
public boolean isOnline() {
Runtime runtime = Runtime.getRuntime();
try {
Process ipProcess = runtime.exec("/system/bin/ping -c 1 8.8.8.8");
int exitValue = ipProcess.waitFor();
return (exitValue == 0);
} catch (IOException | InterruptedException e) { e.printStackTrace(); }
return false;
}
Possible Questions
Is this really fast enough?Yes, very fast!
Couldn’t I just ping my own page, which I want
to request anyways? Sure! You could even check both, if you want to
differentiate between “internet connection available” and your own
servers beeing reachable What if the DNS is down? Google DNS (e.g.
8.8.8.8) is the largest public DNS service in the world. As of 2013 it serves 130 billion requests a day. Let ‘s just say, your app not
responding would probably not be the talk of the day.
read the link. its seems very good
EDIT:
in my exp of using it, it's not as fast as this method:
public boolean isOnline() {
NetworkInfo netInfo = connectivityManager.getActiveNetworkInfo();
return netInfo != null && netInfo.isConnectedOrConnecting();
}
they are a bit different but in the functionality for just checking the connection to internet the first method may become slow due to the connection variables.
Consider using the Restlet framework, which has great semantics for this sort of thing. It's powerful and flexible.
The code could be as simple as:
Client client = new Client(Protocol.HTTP);
Response response = client.get(url);
if (response.getStatus().isError()) {
// uh oh!
}
Related
We have been discussing with one of our data providers the issue that some of the requests from our HTTP requests are intermittently failing due to "Connection Reset" exceptions, but we have also seen "The target server failed to respond" exceptions too.
Many Stack Overflow posts point to some potential solutions, namely
It's a pooling configuration issue, try reaping
HttpClient version issue - suggesting downgrading to HttpClient 4.5.1 (often from 4.5.3) fixes it. I'm using 4.5.12 https://mvnrepository.com/artifact/org.apache.httpcomponents/httpclient
The target server is actually failing to process the request (or cloudfront before the origin server).
I'm hoping this question will help me get to the bottom of the root cause.
Context
It's a Java web application hosted in AWS Elastic Beanstalk with 2..4 servers based on load. The Java WAR file uses HttpClient 4.5.12 to communicate. Over the last few months we have seen
45 x Connection Reset (only 3 were timeouts over 30s, the others failed within 20ms)
To put this into context, we perform in the region of 10,000 requests to this supplier, so the error rate isn't excessive, but it is very inconvenient because our customers pay for the service that then subsequently fails.
Right now we are trying to focus on eliminating the "connection reset" scenarios and we have been recommended to try the following:
1) Restart our app servers (a desperate just-in-case scenario)
2) Change the DNS servers to use Google 8.8.8.8 & 8.8.4.4 (so our request take a different path)
3) Assign a static IP to each server (so they can enable us to communicate without going through their CloudFront distribution)
We will work through those suggestions, but at the same time I want to understand where our HttpClient implementation might not be quite right.
Typical usage
User Request --> Our server (JAX-RS request) --> HttpClient to 3rd party --> Response received e.g. JSON/XML --> Massaged response is sent back (Our JSON format)
Technical details
Tomcat 8 with Java 8 running on 64bit Amazon Linux
4.5.12 HttpClient
4.4.13 HttpCore <-- Maven dependencies shows HttpClient 4.5.12 requires 4.4.13
4.5.12 HttpMime
Typically a HTTP request will take anywhere between 200ms and 10 seconds, with timeouts set around 15-30s depending on the API we are invoking. I also use a connection pool and given that most requests should be complete within 30 seconds I felt it was safe to evict anything older than double that period.
Any advice on whether these are sensible values is appreciated.
// max 200 requests in the connection pool
CONNECTIONS_MAX = 200;
// each 3rd party API can only use up to 50, so worst case 4 APIs can be flooded before exhuasted
CONNECTIONS_MAX_PER_ROUTE = 50;
// as our timeouts are typically 30s I'm assuming it's safe to clean up connections
// that are double that
// Connection timeouts are 30s, wasn't sure whether to close 31s or wait 2xtypical = 60s
CONNECTION_CLOSE_IDLE_MS = 60000;
// If the connection hasn't been used for 60s then we aren't busy and we can remove from the connection pool
CONNECTION_EVICT_IDLE_MS = 60000;
// Is this per request or each packet, but all requests should finish within 30s
CONNECTION_TIME_TO_LIVE_MS = 60000;
// To ensure connections are validated if in the pool but hasn't been used for at least 500ms
CONNECTION_VALIDATE_AFTER_INACTIVITY_MS = 500; // WAS 30000 (not test 500ms yet)
Additionally we tend to set the three timeouts to 30s, but I'm sure we can fine-tune these...
// client tries to connect to the server. This denotes the time elapsed before the connection established or Server responded to connection request.
// The time to establish a connection with the remote host
.setConnectTimeout(...) // typical 30s - I guess this could be 5s (if we can't connect by then the remote server is stuffed/busy)
// Used when requesting a connection from the connection manager (pooling)
// The time to fetch a connection from the connection pool
.setConnectionRequestTimeout(...) // typical 30s - I guess only applicable if our pool is saturated, then this means how long to wait to get a connection?
// After establishing the connection, the client socket waits for response after sending the request.
// This is the time of inactivity to wait for packets to arrive
.setSocketTimeout(...) // typical 30s - I believe this is the main one that we care about, if we don't get our payload in 30s then give up
I have copy and pasted the main code we use for all GET/POST requests but stripped out the un-important aspects such as our retry logic, pre-cache and post-cache
We are using a single PoolingHttpClientConnectionManager with a single CloseableHttpClient, they're both configured as follows...
private static PoolingHttpClientConnectionManager createConnectionManager() {
PoolingHttpClientConnectionManager cm = new PoolingHttpClientConnectionManager();
cm.setMaxTotal(CONNECTIONS_MAX); // 200
cm.setDefaultMaxPerRoute(CONNECTIONS_MAX_PER_ROUTE); // 50
cm.setValidateAfterInactivity(CONNECTION_VALIDATE_AFTER_INACTIVITY_MS); // Was 30000 now 500
return cm;
}
private static CloseableHttpClient createHttpClient() {
httpClient = HttpClientBuilder.create()
.setConnectionManager(cm)
.disableAutomaticRetries() // our code does the retries
.evictIdleConnections(CONNECTION_EVICT_IDLE_MS, TimeUnit.MILLISECONDS) // 60000
.setConnectionTimeToLive(CONNECTION_TIME_TO_LIVE_MS, TimeUnit.MILLISECONDS) // 60000
.setRedirectStrategy(LaxRedirectStrategy.INSTANCE)
// .setKeepAliveStrategy() - The default implementation looks solely at the 'Keep-Alive' header's timeout token.
.build();
return httpClient;
}
Every minute I have a thread that tries to reap connections
public static PoolStats performIdleConnectionReaper(Object source) {
synchronized (source) {
final PoolStats totalStats = cm.getTotalStats();
Log.info(source, "max:" + totalStats.getMax() + " avail:" + totalStats.getAvailable() + " leased:" + totalStats.getLeased() + " pending:" + totalStats.getPending());
cm.closeExpiredConnections();
cm.closeIdleConnections(CONNECTION_CLOSE_IDLE_MS, TimeUnit.MILLISECONDS); // 60000
return totalStats;
}
}
This is the custom method that performs all HttpClient GET/POST, it does stats, pre-cache, post-cache and other useful stuff, but I've stripped all of that out and this is the typical outline performed for each request. I've tried to follow the pattern as per the HttpClient docs that tell you to consume the entity and close the response. Note I don't close the httpClient because one instance is being used for all requests.
public static HttpHelperResponse execute(HttpHelperParams params) {
boolean abortRetries = false;
while (!abortRetries && ret.getAttempts() <= params.getMaxRetries()) {
// 1 Create HttpClient
// This is done once in the static init CloseableHttpClient httpClient = createHttpClient(params);
// 2 Create one of the methods, e.g. HttpGet / HttpPost - Note this also adds HTTP headers
// (see separate method below)
HttpRequestBase request = createRequest(params);
// 3 Tell HTTP Client to execute the command
CloseableHttpResponse response = null;
HttpEntity entity = null;
boolean alreadyStreamed = false;
try {
response = httpClient.execute(request);
if (response == null) {
throw new Exception("Null response received");
} else {
final StatusLine statusLine = response.getStatusLine();
ret.setStatusCode(statusLine.getStatusCode());
ret.setReasonPhrase(statusLine.getReasonPhrase());
if (ret.getStatusCode() == 429) {
try {
final int delay = (int) (Math.random() * params.getRetryDelayMs());
Thread.sleep(500 + delay); // minimum 500ms + random amount up to delay specified
} catch (Exception e) {
Log.error(false, params.getSource(), "HttpHelper Rate-limit sleep exception", e, params);
}
} else {
// 4 Read the response
// 6 Deal with the response
// do something useful with the response body
entity = response.getEntity();
if (entity == null) {
throw new Exception("Null entity received");
} else {
ret.setRawResponseAsString(EntityUtils.toString(entity, params.getEncoding()));
ret.setSuccess();
if (response.getAllHeaders() != null) {
for (Header header : response.getAllHeaders()) {
ret.addResponseHeader(header.getName(), header.getValue());
}
}
}
}
}
} catch (Exception ex) {
if (ret.getAttempts() >= params.getMaxRetries()) {
Log.error(false, params.getSource(), ex);
} else {
Log.warn(params.getSource(), ex.getMessage());
}
ret.setError(ex); // If we subsequently get a response then the error will be cleared.
} finally {
ret.incrementAttempts();
// Any HTTP 2xx are considered successfull, so stop retrying, or if
// a specifc HTTP code has been passed to stop retring
if (ret.getStatusCode() >= 200 && ret.getStatusCode() <= 299) {
abortRetries = true;
} else if (params.getDoNotRetryStatusCodes().contains(ret.getStatusCode())) {
abortRetries = true;
}
if (entity != null) {
try {
// and ensure it is fully consumed - hand it back to the pool
EntityUtils.consume(entity);
} catch (IOException ex) {
Log.error(false, params.getSource(), "HttpHelper Was unable to consume entity", params);
}
}
if (response != null) {
try {
// The underlying HTTP connection is still held by the response object
// to allow the response content to be streamed directly from the network socket.
// In order to ensure correct deallocation of system resources
// the user MUST call CloseableHttpResponse#close() from a finally clause.
// Please note that if response content is not fully consumed the underlying
// connection cannot be safely re-used and will be shut down and discarded
// by the connection manager.
response.close();
} catch (IOException ex) {
Log.error(false, params.getSource(), "HttpHelper Was unable to close a response", params);
}
}
// When using connection pooling we don't want to close the client, otherwise the connection
// pool will also be closed
// if (httpClient != null) {
// try {
// httpClient.close();
// } catch (IOException ex) {
// Log.error(false, params.getSource(), "HttpHelper Was unable to close httpClient", params);
// }
// }
}
}
return ret;
}
private static HttpRequestBase createRequest(HttpHelperParams params) {
...
request.setConfig(RequestConfig.copy(RequestConfig.DEFAULT)
// client tries to connect to the server. This denotes the time elapsed before the connection established or Server responded to connection request.
// The time to establish a connection with the remote host
.setConnectTimeout(...) // typical 30s
// Used when requesting a connection from the connection manager (pooling)
// The time to fetch a connection from the connection pool
.setConnectionRequestTimeout(...) // typical 30s
// After establishing the connection, the client socket waits for response after sending the request.
// This is the time of inactivity to wait for packets to arrive
.setSocketTimeout(...) // typical 30s
.build()
);
return request;
}
I'm making a POST request using Java 8 like this:
URL url = new URL("http://target.server.com/doIt");
URLConnection connection = url.openConnection();
HttpURLConnection httpConn = (HttpURLConnection) connection;
byte[] soapBytes = soapRequest.getBytes();
httpConn.setRequestProperty("Host", "target.host.com");
httpConn.setRequestProperty("Content-Length", soapBytes.length+"");
httpConn.setRequestProperty("Content-Type", "application/soap+xml; charset=utf-8");
httpConn.setRequestMethod("POST");
httpConn.setConnectTimeout(5000);
httpConn.setReadTimeout(35000);
httpConn.setDoOutput(true);
httpConn.setDoInput(true);
OutputStream out = httpConn.getOutputStream();
out.write(soapBytes);
out.close();
int statusCode;
try {
statusCode = httpConn.getResponseCode();
} catch (IOException e) {
InputStream stream = httpConn.getErrorStream();
if (stream == null) {
throw e;
} else {
// this never happens
}
}
My soap request contains a document ID and the target server (which hosts a third-party service that I do not own or have access to) returns a PDF document that matches the supplied ID.
Most of the time, the server returns a PDF doc and occasionally the status code is 500 when the document is not available. However, sometimes the call to getResponseCode() throws an IOException with "Invalid Http response".
I thought that a server would always have some response code to return, no matter what happens.
Does this mean that server is returning complete garbage that doesn't
match the expected format of a HTTP response?
Is there a way to get any more information about the actual response?
Is there a way to retrieve the raw textual response (if any)?
As AxelH points out, there must be something wrong when connecting with the remote server, and in this case you just can't get a valid response.
If you are in a testing environment, you can monitorize the connection at TCP level (not at HTTP level): Put a monitor between your client and the remote server which monitorizes all the TCP traffic exchanged between the two peers. If you are using Eclipse, you can create a TCP monitor.
I am developing a code for a project where a part of the code is to check a list of Urls (Web site) is live and and confirm it.
So far every thing is working as planned, expect some pages that are Moved Permanently with error 301 regarding this list. In case of error 301 I need to get the new Url info and pass it in a method before returning true.
The following example is just move to https but other examples could be moved to another Url, so if you call this site:
http://en.wikipedia.org/wiki/HTTP_301
it moves to
https://en.wikipedia.org/wiki/HTTP_301
Which is fine, I just need to get the new Url.
Is this possible and how?
This is my working code part so far:
boolean isUrlOk(String urlInput) {
HttpURLConnection connection = null;
try {
URL url = new URL(urlInput);
connection = (HttpURLConnection) url.openConnection();
connection.setRequestMethod("GET");
connection.connect();
urlStatusCode = connection.getResponseCode();
} catch (IOException e) {
// other error types to be reported
e.printStackTrace();
}
if (urlStatusCode == 200) {
return true;
} else if (urlStatusCode == 301) {
// call a method with the correct url name
// before returning true
return true;
}
return false;
}
You can get the new URL with
String newUrl = connection.getHeaderField("Location");
Hi I am writing a program that goes through many different URLs and just checks if they exist or not. I am basically checking if the error code returned is 404 or not. However as I am checking over 1000 URLs, I want to be able to do this very quickly. The following is my code, I was wondering how I can modify it to work quickly (if possible):
final URL url = new URL("http://www.example.com");
HttpURLConnection huc = (HttpURLConnection) url.openConnection();
int responseCode = huc.getResponseCode();
if (responseCode != 404) {
System.out.println("GOOD");
} else {
System.out.println("BAD");
}
Would it be quicker to use JSoup?
I am aware some sites give the code 200 and have their own error page, however I know the links that I am checking dont do this, so this is not needed.
Try sending a "HEAD" request instead of get request. That should be faster since the response body is not downloaded.
huc.setRequestMethod("HEAD");
Again instead of checking if response status is not 400, check if it is 200. That is check for positive instead of negative. 404,403,402.. all 40x statuses are nearly equivalent to invalid non-existant url.
You may make use of multi-threading to make it even faster.
Try to ask the next DNS Server
class DNSLookup
{
public static void main(String args[])
{
String host = "stackoverflow.com";
try
{
InetAddress inetAddress = InetAddress.getByName(host);
// show the Internet Address as name/address
System.out.println(inetAddress.getHostName() + " " + inetAddress.getHostAddress());
}
catch (UnknownHostException exception)
{
System.err.println("ERROR: Cannot access '" + host + "'");
}
catch (NamingException exception)
{
System.err.println("ERROR: No DNS record for '" + host + "'");
exception.printStackTrace();
}
}
}
Seems you can set the timeout property, make sure it is acceptable. And if you have many urls to test, do them parallelly, it will be much faster. Hope this will be helpful.
I have the following code (Android 4):
private HttpURLConnection conn = null;
private synchronized String downloadUrl(String myurl) {
InputStream is = null;
BufferedReader _bufferReader = null;
try {
URL url_service = new URL(.....);
System.setProperty("http.keepAlive", "false");
System.setProperty("http.maxConnections", "5");
conn = (HttpURLConnection) url_service.openConnection();
conn.setReadTimeout(DataHandler.TIME_OUT);
conn.setConnectTimeout(DataHandler.TIME_OUT);
conn.setRequestMethod("POST");
conn.setDoInput(true);
conn.setDoOutput(true);
conn.setRequestProperty("connection", "close");
conn.setInstanceFollowRedirects(false);
conn.connect();
StringBuilder total = null;
if (conn.getResponseCode() == HttpURLConnection.HTTP_OK) {
is = conn.getInputStream();
_bufferReader = new BufferedReader(new InputStreamReader(is));
total = new StringBuilder();
String line;
while ((line = _bufferReader.readLine()) != null) {
total.append(line);
}
} else {
onDomainError();
}
return total.toString();
} catch (SocketTimeoutException ste) {
onDomainError();
} catch (Exception e) {
onDomainError();
} finally {
if (is != null) {
try {
is.close();
} catch (IOException e) {
// TODO Auto-generated catch block
}
}
if (_bufferReader != null) {
try {
_bufferReader.close();
} catch (Exception e) {
// TODO: handle exception
}
}
if (conn != null)
conn.disconnect();
conn = null;
}
return null;
}
.disconnect() is used, keep-alive is set to false and max connections is set to 5. However, if SocketTimeout exception occurs, connections are not closed and device soon gets out-of memory. How is this possible?
Also, according to http://developer.android.com/reference/java/net/HttpURLConnection.html, HttpURLConnection should close connections on disconnect() if keep-alive is set to false and reuse it when keep-alive is true. Neither of these approaches work for me. Any ideas what could be wrong?
One possibility is that you are not setting the properties soon enough. According to the javadoc, the "keepalive" property needs to be set to false before issuing any HTTP requests. And that might actually mean before the URL protocol drivers are initialized.
Another possibility is that your OOME is not caused by this at all. It could be caused by what your app does with the content it has downloaded.
There some other problems with your code too.
The variable names url_service, _bufferedReader and myurl are all violations of Java's identifier naming conventions.
The conn variable should be a local variable. Making it a field makes the downloadUrl method non-reentrant. (And that might be contributing to your problems ... if multiple threads are sharing one instance of this object!)
You don't need to close the buffered reader and the input stream. Just close the reader, and it will close the stream. This probably doesn't matter for a reader, but if you do that for a buffered writer AND you close the output stream first, you are liable to get exceptions.
UPDATE
So we definitely have lots of non-garbage HttpURLConnectionImpl instances, and we probably have multiple threads running this code via AsyncTask.
If you try to connect to a non-responding site (e.g. one where the TCP/IP connect requests are black-holing ...) then the conn.connect() call is going to block for a long time and eventually throw an exception. If the connect timeout is long enough, and your code is doing a potentially unbounded number of these calls in parallel, then you are liable to have lots of these instances.
If this theory is correct, then your problem is nothing to do with keep-alives and connections not being closed. The problem is at the other end ... connections that are never properly established in the first place clogging up memory, and each one tying up a thread / thread stack:
Try reducing the connect timeout.
Try running these requests using an Executor with a bounded thread pool.
Note what it says in the AsyncTask javadoc:
"AsyncTask is designed to be a helper class around Thread and Handler and does not constitute a generic threading framework. AsyncTasks should ideally be used for short operations (a few seconds at the most.) If you need to keep threads running for long periods of time, it is highly recommended you use the various APIs provided by the java.util.concurrent pacakge such as Executor, ThreadPoolExecutor and FutureTask."