Getting new Url if Moved Permanently - java

I am developing a code for a project where a part of the code is to check a list of Urls (Web site) is live and and confirm it.
So far every thing is working as planned, expect some pages that are Moved Permanently with error 301 regarding this list. In case of error 301 I need to get the new Url info and pass it in a method before returning true.
The following example is just move to https but other examples could be moved to another Url, so if you call this site:
http://en.wikipedia.org/wiki/HTTP_301
it moves to
https://en.wikipedia.org/wiki/HTTP_301
Which is fine, I just need to get the new Url.
Is this possible and how?
This is my working code part so far:
boolean isUrlOk(String urlInput) {
HttpURLConnection connection = null;
try {
URL url = new URL(urlInput);
connection = (HttpURLConnection) url.openConnection();
connection.setRequestMethod("GET");
connection.connect();
urlStatusCode = connection.getResponseCode();
} catch (IOException e) {
// other error types to be reported
e.printStackTrace();
}
if (urlStatusCode == 200) {
return true;
} else if (urlStatusCode == 301) {
// call a method with the correct url name
// before returning true
return true;
}
return false;
}

You can get the new URL with
String newUrl = connection.getHeaderField("Location");

Related

How to make sure URLConnection is valid and will successfully connect in Java?

how can you ensure URL you receive from users is a valid url and not just a http://nothing.com
my code looks like this:
String urlFromUser = getUrlFromUser(); // might return: http://www.notARealSite.com
URL url = new URL(urlFromUser);
// this might fail
URLConnection urlConnection = url.openConnection();
and try catch is not enough, i want to make sure the site is real
you can also use:
if(new InetSocketAddress(urlFromUser, 80).isUnresolved()) {
// URL is a not a valid server address
}
else {
// URL is valid
}

Socrata URL works from Chrome, not from Android app

I'm trying to use the open data sets that data.LACity.org publishes using Socrata software.
They have a Java API for it, but first I tried to just build and send a URL, as
a variant on the 'Sunshine' app several people have learned from on Udacity.
Now I'm actually building a URL, and then sending it out, but then I get a FileNotFoundException, as follows:
java.io.FileNotFoundException: http://data.lacity.org/resource/yv23-pmwf.json?$select=zip_code, issue_date, address_start, address_end, street_name, street_suffix, work_description, valuation&$where=issue_date >= '2015-02-27T00:00:00' AND zip_code = 90291
Here's the pisser: That whole URL is, as a final attempt, hardwired as a complete string, not built from pieces. The URL works if I plug it into Chrome, but not from my app.
But from my app, the old URL string that the Sunshine sample app builds, plugged in from logcat from a Sunshine run, to replace the URL on the lacity URL, well, that call works, and returns the JSON data.
So I'm doing something wrong when I call the LACity URL for Socrata data from my Android app. I've tried this both as https and http, and both failed. But the same code works when I call the weathermap data from the sample app.
Here are the two URLs:
http://api.openweathermap.org/data/2.5/forecast/daily?q=94043&mode=json&units=metric&cnt=7 <<< this works, both in Chrome and from Android
https://data.lacity.org/resource/yv23-pmwf.json?$select=zip_code, issue_date, address_start, address_end, street_name, street_suffix, work_description, valuation&$where=issue_date >= '2015-02-27T00:00:00' AND zip_code = 90291
This works in Chrome but not from Android.
Any suggestions would be appreciated. I'm going to try again to make heads or tails of the Socrata Soda2 Java API (and why, in this case, it might be necessary.)
Thanks
-k-
The immediate code fragment (pardon my newness to Android/Java):
final String PERMIT_BASE_URL = "one of the url strings above";
Uri builtUri = Uri.parse(PERMIT_BASE_URL).buildUpon()
.build();
URL url = new URL(builtUri.toString());
Log.v(LOG_TAG, "Build URL: " + url.toString());
urlConnection = (HttpURLConnection) url.openConnection();
urlConnection.setRequestMethod("GET");
urlConnection.connect();
InputStream inputStream = urlConnection.getInputStream();
StringBuffer buffer = new StringBuffer();
if (inputStream == null) {
return null;
}
reader = new BufferedReader(new InputStreamReader(inputStream));
String line;
while ((line = reader.readLine()) != null) {
//simplify debugging
buffer.append(line + "\n");
}
if (buffer.length() == 0) {
return null;
}
permitJsonStr = buffer.toString();
Log.v(LOG_TAG, "Permit JSON string: " + permitJsonStr);
} catch (IOException e) {
Log.e(LOG_TAG, "Error on Xoom", e);
// Nothing to parse.
return null;
} finally{
if (urlConnection != null) {
urlConnection.disconnect();
}
if (reader != null) {
try {
reader.close();
} catch (final IOException e) {
Log.e(LOG_TAG, "Error closing stream on Xoom", e);
}
}
Figured this out from the way this page highlighted the URLs in my question.
Spaces.
The call out of Android seems to cough because of the spaces in the URL string.
I closed them all up, but then the 'AND' caused issues.
Replaced it with &, now it works, hardwired.
I'll work on constructing it from input parameters, as intended, but I think this is OK.
As Emily Litella would say...

Download web page only if it has been modified

I'm building and app for an assessment and I need to download a web page only if it has been modified since the last time I downloaded it. I need to store the date of last change as an Long, so the method getDate() returns a long.
I tried to use HttpURLConnection and URLConnection, but I couldn't manage to achieve a solution.
Within my attempts I tried to use:
If-Modified-Since, but, somehow, I didn't receive the 304 response code, only the 200. The code:
HttpURLConnection huc = null;
try {
URL url = new URL(pages.get(0).getUrl());
huc = (HttpURLConnection) url.openConnection();
huc.setIfModifiedSince(pages.get(0).getDate());
huc.connect();
Log.d("App", "Since: " + huc.getIfModifiedSince());
Log.d("App", "Response: " + huc.getResponseCode());
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
//output
since: 1354320000000 - which is the return of the getDate method.
Response: 200
html etags, but I couldn't retrieve the information from the response, because the server doesn't answer the Last-Modified tag.
Thanks in advance

Jsoup malformed url

I'm having trouble with connecting to a url with JSoup.
The url I am trying to test is www.xbox.com/en-US/security which is a 302(I think) redirect to http://www.xbox.com/en-US/Live/Account-Security. I have set up jsoup to follow redirect and get the new url using .headers("location"). The url returned is /en-US/Live/Account-Security. I'm not sure how to handle it, my code is below:
while (i < retries){
try {
response = Jsoup.connect(checkUrl)
.userAgent("Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.21 (KHTML, like Gecko) Chrome/19.0.1042.0 Safari/535.21")
.followRedirects(false)
.timeout(10000)
.execute();
success = true;
break;
} catch (SocketTimeoutException ex){
timeout = true;
} catch (MalformedURLException ep){
malformedUrl = true;
}catch (IOException e) {
statusCode = 404;
}
}
private void getStatus(){
if (success){
statusCode = response.statusCode();
success = false;
}
if (statusCode >= 300 && statusCode <= 399){
//System.out.println("redirect: " +statusCode + " " +checkUrl);
checkUrl = response.header("location");
//System.out.println(checkUrl);
connect();
getStatus();
}
}
Has anyone got suggestions on how to handle this? Or should I do a check on my checkUrl = response.header("location"); to see if it is a valid url and if not don't test it?
First things first: If you try to access "www.xbox.com/en-US/security", it'll throw you a MalformedException and thus not redirect you to where you want.
Than there's the issue that I'd use only the boolean variable success, and set it as false if any exception is caught. Then again I don't know if you're using timeout, or malformed variables for anything.
After that I'd say that the line right after IOException is never useful. I again couldn't tell, since I can't see the full code.
Now... To your question: The returned string is a domain within the first URL you provided. It'll go simply like this:
//Assuming you won't ever change it, make it a final
//variable for less memory usage.
final String URL = "http://www.xbox.com/en-US/security";
//Whatever piece of processing here
//Some tests just to make sure you'll get what you're
//fetching:
String newUrl = ""
if (checkUrl.startsWith("/"))
newUrl = URL + checkUrl;
if (checkUrl.startsWith("http://"))
newUrl = checkUrl;
if (checkUrl.startsWith("www"))
newUrl = "http://" + checkUrl;
This piece of code will basically make sure you can navigate through urls, without getting some MalformedUrlException. I'd suggest putting a manageUrl() method somewhere and test if the fetched URL is within the domain you're searching, or ele you might end up in e-commerces or publicuty websites.
Hope it helps =)

Preferred Java way to ping an HTTP URL for availability

I need a monitor class that regularly checks whether a given HTTP URL is available. I can take care of the "regularly" part using the Spring TaskExecutor abstraction, so that's not the topic here. The question is: What is the preferred way to ping a URL in java?
Here is my current code as a starting point:
try {
final URLConnection connection = new URL(url).openConnection();
connection.connect();
LOG.info("Service " + url + " available, yeah!");
available = true;
} catch (final MalformedURLException e) {
throw new IllegalStateException("Bad URL: " + url, e);
} catch (final IOException e) {
LOG.info("Service " + url + " unavailable, oh no!", e);
available = false;
}
Is this any good at all (will it do what I want)?
Do I have to somehow close the connection?
I suppose this is a GET request. Is there a way to send HEAD instead?
Is this any good at all (will it do what I want?)
You can do so. Another feasible way is using java.net.Socket.
public static boolean pingHost(String host, int port, int timeout) {
try (Socket socket = new Socket()) {
socket.connect(new InetSocketAddress(host, port), timeout);
return true;
} catch (IOException e) {
return false; // Either timeout or unreachable or failed DNS lookup.
}
}
There's also the InetAddress#isReachable():
boolean reachable = InetAddress.getByName(hostname).isReachable();
This however doesn't explicitly test port 80. You risk to get false negatives due to a Firewall blocking other ports.
Do I have to somehow close the connection?
No, you don't explicitly need. It's handled and pooled under the hoods.
I suppose this is a GET request. Is there a way to send HEAD instead?
You can cast the obtained URLConnection to HttpURLConnection and then use setRequestMethod() to set the request method. However, you need to take into account that some poor webapps or homegrown servers may return HTTP 405 error for a HEAD (i.e. not available, not implemented, not allowed) while a GET works perfectly fine. Using GET is more reliable in case you intend to verify links/resources not domains/hosts.
Testing the server for availability is not enough in my case, I need to test the URL (the webapp may not be deployed)
Indeed, connecting a host only informs if the host is available, not if the content is available. It can as good happen that a webserver has started without problems, but the webapp failed to deploy during server's start. This will however usually not cause the entire server to go down. You can determine that by checking if the HTTP response code is 200.
HttpURLConnection connection = (HttpURLConnection) new URL(url).openConnection();
connection.setRequestMethod("HEAD");
int responseCode = connection.getResponseCode();
if (responseCode != 200) {
// Not OK.
}
// < 100 is undetermined.
// 1nn is informal (shouldn't happen on a GET/HEAD)
// 2nn is success
// 3nn is redirect
// 4nn is client error
// 5nn is server error
For more detail about response status codes see RFC 2616 section 10. Calling connect() is by the way not needed if you're determining the response data. It will implicitly connect.
For future reference, here's a complete example in flavor of an utility method, also taking account with timeouts:
/**
* Pings a HTTP URL. This effectively sends a HEAD request and returns <code>true</code> if the response code is in
* the 200-399 range.
* #param url The HTTP URL to be pinged.
* #param timeout The timeout in millis for both the connection timeout and the response read timeout. Note that
* the total timeout is effectively two times the given timeout.
* #return <code>true</code> if the given HTTP URL has returned response code 200-399 on a HEAD request within the
* given timeout, otherwise <code>false</code>.
*/
public static boolean pingURL(String url, int timeout) {
url = url.replaceFirst("^https", "http"); // Otherwise an exception may be thrown on invalid SSL certificates.
try {
HttpURLConnection connection = (HttpURLConnection) new URL(url).openConnection();
connection.setConnectTimeout(timeout);
connection.setReadTimeout(timeout);
connection.setRequestMethod("HEAD");
int responseCode = connection.getResponseCode();
return (200 <= responseCode && responseCode <= 399);
} catch (IOException exception) {
return false;
}
}
Instead of using URLConnection use HttpURLConnection by calling openConnection() on your URL object.
Then use getResponseCode() will give you the HTTP response once you've read from the connection.
here is code:
HttpURLConnection connection = null;
try {
URL u = new URL("http://www.google.com/");
connection = (HttpURLConnection) u.openConnection();
connection.setRequestMethod("HEAD");
int code = connection.getResponseCode();
System.out.println("" + code);
// You can determine on HTTP return code received. 200 is success.
} catch (MalformedURLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} finally {
if (connection != null) {
connection.disconnect();
}
}
Also check similar question How to check if a URL exists or returns 404 with Java?
Hope this helps.
You could also use HttpURLConnection, which allows you to set the request method (to HEAD for example). Here's an example that shows how to send a request, read the response, and disconnect.
The following code performs a HEAD request to check whether the website is available or not.
public static boolean isReachable(String targetUrl) throws IOException
{
HttpURLConnection httpUrlConnection = (HttpURLConnection) new URL(
targetUrl).openConnection();
httpUrlConnection.setRequestMethod("HEAD");
try
{
int responseCode = httpUrlConnection.getResponseCode();
return responseCode == HttpURLConnection.HTTP_OK;
} catch (UnknownHostException noInternetConnection)
{
return false;
}
}
public boolean isOnline() {
Runtime runtime = Runtime.getRuntime();
try {
Process ipProcess = runtime.exec("/system/bin/ping -c 1 8.8.8.8");
int exitValue = ipProcess.waitFor();
return (exitValue == 0);
} catch (IOException | InterruptedException e) { e.printStackTrace(); }
return false;
}
Possible Questions
Is this really fast enough?Yes, very fast!
Couldn’t I just ping my own page, which I want
to request anyways? Sure! You could even check both, if you want to
differentiate between “internet connection available” and your own
servers beeing reachable What if the DNS is down? Google DNS (e.g.
8.8.8.8) is the largest public DNS service in the world. As of 2013 it serves 130 billion requests a day. Let ‘s just say, your app not
responding would probably not be the talk of the day.
read the link. its seems very good
EDIT:
in my exp of using it, it's not as fast as this method:
public boolean isOnline() {
NetworkInfo netInfo = connectivityManager.getActiveNetworkInfo();
return netInfo != null && netInfo.isConnectedOrConnecting();
}
they are a bit different but in the functionality for just checking the connection to internet the first method may become slow due to the connection variables.
Consider using the Restlet framework, which has great semantics for this sort of thing. It's powerful and flexible.
The code could be as simple as:
Client client = new Client(Protocol.HTTP);
Response response = client.get(url);
if (response.getStatus().isError()) {
// uh oh!
}

Categories