HTTP Request to Wikipedia gives no result - java
I try to fetch HTML per Code. When fetching from "http://www.google.com" for example it works perfect. When trying to fetch from "http://en.wikipedia.org/w/api.php" I do not get any results.
Does someone have any idea ?
Code:
String sURL="http://en.wikipedia.org/w/api.php?action=query&generator=categorymembers&gcmtitle=Category:Countries&prop=info&gcmlimit=500&format=json";
String sText=readfromURL(sURL);
public static String readfromURL(String sURL){
URL url = null;
try {
url = new URL(sURL);
} catch (MalformedURLException e1) {
e1.printStackTrace();
}
URLConnection urlconnect = null;
try {
urlconnect = url.openConnection();
urlconnect.setRequestProperty("User-Agent","Mozilla/5.0 (Windows NT 5.1; rv:19.0) Gecko/20100101 Firefox/19.0");
} catch (IOException e) {
e.printStackTrace();
}
BufferedReader in = null;
try {
in = new BufferedReader(new InputStreamReader(urlconnect.getInputStream()));
} catch (IOException e) {
e.printStackTrace();
}
String inputLine;
String sEntireContent="";
try {
while ((inputLine = in.readLine()) != null) {
System.out.println(inputLine);
sEntireContent=sEntireContent+inputLine;
}
} catch (IOException e) {
e.printStackTrace();
}
try {
in.close();
} catch (IOException e) {
e.printStackTrace();
}
return sEntireContent;
}
It looks like the request limit. Try to check the response code.
From the documentation (https://www.mediawiki.org/wiki/API:Etiquette):
If you make your requests in series rather than in parallel (i.e. wait
for the one request to finish before sending a new request, such that
you're never making more than one request at the same time), then you
should definitely be fine.
Be sure that you do not do few request at a time
Update
I did verification on my local your code - you are correct it does not work. Fix - you need to use https, so it would work:
https://en.wikipedia.org/w/api.php?action=query&generator=categorymembers&gcmtitle=Category:Countries&prop=info&gcmlimit=500&format=json
result:
{"batchcomplete":"","query":{"pages":{"5165":{"pageid":5165,"ns":0,"title":"Country","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-20T20:09:05Z","lastrevid":686706429,"length":12695},"5112305":{"pageid":5112305,"ns":14,"title":"Category:Countries by continent","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-18T17:31:54Z","lastrevid":681415612,"length":133},"14353213":{"pageid":14353213,"ns":14,"title":"Category:Countries by form of government","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-08-13T23:33:29Z","lastrevid":675984011,"length":261},"5112467":{"pageid":5112467,"ns":14,"title":"Category:Countries by international organization","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-18T05:11:12Z","lastrevid":686245148,"length":123},"4696391":{"pageid":4696391,"ns":14,"title":"Category:Countries by language","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-20T01:17:18Z","lastrevid":675966601,"length":333},"5112374":{"pageid":5112374,"ns":14,"title":"Category:Countries by status","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-08-13T21:05:47Z","lastrevid":675966630,"length":30},"708617":{"pageid":708617,"ns":14,"title":"Category:Lists of countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-18T05:08:45Z","lastrevid":681553760,"length":256},"46624537":{"pageid":46624537,"ns":14,"title":"Category:Caspian littoral states","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-09-23T08:40:34Z","lastrevid":663549987,"length":50},"18066512":{"pageid":18066512,"ns":14,"title":"Category:City-states","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-09-29T20:14:14Z","lastrevid":679367764,"length":145},"2019528":{"pageid":2019528,"ns":14,"title":"Category:Country classifications","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-09-25T09:09:13Z","lastrevid":675966465,"length":182},"935240":{"pageid":935240,"ns":14,"title":"Category:Country codes","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-17T06:05:53Z","lastrevid":546489724,"length":222},"36819536":{"pageid":36819536,"ns":14,"title":"Category:Countries in fiction","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-03T06:09:16Z","lastrevid":674147667,"length":169},"699787":{"pageid":699787,"ns":14,"title":"Category:Fictional countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-17T18:43:25Z","lastrevid":610289877,"length":356},"804303":{"pageid":804303,"ns":14,"title":"Category:Former countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-09-21T09:58:52Z","lastrevid":668632882,"length":403},"7213567":{"pageid":7213567,"ns":14,"title":"Category:Island countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-22T22:10:37Z","lastrevid":648502876,"length":157},"3046541":{"pageid":3046541,"ns":14,"title":"Category:Landlocked countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-04T00:45:24Z","lastrevid":648502892,"length":54},"743058":{"pageid":743058,"ns":14,"title":"Category:Middle Eastern countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-12T14:41:59Z","lastrevid":677900732,"length":495},"41711462":{"pageid":41711462,"ns":14,"title":"Category:Mongol states","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-23T07:36:21Z","lastrevid":687093637,"length":121},"30645082":{"pageid":30645082,"ns":14,"title":"Category:Country names","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-03-07T06:33:19Z","lastrevid":561256656,"length":94},"21218559":{"pageid":21218559,"ns":14,"title":"Category:Outlines of countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-07T18:04:29Z","lastrevid":645312408,"length":248},"37943702":{"pageid":37943702,"ns":14,"title":"Category:Proposed countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-20T02:30:25Z","lastrevid":668630396,"length":130},"15086044":{"pageid":15086044,"ns":14,"title":"Category:Turkic states","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-20T06:23:35Z","lastrevid":677424552,"length":114},"32809189":{"pageid":32809189,"ns":14,"title":"Category:Works about countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-17T08:45:32Z","lastrevid":620016516,"length":153},"27539189":{"pageid":27539189,"ns":14,"title":"Category:Wikipedia books on countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-04-11T05:12:25Z","lastrevid":546775798,"length":203},"35317198":{"pageid":35317198,"ns":14,"title":"Category:Wikipedia categories named after countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-17T18:35:14Z","lastrevid":641689352,"length":202}}}}
{"batchcomplete":"","query":{"pages":{"5165":{"pageid":5165,"ns":0,"title":"Country","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-20T20:09:05Z","lastrevid":686706429,"length":12695},"5112305":{"pageid":5112305,"ns":14,"title":"Category:Countries by continent","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-18T17:31:54Z","lastrevid":681415612,"length":133},"14353213":{"pageid":14353213,"ns":14,"title":"Category:Countries by form of government","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-08-13T23:33:29Z","lastrevid":675984011,"length":261},"5112467":{"pageid":5112467,"ns":14,"title":"Category:Countries by international organization","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-18T05:11:12Z","lastrevid":686245148,"length":123},"4696391":{"pageid":4696391,"ns":14,"title":"Category:Countries by language","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-20T01:17:18Z","lastrevid":675966601,"length":333},"5112374":{"pageid":5112374,"ns":14,"title":"Category:Countries by status","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-08-13T21:05:47Z","lastrevid":675966630,"length":30},"708617":{"pageid":708617,"ns":14,"title":"Category:Lists of countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-18T05:08:45Z","lastrevid":681553760,"length":256},"46624537":{"pageid":46624537,"ns":14,"title":"Category:Caspian littoral states","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-09-23T08:40:34Z","lastrevid":663549987,"length":50},"18066512":{"pageid":18066512,"ns":14,"title":"Category:City-states","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-09-29T20:14:14Z","lastrevid":679367764,"length":145},"2019528":{"pageid":2019528,"ns":14,"title":"Category:Country classifications","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-09-25T09:09:13Z","lastrevid":675966465,"length":182},"935240":{"pageid":935240,"ns":14,"title":"Category:Country codes","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-17T06:05:53Z","lastrevid":546489724,"length":222},"36819536":{"pageid":36819536,"ns":14,"title":"Category:Countries in fiction","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-03T06:09:16Z","lastrevid":674147667,"length":169},"699787":{"pageid":699787,"ns":14,"title":"Category:Fictional countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-17T18:43:25Z","lastrevid":610289877,"length":356},"804303":{"pageid":804303,"ns":14,"title":"Category:Former countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-09-21T09:58:52Z","lastrevid":668632882,"length":403},"7213567":{"pageid":7213567,"ns":14,"title":"Category:Island countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-22T22:10:37Z","lastrevid":648502876,"length":157},"3046541":{"pageid":3046541,"ns":14,"title":"Category:Landlocked countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-04T00:45:24Z","lastrevid":648502892,"length":54},"743058":{"pageid":743058,"ns":14,"title":"Category:Middle Eastern countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-12T14:41:59Z","lastrevid":677900732,"length":495},"41711462":{"pageid":41711462,"ns":14,"title":"Category:Mongol states","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-23T07:36:21Z","lastrevid":687093637,"length":121},"30645082":{"pageid":30645082,"ns":14,"title":"Category:Country names","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-03-07T06:33:19Z","lastrevid":561256656,"length":94},"21218559":{"pageid":21218559,"ns":14,"title":"Category:Outlines of countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-07T18:04:29Z","lastrevid":645312408,"length":248},"37943702":{"pageid":37943702,"ns":14,"title":"Category:Proposed countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-20T02:30:25Z","lastrevid":668630396,"length":130},"15086044":{"pageid":15086044,"ns":14,"title":"Category:Turkic states","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-20T06:23:35Z","lastrevid":677424552,"length":114},"32809189":{"pageid":32809189,"ns":14,"title":"Category:Works about countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-17T08:45:32Z","lastrevid":620016516,"length":153},"27539189":{"pageid":27539189,"ns":14,"title":"Category:Wikipedia books on countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-04-11T05:12:25Z","lastrevid":546775798,"length":203},"35317198":{"pageid":35317198,"ns":14,"title":"Category:Wikipedia categories named after countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-17T18:35:14Z","lastrevid":641689352,"length":202}}}}
The reason you didn't receive the response back is due to a HTTP Redirect 3XX. Wikipedia redirects your HTTP Request. Please try the below source code to fetch Response from Redirected URL. Please refer How to send HTTP request GET/POST in Java
public static String readfromURLwithRedirect(String url) {
String response = "";
try {
URL obj = new URL(url);
HttpURLConnection conn = (HttpURLConnection) obj.openConnection();
conn.setReadTimeout(5000);
conn.addRequestProperty("Accept-Language", "en-US,en;q=0.8");
conn.addRequestProperty("User-Agent", "Mozilla");
conn.addRequestProperty("Referer", "google.com");
System.out.println("Request URL ... " + url);
boolean redirect = false;
// normally, 3xx is redirect
int status = conn.getResponseCode();
if (status != HttpURLConnection.HTTP_OK) {
if (status == HttpURLConnection.HTTP_MOVED_TEMP
|| status == HttpURLConnection.HTTP_MOVED_PERM
|| status == HttpURLConnection.HTTP_SEE_OTHER) {
redirect = true;
}
}
System.out.println("Response Code ... " + status);
if (redirect) {
// get redirect url from "location" header field
String newUrl = conn.getHeaderField("Location");
// get the cookie if need, for login
String cookies = conn.getHeaderField("Set-Cookie");
// open the new connnection again
conn = (HttpURLConnection) new URL(newUrl).openConnection();
conn.setRequestProperty("Cookie", cookies);
conn.addRequestProperty("Accept-Language", "en-US,en;q=0.8");
conn.addRequestProperty("User-Agent", "Mozilla");
conn.addRequestProperty("Referer", "google.com");
System.out.println("Redirect to URL : " + newUrl);
}
BufferedReader in = new BufferedReader(
new InputStreamReader(conn.getInputStream()));
String inputLine;
StringBuffer responseBuffer = new StringBuffer();
while ((inputLine = in.readLine()) != null) {
responseBuffer.append(inputLine);
}
in.close();
System.out.println("URL Content... \n" + responseBuffer.toString());
response = responseBuffer.toString();
System.out.println("Done");
} catch (Exception e) {
e.printStackTrace();
}
return response;
}
Related
Android HttpURLConnection: HTTP response always 403
public static class GetTitleTask extends AsyncTask<String, Void, String> { protected String doInBackground(String... urls) { try { String youtubeUrl = urls[0]; if (youtubeUrl != null) { URL url = new URL("http://www.youtube.com/oembed?url=" + youtubeUrl + "&format=json"); HttpURLConnection connection = null; BufferedReader reader = null; try { connection = (HttpURLConnection) url.openConnection(); connection.addRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:221.0) Gecko/20100101 Firefox/31.0"); connection.connect(); if (connection.getResponseCode() != HttpURLConnection.HTTP_OK) { Log.d(TAG, "HTTP response code is " + connection.getResponseCode()); return null; } InputStream stream = connection.getInputStream(); reader = new BufferedReader(new InputStreamReader(stream)); StringBuffer buffer = new StringBuffer(); String line = ""; while ((line = reader.readLine()) != null) { buffer.append(line+"\n"); Log.d("Response: ", "> " + line); } return buffer.toString(); } catch (MalformedURLException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } finally { if (connection != null) { connection.disconnect(); } try { if (reader != null) { reader.close(); } } catch (IOException e) { e.printStackTrace(); } } return null; } } catch (Exception e) { e.printStackTrace(); } return null; } protected void onPostExecute(String title) { Log.v(TAG, "onPostExecute: " + title); super.onPostExecute(title); // ... } } Input for example: https://www.youtube.com/watch?v=hT_nvWreIhg This always returns null. The HTTP response code is 403. That is why I tried adding a user agent, but it didn't help. Is there any way to fix this? I would like to get the video title without using the YouTube API.
I tried fetching it with curl. Your URL uses http instead of https://..., change it to https://www.youtube.com/oembed?url=. Trying to fetch the URL when it starts with just http resulted in a 403 status for me. But https was successful. URL url = new URL("http://www.youtube.com/oembed?url=" + youtubeUrl + "&format=json"); Should be changed to: URL url = new URL("https://www.youtube.com/oembed?url=" + youtubeUrl + "&format=json");
The server does not agree to fulfil the request (error code 403). This is probably due to the protocol. Try using HttpsURLConnection instead of HttpURLConnection.
Android cUrl api
i am new to programming in android, i wanted to get json data from the website https://api.clashofclans.com/v1/clans in my android app. How can i request the data in json format curl -X GET --header 'Accept: application/json' --header "authorization: Bearer " 'https://api.clashofclans.com/v1/clans' . i dont know how to add api key in the combination and open network connection. thanks in advance. urlConnection = (HttpURLConnection) url.openConnection(); urlConnection.setReadTimeout(10000 /* milliseconds */); urlConnection.setConnectTimeout(15000 /* milliseconds */); urlConnection.setRequestMethod("GET"); urlConnection.addRequestProperty("Accept","application/json"); urlConnection.addRequestProperty("authorization","my api-key here"); urlConnection.connect(); // If the request was successful (response code 200), // then read the input stream and parse the response. if (urlConnection.getResponseCode() == 200) { inputStream = urlConnection.getInputStream(); jsonResponse = readFromStream(inputStream); } else { Log.e(LOG_TAG, "Error response code: " + urlConnection.getResponseCode()); } i am getting "Error response code: 403", can u tell me where my code is wrong the api key is based on ip address.
Im sending it to my method this is where i return it as String: public void getJson() { String json = getJSONFromURLConnection("https://api.clashofclans.com/v1/clans?name=illuminati"); if (json != null) { Log.i("JSON", json.toString()); } } Here getting all info: private String getJSONFromURLConnection(String urlString) { BufferedReader reader = null; try { URL url = new URL(urlString); urlConnection = (HttpURLConnection) url.openConnection(); String apikey = YOUR_KEY; urlConnection.setRequestMethod("GET"); urlConnection.addRequestProperty("Authorization", "Bearer " + apikey); urlConnection.setRequestProperty("Accept", "application/json"); if (urlConnection.getResponseCode() == 200) { InputStream stream = urlConnection.getInputStream(); reader = new BufferedReader(new InputStreamReader(stream)); StringBuffer buffer = new StringBuffer(); String line; while ((line = reader.readLine()) != null) { buffer.append(line); } return buffer.toString(); } else { Log.e("Error", "Error response code: " + urlConnection.getResponseCode()); } } catch (IOException e) { e.printStackTrace(); } finally { if (urlConnection != null) urlConnection.disconnect(); } return null; } Created and tested, works for me hope you get the result you need.
Not able to get Jsonstring from this link
String fileURL="https://tools.keycdn.com/geo.json?host=192.168.6.9"; URL url = new URL(fileURL); HttpURLConnection httpConn = (HttpURLConnection) url.openConnection(); int responseCode = httpConn.getResponseCode(); System.out.println(responseCode); Full code: try { String url = "http://tools.keycdn.com/geo.json?host=192.168.6.9"; URL obj = new URL(url); HttpURLConnection conn = (HttpURLConnection) obj.openConnection(); conn.setInstanceFollowRedirects(true); HttpURLConnection.setFollowRedirects(true); conn.setReadTimeout(5000); conn.addRequestProperty("Accept-Language", "en-US,en;q=0.8"); conn.addRequestProperty("User-Agent", "Mozilla"); conn.addRequestProperty("Referer", "google.com"); System.out.println("Request URL ... " + url); boolean redirect = false; // normally, 3xx is redirect int status = conn.getResponseCode(); if (status != HttpURLConnection.HTTP_OK) { if (status == HttpURLConnection.HTTP_MOVED_TEMP || status == HttpURLConnection.HTTP_MOVED_PERM || status == HttpURLConnection.HTTP_SEE_OTHER) redirect = true; } System.out.println("Response Code ... " + status); if (redirect) { // get redirect url from "location" header field String newUrl =conn.getHeaderField("Location"); newUrl=newUrl.replace("https://", "http://"); // get the cookie if need, for login String cookies = conn.getHeaderField("Set-Cookie"); // open the new connnection again conn = (HttpURLConnection) new URL(newUrl).openConnection(); conn.setRequestProperty("Cookie", cookies); conn.addRequestProperty("Accept-Language", "en-US,en;q=0.8"); conn.addRequestProperty("User-Agent", "Mozilla"); conn.addRequestProperty("Referer", "google.com"); System.out.println("Redirect to URL : " + newUrl); } BufferedReader in = new BufferedReader( new InputStreamReader(conn.getInputStream())); String inputLine; StringBuffer html = new StringBuffer(); while ((inputLine = in.readLine()) != null) { html.append(inputLine); } in.close(); System.out.println("URL Content... \n" + html.toString()); System.out.println("Done"); } catch (Exception e) { e.printStackTrace(); } } OUTPUT: Request URL ... http://tools.keycdn.com/geo.json?host=192.168.6.9 Response Code ... 301 Redirect to URL : http://tools.keycdn.com/geo.json?host=192.168.6.9 URL Content... 301 Moved Permanently301 Moved Permanentlynginx Done Iam not able to get json from this link, It gives 301 response code, I tried to get the redirected URL from HTTP header part, eventhough, It returns same URL and same 301 response code, please give java code solution to get JSON string from this URL.
Try this : String fileURL="https://tools.keycdn.com/geo.json?host=192.168.6.9"; URL url; try { final String USER_AGENT = "Mozilla/5.0"; url = new URL(fileURL); HttpURLConnection httpConn = (HttpURLConnection) url.openConnection(); // optional default is GET httpConn.setRequestMethod("GET"); //add request header httpConn.setRequestProperty("User-Agent", USER_AGENT); int responseCode = httpConn.getResponseCode(); System.out.println(responseCode); BufferedReader in = new BufferedReader( new InputStreamReader(httpConn.getInputStream())); String inputLine; StringBuffer response = new StringBuffer(); while ((inputLine = in.readLine()) != null) { response.append(inputLine); } in.close(); //print result System.out.println(response.toString()); } catch (Exception e) { e.printStackTrace(); }
HttpUrlConnection BadRequest - Statuscode 400
I have implemented a class using HttpUrlConnection to get some data from the google geocoding api. When I'm using this code on android, it works properly. But as soon as I am using this code in another "normal" java program, I am getting the status-code 400 (BadRequest) sometimes. Here is my code: HttpURLConnection c = null; StringBuilder sb = new StringBuilder(); try { URL u = new URL(url); c = (HttpURLConnection) u.openConnection(); c.setRequestMethod("GET"); c.setRequestProperty("Content-length", "0"); c.setUseCaches(false); c.setAllowUserInteraction(false); c.setConnectTimeout(timeout); c.setReadTimeout(timeout); c.connect(); int status = c.getResponseCode(); switch (status) { case HttpURLConnection.HTTP_OK: case HttpURLConnection.HTTP_CREATED: BufferedReader br = new BufferedReader(new InputStreamReader(c.getInputStream())); String line; while ((line = br.readLine()) != null) { sb.append(line + "\n"); } br.close(); } } catch (SocketTimeoutException ex){ // Handle ... } catch (MalformedURLException ex) { // Handle ... } catch (IOException ex) { // Handle ... } finally { if (c != null) { try { c.disconnect(); } catch (Exception ex) { } } } I have a reliable internet connection and also the URL I am using to receive the data works, whenever I try it with my web browser. Thanks in advance!
Bad Request is often caused by inadequat URLs. As you mentioned not every URL gives this error, only a view of them. So it has to be something to do with that. Try the following code to ensure the correct encoding of the URL you are using: String url = ...; // your url url = URLEncoder.encode(url,"UTF-8"); // Use 'url' ...
URLConnection is not allowing me to access data on Http errors (404,500,etc)
I am making a crawler, and need to get the data from the stream regardless if it is a 200 or not. CURL is doing it, as well as any standard browser. The following will not actually get the content of the request, even though there is some, an exception is thrown with the http error status code. I want the output regardless, is there a way? I prefer to use this library as it will actually do persistent connections, which is perfect for the type of crawling I am doing. package test; import java.net.*; import java.io.*; public class Test { public static void main(String[] args) { try { URL url = new URL("http://github.com/XXXXXXXXXXXXXX"); URLConnection connection = url.openConnection(); DataInputStream inStream = new DataInputStream(connection.getInputStream()); String inputLine; while ((inputLine = inStream.readLine()) != null) { System.out.println(inputLine); } inStream.close(); } catch (MalformedURLException me) { System.err.println("MalformedURLException: " + me); } catch (IOException ioe) { System.err.println("IOException: " + ioe); } } } Worked, thanks: Here is what I came up with - just as a rough proof of concept: import java.net.*; import java.io.*; public class Test { public static void main(String[] args) { //InputStream error = ((HttpURLConnection) connection).getErrorStream(); URL url = null; URLConnection connection = null; String inputLine = ""; try { url = new URL("http://verelo.com/asdfrwdfgdg"); connection = url.openConnection(); DataInputStream inStream = new DataInputStream(connection.getInputStream()); while ((inputLine = inStream.readLine()) != null) { System.out.println(inputLine); } inStream.close(); } catch (MalformedURLException me) { System.err.println("MalformedURLException: " + me); } catch (IOException ioe) { System.err.println("IOException: " + ioe); InputStream error = ((HttpURLConnection) connection).getErrorStream(); try { int data = error.read(); while (data != -1) { //do something with data... //System.out.println(data); inputLine = inputLine + (char)data; data = error.read(); //inputLine = inputLine + (char)data; } error.close(); } catch (Exception ex) { try { if (error != null) { error.close(); } } catch (Exception e) { } } } System.out.println(inputLine); } }
Simple: URLConnection connection = url.openConnection(); InputStream is = connection.getInputStream(); if (connection instanceof HttpURLConnection) { HttpURLConnection httpConn = (HttpURLConnection) connection; int statusCode = httpConn.getResponseCode(); if (statusCode != 200 /* or statusCode >= 200 && statusCode < 300 */) { is = httpConn.getErrorStream(); } } You can refer to Javadoc for explanation. The best way I would handle this is as follows: URLConnection connection = url.openConnection(); InputStream is = null; try { is = connection.getInputStream(); } catch (IOException ioe) { if (connection instanceof HttpURLConnection) { HttpURLConnection httpConn = (HttpURLConnection) connection; int statusCode = httpConn.getResponseCode(); if (statusCode != 200) { is = httpConn.getErrorStream(); } } }
You need to do the following after calling openConnection. Cast the URLConnection to HttpURLConnection Call getResponseCode If the response is a success, use getInputStream, otherwise use getErrorStream (The test for success should be 200 <= code < 300 because there are valid HTTP success codes apart from than 200.) I am making a crawler, and need to get the data from the stream regardless if it is a 200 or not. Just be aware that it if the code is a 4xx or 5xx, then the "data" is likely to be an error page of some kind. The final point that should be made is that you should always respect the "robots.txt" file ... and read the Terms of Service before crawling / scraping the content of a site whose owners might care. Simply blatting off GET requests is likely to annoy site owners ... unless you've already come to some sort of "arrangement" with them.