HttpUrlConnection BadRequest - Statuscode 400 - java

I have implemented a class using HttpUrlConnection to get some data from the google geocoding api. When I'm using this code on android, it works properly. But as soon as I am using this code in another "normal" java program, I am getting the status-code 400 (BadRequest) sometimes. Here is my code:
HttpURLConnection c = null;
StringBuilder sb = new StringBuilder();
try {
URL u = new URL(url);
c = (HttpURLConnection) u.openConnection();
c.setRequestMethod("GET");
c.setRequestProperty("Content-length", "0");
c.setUseCaches(false);
c.setAllowUserInteraction(false);
c.setConnectTimeout(timeout);
c.setReadTimeout(timeout);
c.connect();
int status = c.getResponseCode();
switch (status) {
case HttpURLConnection.HTTP_OK:
case HttpURLConnection.HTTP_CREATED:
BufferedReader br = new BufferedReader(new InputStreamReader(c.getInputStream()));
String line;
while ((line = br.readLine()) != null) {
sb.append(line + "\n");
}
br.close();
}
} catch (SocketTimeoutException ex){
// Handle ...
} catch (MalformedURLException ex) {
// Handle ...
} catch (IOException ex) {
// Handle ...
} finally {
if (c != null) {
try {
c.disconnect();
} catch (Exception ex) {
}
}
}
I have a reliable internet connection and also the URL I am using to receive the data works, whenever I try it with my web browser.
Thanks in advance!

Bad Request is often caused by inadequat URLs. As you mentioned not every URL gives this error, only a view of them. So it has to be something to do with that. Try the following code to ensure the correct encoding of the URL you are using:
String url = ...; // your url
url = URLEncoder.encode(url,"UTF-8");
// Use 'url' ...

Related

HTTP Request to Wikipedia gives no result

I try to fetch HTML per Code. When fetching from "http://www.google.com" for example it works perfect. When trying to fetch from "http://en.wikipedia.org/w/api.php" I do not get any results.
Does someone have any idea ?
Code:
String sURL="http://en.wikipedia.org/w/api.php?action=query&generator=categorymembers&gcmtitle=Category:Countries&prop=info&gcmlimit=500&format=json";
String sText=readfromURL(sURL);
public static String readfromURL(String sURL){
URL url = null;
try {
url = new URL(sURL);
} catch (MalformedURLException e1) {
e1.printStackTrace();
}
URLConnection urlconnect = null;
try {
urlconnect = url.openConnection();
urlconnect.setRequestProperty("User-Agent","Mozilla/5.0 (Windows NT 5.1; rv:19.0) Gecko/20100101 Firefox/19.0");
} catch (IOException e) {
e.printStackTrace();
}
BufferedReader in = null;
try {
in = new BufferedReader(new InputStreamReader(urlconnect.getInputStream()));
} catch (IOException e) {
e.printStackTrace();
}
String inputLine;
String sEntireContent="";
try {
while ((inputLine = in.readLine()) != null) {
System.out.println(inputLine);
sEntireContent=sEntireContent+inputLine;
}
} catch (IOException e) {
e.printStackTrace();
}
try {
in.close();
} catch (IOException e) {
e.printStackTrace();
}
return sEntireContent;
}
It looks like the request limit. Try to check the response code.
From the documentation (https://www.mediawiki.org/wiki/API:Etiquette):
If you make your requests in series rather than in parallel (i.e. wait
for the one request to finish before sending a new request, such that
you're never making more than one request at the same time), then you
should definitely be fine.
Be sure that you do not do few request at a time
Update
I did verification on my local your code - you are correct it does not work. Fix - you need to use https, so it would work:
https://en.wikipedia.org/w/api.php?action=query&generator=categorymembers&gcmtitle=Category:Countries&prop=info&gcmlimit=500&format=json
result:
{"batchcomplete":"","query":{"pages":{"5165":{"pageid":5165,"ns":0,"title":"Country","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-20T20:09:05Z","lastrevid":686706429,"length":12695},"5112305":{"pageid":5112305,"ns":14,"title":"Category:Countries by continent","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-18T17:31:54Z","lastrevid":681415612,"length":133},"14353213":{"pageid":14353213,"ns":14,"title":"Category:Countries by form of government","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-08-13T23:33:29Z","lastrevid":675984011,"length":261},"5112467":{"pageid":5112467,"ns":14,"title":"Category:Countries by international organization","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-18T05:11:12Z","lastrevid":686245148,"length":123},"4696391":{"pageid":4696391,"ns":14,"title":"Category:Countries by language","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-20T01:17:18Z","lastrevid":675966601,"length":333},"5112374":{"pageid":5112374,"ns":14,"title":"Category:Countries by status","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-08-13T21:05:47Z","lastrevid":675966630,"length":30},"708617":{"pageid":708617,"ns":14,"title":"Category:Lists of countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-18T05:08:45Z","lastrevid":681553760,"length":256},"46624537":{"pageid":46624537,"ns":14,"title":"Category:Caspian littoral states","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-09-23T08:40:34Z","lastrevid":663549987,"length":50},"18066512":{"pageid":18066512,"ns":14,"title":"Category:City-states","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-09-29T20:14:14Z","lastrevid":679367764,"length":145},"2019528":{"pageid":2019528,"ns":14,"title":"Category:Country classifications","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-09-25T09:09:13Z","lastrevid":675966465,"length":182},"935240":{"pageid":935240,"ns":14,"title":"Category:Country codes","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-17T06:05:53Z","lastrevid":546489724,"length":222},"36819536":{"pageid":36819536,"ns":14,"title":"Category:Countries in fiction","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-03T06:09:16Z","lastrevid":674147667,"length":169},"699787":{"pageid":699787,"ns":14,"title":"Category:Fictional countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-17T18:43:25Z","lastrevid":610289877,"length":356},"804303":{"pageid":804303,"ns":14,"title":"Category:Former countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-09-21T09:58:52Z","lastrevid":668632882,"length":403},"7213567":{"pageid":7213567,"ns":14,"title":"Category:Island countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-22T22:10:37Z","lastrevid":648502876,"length":157},"3046541":{"pageid":3046541,"ns":14,"title":"Category:Landlocked countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-04T00:45:24Z","lastrevid":648502892,"length":54},"743058":{"pageid":743058,"ns":14,"title":"Category:Middle Eastern countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-12T14:41:59Z","lastrevid":677900732,"length":495},"41711462":{"pageid":41711462,"ns":14,"title":"Category:Mongol states","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-23T07:36:21Z","lastrevid":687093637,"length":121},"30645082":{"pageid":30645082,"ns":14,"title":"Category:Country names","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-03-07T06:33:19Z","lastrevid":561256656,"length":94},"21218559":{"pageid":21218559,"ns":14,"title":"Category:Outlines of countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-07T18:04:29Z","lastrevid":645312408,"length":248},"37943702":{"pageid":37943702,"ns":14,"title":"Category:Proposed countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-20T02:30:25Z","lastrevid":668630396,"length":130},"15086044":{"pageid":15086044,"ns":14,"title":"Category:Turkic states","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-20T06:23:35Z","lastrevid":677424552,"length":114},"32809189":{"pageid":32809189,"ns":14,"title":"Category:Works about countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-17T08:45:32Z","lastrevid":620016516,"length":153},"27539189":{"pageid":27539189,"ns":14,"title":"Category:Wikipedia books on countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-04-11T05:12:25Z","lastrevid":546775798,"length":203},"35317198":{"pageid":35317198,"ns":14,"title":"Category:Wikipedia categories named after countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-17T18:35:14Z","lastrevid":641689352,"length":202}}}}
{"batchcomplete":"","query":{"pages":{"5165":{"pageid":5165,"ns":0,"title":"Country","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-20T20:09:05Z","lastrevid":686706429,"length":12695},"5112305":{"pageid":5112305,"ns":14,"title":"Category:Countries by continent","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-18T17:31:54Z","lastrevid":681415612,"length":133},"14353213":{"pageid":14353213,"ns":14,"title":"Category:Countries by form of government","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-08-13T23:33:29Z","lastrevid":675984011,"length":261},"5112467":{"pageid":5112467,"ns":14,"title":"Category:Countries by international organization","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-18T05:11:12Z","lastrevid":686245148,"length":123},"4696391":{"pageid":4696391,"ns":14,"title":"Category:Countries by language","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-20T01:17:18Z","lastrevid":675966601,"length":333},"5112374":{"pageid":5112374,"ns":14,"title":"Category:Countries by status","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-08-13T21:05:47Z","lastrevid":675966630,"length":30},"708617":{"pageid":708617,"ns":14,"title":"Category:Lists of countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-18T05:08:45Z","lastrevid":681553760,"length":256},"46624537":{"pageid":46624537,"ns":14,"title":"Category:Caspian littoral states","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-09-23T08:40:34Z","lastrevid":663549987,"length":50},"18066512":{"pageid":18066512,"ns":14,"title":"Category:City-states","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-09-29T20:14:14Z","lastrevid":679367764,"length":145},"2019528":{"pageid":2019528,"ns":14,"title":"Category:Country classifications","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-09-25T09:09:13Z","lastrevid":675966465,"length":182},"935240":{"pageid":935240,"ns":14,"title":"Category:Country codes","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-17T06:05:53Z","lastrevid":546489724,"length":222},"36819536":{"pageid":36819536,"ns":14,"title":"Category:Countries in fiction","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-03T06:09:16Z","lastrevid":674147667,"length":169},"699787":{"pageid":699787,"ns":14,"title":"Category:Fictional countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-17T18:43:25Z","lastrevid":610289877,"length":356},"804303":{"pageid":804303,"ns":14,"title":"Category:Former countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-09-21T09:58:52Z","lastrevid":668632882,"length":403},"7213567":{"pageid":7213567,"ns":14,"title":"Category:Island countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-22T22:10:37Z","lastrevid":648502876,"length":157},"3046541":{"pageid":3046541,"ns":14,"title":"Category:Landlocked countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-04T00:45:24Z","lastrevid":648502892,"length":54},"743058":{"pageid":743058,"ns":14,"title":"Category:Middle Eastern countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-12T14:41:59Z","lastrevid":677900732,"length":495},"41711462":{"pageid":41711462,"ns":14,"title":"Category:Mongol states","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-23T07:36:21Z","lastrevid":687093637,"length":121},"30645082":{"pageid":30645082,"ns":14,"title":"Category:Country names","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-03-07T06:33:19Z","lastrevid":561256656,"length":94},"21218559":{"pageid":21218559,"ns":14,"title":"Category:Outlines of countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-07T18:04:29Z","lastrevid":645312408,"length":248},"37943702":{"pageid":37943702,"ns":14,"title":"Category:Proposed countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-20T02:30:25Z","lastrevid":668630396,"length":130},"15086044":{"pageid":15086044,"ns":14,"title":"Category:Turkic states","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-20T06:23:35Z","lastrevid":677424552,"length":114},"32809189":{"pageid":32809189,"ns":14,"title":"Category:Works about countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-17T08:45:32Z","lastrevid":620016516,"length":153},"27539189":{"pageid":27539189,"ns":14,"title":"Category:Wikipedia books on countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-04-11T05:12:25Z","lastrevid":546775798,"length":203},"35317198":{"pageid":35317198,"ns":14,"title":"Category:Wikipedia categories named after countries","contentmodel":"wikitext","pagelanguage":"en","touched":"2015-10-17T18:35:14Z","lastrevid":641689352,"length":202}}}}
The reason you didn't receive the response back is due to a HTTP Redirect 3XX. Wikipedia redirects your HTTP Request. Please try the below source code to fetch Response from Redirected URL. Please refer How to send HTTP request GET/POST in Java
public static String readfromURLwithRedirect(String url) {
String response = "";
try {
URL obj = new URL(url);
HttpURLConnection conn = (HttpURLConnection) obj.openConnection();
conn.setReadTimeout(5000);
conn.addRequestProperty("Accept-Language", "en-US,en;q=0.8");
conn.addRequestProperty("User-Agent", "Mozilla");
conn.addRequestProperty("Referer", "google.com");
System.out.println("Request URL ... " + url);
boolean redirect = false;
// normally, 3xx is redirect
int status = conn.getResponseCode();
if (status != HttpURLConnection.HTTP_OK) {
if (status == HttpURLConnection.HTTP_MOVED_TEMP
|| status == HttpURLConnection.HTTP_MOVED_PERM
|| status == HttpURLConnection.HTTP_SEE_OTHER) {
redirect = true;
}
}
System.out.println("Response Code ... " + status);
if (redirect) {
// get redirect url from "location" header field
String newUrl = conn.getHeaderField("Location");
// get the cookie if need, for login
String cookies = conn.getHeaderField("Set-Cookie");
// open the new connnection again
conn = (HttpURLConnection) new URL(newUrl).openConnection();
conn.setRequestProperty("Cookie", cookies);
conn.addRequestProperty("Accept-Language", "en-US,en;q=0.8");
conn.addRequestProperty("User-Agent", "Mozilla");
conn.addRequestProperty("Referer", "google.com");
System.out.println("Redirect to URL : " + newUrl);
}
BufferedReader in = new BufferedReader(
new InputStreamReader(conn.getInputStream()));
String inputLine;
StringBuffer responseBuffer = new StringBuffer();
while ((inputLine = in.readLine()) != null) {
responseBuffer.append(inputLine);
}
in.close();
System.out.println("URL Content... \n" + responseBuffer.toString());
response = responseBuffer.toString();
System.out.println("Done");
} catch (Exception e) {
e.printStackTrace();
}
return response;
}

FileNotFoundException on HttpsURLConnection with POST and unmutable doOutput variable

I'm trying to POST some data to an https url, in my android application, in order to get a json format response.
I'm facing two problems:
is = conn.getInputStream();
throws
java.io.FileNotFoundException
I don't get if i do something wrong with HttpsURLConnection.
The second problem arose when i debug the code (used eclipse); I set a breakpoint after
conn.setDoOutput(true);
and, when inspecting conn values, I see that the variable doOutput remain set to false and type GET.
My method for https POST is the following, where POSTData is a class extending ArrayList<NameValuePair>
private static String httpsPOST(String urlString, POSTData postData, List<HttpCookie> cookies) {
String result = null;
HttpsURLConnection conn = null;
OutputStream os = null;
InputStream is = null;
try {
URL url = new URL(urlString);
conn = (HttpsURLConnection) url.openConnection();
conn.setReadTimeout(10000);
conn.setConnectTimeout(15000);
conn.setRequestMethod("POST");
conn.setUseCaches (false);
conn.setDoInput(true);
conn.setDoOutput(true);
if(cookies != null)
conn.setRequestProperty("Cookie",
TextUtils.join(";", cookies));
os = conn.getOutputStream();
BufferedWriter writer = new BufferedWriter(
new OutputStreamWriter(os, "UTF-8"));
writer.write(postData.getPostData());
writer.flush();
writer.close();
is = conn.getInputStream();
BufferedReader r = new BufferedReader(
new InputStreamReader(is));
StringBuilder total = new StringBuilder();
String line;
while ((line = r.readLine()) != null) {
total.append(line);
}
result = total.toString();
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (ProtocolException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (os != null) {
try {
os.close();
} catch (IOException e) {
}
}
if (is != null) {
try {
is.close();
} catch (IOException e) {
}
}
if (conn != null) {
conn.disconnect();
}
}
return result;
}
A little update: apparently eclipse debug lied to me, running and debugging on netbeans shows a POST connection. Error seems to be related to parameters i'm passing to the url.
FileNotFoundException means that the URL you posted to doesn't exist, or couldn't be mapped to a servlet. It is the result of an HTTP 404 status code.
Don't worry about what you see in the debugger if it doesn't agree with how the program behaves. If doOutput really wasn't enabled, you would get an exception obtaining the output stream.

Sending SMS through low end API

I have a message constructor method of the form:
public static String constructMsg(CustomerInfo customer) {
... snipped
String msg = String.format("Snipped code encapsulated by customer object");
return msg;
}
The API link is:
http://xxx.xxx.xx.xx:8080/bulksms?username=xxxxxxx &password=xxxx &type=0 &dlr=1&destination=10digitno & source=xxxxxx& message=xxxxx
In my main method I have(s):
List<CustomerInfo> customer = dao.getSmsDetails(userDate);
theLogger.info("Total No : " + customer.size() );
if (!customer.isEmpty()) {
for (CustomerInfo cust : customer) {
String message = constructMsg(cust);
// Add link and '?' and query string
// use URLConnection's connect method
}
}
So I am using connect method of URLConnection. The API does not have any documentation. Is there any way for checking response?
My other question is, I have been advised to use ThreadPoolExecutor. How would I use use it here?
This method use HTTPURLConnection to perform a GET request returning the response as a String. There're many way to do it, this is not particularly brilliant but it's really readable.
public String getResponse(String url, int timeout) {
HttpURLConnection c;
try {
URL u = new URL(url);
c = (HttpURLConnection) u.openConnection();
c.setRequestMethod("GET");
c.setRequestProperty("Content-length", "0");
c.setUseCaches(false);
c.setAllowUserInteraction(false);
c.setConnectTimeout(timeout);
c.setReadTimeout(timeout);
c.connect();
int status = c.getResponseCode();
switch (status) {
case 200:
case 201:
BufferedReader br = new BufferedReader(new InputStreamReader(c.getInputStream()));
StringBuilder sb = new StringBuilder();
String line;
while ((line = br.readLine()) != null) {
sb.append(line+"\n");
}
br.close();
return sb.toString();
default:
return "HTTP CODE: "+status;
}
} catch (MalformedURLException ex) {
Logger.getLogger(DebugServer.class.getName()).log(Level.SEVERE, null, ex);
} catch (IOException ex) {
Logger.getLogger(DebugServer.class.getName()).log(Level.SEVERE, null, ex);
} finally{
if(c!=null) c.disconnect();
}
return null;
}
Call this method like this:
getResponse("http://xxx.xxx.xx.xx:8080/bulksms?username=xxxxxxx&password=xxxx&type=0 &dlr=1&destination=10digitno&source=xxxxxx&message=xxxxx",2000);
I assume the whitespaces in your URL are not supposed to be there.

Java, FileNotfound Exception, While reading conn.getInputStream()

Please tell me some one, How to resolve this problem,
Sometime I am getting Filenotfound Exception and Some time this code working fine.
Below is my code,
public String sendSMS(String data, String url1) {
URL url;
String status = "Somthing wrong ";
try {
url = new URL(url1);
URLConnection conn = url.openConnection();
conn.setDoOutput(true);
conn.setRequestProperty("User-Agent","Mozilla/5.0 ( compatible ) ");
conn.setRequestProperty("Accept","*/*");
OutputStreamWriter wr = new OutputStreamWriter(conn.getOutputStream());
wr.write(data);
wr.flush();
// Get the response
try {
BufferedReader rd = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String s;
while ((s = rd.readLine()) != null) {
status = s;
}
rd.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
}
wr.close();
} catch (MalformedURLException e) {
status = "MalformedURLException Exception in sendSMS";
e.printStackTrace();
} catch (IOException e) {
status = "IO Exception in sendSMS";
e.printStackTrace();
}
return status;
}
Rewrite like this and let me know how you go... (note closing of reading and writing streams, also the cleanup of streams if an exception is thrown).
public String sendSMS(String data, String url1) {
URL url;
OutputStreamWriter wr = null;
BufferedReader rd = null;
String status = "Somthing wrong ";
try {
url = new URL(url1);
URLConnection conn = url.openConnection();
conn.setDoOutput(true);
conn.setRequestProperty("User-Agent","Mozilla/5.0 ( compatible ) ");
conn.setRequestProperty("Accept","*/*");
wr = new OutputStreamWriter(conn.getOutputStream());
wr.write(data);
wr.flush();
wr.close();
rd = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String s;
while ((s = rd.readLine()) != null) {
status = s;
}
rd.close();
} catch (Exception e) {
if (wr != null) try { wr.close(); } catch (Exception x) {/*cleanup*/}
if (rd != null) try { rd.close(); } catch (Exception x) {/*cleanup*/}
e.printStackTrace();
}
return status;
}
This issue seems to be known, but for different reasons so its not clear why this happend.
Some threads would recommend closing the OutputStreamWriter as flushing it is not enough, therefor i would try to clos it directly after fushing as you are not using it in the code between the flush and close.
Other threads show that using a different connections like HttpURLConnection are avoiding this problem from occuring (Take a look here)
Another article suggests to use the URLEncoder class’ static method encode. This method takes a string and encodes it to a string that is ok to put in a URL.
Some similar questions:
URL is accessable with browser but still FileNotFoundException with URLConnection
URLConnection FileNotFoundException for non-standard HTTP port sources
URLConnection throwing FileNotFoundException
Wish you good luck.
It returns FileNotFoundException when the server response to HTTP request is code 404.
Check your URL.

URLConnection is not allowing me to access data on Http errors (404,500,etc)

I am making a crawler, and need to get the data from the stream regardless if it is a 200 or not. CURL is doing it, as well as any standard browser.
The following will not actually get the content of the request, even though there is some, an exception is thrown with the http error status code. I want the output regardless, is there a way? I prefer to use this library as it will actually do persistent connections, which is perfect for the type of crawling I am doing.
package test;
import java.net.*;
import java.io.*;
public class Test {
public static void main(String[] args) {
try {
URL url = new URL("http://github.com/XXXXXXXXXXXXXX");
URLConnection connection = url.openConnection();
DataInputStream inStream = new DataInputStream(connection.getInputStream());
String inputLine;
while ((inputLine = inStream.readLine()) != null) {
System.out.println(inputLine);
}
inStream.close();
} catch (MalformedURLException me) {
System.err.println("MalformedURLException: " + me);
} catch (IOException ioe) {
System.err.println("IOException: " + ioe);
}
}
}
Worked, thanks: Here is what I came up with - just as a rough proof of concept:
import java.net.*;
import java.io.*;
public class Test {
public static void main(String[] args) {
//InputStream error = ((HttpURLConnection) connection).getErrorStream();
URL url = null;
URLConnection connection = null;
String inputLine = "";
try {
url = new URL("http://verelo.com/asdfrwdfgdg");
connection = url.openConnection();
DataInputStream inStream = new DataInputStream(connection.getInputStream());
while ((inputLine = inStream.readLine()) != null) {
System.out.println(inputLine);
}
inStream.close();
} catch (MalformedURLException me) {
System.err.println("MalformedURLException: " + me);
} catch (IOException ioe) {
System.err.println("IOException: " + ioe);
InputStream error = ((HttpURLConnection) connection).getErrorStream();
try {
int data = error.read();
while (data != -1) {
//do something with data...
//System.out.println(data);
inputLine = inputLine + (char)data;
data = error.read();
//inputLine = inputLine + (char)data;
}
error.close();
} catch (Exception ex) {
try {
if (error != null) {
error.close();
}
} catch (Exception e) {
}
}
}
System.out.println(inputLine);
}
}
Simple:
URLConnection connection = url.openConnection();
InputStream is = connection.getInputStream();
if (connection instanceof HttpURLConnection) {
HttpURLConnection httpConn = (HttpURLConnection) connection;
int statusCode = httpConn.getResponseCode();
if (statusCode != 200 /* or statusCode >= 200 && statusCode < 300 */) {
is = httpConn.getErrorStream();
}
}
You can refer to Javadoc for explanation. The best way I would handle this is as follows:
URLConnection connection = url.openConnection();
InputStream is = null;
try {
is = connection.getInputStream();
} catch (IOException ioe) {
if (connection instanceof HttpURLConnection) {
HttpURLConnection httpConn = (HttpURLConnection) connection;
int statusCode = httpConn.getResponseCode();
if (statusCode != 200) {
is = httpConn.getErrorStream();
}
}
}
You need to do the following after calling openConnection.
Cast the URLConnection to HttpURLConnection
Call getResponseCode
If the response is a success, use getInputStream, otherwise use getErrorStream
(The test for success should be 200 <= code < 300 because there are valid HTTP success codes apart from than 200.)
I am making a crawler, and need to get the data from the stream regardless if it is a 200 or not.
Just be aware that it if the code is a 4xx or 5xx, then the "data" is likely to be an error page of some kind.
The final point that should be made is that you should always respect the "robots.txt" file ... and read the Terms of Service before crawling / scraping the content of a site whose owners might care. Simply blatting off GET requests is likely to annoy site owners ... unless you've already come to some sort of "arrangement" with them.

Categories