import java.net.URL;
import java.io.*;
import java.net.MalformedURLException;
import java.util.logging.Level;
import java.util.logging.Logger;
public class Test {
public static void main(String args[]) {
try {
processHTMLFromLink(new URL("http://fwallpapers.com"));
} catch (MalformedURLException ex) {
Logger.getLogger(Test.class.getName()).log(Level.SEVERE, null, ex);
}
}
public static int processHTMLFromLink(URL url) {
InputStream is = null;
DataInputStream dis;
String line;
int count = 0;
try {
BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
while ((line = in.readLine()) != null) {
System.out.println(line);
}
} catch (MalformedURLException mue) {
System.out.println(mue.toString());
} catch (IOException ioe) {
System.out.println(ioe.toString());
} finally {
try {
is.close();
} catch (IOException ioe) {
// nothing to see here
}
}
return count;
}
}
error:
java.io.IOException: Server returned HTTP response code: 403 for URL: http://fwallpapers.com
Exception in thread "main" java.lang.NullPointerException
at Test.processHTMLFromLink(Test.java:38)
at Test.main(Test.java:15)
Java Result: 1
It is working fine on browser. But I am getting null point exceptions. this code works fine with other links. can anyone help me out with this. How can I get content while i am getting 403 error.
This is an old post but if people wanted to know how this works.
a 403 means acces-denied.
There is a work around for this.
If you want to able to do this you have to set a user agant parameter to 'fool' the website
This is how my old method looked like:
private InputStream read() {
try {
return url.openStream();
}
catch (IOException e) {
String error = e.toString();
throw new RuntimeException(e);
}
}
Changed it to: (And it works for me!)
private InputStream read() {
try {
HttpURLConnection httpcon = (HttpURLConnection) url.openConnection();
httpcon.addRequestProperty("User-Agent", "Mozilla/4.0");
return httpcon.getInputStream();
} catch (IOException e) {
String error = e.toString();
throw new RuntimeException(e);
}
}
Your mistake is swallowing the exception.
When I run my code, I get an HTTP 403 - "forbidden". The web server won't allow you to do this.
My code works perfectly for http://www.yahoo.com.
Here's how I do it:
package url;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.Reader;
import java.net.URL;
/**
* UrlReader
* #author Michael
* #since 3/20/11
*/
public class UrlReader {
public static void main(String[] args) {
UrlReader urlReader = new UrlReader();
for (String url : args) {
try {
String contents = urlReader.readContents(url);
System.out.printf("url: %s contents: %s\n", url, contents);
} catch (Exception e) {
e.printStackTrace();
}
}
}
public String readContents(String address) throws IOException {
StringBuilder contents = new StringBuilder(2048);
BufferedReader br = null;
try {
URL url = new URL(address);
br = new BufferedReader(new InputStreamReader(url.openStream()));
String line = "";
while (line != null) {
line = br.readLine();
contents.append(line);
}
} finally {
close(br);
}
return contents.toString();
}
private static void close(Reader br) {
try {
if (br != null) {
br.close();
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
This is now a completely different question so I have edited your title.
According to your edit, you aren't getting null pointer exceptions, you are getting HTTP 403 status, which means 'Forbidden', which means you can't access that resource.
Related
I am new to Java and currently struggling a bit.
i am able to read the xml files and auhenication is also completed but I am not able to download it in client.
For e.g URL looks like this:- "http://myworld.com/436789.xml".
Destination:- "/infa_shared/cache".
File Name to be saved at destination with name:- 436789.xml.
Need to pass these three as variable.
Below is the code:-
enter code here
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.Authenticator;
import java.net.MalformedURLException;
import java.net.URL;
public class MyMain {
public static void main(String[] args) {
URL url;
InputStream is = null;
BufferedReader br;
String line;
// Install Authenticator
MyAuthenticator.setPasswordAuthentication("User_name", "Password");
Authenticator.setDefault(new MyAuthenticator());
try {
url = new URL("http://myworld.com/436789.xml");
is = url.openStream(); // throws an IOException
br = new BufferedReader(new InputStreamReader(is));
while ((line = br.readLine()) != null) {
System.out.println(line);
}
} catch (MalformedURLException mue) {
mue.printStackTrace();
} catch (IOException ioe) {
ioe.printStackTrace();
} finally {
try {
if (is != null)
is.close();
} catch (IOException ioe) {
// nothing to see here
}
}
}
}
class MyAuthenticator extends Authenticator {
private static String username = "";
private static String password = "";
protected PasswordAuthentication getPasswordAuthentication() {
return new PasswordAuthentication(MyAuthenticator.username, MyAuthenticator.password.toCharArray());
}
public static void setPasswordAuthentication(String username, String password) {
MyAuthenticator.username = username;
MyAuthenticator.password = password;
}
}
I am trying to use an API from https://us.mc-api.net/ for a project and I have made this as a test.
public static void main(String[] args){
try {
URL url = new URL("http://us.mc-api.net/v3/uuid/193nonaxishsl/csv/");
BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
String line;
while ((line = in.readLine()) != null) {
System.out.println(line);
}
in.close();
}
catch (MalformedURLException e) {
e.printStackTrace();
}
catch (IOException e) {
e.printStackTrace();
System.out.println("I/O Error");
}
}
}
And this is giving me an IOException error but when ever I open the same page in my web browser I get
false,Unknown-Username
which is what I want to get from the code. I am new and don't really know why it is happening or why.
EDIT: StackTrace
java.io.FileNotFoundException: http://us.mc-api.net/v3/uuid/193nonaxishsl/csv/
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
at java.net.URL.openStream(Unknown Source)
at com.theman1928.Test.Main.main(Main.java:13)
The URL is returning status code 404 and therefore the input stream (mild guess here) is not being created and therefore is null. Sort the status code and you should be OK.
Ran it with this CSV and it is fine: other csv
If the error code is important to you then you can use HttpURLConnection:
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
System.out.println("code:"+conn.getResponseCode());
In that way you can process the response code before proceeding with a quick if-then-else check.
I tried it with the Apache HTTP libraries. The API endpoint seems to return a status code of 404, hence your error. Code I used is below.
public static void main(String[] args) throws URISyntaxException, ClientProtocolException, IOException {
HttpClient httpclient = HttpClients.createDefault();
URIBuilder builder = new URIBuilder("http://us.mc-api.net/v3/uuid/193nonaxishsl/csv/");
URI uri = builder.build();
HttpGet request = new HttpGet(uri);
HttpResponse response = httpclient.execute(request);
System.out.println(response.getStatusLine().getStatusCode()); // 404
}
Switching out the http://us.mc-api.net/v3/uuid/193nonaxishsl/csv/ with www.example.com or whatever returns a status code of 200, which further proves an error with the API endpoint. You can take a look at [Apache HTTP Components] library here.
This has to do with how the wire protocols are working in comparison with the java.net classes and an actual browser. A browser is going to be much more sophisticated than the simple java.net API you are using.
If you want to get the equivalent response value in Java, then you need to use a richer HTTP API.
This code will give you the same response as the browser; however, you need to download the Apache HttpComponents jars
The code:
import java.io.BufferedReader;
import java.io.DataOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.URL;
import javax.net.ssl.HttpsURLConnection;
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.methods.HttpUriRequest;
import org.apache.http.impl.client.HttpClients;
public class TestDriver
{
public static void main(String[] args)
{
try
{
String url = "http://us.mc-api.net/v3/uuid/193nonaxishsl/csv";
HttpGet httpGet = new HttpGet(url);
getResponseFromHTTPReq(httpGet, url);
}
catch (Throwable e)
{
e.printStackTrace();
}
}
private static String getResponseFromHTTPReq(HttpUriRequest httpReq, String url)
{
HttpClient httpclient = HttpClients.createDefault();
// Execute and get the response.
HttpResponse response = null;
HttpEntity entity = null;
try
{
response = httpclient.execute(httpReq);
entity = response.getEntity();
}
catch (IOException ioe)
{
throw new RuntimeException(ioe);
}
if (entity == null)
{
String errMsg = "No response entity back from " + url;
throw new RuntimeException(errMsg);
}
String returnRes = null;
InputStream is = null;
BufferedReader buf = null;
try
{
is = entity.getContent();
buf = new BufferedReader(new InputStreamReader(is, "UTF-8"));
System.out.println("Response Code : " + response.getStatusLine().getStatusCode());
StringBuilder sb = new StringBuilder();
String s = null;
while (true)
{
s = buf.readLine();
if (s == null || s.length() == 0)
{
break;
}
sb.append(s);
}
returnRes = sb.toString();
System.out.println("Response: [" + returnRes + "]");
}
catch (UnsupportedOperationException | IOException e)
{
throw new RuntimeException(e);
}
finally
{
if (buf != null)
{
try
{
buf.close();
}
catch (IOException e)
{
}
}
if (is != null)
{
try
{
is.close();
}
catch (IOException e)
{
}
}
}
return returnRes;
}
}
Outputs:
Response Code : 404
Response: [false,Unknown-Username]
What I wanna do is get the content of this URL :
https://www.aviationweather.gov/adds/dataserver_current/httpparam?dataSource=metars&requestType=retrieve&format=xml&stationString=CYQB&hoursBeforeNow=2
and copy it to a file so I can parse it and use the elements.
Here is what I have so far :
package test;
import java.io.*;
import java.net.*;
import org.apache.commons.io.FileUtils;
public class JavaGetUrl {
#SuppressWarnings("deprecation")
public static void main(String[] args) throws FileNotFoundException {
URL u;
InputStream is = null;
DataInputStream dis;
String s = null;
try {
u = new URL(
"https://www.aviationweather.gov/adds/dataserver_current/httpparam?dataSource=metars&requestType=retrieve&format=xml&stationString=CYQB&hoursBeforeNow=2");
is = u.openStream(); // throws an IOException
dis = new DataInputStream(new BufferedInputStream(is));
while ((s = dis.readLine()) != null) {
System.out.println(s);
FileUtils.writeStringToFile(new File("input.txt"), s);
}
} catch (MalformedURLException mue) {
System.out.println("Ouch - a MalformedURLException happened.");
mue.printStackTrace();
System.exit(1);
} catch (IOException ioe) {
System.out.println("Oops- an IOException happened.");
ioe.printStackTrace();
System.exit(1);
} finally {
try {
is.close();
} catch (IOException ioe) {
}
}
}
}
The problem is that the content of s does not show up in input.txt.
If I replace s by any other strings it works. So I guess it's a problem with the data of s. Is it because it's xml?
Thank you all for the help.
The file is probably getting over-written.
You should use "append" mode to get file appended with data(from readLine).
public static void writeStringToFile(File file,
String data,
boolean append)
as you are already using apaches commons-io, you can also simply use
FileUtils.copyURLToFile(URL, File)
see https://commons.apache.org/proper/commons-io/javadocs/api-2.4/org/apache/commons/io/FileUtils.html#copyURLToFile(java.net.URL,%20java.io.File)
Hi I am using following code to reading URL:
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;
import java.net.URLEncoder;
public class JavaHttpUrlConnectionReader
{
public static void main(String[] args)
throws Exception
{
new JavaHttpUrlConnectionReader();
}
public JavaHttpUrlConnectionReader()
{
try
{
String myUrl = "http://epaperbeta.timesofindia.com/NasData/PUBLICATIONS/THETIMESOFINDIA/Delhi/2015/06/09/PageIndex/09_06_2015.xml";
// if your url can contain weird characters you will want to
// encode it here, something like this:
// myUrl = URLEncoder.encode(myUrl, "UTF-8");
String results = doHttpUrlConnectionAction(myUrl);
System.out.println(results);
}
catch (Exception e)
{
// deal with the exception in your "controller"
}
}
/**
* Returns the output from the given URL.
*/
private String doHttpUrlConnectionAction(String desiredUrl)
throws Exception
{
URL url = null;
BufferedReader reader = null;
StringBuilder stringBuilder;
try
{
// create the HttpURLConnection
url = new URL(desiredUrl);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
// just want to do an HTTP GET here
connection.setRequestMethod("GET");
// uncomment this if you want to write output to this url
//connection.setDoOutput(true);
// give it 15 seconds to respond
connection.setReadTimeout(35*1000);
connection.connect();
// read the output from the server
reader = new BufferedReader(new InputStreamReader(connection.getInputStream()));
stringBuilder = new StringBuilder();
String line = null;
while ((line = reader.readLine()) != null)
{
stringBuilder.append(line + "\n");
}
return stringBuilder.toString();
}
catch (Exception e)
{
e.printStackTrace();
throw e;
}
finally
{
// close the reader; this can throw an exception too, so
// wrap it in another try/catch block.
if (reader != null)
{
try
{
reader.close();
}
catch (IOException ioe)
{
ioe.printStackTrace();
}
}
}
}
}
It gives me following error:
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:687)
at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:632)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1072)
at JavaHttpUrlConnectionReader.doHttpUrlConnectionAction(JavaHttpUrlConnectionReader.java:77)
at JavaHttpUrlConnectionReader.<init>(JavaHttpUrlConnectionReader.java:33)
at JavaHttpUrlConnectionReader.main(JavaHttpUrlConnectionReader.java:21)
Kindly tell me the reason why it occurs, and solution for it.
When I run this code outside of my office LAN, it is working fine. but not in office LAN.
Thanks & Regards
Abhishek
Your URL:
http://epaperbeta.timesofindia.com/NasData/PUBLICATIONS/THETIMESOFINDIA/Delhi/2015/06/09/PageIndex/09_06_2015.xml
is not accessible without a proxy (for example I can't access it from here), no wonder why cannot be read from the stream.
Check your proxy settings. you could try the url in the browser with/without proxy and see the difference.
As #Jens commented look at this.
First of all, I am but a lowly web-programmer so have very little experience with actual programming.
I have been given a list of 30,000 urls and I am not going to waste my time clicking each one to check if they are valid - is there a way to read through the text file that they are in and have a program check each line?
The code I currently have is in java as really that's all I know so if there's a better language again, please let me know.
Here is what I have so far:
public class UrlCheck {
public static void main(String[] args) throws IOException {
URL url = new URL("http://www.google.com");
//Need to change this to make it read from text file
try {
InputStream inp = null;
try {
inp = url.openStream();
} catch (UnknownHostException ex) {
System.out.println("Invalid");
}
if (inp != null) {
System.out.println("Valid");
}
} catch (MalformedURLException exc) {
exc.printStackTrace();
}
}
}
First you read the file line by line using a BufferedReader and check each line. Below code should work. It is upto you to decide what to do when you encounter an invalid URL. You could just print it as I showed or write to another file.
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.io.InputStream;
import java.net.MalformedURLException;
import java.net.URL;
import java.rmi.UnknownHostException;
public class UrlCheck {
public static void main(String[] args) throws IOException {
BufferedReader br = new BufferedReader(new FileReader("_filename"));
String line;
while ((line = br.readLine()) != null) {
if(checkUrl(line)) {
System.out.println("URL " + line + " was OK");
} else {
System.out.println("URL " + line + " was not VALID"); //handle error as you like
}
}
br.close();
}
private static boolean checkUrl(String pUrl) throws IOException {
URL url = new URL(pUrl);
//Need to change this to make it read from text file
try {
InputStream inp = null;
try {
inp = url.openStream();
} catch (UnknownHostException ex) {
System.out.println("Invalid");
return false;
}
if (inp != null) {
System.out.println("Valid");
return true;
}
} catch (MalformedURLException exc) {
exc.printStackTrace();
return false;
}
return true;
}
}
The checkUrl method can be simplified as below as well
private static boolean checkUrl(String pUrl) {
URL url = null;
InputStream inp = null;
try {
url = new URL(pUrl);
inp = url.openStream();
return inp != null;
} catch (IOException e) {
e.printStackTrace();
return false;
} finally {
try {
if (inp != null) {
inp.close();
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
You could just use httpURLConnection. If it is not valid you won't get anything back.
HttpURLConnection connection = null;
try{
URL myurl = new URL("http://www.myURL.com");
connection = (HttpURLConnection) myurl.openConnection();
//Set request to header to reduce load
connection.setRequestMethod("HEAD");
int code = connection.getResponseCode();
System.out.println("" + code);
} catch {
//Handle invalid URL
}
I am unsure of your experience but a multi-threaded solution is possible here. As you read through the text file store the urls in a thread-safe structure and allow a number of threads to go and attempt to open these connections. This will make for a more efficient solution as it may take a while to test the 30000 urls while you are reading them in.
Check out a producer-consumer example if you are unsure:
http://www.journaldev.com/1034/java-blockingqueue-example-implementing-producer-consumer-problem
public class UrlCheck {
public static void main(String[] args) {
try {
URL url = new URL("http://www.google.com");
//Open the Http connection
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
//Get the http response code
int responceCode = connection.getResponseCode();
if (responceCode == HttpURLConnection.HTTP_OK) //if the http response code is 200 OK so the url is valid
{
System.out.println("Valid");
} else //Else the url is not valid
{
System.out.println("Invalid");
}
} catch (MalformedURLException ex) {
System.out.println("Invalid");
} catch (IOException ex) {
System.out.println("Invalid");
}
}
}