How to read JavaScript response from a URL in java - java

I need to write a simple java function that takes a URL and processes the response which is in JavaScript, I tried using HttpUrlConnection, but it could not. Is there any java library for handling javascript response?
thanks.
EDIT: My code:
Url url = new url("https://login.live.com/oauth20_authorize.srf");
HttpURLConnection con = (HttpURLConnection) url.openConnection();
BufferedReader reader = new BufferedReader( new InputStreamReader(url.openStream()));
while(reader.readLine()!=null){
System.out.println(reader.readLine());
Response:
<html dir="..... Windows Live ID requires JavaScript to sign in. This web browser either does not support JavaScript, or scripts are being blocked......<body onload="evt_LoginHostMobile_onload(event);">
But I want to read those javascript response. Is it possible in java?

I found the way, HtmlUnit does this, it can handle javascript response
Thanks all those negative raters .....

Related

Does inputstream contain data of iframe?

URL url = new URL("https://www.cs.tut.fi/~jkorpela/html/iframe-pdf.html");
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
InputStream in = connection.getInputStream();
When calling on getInputStream, i turn all the bytes into a string. But why am i not seeing any sign of the data in the iframe?
My goal is to download the PDF.
If you request a URL, you will only get the contents of that file. An iframe is normally effectively a seperate page, so you would need to request that seperately. A browser will normally do all this transparently.
I would recommend using a library such as JSoup which contains lots of methods for parsing HTML, which you will need to get the URL of the iframe (and the URL of the PDF).

Taking text from a response web page using Java

I am sending commands to a server using http, and I currently need to parse a response that the server sends back (I am sending the command via the command line, and the servers response appears in my browser).
There are a lot of resources such as this: Saving a web page to a file in Java, that clearly illustrate how to scrape a page such as cnn.com. However, since this is a response page that is only generated when the camera receives a specific command, my attempts to use the method described by Mike Deck (in the link above) have met with failure. (Specifically, when my program requests the page again the server returns a 401 error.)
The response from the server opens a new tab in my browser. Essentially, I need to know how to save the current web page using java, since reading in a file is probably the most simple way to approach this. Do any of you know how to do this?
TL;DR How do you save the current webpage to a webpage.html or webpage.txt file using java?
EDIT: I used Base64 from the Apache commons codec, which solved my 401 authentication issue. However, I am still getting a 400 error when I attempt to connect my InputStream (see below). Does this mean a connection isn't being established in the first place?
URL url = new URL ("http://"+ipAddress+"/axis-cgi/record/record.cgi?diskid=SD_DISK");
byte[] encodedBytes = Base64.encodeBase64("root:pass".getBytes());
String encoding = new String (encodedBytes);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
connection.setRequestMethod("POST");
connection.setDoInput (true);
connection.setRequestProperty ("Authorization", "Basic " + encoding);
connection.connect();
InputStream content = (InputStream)connection.getInputStream();
BufferedReader in = new BufferedReader (new InputStreamReader (content));
String line;
while ((line = in.readLine()) != null) {
System.out.println(line);
}
EDIT 2: Changing the request to a GET resolved the issue.
So while scrutinizing my code above, I decided to change
connection.setRequestMethod("POST");
to
connection.setRequestMethod("GET");
This solved my problem. In hindsight, I think the server was not recognizing the HTTP because it is not set up to handle the various trappings that come along with post.

Download file programmatically

I am trying to download an vcalendar using a java application, but I can't download from a specific link.
My code is:
URL uri = new URL("http://codebits.eu/s/calendar.ics");
InputStream in = uri.openStream();
int r = in.read();
while(r != -1) {
System.out.print((char)r);
r = in.read();
}
When I try to download from another link it works (ex: http://www.mysportscal.com/Files_iCal_CSV/iCal_AUTO_2011/f1_2011.ics). Something don't allow me to download and I can't figure out why, when I try with the browser it works.
I'd follow this example. Basically, get the response code for the connection. If it's a redirect (e.g. 301 in this case), retrieve the header location and attempt to access the file using that.
Simplistic Example:
URL uri = new URL("http://codebits.eu/s/calendar.ics");
HttpURLConnection con = (HttpURLConnection)uri.openConnection();
System.out.println(con.getResponseCode());
System.out.println(con.getHeaderField("Location"));
uri = new URL(con.getHeaderField("Location"));
con = (HttpURLConnection)uri.openConnection();
InputStream in = con.getInputStream();
You should check what that link actually provides. For example, it might be a page that has moved, which gives you back an HTTP 301 code. Your browser will automatically know to go and fetch it from the new URL, but your program won't.
You might want to try, for example, wireshark to sniff the actual traffic when you do the browser request.
I think too that there is a redirect. The browser downloads from ssl secured https://codebits.eu/s/calendar.ics. Try using a HttpURLConnection, it should follow redirects automatically:
HttpURLConnection con = (HttpURLConnection)uri.openConnection();
InputStream in = con.getInputStream();

403 error in accessing an URL but works fine in browsers

String url = "http://maps.googleapis.com/maps/api/directions/xml?origin=Chicago,IL&destination=Los+Angeles,CA&waypoints=Joplin,MO|Oklahoma+City,OK&sensor=false";
URL google = new URL(url);
HttpURLConnection con = (HttpURLConnection) google.openConnection();
and I use BufferedReader to print the content I get 403 error
The same URL works fine in the browser. Could any one suggest.
The reason it works in a browser but not in java code is that the browser adds some HTTP headers which you lack in your Java code, and the server requires those headers. I've been in the same situation - and the URL worked both in Chrome and the Chrome plugin "Simple REST Client", yet didn't work in Java. Adding this line before the getInputStream() solved the problem:
connection.addRequestProperty("User-Agent", "Mozilla/4.0");
..even though I have never used Mozilla. Your situation might require a different header. It might be related to cookies ... I was getting text in the error stream advising me to enable cookies.
Note that you might get more information by looking at the error text. Here's my code:
try {
HttpURLConnection connection = ((HttpURLConnection)url.openConnection());
connection.addRequestProperty("User-Agent", "Mozilla/4.0");
InputStream input;
if (connection.getResponseCode() == 200) // this must be called before 'getErrorStream()' works
input = connection.getInputStream();
else input = connection.getErrorStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(input));
String msg;
while ((msg =reader.readLine()) != null)
System.out.println(msg);
} catch (IOException e) {
System.err.println(e);
}
HTTP 403 is a Forbidden status code. You would have to read the HttpURLConnection.getErrorStream() to see the response from the server (which can tell you why you have been given a HTTP 403), if any.
This code should work fine. If you have been making a number of requests, it is possible that Google is just throttling you. I have seen Google do this before. You can try using a proxy to verify.
Most browsers automatically encode URLs when you enter them, but the Java URL function doesn't.
You should Encode the URL with URLEncoder URL Encoder
I know this is a bit late, but the easiest way to get the contents of a URL is to use the Apache HttpComponents HttpClient project: http://hc.apache.org/httpcomponents-client-ga/index.html
you original page (with link) and the targeted linked page are not the same domain.
original-domain and target-domain.
I found the difference is in request header:
with 403 forbidden error,
request header have one line:
Referer: http://original-domain/json2tree/ipfs/ipfsList.html
when I enter url, no 403 forbidden,
the request header does NOT have above line referer: original-domain
I finally figure out how to fix this error!!!
on your original-domain web page, you have to add
<meta name="referrer" content="no-referrer" />
it will remove or prevent sending the Referer in header, works both for links and for Ajax requests made

How can I download comments on a webpage (Android)

Usually I use this code to download a webpage source:
URL myURL = new URL("http://mysite.com/index.html");
StringBuffer all = new StringBuffer("");
URLConnection ucon = myURL.openConnection();
InputStream is = ucon.getInputStream();
BufferedReader page = new BufferedReader(new InputStreamReader(is, "ISO-8859-15"));
while((linea = page.readLine()) != null){
all.append(linea.trim());
}
It works fine with a wifi connection because it downloads the string like <!-- it's a comment -->,but i tried to used a mobile connection with my mobile phone but it doesn't download the comments.. Is there a method to include the comments on download webpage source?
thx for reply ;)
It is possible that your service provider is compressing the pages on their side to reduce the data sent. I've not heard of this being done for HTML but it is frequently done for JPG, so it's easy to image that's what's happening. This compression would be very likely to remove comments.
It would be nice if there was some http convention to tell the stack 'never compress', but (at fas as I know) there is not. So you're probably out of luck.

Categories