Get some, not all, content of HTML in Java - java

In the following code, the content of HTML is displayed in the console. What I want to do is how can I just show the content of some part of the HTML, for example the HTML content of stock prices?
import java.io.InputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.io.PrintWriter;
import java.net.HttpURLConnection;
import java.net.URL;
import java.net.URLConnection;
import java.util.Scanner;
public class ShowStock {
public static void main(String[] args) throws IOException {
String urlString;
if(args.length == 1)
urlString = args[0];
else
{
urlString = "https://www.google.com/finance/historical?cid=22144&startdate=Jan+1%2C+2014&enddate=Dec+31%2C+2015&num=30&ei=m-JzVqm2L9fJUaOphsAF";
System.out.println("Reading data from " + urlString );
}
// Open connection
URL u = new URL(urlString);
URLConnection connection = u.openConnection();
// check to make sure the page exists
HttpURLConnection httpConnection = (HttpURLConnection) connection;
int code = httpConnection.getResponseCode();
String message = httpConnection.getResponseMessage();
System.out.println(code + " " + message);
if (code != HttpURLConnection.HTTP_OK)
return;
// Read server response
InputStream instream = connection.getInputStream();
Scanner in = new Scanner(instream);
// display server response to console
while (in.hasNextLine())
{
String input = in.nextLine();
System.out.println(input);
}
}
}

If it is XHTML (html like xml), you can use many xml libraries
If not, use an html parser jsoup, htmlcleaner, ...
see this:
Which HTML Parser is the best?

Related

Bugzilla-Query using Java - Getting HTML instead of XML

When I type this following URL into my browser, Bugzilla answers with XML:
http://bugzilla.mycompany.local/buglist.cgi?ctype=rdf&bug_status=CONFIRMED&product=MyProduct
I want to process this XML in a Java program. But when I use the exact same URL in my Java program, Bugzilla answers with HTML instead of XML.
This is my program:
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.URL;
import java.net.URLConnection;
public class Test {
public static void main(String[] args)
throws IOException {
URL url = new URL("http://bugzilla.mycompany.local/buglist.cgi?ctype=rdf&bug_status=CONFIRMED&product=MyProduct");
URLConnection connection = url.openConnection();
final StringBuilder response = new StringBuilder(1024);
try(InputStreamReader isr = new InputStreamReader(connection.getInputStream())) {
try(BufferedReader reader = new BufferedReader(isr)) {
String inputLine = null;
while((inputLine = reader.readLine()) != null) {
response.append(inputLine);
response.append('\n');
}
}
}
System.out.println(response);
}
}
What am I doing wrong?
The resulting HTML is not the result of the query. It's Bugzillas log-in form. Duh!

java.lang.reflect.InvocationTargetException while using jsoup

I'm trying to parse an html document in a webservice. According to google, jsoup seem to be the faster and easier html parser, so I included in my project but I get the exception "Exception: java.lang.reflect.InvocationTargetException Message: java.lang.reflect.InvocationTargetException" I have tried everything, but nothings give results. Please help
I add jsoup.jar in my project's libray classpath.
I am using Eclipse Luna on Windows XP
Java 1.7 apache tomcat 7.0
this is my code:
try {
url = new URL("http://consulta.muniguate.com/emetra/despliega.php?tplaca="+tplaca+"&nplaca="+nplaca);
conn = (HttpURLConnection) url.openConnection();
conn.setRequestMethod("GET");
rd = new BufferedReader(new InputStreamReader(conn.getInputStream()));
while ((line = rd.readLine()) != null) {
result += line;
}
Document doc = Jsoup.connect(result).get();
String title= doc.title();
System.out.println(title);
rd.close();
} catch (IOException e) {
e.printStackTrace();
} catch (Exception e) {
e.printStackTrace();
}
}
This is the full code:
package clases;
import java.io.BufferedReader;
import javax.jws.WebService;
import javax.jws.WebMethod;
import javax.jws.WebParam;
import java.io.IOException;
import java.io.InputStreamReader;
import java.lang.reflect.InvocationTargetException;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.URI;
import java.net.URL;
import org.jsoup.Connection.Method;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
#WebService(serviceName = "Transito")
public class Transito {
#WebMethod(operationName = "consultar_saldo")
public String consultar_saldo(String tplaca, int nplaca) throws InvocationTargetException {
String result = "";
try {
Document doc= Jsoup.connect("http://www.muniguate.com/utilities/remisiones.htm?tplaca="+tplaca+"&nplaca="+nplaca).userAgent("Mozilla").get();
String result = doc.title();
System.out.println(result);
} catch (Exception e){
e.getCause();
}
return result;
}
}
Jsoup.connect() accepts a url string, not response content
"I have tried everything, but nothings give results."
In that case you are doomed since there is nothing left for us to try.
But lets assume that you didn't try everything. Lets assume that documentation of Jsoup.connect() is actually telling the true, and this method is used only to create Connection to resource which should be parsed, not to parse it. Its get() method job to connect to resource from created Connection, parse it and return it as Document.
So this method instead of HTML text of resource, will need information required for connection like URL.
So instead of manually creating HttpURLConnection and reading its HTML code, pass string representing URL to Jsoup.connect() and then using get() connect and parse to this resource.
So instead of
URL url = new URL("http://consulta.muniguate.com/emetra/despliega.php?tplaca="+tplaca+"&nplaca="+nplaca);
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.setRequestMethod("GET");
BufferedReader rd = new BufferedReader(new InputStreamReader(
conn.getInputStream()));
String line = null;
String result = "";
while ((line = rd.readLine()) != null) {
result += line;
}
Simply use
Document doc = Jsoup.connect("http://consulta.muniguate.com/emetra/despliega.php?tplaca="+tplaca+"&nplaca="+nplaca).get();
Now you should be able to use
String title = doc.title();
System.out.println(title);

How to to use Base64.java file in my code?

I am trying this
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.io.OutputStream;
import java.net.HttpURLConnection;
import java.net.URL;
public class HttpBasicAuth {
public static void downloadFileWithAuth(String urlStr, String user, String pass, String outFilePath) {
try {
// URL url = new URL ("http://ip:port/download_url");
URL url = new URL(urlStr);
String authStr = user + ":" + pass;
String authEncoded = Base64.encodeBytes(authStr.getBytes());
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
connection.setRequestMethod("GET");
connection.setDoOutput(true);
connection.setRequestProperty("Authorization", "Basic " + authEncoded);
File file = new File(outFilePath);
InputStream in = (InputStream) connection.getInputStream();
OutputStream out = new BufferedOutputStream(new FileOutputStream(file));
for (int b; (b = in.read()) != -1;) {
out.write(b);
}
out.close();
in.close();
}
catch (Exception e) {
e.printStackTrace();
}
}
}
It works fine but gives an error " Cannot find symbol error Base64Encoder"
Downloaded the Base64.java file
Now I don't know how to use this file with my project to remove the error.
can you tell me please the how to use the Base64.java file to remove the error?
Thanks in anticipation.
You could just use the Base64 encode/decode capability that is present in the JDK itself. The package javax.xml.bind includes a class DatatypeConverter that provides methods to print/parse to various forms including
static byte[] parseBase64Binary(String lexicalXSDBase64Binary)
static String printBase64Binary(byte[] val)
Just import javax.xml.bind.DatatypeConverter and use the provided methods.
Need to import the Base64 into your code. The import are depends on your source file.
Apache Commons Codec has a solid implementation of Base64.
example:
import org.apache.commons.codec.binary.Base64;

How can i made soap Request and Response in Java?

How can I execute the soap webservices and how can I print the data?
Currently I am using the following code
package com.appulento.pack;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;
import java.net.HttpURLConnection;
import java.net.URL;
public class SimpleHTTPRequest
{
public static void main(String[] args) throws Exception {
final String url =
"http://**********:8000/sap/bc/srt/rfc/sap/zmaterials_details/" +
"800/zmaterials_details/zmaterials_details_bind",
soapAction ="urn:sap-com:document:sap:soap:functions:mc-style/ZMATERIALS_DETAILS",
envelope1="<?xml version=\"1.0\" encoding=\"utf-8\"?>" +
"<soapenv:Envelope xmlns:soapenv=\"http://schemas.xmlsoap.org/soap/envelope/\"" +
" xmlns:urn=\"urn:sap-com:document:sap:soap:functions:mc-style\">" +
"<soapenv:Header>"+
"<soapenv:Body>"+
"<urn:ZMATERIALS_DETAILS>"+
"<Language>D</Language>"+
"<MaterialGroup>00208</MaterialGroup>"+
"</urn:ZMATERIALS_DETAILS>"+
"</soap:Body>"+
"</soap:Envelope>" ;
HttpURLConnection connection = null;
try {
final URL serverAddress = new URL("http://*********:8000/sap/bc/srt/wsdl/"+
"srvc_14DAE9C8D79F1EE196F1FC6C6518A345/wsdl11/allinone/ws_policy/" +
"document?sap-client=800&sap-user=************&sap-password=****");
connection = (HttpURLConnection)serverAddress.openConnection();
connection.setRequestProperty("SOAPAction", soapAction);
connection.setRequestMethod("POST");
connection.setDoOutput(true);
final OutputStreamWriter writer = new OutputStreamWriter(connection.getOutputStream());
writer.append(envelope1);
writer.close();
final BufferedReader rd =
new BufferedReader(new InputStreamReader(connection.getInputStream()));
String line;
while ((line = rd.readLine()) != null) System.out.println(line);
} finally { connection.disconnect(); }
}
}
I want send xml as input request and I want to display in xml too.
Iit's possible to sent HTTP request using httpConnection and parse response, like you do.
But it is already written by other people, use wsimport tool with -keep option. It will generate for you Java artifacts for sending request using SOAP.

How can my java applet use my PHP authenticated session?

I have set up a login area on my PHP server. The members.php file requires login; after I login a session is created. The session lasts for a while. I want to make this work so that while the session is still valid the java applet should be able to access members.php page.
I have embedded the Java applet into the members.php page. It makes a HttpURLConnection request, however when I get the response I find that it was redirected by the PHP server to the login page.
How do I set this up correctly?
Here is the Java Applet code:
import java.applet.Applet;
import java.awt.Dimension;
import java.io.BufferedReader;
import java.io.DataInputStream;
import java.io.DataOutputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.URL;
import java.net.URLConnection;
import java.util.List;
import java.util.Map;
import javax.swing.JFrame;
import javax.swing.JTextArea;
public class phpConnectApplet extends Applet {
private static final long serialVersionUID = 1L;
public void init() {
URL url = null;
try {
url = new URL("http://www.example.com/members.php");
URLConnection urlConn = url.openConnection();
HttpURLConnection httpConn = (HttpURLConnection) urlConn;
httpConn.setDoOutput(true);
httpConn.setDoInput(true);
httpConn.setUseCaches(false);
httpConn.setRequestProperty("Content-Type", "application/x-www-form-urlencoded");
DataOutputStream output = new DataOutputStream(httpConn.getOutputStream());
String content = "action=blah"; //just to test the PHP file
output.writeBytes(content);
output.flush();
output.close();
DataInputStream in = new DataInputStream(urlConn.getInputStream());
BufferedReader input = new BufferedReader(new InputStreamReader(in));
String str, result = "";
while ((str = input.readLine()) != null) {
result = result + str + "\n";
}
input.close();
Map<String, List<String>> headers = httpConn.getHeaderFields();
List<String> values = headers.get("Set-Cookie");
String cookieValue = null;
for (String v:values) {
if (cookieValue == null)
cookieValue = v;
else
cookieValue = cookieValue + ";" + v;
}
System.out.println(cookieValue);
JFrame f = new JFrame("App Title");
f.add(new JTextArea(result));
f.setMaximumSize(new Dimension(400,300));
f.pack();
f.setLocationRelativeTo(null);
f.setVisible(true);
} catch (MalformedURLException me) {
me.printStackTrace();
} catch(IOException ie) {
ie.printStackTrace();
} catch(Exception e) {
e.printStackTrace();
}
}
}
The output for this is a JFrame with one JTextField which contains the output HTML of the login page.
You need to capture the value of your PHPSESSID cookie (or whatever you are using for the cookie name) and add it to the request in the HttpURLConnection. This MAY come through as a system parameter, but if not, you can embed the session ID as an applet attribute on the page launching the applet. (I haven't experimented with this part specifically)
Here's a tutorial that explains how to send cookies in the URLConnection class: http://www.hccp.org/java-net-cookie-how-to.html
Specifically see the section titled Setting a cookie value in a request.

Categories