Java program to read html from website address - java

Hi Im figuring out this program but keep getting null as the answer. Any help will be appreciated. I cant use any external methods for this and have to declare a static method. Here is my code:
import java.io.IOException;
import java.net.URL;
import java.util.ArrayList;
import java.util.Scanner;
public class Links {
private static ArrayList<String> links;
public static ArrayList<String> getHTMLLinksFromPage(String location) {
String webpage = location;
for(int i = 0; i<webpage.length()-6; i++) {
if(webpage.charAt(i) == 'h' && webpage.charAt(i+1) == 'r') {
for(int k = i; k<webpage.length();k++ ){
if(webpage.charAt(k) == '>'){
String link = webpage.substring(i+6,k-1);
links.add(link);
// Break the loop
k = webpage.length();
}
}
}
}
return links;
}
public static void main(String[] args) throws IOException{
String address = "http://horstmann.com/index.html.";
URL pageLocation = new URL(address);
Scanner in = new Scanner(pageLocation.openStream());
String webpage = in.next();
ArrayList<String> x = getHTMLLinksFromPage(webpage);
System.out.println(x);
}
}

There are a few issues with your code:
Firstly you did not initialize your ArrayList called links.
Secondly there was an extra . at the end of your URL which caused a FileNotFoundException.
Also to break out of a for loop you should use the break statement.
Thirdly you are reading the web page incorrectly. You only call scanner.next() once which only reads the first token of the webpage. in this case <?xml. To read the whole web page you need to keep calling scanner.next().
Instead of using a Scanner though, I believe that using a InputStreamReader and BufferedReader would be faster.
So your code instead should look like this:
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.ArrayList;
import java.util.List;
public class Links {
public static void main(String[] args) throws IOException {
URL pageLocation = new URL("http://horstmann.com/index.html");
HttpURLConnection urlConnection = (HttpURLConnection) pageLocation.openConnection();
InputStreamReader inputStreamReader = new InputStreamReader(urlConnection.getInputStream());
BufferedReader bufferedReader = new BufferedReader(inputStreamReader);
StringBuilder response = new StringBuilder();
String line;
while((line = bufferedReader.readLine()) != null) {
response.append(line);
}
bufferedReader.close();
System.out.println(getHTMLLinksFromPage(response.toString()));
}
private static List<String> getHTMLLinksFromPage(final String webPage) {
List<String> links = new ArrayList<>();
for(int i = 0; i < webPage.length()-6; i++) {
if(webPage.charAt(i) == 'h' && webPage.charAt(i+1) == 'r') {
for(int j = i; j < webPage.length(); j++ ){
if(webPage.charAt(j) == '>'){
String link = webPage.substring(i+6,j-1);
links.add(link);
break;
}
}
}
}
return links;
}
}
Output:
[styles.css" rel="stylesheet" type="text/css", mailto:cay#horstmann.com, http://horstmann.com/caypubkey.txt, http://www.uni-kiel.de/, an-Albrechts-Universität</, http://www.kiel.de/, http://www.syr.edu/, http://www.umich.edu/, http://www.mathcs.sjsu.edu/, unblog/index.html, https://plus.google.com/117406678785944293188/posts, http://www.sjsu.edu/people/cay.horstmann, http://horstmann.com/quotes.html, http://horstmann.com/resume.html, http://www.family-horstmann.net/, http://horstmann.com/javaimpatient/index.html, http://horstmann.com/java8/index.html, http://horstmann.com/scala/index.html, http://horstmann.com/python4everyone.html, http://horstmann.com/bigjava.html, http://horstmann.com/bigjava.html, http://horstmann.com/bjlo.html, http://horstmann.com/bjlo.html, http://www.wiley.com/college/sc/horstmann/, http://horstmann.com/bigcpp.html, http://horstmann.com/cpp4everyone/index.html, http://horstmann.com/corejava.html, http://corejsf.com/, http://horstmann.com/design_and_patterns.html, http://horstmann.com/PracticalOO.html, http://horstmann.com/mood.html, http://horstmann.com/mcpp.html, http://codecheck.it, http://horstmann.com/violet/index.html, http://horstmann.com/safestl.html, http://www.stlport.org/, http://horstmann.com/cpp/pitfalls.html, http://horstmann.com/cpp/iostreams.html, http://code.google.com/p/twisty/, http://frotz.sourceforge.net/, http://www.vaxdungeon.com/Infocom/, http://horstmann.com/applets/RoadApplet/RoadApplet.html, http://horstmann.com/applets/Retire/Retire.html, http://horstmann.com/corejava.html, http://horstmann.com/applets/WeatherApplet/WeatherApplet.html, http://horstmann.com/corejava.html, http://horstmann.com/applets/Intersection/Intersection.html]

Related

StringTokenizer doesn't read the firs line of the file.txt

I'm trying to take every single words from a text file and put them into a ArrayList but the StringTokenizer doesn't read the first line of the text file... What's wrong?
public class BufferReader {
public static void main(String[] args) throws FileNotFoundException, IOException {
BufferedReader reader = new BufferedReader(new FileReader("C://Java-projects//EsameJava//prova.txt"));
String line = reader.readLine();
List<String> str = new ArrayList<>();
while ((line = reader.readLine()) != null) {
StringTokenizer token = new StringTokenizer(line);
while (token.hasMoreTokens()) {
str.add(token.nextToken());
}
}
System.out.println(str);
The only solution I found is to start the text file from the second line but it's not what I want...
This is how you could marry the (very) old and the new(er) to provide a collection of words:
import java.text.BreakIterator;
import java.util.ArrayList;
import java.util.List;
import java.util.stream.Stream;
import java.nio.file.Files;
import java.nio.file.Paths;
public class WordCollector {
public static void main(String[] args) {
try {
List<String> words = WordCollector.getWords(Files.lines(Paths.get(args[0])));
System.out.println(words);
} catch (Throwable t) {
t.printStackTrace();
}
}
public static List<String> getWords(Stream<String> lines) {
List<String> result = new ArrayList<>();
BreakIterator boundary = BreakIterator.getWordInstance();
lines.forEach(line -> {
boundary.setText(line);
int start = boundary.first();
for (int end = boundary.next(); end != BreakIterator.DONE; start = end, end = boundary.next()) {
String candidate = line.substring(start, end).replaceAll("\\p{Punct}", "").trim();
if (candidate.length() > 0) {
result.add(candidate);
}
}
});
return result;
}
}

Finding a specific sequence of characters in a string

I'm trying to write a code to get the latitude and longitude of a user inputted city.
I've written a code which searches a website and then stores its HTML data in a string.
I now want to search the string(the HTML of the website) for the values.
I've looked into API's, and HTML parsers, but they are all too complicated for me (I'm still in school, just starting out), so please don't recommend those unless its absolutely impossible to do without them.
Code:
import java.net.*;
import java.io.*;
import java.util.Scanner;
import static java.lang.System.*;
class websearch {
public static void main(String[] args) {
Scanner sc = new Scanner(System.in);
//gets the city
out.println("enter city, add plus between multiple words");
String term = sc.nextLine();
try {URL url = new URL("http://www.geonames.org/search.html?q=" + term + "&country=");
URLConnection ucl = url.openConnection();
InputStream stream = ucl.getInputStream();
int i;
//the string in which the html code will be stored
String code = " ";
while ((i=stream.read())!= -1) {
code += Character.toString((char)i);
}
//printing the html, only for testing
System.out.print(code);
} catch(Exception e) {
System.out.println("error");
}
}
}
This code prints a string too large to be pasted here, but the values I want to find look like this:
<td nowrap="">N 40° 42' 51''</td>
<td nowrap="">W 74° 0' 21''</td>
How could I find this sequence of characters, and then store only the latitude and longitude in a variable?
This tutorial should be of some use to you. it goes over how to get the geocode data without actually using the google maps API. If you follow what it says, you should have a fairly easy time implementing it.
The end result is that you will enter the street address of the location you want, and it will return the latitude and longitude in searchable variables.
Primary working class
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.HashMap;
import java.util.Map;
import org.apache.log4j.Logger;
import org.json.simple.JSONArray;
import org.json.simple.JSONObject;
import org.json.simple.JSONValue;
import org.json.simple.parser.JSONParser;
public class OpenStreetMapUtils {
public final static Logger log = Logger.getLogger("OpenStreeMapUtils");
private static OpenStreetMapUtils instance = null;
private JSONParser jsonParser;
public OpenStreetMapUtils() {
jsonParser = new JSONParser();
}
public static OpenStreetMapUtils getInstance() {
if (instance == null) {
instance = new OpenStreetMapUtils();
}
return instance;
}
private String getRequest(String url) throws Exception {
final URL obj = new URL(url);
final HttpURLConnection con = (HttpURLConnection) obj.openConnection();
con.setRequestMethod("GET");
if (con.getResponseCode() != 200) {
return null;
}
BufferedReader in = new BufferedReader(new InputStreamReader(con.getInputStream()));
String inputLine;
StringBuffer response = new StringBuffer();
while ((inputLine = in.readLine()) != null) {
response.append(inputLine);
}
in.close();
return response.toString();
}
public Map<String, Double> getCoordinates(String address) {
Map<String, Double> res;
StringBuffer query;
String[] split = address.split(" ");
String queryResult = null;
query = new StringBuffer();
res = new HashMap<String, Double>();
query.append("http://nominatim.openstreetmap.org/search?q=");
if (split.length == 0) {
return null;
}
for (int i = 0; i < split.length; i++) {
query.append(split[i]);
if (i < (split.length - 1)) {
query.append("+");
}
}
query.append("&format=json&addressdetails=1");
log.debug("Query:" + query);
try {
queryResult = getRequest(query.toString());
} catch (Exception e) {
log.error("Error when trying to get data with the following query " + query);
}
if (queryResult == null) {
return null;
}
Object obj = JSONValue.parse(queryResult);
log.debug("obj=" + obj);
if (obj instanceof JSONArray) {
JSONArray array = (JSONArray) obj;
if (array.size() > 0) {
JSONObject jsonObject = (JSONObject) array.get(0);
String lon = (String) jsonObject.get("lon");
String lat = (String) jsonObject.get("lat");
log.debug("lon=" + lon);
log.debug("lat=" + lat);
res.put("lon", Double.parseDouble(lon));
res.put("lat", Double.parseDouble(lat));
}
}
return res;
}
}
How to call the above working class:
public class GetCoordinates {
static String address = "The White House, Washington DC";
public static void main(String[] args) {
Map<String, Double> coords;
coords = OpenStreetMapUtils.getInstance().getCoordinates(address);
System.out.println("latitude :" + coords.get("lat"));
System.out.println("longitude:" + coords.get("lon"));
}
}
use XPath is a easy way.
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(<uri_as_string>);
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile(<xpath_expression>);
as for , try something like below:
/xxx/td[#nowrap='']/text()

Using Arraylist in different classes

So the code below finds words in a document as specific by the word input. Counts the number of times the words occurs in each sentence then stores that count in the arraylists at the bottom label a for cone and b for ctwo.
I want to use the arraylists in another class but can't seem to find a way to do it.
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
public class exc {
public exc() {
}
public static void main(String[] args) throws Exception {
cone aa = new cone();
ctwo bb = new ctwo();
// after this I'm stuck
}
}
import java.io.BufferedReader;
import java.io.FileReader;
import java.util.Arrays;
import java.util.List;
public class cone {
public void cone() throws Exception {
BufferedReader e = new BufferedReader(new FileReader("words to be read.txt"));
String o;
while((o = e.readLine()) != null){
String[] sentences = o.split("\\b[.!?]\\s+");
//System.out.println(o);
String [] h = sentences;
{
BufferedReader t = new BufferedReader(new FileReader("Text to be scan.txt"));
String g;
while((g = t.readLine()) != null){
String[] set=g.split(" ");
List<String> list = Arrays.asList(set);
// System.out.println(Arrays.toString(set));
//System.out.println(o);
int sentenceNumb=1;
for (String sentence: h) {
int counter=0;
String[] words = sentence.replace(".", "").split(" ");
for(String word: words) {
if (list.contains(word)) {
counter++;
}
}
List<Integer> A = Arrays.asList(counter++);
}
}
}
}
}
}
import java.io.BufferedReader;
import java.io.FileReader;
import java.util.Arrays;
import java.util.List;
public class ctwo {
public void ctwo() throws Exception {
BufferedReader e = new BufferedReader(new FileReader("words to be read.txt"));
String o;
while((o = e.readLine()) != null){
String[] sentences = o.split("\\b[.!?]\\s+");
//System.out.println(o);
String [] h = sentences;
{
BufferedReader t = new BufferedReader(new FileReader("Text to be scan.txt"));
String g;
while((g = t.readLine()) != null){
String[] set=g.split(" ");
List<String> list = Arrays.asList(set);
// System.out.println(Arrays.toString(set));
//System.out.println(o);
int sentenceNumb=1;
for (String sentence: h) {
int counter=0;
String[] words = sentence.replace(".", "").split(" ");
for(String word: words) {
if (list.contains(word)) {
counter++;
}
}
List<Integer> B= Arrays.asList(counter++);
}
}
}
}
}
}
Best approach: You have both the ArrayLists in main(), pass them as function parameters to functions(from any class) that need them.
Not so good approach: Store the ArrayLists as package protected static class variables in the cone and ctwo classes. You can access them as cone.A and ctwo.B.
Pass the same array list in the constructor of both the classes.
your program seems weird.
I would suggest reading words and adding distinct words to hashmap with key as word and value as it's count.

I'm trying to read a CSV file and display it in excel as a csv file

***I'm having error that states:
Error: Main method not found in class Main, please define the main method as:
public static void main(String[] args)
I solved the previous issue but I'm now getting this error: C:\Program Files\Java\jre1.8.0_60\bin\javaw.exe (Nov 8, 2015, 7:41:12 PM)
When attempting to run this code:***
package filtermovingaverage;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.io.PrintWriter;
import java.util.ArrayList;
public class Main {
//public static void main(String[] args){
private static double freqS = 100;
static ArrayList<Double> sec = null;
private static double[] pressure = new double[20481];
void func() throws IOException
{
BufferedReader read = new BufferedReader(new FileReader(new File("C:\\Users\\KwakuK\\Downloads\\smith2.csv")));
String currentLine = new String();
currentLine = read.readLine();
int i = 0;
//make some computation
while((currentLine = read.readLine()) != null)
{
String[] numbers = currentLine.split(","); // split the string into sub strings
if(numbers.length >= 3)
{
System.out.println("currentLine: " + " " + currentLine);
pressure[i++] = Double.parseDouble(numbers[2]); // when you do the 2, it's the third column which is the pressure
}
}
}
public static void setupFirstPlot() throws FileNotFoundException{
sec = new ArrayList<Double>();
double ws = 1/freqS;
double n = (pressure.length)*ws;
for(double i = 0; i < n; i = i + ws){
sec.add(i);
}
PrintWriter pw = new PrintWriter(new File("plot13.csv"));
for(int i = 0; i < pressure.length; i++){
pw.write(sec.get(i)+","+pressure[i]+"\n");
}
pw.close();
}
public static void main(String[] args) throws FileNotFoundException{
setupFirstPlot();
System.out.println();
}
}
Try adding package filtermovingaverage; to the top of this java file.

How to create a regex who verify the existence of a number into an array in java

i want to verify if a number for example 701234567 is an element of my array in java. For this, my code search if my number who is begening with 7 and have 9 digits is a element of my array "numbercall.txt" who have 5 elements. This is my text file:
numbercall.txt [ 702345678, 714326578, 701234567, 791234567,751234567]
This is my code:
import java.io.BufferedReader;
import java.io.DataInputStream;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class TestNumberLt {
static String[] arr= null;
String filename = "fichiers/numbercall.txt";
static String a = null ;
static List<String> list = new ArrayList<String>();
public static void main(String [] args) throws IOException{
FileInputStream fstream_school = new FileInputStream(filename);
DataInputStream data_input = new DataInputStream(fstream_school);
BufferedReader buffer = new BufferedReader(new InputStreamReader(data_input));
String str_line;
while ((str_line = buffer.readLine()) != null)
{
str_line = str_line.trim();
if ((str_line.length()!=0))
{
list.add(str_line);
}
}
int b = 773214576;
//convert the arraylist to a array
arr = (String[])list.toArray(new String[list.size()]);
Pattern p = Pattern.compile("^7[0|6|7][0-9]{7}$");
Matcher m ;
//a loop for verify if a number exist in this array
for (int j = 0; j < list.size();)
{
System.out.print(" "+list.get(j)+ " ");
m = p.matcher(list.get(j));
/*while(m.find())
System.out.println(m.group());*/
if(list.get(j).equals(b))
{
System.out.println("Trouvé "+list.get(j));
break;
}
else
{
System.out.println("ce numéro ("+b+") n'existe pas!");
}
break;
}
}
}
Do it simply like this
String str_line= "702345678,714326578,701234567,791234567,751234567";
String[] strArray = str_line.split(",");
String key = "702345678";
for(String v:strArray) {
if(v.equals(key)) {
System.out.println("found");
}
}
I'm not realy sure of what you want, but if you just need the index of b in your array just do this:
public static void main(String [] args) throws IOException{
...
int b = 773214576;
int tmp = list.indexOf(b+"");
if(tmp!=-1) {
System.out.println("Trouvé "+ b + " à l'index " + tmp);
} else {
System.out.println("Ce numéro ("+b+") n'existe pas!");
}
...
}
Another answer, using Guava :
(in this case, there really is no need, you could simply use split() method from String object, but like Guava readibility and returns)
package stackoverflow;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import com.google.common.base.Splitter;
public class RegexExample {
String filename = "numbercall.txt";
public boolean isInList(String numberToCheck) throws IOException {
BufferedReader file = loadFile();
for (String number : extractNumberListFrom(file)) {
if (number.trim().equals(numberToCheck)) {
return true;
}
}
return false;
}
private Iterable<String> extractNumberListFrom(BufferedReader buffer) throws IOException {
StringBuilder numberList = new StringBuilder();
String line;
while ((line = buffer.readLine()) != null) {
numberList.append(line);
}
return Splitter.on(",").split(numberList.toString());
}
private BufferedReader loadFile() {
InputStream fstream_school = RegexExample.class.getResourceAsStream(filename);
BufferedReader buffer = new BufferedReader(new InputStreamReader(fstream_school));
return buffer;
}
}

Categories