I am noob in Java trying to to build a scraper in a Java which could do the following things.
Ability to read data from a CSV file.
Use the URIs in that file and scrape the complete App info from the Google Playstore.
Export the scraped data and other meta data from the CSV file into an XML file
Can any one guide me in this how to go from here ?
Till now I have made the following three classes
main.java (This is the main method where I call other two classes)
import java.io.IOException;
public class main {
public static void main(String[] args) throws IOException {
ReadCVS obj = new ReadCVS();
obj.run();
AppInfo obj1 = new AppInfo();
obj1.readFile();
}
}
ReadCVS.java (This file reads the CSV file and give the output in a txt file)
import java.io.BufferedReader;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.FileReader;
import java.io.IOException;
import java.io.PrintStream;
public class ReadCVS {
public void run() {
// Replace the file path to the appropriate path.
String csvFile = "\\Desktop\\https---play_google_com-store-apps-details-id=.csv";
BufferedReader br = null;
String line = "";
String cvsSplitBy = ";";
try {
File file = new File("\\Desktop\\output.txt");
FileOutputStream fos = new FileOutputStream(file);
PrintStream ps = new PrintStream(fos);
System.setOut(ps);
br = new BufferedReader(new FileReader(csvFile));
while ((line = br.readLine()) != null) {
// use comma as separator
String[] country = line.split(cvsSplitBy);
System.out.println("URL = " + country[0] + " "
);
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (br != null) {
try {
br.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
System.out.println("Done");
}
}
AppInfo.java (This file reads the input from the saved output.txt and tries to out put in the console. But it is not currently working)
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
public class AppInfo {
public void readFile(){
String fileName = "\\Desktop\\output.txt";
//read file into stream, try-with-resources
try (BufferedReader br = new BufferedReader(new FileReader(fileName))) {
String line;
while ((line = br.readLine()) != null) {
System.out.println(line);
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
The problem is that whenever I try to run this code the program get hanged and does not terminate.
Can any one help me with my problem ?
Related
I can't wrap my head around why I get zero output... The code looks correct to me, and it compiles with no problem (except for the lack of output). I have tried with absolute path. The text file is stored in the same folder as the class. Am I missing something obvious?
public class File {
public static void main(String[] args) throws FileNotFoundException {
String filename = "./inputD2.txt";
readFile(filename);
System.out.println( readFile(filename));
}
private static List<String> readFile(String filename) {
List<String> records = new ArrayList<>();
try {
BufferedReader reader = new BufferedReader(new FileReader(filename));
String line;
while ((line = reader.readLine()) != null) {
records.add(line);
}
reader.close();
return records;
}
catch (Exception e) {
System.err.format("Exception occurred trying to read '%s'.", filename);
e.printStackTrace();
return null;
}
}
}
package com.test;
import java.io.BufferedReader;
import java.io.FileReader;
import java.util.ArrayList;
import java.util.List;
public class FileReaderTest {
public static void main(String[] args) {
String filename = "F:\\Sixth_workspace\\Sampleproject\\src\\main\\resources\\try.txt";
System.out.println("Reading from the text file" + " " + readFile(filename));
}
private static List<String> readFile(String filename) {
List<String> records;
try {
records = new ArrayList<String>();
BufferedReader reader = new BufferedReader(new FileReader(filename));
String line;
while ((line = reader.readLine()) != null) {
records.add(line);
}
reader.close();
return records;
} catch (Exception e) {
System.err.println("Exception occurred trying to read '%s'." + filename);
e.printStackTrace();
return null;
}
}
}
I modified your code and got the desired output. Use the full path of the text file, here
F:\\Sixth_workspace\\Sampleproject\\src\\main\\resources\\try.txt
is my full path.
Changes:
Changed the classname
Given full path of the text file
Using java 1.8 (above 1.5 is required)
I'm sort of new to Java and I'm learning about input validation methods but I'm struggling with an assignment that I'm trying to complete. Can someone help me? The following code is reading a file somewhere on your computer. I'm supposed to verify that the file path is correct with an input validation method. This is what I have so far:
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.util.Scanner;
public class readFile {
public static void main(String[] args) {
Scanner scan = new Scanner(System.in);
System.out.print("Enter the name of your File: ");
String fileName = scan.nextLine();
File inputFile = new File(fileName);
BufferedReader reader = null;
try {
String sCurrentLine;
reader = new BufferedReader(new FileReader(inputFile));
while ((sCurrentLine = reader.readLine()) != null) {
System.out.println(sCurrentLine);
}
} catch (IOException e) {
e.printStackTrace();
System.out.print(e.getMessage());
} finally {
try {
if (reader != null)reader.close();
} catch (IOException ex) {
System.out.println(ex.getMessage());
ex.printStackTrace();
}
}
}
}
The easiest way to tell if the file path given is correct is to simply check if it exists:
if (inputFile.exists() && !inputFile.isDirectory()) {
// inputFile has a valid path.
}
Use following code checking.
File f = new File(filePathString);
if(f.exists() && !f.isDirectory()) {
// do something
}
I'm writing a code where in data in a file has to be replaced with another file content.
I know how to use a string Replace() function. but the problem here is, I want to replace a string with a entirely new Data.
I'm able to append(in private static void writeDataofFootnotes(File temp, File fout)) the content, but unable to know how do I replace it.
Below is my code.
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.Closeable;
import java.io.File;
import java.io.FileOutputStream;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;
import java.net.URL;
public class BottomContent {
public static void main(String[] args) throws Exception {
String input = "C:/Users/u0138039/Desktop/Proview/TEST/Test/src.html";
String fileName = input.substring(input.lastIndexOf("/") + 1);
URL url = new URL("file:///" + input);
BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
File fout = new File("C:/Users/u0138039/Desktop/TEST/Test/OP/" + fileName);
File temp = new File("C:/Users/u0138039/Desktop/TEST/Test/OP/temp.txt");
if (!fout.exists()) {
fout.createNewFile();
}
if (!temp.exists()) {
temp.createNewFile();
}
FileOutputStream fos = new FileOutputStream(fout);
FileOutputStream tempOs = new FileOutputStream(temp);
BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(fos));
BufferedWriter tempWriter = new BufferedWriter(new OutputStreamWriter(tempOs));
String inputLine;
String footContent = null;
int i = 0;
while ((inputLine = in.readLine()) != null) {
if (inputLine.contains("class=\"para\" id=\"")) {
footContent = inputLine.replaceAll(
"<p class=\"para\" id=\"(.*)_(.*)\" style=\"text-indent: (.*)%;\">(.*)(.)(.*)</p>",
"<div class=\"tr_footnote\">\n<div class=\"footnote\">\n<sup><a name=\"ftn.$2\" href=\"#f$2\" class=\"tr_ftn\">$4</a></sup>\n"
+ "<div class=\"para\">" + "$6" + "\n</div>\n</div>\n</div>");
inputLine = inputLine.replaceAll(
"<p class=\"para\" id=\"(.*)_(.*)\" style=\"text-indent: (.*)%;\">(.*)(.)(.*)</p>",
"");
tempWriter.write(footContent);
tempWriter.newLine();
}
inputLine = inputLine.replace("</body>", "<hr/></body>");
bw.write(inputLine);
bw.newLine();
}
tempWriter.close();
bw.close();
in.close();
writeDataofFootnotes(temp, fout);
}
private static void writeDataofFootnotes(File temp, File fout) throws IOException {
FileReader fr = null;
FileWriter fw = null;
try {
fr = new FileReader(temp);
fw = new FileWriter(fout, true);
int c = fr.read();
while (c != -1) {
fw.write(c);
c = fr.read();
}
} catch (IOException e) {
e.printStackTrace();
} finally {
close(fr);
close(fw);
}
}
public static void close(Closeable stream) {
try {
if (stream != null) {
stream.close();
}
} catch (IOException e) {
// ...
}
}
}
Here I'm searching for a particular string and saving it in a separate txt file. And once I'm done with the job. I want to replace the <hr /> tag with the entire txt file data.
How can I achieve this?
I'd modify your processing loop as follows:
while ((inputLine = in.readLine()) != null) {
// Stop translation when we reach end of document.
if (inputLine.contains("</body>") {
break;
}
if (inputLine.contains("class=\"para\" id=\"")) {
// No changes in this block
}
bw.write(inputLine);
bw.newLine();
}
// Close temporary file
tempWriter.close();
// Open temporary file, and copy verbatim to output
BufferedReader temp_in = Files.newBufferedReader(temp.toPath());
String footnotes;
while ((footnotes = temp_in.readLine()) != null) {
bw.write(footnotes);
bw.newLine();
}
temp_in.close();
// Finish document
bw.write(inputLine);
bw.newLine();
while ((inputLine = in.readLine()) != null) {
bw.write(inputLine);
bw.newLine();
}
// ... and close all open files
I have created a few files for temporary use and used them as inputs for some methods. And I called
deleteOnExit()
on all files I created. But one file still remains.
I assume it is because the file is still in use, but doesn't the compiler go to next line only after the current line is done?(Single thread)
While its not a problem practically because of java overwrite, there is only one file always. I would like to understand why it happens and also if I can use
Thread.sleep(sometime);
Edit:-
File x = new file("x.txt");
new class1().method1();
After creating all files(5), I just added this line
x.deleteOnExit(); y.deletOnExit() and so on...
All the files except that last one is deleted.
Make sure that whatever streams are writing to the file are closed. If the stream is not closed, file will be locked and delete will return false. That was an issue I had. Hopefully that helps.
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.io.StringWriter;
import java.text.SimpleDateFormat;
import java.util.ArrayList;
import java.util.Date;
import java.util.List;
public class Test {
public static void main(String[] args) {
File reportNew = null;
File writeToDir = null;
BufferedReader br = null;
BufferedWriter bw = null;
StringWriter sw = null;
List<File> fileList = new ArrayList<File>();
SimpleDateFormat ft = new SimpleDateFormat("yyyymmdd_hh_mm_ss_ms");
try {
//Read report.new file
reportNew = new File("c:\\temp\\report.new");
//Create temp directory for newly created files
writeToDir = new File("c:\\temp");
//tempDir.mkdir();
//Separate report.new into many files separated by a token
br = new BufferedReader(new FileReader(reportNew));
sw = new StringWriter();
new StringBuilder();
String line;
int fileCount = 0;
while (true) {
line=br.readLine();
if (line == null || line.contains("%PDF")) {
if (!sw.toString().isEmpty()) {
fileCount++;
File _file = new File(writeToDir.getPath()
+ File.separator
+ fileCount
+ "_"
+ ft.format(new Date())
+ ".htm");
_file.deleteOnExit();
fileList.add(_file);
bw = new BufferedWriter(new FileWriter(_file));
bw.write(sw.toString());
bw.flush();
bw.close();
sw.getBuffer().setLength(0);
System.out.println("File "
+ _file.getPath()
+ " exists "
+ _file.exists());
}
if (line == null)
break;
else
continue;
}
sw.write(line);
sw.write(System.getProperty("line.separator"));
}
} catch ( Exception e) {
e.printStackTrace();
} finally {
if (bw != null) {
try {
bw.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
}
In order to close the file that you have opened in your program, try creating an explicit termination method.
Therefore, try writing the following:
public class ClassThatUsesFile {
private String filename;
private BufferReader reader;
public ClassThatUsesFile (String afile) {
this.filename = afile;
this.reader = new BufferReader(new FileReader(afile));
}
// try-finally block guarantees execution of termination method
protected void terminate() {
try {
// Do what must be done with your file before it needs to be closed.
} finally {
// Here is where your explicit termination method should be located.
// Close or delete your file and close or delete your buffer reader.
}
}
}
I have a list of files in the directory C:\Users\Mahady\Desktop\Java 31122011\src\register\
they are like this....
100100545.txt
100545454.txt etc etc
in each file, file data are like this line by line:
Bob
1234
4834
London
9852
1
My question is, how do i read each files one by one in the directory and for each files read all lines except line 3. i would then like to merge this data in word and create letters. thanks
Detailed Answer....
import java.io.BufferedReader;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
public class FileRead {
public static void main(String[] args) {
FileReader fileReader = null;
BufferedReader bufferedReader = null;
try {
File folder = new File("C:/Users/Mahady/Desktop/Java 31122011/src/register/");
if (folder.isDirectory()) {
for (File file : folder.listFiles()) {
fileReader = new FileReader(file);
bufferedReader = new BufferedReader(fileReader);
String line = null;
int lineCount = 0;
while (null != (line = bufferedReader.readLine())) {
lineCount++;
if (3 != lineCount) {
System.out.println(line);
}
}
}
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (null != bufferedReader)
try {
bufferedReader.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
Hope this would help you.
Try this:
File dir = new File("C:\\Users\\Mahady\\Desktop\\Java 31122011\\src\\register\\");
for (string fn : dir.list()) {
FileInputStream fstream = new FileInputStream(fn);
DataInputStream in = new DataInputStream(fstream);
BufferedReader br = new BufferedReader(new InputStreamReader(in));
String strLine;
while ((strLine = br.readLine()) != null) {
System.out.println (strLine);
}
in.close();
}
Obviously, you will need to add exception handling code around this skeletal implementation.