Extracting Strings of Data from a .txt file - java

When I input a file and try to extract the printed strings and doubles from it, I end up extracting information on the text itself. I inserted a System.out.println into my while loop to print the lines from the file, and it also printed extra lines of text information. I'm trying to get only the written text from the file, ignoring the lines that look like:
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\ql\qnatural\pardirnatural"
I'm doing this so I can take the information from the file to make string arrays with them.
The purpose of this program will be to input a file with rows of information (last name(string), first name(string), account balance(double)), extract each row separately, place each row string in an Array List, sort the array list (by last name then by first name), then output a file, with the name output.txt, with the new sorted rows. The rows will be formatted last name, first name, then account balance with a single space between each. The number of rows can vary.
Input example (from a .txt file):
Smith Charles 200.000
Allen Drake 5000.00
Allen Trey 300.00
Burbis Zeik 400.00
Zan Rick 6000.00
Output example (written to a file output.txt):
Allen Drake 5000.00
Allen Trey 300.00
Burbis Zeik 400.00
Smith Charles 200.000
Zan Rick 6000.00
Thanks!
public static void main(String[] args) throws IOException {
Scanner fileName = new Scanner(System.in);
String file = fileName.next();
String input;
Scanner fileinput = null;
// File inFile = new File("c:\\csc2310\\test.txt");
File inFile = new File(file);
int i = 0;
try
{
fileinput = new Scanner(inFile);
while(fileinput.hasNext())
{
i++;
System.out.println(i);
input = fileinput.nextLine();
System.out.println(input);
}
fileinput.close();
}catch(FileNotFoundException e)
{
System.out.println(e);
System.exit(1);
}
finally
{
fileinput.close();
}
}

The problem had to do with the way that TextEdit (the Mac OS text editor) saves .text files. I wrote the same information into JEdit, and saved it with the same extension, and it eliminated all the lines of text information. Thank you for your time.

Your string is spoilt. Java while reading it will automatically pad the \ characters with \.
Hence you get
"\\pard\\tx720\\tx1440\\tx2160\\tx2880\\tx3600\\tx4320\\tx5040\\tx5760\\tx6480\\tx7200\\tx7920\\tx8640\\ql\\qnatural\\pardirnatural"
Only thing you might be able to possibly do is read the string as byte array and remove the byte equivalent of \ . Check if it is possible

Related

Importing two CSV files into Java and then parsing them. The first one works the second doesnt

Im working on my code where I am importing two csv files and then parsing them
//Importing CSV File for betreuen
String filename = "betreuen_4.csv";
File file = new File(filename);
//Importing CSV File for lieferant
String filename1 = "lieferant.csv";
File file1 = new File(filename1);
I then proceed to parse them. For the first csv file everything works fine. The code is
try {
Scanner inputStream = new Scanner(file);
while(inputStream.hasNext()) {
String data = inputStream.next();
String[] values = data.split(",");
int PInummer = Integer.parseInt(values[1]);
String MNummer = values[0];
String KundenID = values[2];
//System.out.println(MNummer);
//create the caring object with the required paramaters
//Caring caring = new Caring(MNummer,PInummer,KundenID);
//betreuen.add(caring);
}
inputStream.close();
}catch(FileNotFoundException d) {
d.printStackTrace();
}
I then proceed to parse the other csv file the code is
// parsing csv file lieferant
try {
Scanner inputStream1 = new Scanner(file1);
while(inputStream1.hasNext()) {
String data1 = inputStream1.next();
String[] values1 = data1.split(",");
int LIDnummer = Integer.parseInt(values1[0]);
String citynames = values1[1];
System.out.println(LIDnummer);
String firmanames = values1[2];
//create the suppliers object with the required paramaters
//Suppliers suppliers = new
//Suppliers(LIDnummer,citynames,firmanames);
//lieferant.add(suppliers);
}
inputStream1.close();
}catch(FileNotFoundException d) {
d.printStackTrace();
}
the first error I get is
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 2
at Verbindung.main(Verbindung.java:61)
So I look at my array which is firmaname at line 61 and I think, well it's impossible that its out of range since in my CSV file there are three columns and at index 2 (which I know is the third column in the CSV file) is my list of company names. I know the array is not empty because when i wrote
`System.out.println(firmanames)`
it would print out three of the first company names. So in order to see if there is something else causing the problem I commented line 61 out and I ran the code again. I get the following error
`Exception in thread "main" java.lang.NumberFormatException: For input
string: "Ridge"
at java.lang.NumberFormatException.forInputString(Unknown Source)
at java.lang.Integer.parseInt(Unknown Source)
at java.lang.Integer.parseInt(Unknown Source)
at Verbindung.main(Verbindung.java:58)`
I google these errors and you know it was saying im trying to parse something into an Integer which cannot be an integer, but the only thing that I am trying to parse into an Integer is the code
int LIDnummer = Integer.parseInt(values1[0]);
Which indeed is a column containing only Integers.
My second column is also indeed just a column of city names in the USA. The only thing with that column is that there are spaces in some town names like Middle brook but I don't think that would cause problems for a String type. Also in my company columns there are names like AT&T but i would think that the & symbol would also not cause problems for a string. I don't know where I am going wrong here.
I cant include the csv file but here is a pic of a part of it. The length of each column is a 1000.
A pic of the csv file
Scanner by default splits its input by whitespace (docs). Whitespace means spaces, tabs and newlines.
So your code will, I think, split the whole input file at every space and every newline, which is not what you want.
So, the first three elements your code will read are
5416499,Prairie
Ridge,NIKE
1765368,Edison,Cartier
I suggest using method readLine of BufferedReader then calling split on that.
The alternative is to explicitly tell Scanner how you want it to split the input
Scanner inputStream1 = new Scanner(file1).useDelimiter("\n");
but I think this is not the best use of Scanner when a simpler class (BufferedReader) will do.
First of all, I would highly suggest you try and use an existing CSV parser, for example this one.
But if you really want to use your own, you are going to need to do some simple debugging. I don't know how large your file is, but the symptoms you are describing lead me to believe that somewhere in the csv there may be a missing comma or an accidental escape character. You need to find out what line it is. So run this code and check its output before it crashes:
int line = 1;
try {
Scanner inputStream1 = new Scanner(file1);
while(inputStream1.hasNext()) {
String data1 = inputStream1.next();
String[] values1 = data1.split(",");
int LIDnummer = Integer.parseInt(values1[0]);
String citynames = values1[1];
System.out.println(LIDnummer);
String firmanames = values1[2];
line++;
}
} catch (ArrayIndexOutOfBoundsException e){
System.err.println("The issue in the csv is at line:" + line);
}
Once you find what line it is, the answer should be obvious. If not, post a picture of that line and we'll see...

Words coming up in individual lines when parsing csv file in java

I am trying to parse a csv file in java and am running in to a problem. When I try to split the csv up like so:
public static void main(String[] args) {
String nameOfFile = "KingstonNorthWard2016Distribution.csv";
File file = new File(nameOfFile);
try {
Scanner inputStream = new Scanner(file);
while (inputStream.hasNext()){
String data = inputStream.next();
String[] values = data.split(",");
System.out.println(Arrays.toString(values));
}
inputStream.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
My console returns this:
[Distribution]
[Report]
[]
[Print]
[Date/Time:]
[31/10/2016]
[07:27:10PM]
[Bayside]
[City]
[Council]
[2016]
Despite my csv looking like this:
Distribution Report ,,,,,,,,,,,,,,,,,,,,,,,,,,
Print Date/Time: 31/10/2016 07:27:10PM,,,,,,,,,,,,,,,,,,,,,,,,,,
Bayside City Council 2016,,,,,,,,,,,,,,,,,,,,,,,,,,
Central Ward,,,,,,,,,,,,,,,,,,,,,,,,,,
What I can't understand is why my console doesn't look like this:
[Distribution report]
[Print Date/Time: 31/10/2016 07:27:10PM]
[Bayside City Council 2016]
[Central Ward]
If anyone could help that would by great. For extra points, the csv later goes on to list names like "Smith, John" so bear that in mind if my split is in need of change. Thanks in advance.
hasNext and next iterate over words, you want hasNextLine and nextLine.
As for the fields which contain your delimiter, we'd have to look at a sample from your dataset to try and see if there is a rule we can define which shows a delimiter can be ignored by split.

Java: Need Help Reading a large file into an array list by paragraph

OK here is my question, I am in an introductory course in Java so I cannot use any advanced code. I am needing to read in a large text file and store each paragraph as an address in an array list. So I am needing to read in the file and split on the carriage return. What I have so far is posted below. Thanks in advance.
public static void fileReader(String x)throws FileNotFoundException{
String fileName = (x + ".txt");
File input= new File(fileName);
Scanner in =new Scanner(input);
ArrayList<String> linesInFile = new ArrayList<>();
while (in.hasNextLine()){
if ( != '/n'){ //this is where i'm losing it
String line = in.nextLine();
linesInFile.add(line);
}
}
in.close();
If the text file contains paragraphs (doesn't contain any line-breaks within the paragraph), then you don't have to check "/n".
while (in.hasNextLine()){
String line = in.nextLine();
linesInFile.add(line);
}
This would suffice

How do i create/modify the contents of a file before i create it in a Java program i have made?

Hey here is my code i have created its a simple file creation program as i have only been using java for the past 2 days. I'm only 13 so please be simple :)
import java.io.File;
import java.io.IOException;
import java.util.Scanner;
public class Filecreator
{
public static void main( String[] args )
{
Scanner read = new Scanner (System.in);
String y;
String u;
try {
System.out.println("Please enter the name of your file!");
y = read.next();
while (y.contains(".") || y.contains(",") || y.contains("{") || y.contains("}") || y.contains("#")){
System.out.println("Your Filename contains an incorrect character you may only use Number 0-9 And Letters A-Z");
System.out.println("Please Re-enter your file name");
y = read.next();
}
System.out.println("Please enter the file type name");
u = read.next();
while (u.contains(".") || u.contains(",") || u.contains("{") || u.contains("}") || u.contains("#") ){
System.out.println("Your File-type name contains an incorrect character you may only use Number 0-9 And Letters A-Z");
System.out.println("Please Re-enter your file-type name");
u = read.next();
}
File file = new File( y + "." + u );
if (file.createNewFile()){
System.out.println("File is created!");
System.out.println("The name of the file you have created is called " + y + file);
}else{
System.out.println("File already exists.");
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
If you run it on a program such as Eclipse you will see the output. But i want to be able to edit the [file's] contents before i finally choose the name and the file type and then save it. Is there anyway i can do this? Thanks - George
Currently you are printing everything out to the console - this is done when you use System.out.println(...).
What you can do is to write the output somewhere else. How you can do this ? The easiest way how to do this is to Use StringBuilder:
http://docs.oracle.com/javase/7/docs/api/java/lang/StringBuilder.html
This is a code sample :
StringBuilder sb = new StringBuilder();
sb.append("one ");
sb.append("two");
String output = sb.toString(); // output contains string "one two"
Now you have your whole output in one string. If you look at StringBuilder documentation (link is above) you can see that there are other useful methods like insert or delete that help you to modify your output before you convert it to string (with toString method). Once all your modifications are done you can write this string to a file.
For writing a String to a file this could be helpful :
How do I save a String to a text file using Java?
This is good enough approach if you are writing small files (up to few MB). If you want to write bigger files you shouldn't store the whole string in memory before you write it to a file. In such scenarios you should create smaller strings and write them to a file. There is a good tutorial for that :
http://www.mkyong.com/java/how-to-write-to-file-in-java-bufferedwriter-example/
Read lines until the line contains a special end-of-file marker. Eg, the old unix program 'mail' append all lines until a line consisting of a single '.' is read.
// insert this before reading the filename
StringBuilder content = new StringBuilder();
s = read.next();
while( !s.equals(".")) {
content.append(s);
content.append(String.format("%n"));
s = read.next();
}
The look at this: write a string into file (better answer than the other) and use that as an example of how to actually write the contents to the file, after it has been created.
p.s.
Pro-tip: I'm sure you have noticed that you have duplicated exactly the same line for checking if a filename is valid. A good programmer would notice this too and "extract" a method that does the logic in one place.
Example:
boolean isValidFilename( String s ) {
return !(y.contains(".") || y.contains(",") || y.contains("{") || y.contains("}") || y.contains("#")));
}
You may then replace the checks;
while (!isValidFilename( u )){
System.out.println("Your File-type name contains an incorrect character...etc");
}
This is good since you don't have to repeat tricky code, which means there are fewer places to do errors in. Btw, the negations (!) are there to avoid negative names (invalid=true) because you might end up with a double negation when using them (not invalid=true) and that may be a bit confusing.
You can't edit a file before you create it. You can make changes to the data you are going to write to the file once you crate it; since you control both that data and the creation of the file, there should be no problem.

Is there any way to return an entire line of a .txt document?

I'm currently working on a java program that requires you to use several scanners on several txt files to return definitions of names, as well as a few numbers that go along with them about their popularity. Currently, I'm having difficulty returning this data back to the original program.
An example of how the .txt File with the definitions is as follows:
ADAMO m Italian Italian form of ADAM
ADAN mf (no meaning found)
ADANNA f Igbo Means "father's daughter" in Igbo.
ADANNAYA f Igbo Means "her father's daughter" in Igbo.
ADAOIN f Irish Modern form of TAN
My program currently looks like this thus far:
import java.io.*;
import java.util.Scanner;//For reading the file.
public class BabyNamesProject {
//This program will use user input to scan several text documents for information regarding names.
//It will give the popularity of the name in the last few decades, it's meaning, and a Drawing Panel chart with it's popularity.
public static void main(String[] args)
throws FileNotFoundException {
File f = new File("Names.txt");
File g = new File("Meanings.txt");
File h = new File("Names2.txt");
Scanner nameCheck = new Scanner(f);
Scanner meaningCheck = new Scanner(g);
Scanner popularityCheck = new Scanner(h);
Scanner Ask = new Scanner(System.in);
String userName = "";
String upperUserName = "";
String meaning = "";
System.out.println("Hello! This application will allow you to check how popular certain names are compared to others.");
System.out.println(" ");
System.out.println("Please type in your name!");
userName = Ask.next();
upperUserName = userName.toUpperCase();
if (meaningCheck.hasNext(""+ upperUserName))
meaning = meaningCheck.next("" + upperUserName);
System.out.println("" + meaning);
System.out.println("" + userName +"! That's one the rarest cards in all of duel monsters!");
}
}
As you may notice, it's nowhere near finished, but what I can't seem to figure out is how to both search the .txt file, using the uppercase name the user imputs, and how to make it return the entire line. So far, it seems like plenty of searches on here have led me to carriage returns, which I at least understand how they work, but not how to use one. If anyone has time to explain something I could do to solve this, it would be greatly appreciated!
The method is called nextLine():
File f = new File("text.txt");
Scanner s = new Scanner(f);
String line = s.nextLine();
You can use a BufferedReader to read the text file line-by-line and do whatever you please with it.
Have a look here: http://www.mkyong.com/java/how-to-read-file-from-java-bufferedreader-example/

Categories