reading data from file and breaking it into tokens - java

I am using the following code after reading from a text file as to break the input of the text file into tokens:
String input;
while(true)
{
input = bin.readLine();
if (input == null)
{
System.out.println( "No data found in the file");
return 0;
}
break;
}
StringTokenizer tokenizer = new StringTokenizer(input);
Then:
for (int i=0; i < numAtt; i++)
{
attributeN[i] = tokenizer.nextToken();
}
I cannot understand why the attributeNames gets the tokens in the first line of the text file only, doesn't while(true) keep on reading the whole file? Also is there a way to avoid the while(true) and using break to terminate it?

Your break after if (input == null) { } is breaking the while, so your code only read one line.
Also is there a way to avoid the while(true) and using break to
terminate it?
Do it in this way:
while ((input = bin.readLine()) != null) {
//split input line here
}
Also, consider using String#split() to split the line in tokens. Example for the separator , :
String attributeNames[] = input.split(",");

The best way is using:
String splittedString[] = input.split("the separator");
It's recommended by Oracle.

Are you just trying to get tokens from the first line? If so, you don't need the while loop at all. Just remove the while { from the beginning and break; } from the end.

Related

Ignoring blank lines in CSV file in Java

I am trying to iterate through a CSV file in Java. It iterates through the entire file, but will get to the end of the file and try to read the next blank line and throw an error. My code is below.
public class Loop() {
public static void main(String[] args) {
BufferedReader br = null;
String line = "";
try {
HashMap<Integer, Integer> changeData = new HashMap<Integer, Integer>();
br = new BufferedReader(new FileReader("C:\\xxxxx\\xxxxx\\xxxxx\\the_file.csv"));
String headerLine = br.readLine();
while ((line = br.readLine()) != null) {
String[] data = line.split(",");
/*Below is my latest attempt at fixing this,*/
/*but I've tried other things too.*/
if (data[0].equals("")) { break; }
System.out.println(data[0] + " - " + data[6]);
int changeId = Integer.parseInt(data[0]);
int changeCv = Integer.parseInt(data[6]);
changeData.put(changeId, changeCv);
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
Like I typed, this works fine until it gets to the end of the file. When it gets to the end of the file, I get the error Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0 at com.ucg.layout.ShelfTableUpdates.main(ShelfTableUpdates.java:23). I've stepped through the code by debugging it in Spring Tool Suite. The error comes up whenever I try to reference data[0] or data[6]; likely because there is nothing in that line. Which leads me back to my original question of why it is even trying to read the line in the first place.
It was my understanding that while ((line = br.readLine()) != null) would detect the end of the file, but it doesn't seem to be. I've tried re-opening the file and deleting all of the blank rows, but that did not work.
Any idea how I can detect the end of the file so I don't get an error in this code?
ANSWER:
Credit goes to user #quemeraisc. I also was able to replace the commas with blanks, and if the line then equals null or "", then you know that it is the end of the file; in my case, there are no blank rows before the end of the file. This still does not solve the problem of detecting the end of the file in that if I did have blank rows in between my data that were not the EOF then this would detect those.
Solution 1:
if (data.length < 7) {
System.out.println(data.length);
break;
}
Solution 1:
if (line.replace(",", "").equals(null) || line.replace(",", "").equals("")) {
System.out.println(line.replace(",", ""));
break;
}
Just skip all blank lines:
while ((line = br.readLine()) != null) {
if( line.trim().isEmpty() ) {
continue;
}
....
....
The last line may contain some control characters (like new line, carriage return, EOF and others unvisible chars), in this case a simple String#trim() doesn't remove them, see this answer to know how to remove them: How can i remove all control characters from a java string?
public String readLine() will read a line from your file, even empty lines. Thus, when you split your line, as in String[] data = line.split(","); you get an array of size 1.
Why not try :
if (data.length >= 7)
{
System.out.println(data[0] + " - " + data[6]);
int changeId = Integer.parseInt(data[0]);
int changeCv = Integer.parseInt(data[6]);
changeData.put(changeId, changeCv);
}
which will make sure there are at least 7 elements in your array before proceeding.
To skip blank lines you could try:
while ((line = reader.readLine()) != null) {
if(line.length() > 0) {
String[] data = line.split(",");
/*Below is my latest attempt at fixing this,*/
/*but I've tried other things too.*/
if (data[0] == null || data[0].equals("")) { break; }
System.out.println(data[0] + " - " + data[6]);
int changeId = Integer.parseInt(data[0]);
int changeCv = Integer.parseInt(data[6]);
changeData.put(changeId, changeCv);
}
}
Instead of replace method use replaceAll method. Then it will work.

How to read a ;-separated CSV in Java that can countain an unknown number of elements

I know there exist a lot questions about reading CSV files, but I simply can't find one that fits my needs.
I try to get keywords from a keywords.csv that can be in a form like this. The delimeter is always the ";".
SAP;BI; Business Intelligence;
ERP;
SOA;
SomethingElse;
I already looked into openCSV and so on, but I can't find a functioning example how to do that (simple) task.
I tried this:
public void getKeywords()
{
try {
int rowCount = 0;
CSVReader reader = new CSVReader(new FileReader(csvFilename), ';');
String[] row = null;
while((row = reader.readNext()) != null) {
System.out.println(row[rowCount]);
rowCount++;
}
//...
reader.close();
}
catch (IOException e) {
System.out.println("File Read Error");
}
But it will just return the first element. I don't know what I do wrong. Im new to coding as you may have noticed :)
EDIT: Got what I wanted, thanks for your help!
while((row = reader.readNext()) != null) {
for (int i=0; i< row.length; i++ )
{
System.out.println(row[i]);
}
Please help an old man out.
Thank you!
Using openCSV, you could use this code:
CSVReader reader = new CSVReader(new FileReader("yourfile.csv"), ';');
That will open the .csv file, read it in, and use a ; as the delimiter. A similar example can be found on the openCSV home page.
Once you have the file read in, you can use the data with something like the following:
String [] nextLine;
// Read from the csv sequentially until all the lines have been read.
while ((nextLine = reader.readNext()) != null) {
// nextLine[] is an array of values from the line
System.out.println(nextLine[0] + nextLine[1] + "etc...");
}
Where nextLine is a line from the file, and nextLine[0] will be the first element of the line, nextLine[1] will be the second, etc.
Edit:
In your comment below, you mentioned that you don't know how many elements will be in each row. You can handle that by using nextLine.length and figuring out how many elements are in that row.
For example, change the above code to something like:
String [] nextLine;
while ((nextLine = reader.readNext()) != null) {
if(nextLine.length == 1) {
// Do something with the first element, nextLine[0]
System.out.println(nextLine[0]);
}
else if(nextLine.length == 2) {
// Do something with both nextLine[0] and nextLine[1]
System.out.println(nextLine[0] + ", " + nextLine[1]);
}
// Continue depending on how you want to handle the different rows.
}
You can read the file using the readLine() method from the Scanner class. The output of this method is one line of the input file. You can then use the String.split(";") method to get the individual elements. You can then move to the next line using the methods in the Scanner class and then continue from thereon.
You will get a number of arrays - one corresponding to each line from the input file. You can just combine them to get what you want.

java.util.NoSuchElementException: No line found when line exists in file

I have a Code where I am scanning the lines using Scanner Class and looping till there are no lines left.
My code looks like this:
File file = new File(filePath);
if (file.exists()) {
Scanner s = new Scanner(file);
String tmp = null;
int result = 0;
try {
while (true) {
tmp = s.nextLine();
if (tmp != null && tmp.equals("")) {
result += Integer.parseInt(tmp);
}
System.out.println(runSequence(Integer.parseInt(tokens[0])));
}
} catch (Exception e) {
e.printStackTrace();
}
System.out.println(result);
}
It gives the error at
tmp = s.nextLine();
java.util.NoSuchElementException: No line found
Which is odd because earlier the same code was working fine.
Why is this line giving an error?
Edit:
My mistake i did not state the question correctly, i particularly left the try catch block out of the while loop so that i could make an exit when the lines ended...My question is why am i not able to read any of the lines...i have about 3-4 lines to read in the txt file and it is not reading any and giving exception at the first line read itself...
I think the better way to code is to have a condition in your while loop using Scanner#hasNextLine(). Scanner#hasNextLine() would make sure that code inside while would only run if it has a line in the file=.
while (s.hasNextLine()) {
tmp = s.nextLine();
if (tmp != null && tmp.equals("")) {
result += Integer.parseInt(tmp);
}
if (tmp != null && tmp.equals(""))
should be (if you are trying to check given string is not empty string)
if (tmp != null && !tmp.isEmpty())
I think you reach at the end of file where there is no remaining line and your condition is while(true) so it tries to read that time also . So you getting NoSuchElementException(if no line was found )
So better to change your while loop as
while (s.hasNextLine()){
tmp = s.nextLine();
// then do something
}
while (s.hasNextLine())
{
//...
}

no line found exception

Help again guys, why do I always get this kind of error when using scanner, even though I'm sure that the file exists.
java.util.NoSuchElementException: No line found
I am trying to count the number of occurences of a by using for loop. the text file contain lines of sentence. At the same time, I want to print the exact format of sentences.
Scanner scanLine = new Scanner(new FileReader("C:/input.txt"));
while (scanLine.nextLine() != null) {
String textInput = scanLine.nextLine();
char[] stringArray = textInput.toCharArray();
for (char c : stringArray) {
switch (c) {
case 'a':
default:
break;
}
}
}
while(scanLine.nextLine() != null) {
String textInput = scanLine.nextLine();
}
I'd say the problem is here:
In your while condition, you scan the last line and come to EOF. After that, you enter loop body and try to get next line, but you've already read the file to its end. Either change the loop condition to scanLine.hasNextLine() or try another approach of reading files.
Another way of reading the txt file can be like this:
BufferedReader reader = new BufferedReader(new InputStreamReader(new BufferedInputStream(new FileInputStream(new File("text.txt")))));
String line = null;
while ((line = reader.readLine()) != null) {
// do something with your read line
}
reader.close();
or this:
byte[] bytes = Files.readAllBytes(Paths.get("text.txt"));
String text = new String(bytes, StandardCharsets.UTF_8);
You should use : scanner.hasNextLine() instead of scanner.nextLine() in the while condition
Scanner implements the Iterator interface which works by this pattern:
See if there is a next item (hasNext())
Retrieve the next item (next())
To count the number of occurrences of "a" or for that matter any string in a string, you can use StringUtils from apache-commons-lang like:
System.out.println(StringUtils.countMatches(textInput,"a"));
I think it will be more efficient than converting the string to character array and then looping over the whole array to find the number of occurrences of "a". Moreover, StringUtils methods are null safe

JAVA - import CSV to ArrayList

I'm trying import CSV file to Arraylist using StringTokenizer:
public class Test
{
public static void main(String [] args)
{
List<ImportedXls> datalist = new ArrayList<ImportedXls>();
try
{
FileReader fr = new FileReader("c:\\temp.csv");
BufferedReader br = new BufferedReader(fr);
String stringRead = br.readLine();
while( stringRead != null )
{
StringTokenizer st = new StringTokenizer(stringRead, ",");
String docNumber = st.nextToken( );
String note = st.nextToken( ); /** PROBLEM */
String index = st.nextToken( ); /** PROBLEM */
ImportedXls temp = new ImportedXls(docNumber, note, index);
datalist.add(temp);
// read the next line
stringRead = br.readLine();
}
br.close( );
}
catch(IOException ioe){...}
for (ImportedXls item : datalist) {
System.out.println(item.getDocNumber());
}
}
}
I don't understand how the nextToken works, because if I keep the initialize three variables (docNumber, note and index) as nextToken(), it fails on:
Exception in thread "main" java.util.NoSuchElementException
at java.util.StringTokenizer.nextToken(Unknown Source)
at _test.Test.main(Test.java:32)
If I keep docNumber only, it works. Could you help me?
It seems that some of the rows of your input file have less then 3 comma separated fields.You should always check if tokenizer has more tokens (StringTokenizer.hasMoreTokens), unless you are are 100% sure your input is correct.
CORRECT parsing of CSV files is not so trivial task. Why not to use a library that can do it very well - http://opencsv.sourceforge.net/ ?
Seems like your code is getting to a line that the Tokenizer is only breaking up into 1 part instead of 3. Is it possible to have lines with missing data? If so, you need to handle this.
Most probably your input file doesn't contain another element delimited by , in at least one line. Please show us your input - if possible the line that fails.
However, you don't need to use StringTokenizer. Using String#split() might be easier:
...
while( stringRead != null )
{
String[] elements = stringRead.split(",");
if(elements.length < 3) {
throw new RuntimeException("line too short"); //handle missing entries
}
String docNumber = elements[0];
String note = elements[1];
String index = elements[2];
ImportedXls temp = new ImportedXls(docNumber, note, index);
datalist.add(temp);
// read the next line
stringRead = br.readLine();
}
...
You should be able to check your tokens using the hasMoreTokens() method. If this returns false, then it's possible that the line you've read does not contain anything (i.e., an empty string).
It would be better though to use the String.split() method--if I'm not mistaken, there were plans to deprecate the StringTokenizer class.

Categories