Scanner reading only half the no. of lines in a file - java

I am trying to read a file using Scanner Object with the following code -
public void read(){
Scanner scanner = new Scanner(dataFile).useDelimiter("\n");
String line;
int i = 0;
while(scanner.hasNext()){
line = scanner.next();
i++;
}
System.out.println(i);
}
The file which I am trying to read from has 117000 lines, out of which the scanner only reads first 59550 odd lines. It does not throw any exception and simply returns.
When I change the implementation to use a BufferedReader it reads all 117000 lines -
public void read(){
BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(dataFile)));
String line;
int i=0;
while((line = br.readLine())!= null){
i++;
}
System.out.println(i);
}
Can anyone explain why scanner doesn't read all lines ?

One probable reason could be that Scanner's(1KB) buffer limit is less than that of BufferedReader(8KB).

The following program works for me:
Scanner scanner = new Scanner(dataFile);
String line;
int i = 0;
while(scanner.hasNextLine()){
line = scanner.nextLine();
// System.out.println(line); // remove comment for debug
i++;
}
System.out.println(i);
scanner.close();
The changes from the original program are:
Changed hasNext() and next() to hasNextLine() and nextLine(). In this case the default delimiter is fine
Fixed a typo - system.out.println should be System.out.println
Added a comment to print line (and check if the delimiter is OK)
Added scanner.close()

It's probably something to do with the line ending, delimiter used by Scanner.
You should use the methods :
hasNextLine() and nextLine()

Can anyone explain why scanner doesn't read all lines ?
br.readLine also selects lines that end with \r (and not \n). This is one important difference with your Scanner that only reads lines with \n.

Related

Counting the number of items in a file using java

I have a text file:
2|BATH BENCH|19.00
20|ORANGE BELL|1.42
04|BOILER ONION|1.78
I need to get the number of items which is 3 here using JAVA. This is my code:
int Flag=0;
File file = new File("/Users/a0r01ox/Documents/costl-tablet-automation/src/ItemUPC/ItemUPC.txt");
Scanner sc = new Scanner(file);
while (sc.hasNextLine()) {
Flag=Flag+1;
}
It is going in an infinite loop.
Can someone please help? Thank you.
You must get the next line to avoid an endless loop.
int Flag = 0;
File file = new File("/Users/a0r01ox/Documents/costl-tablet-automation/src/ItemUPC/ItemUPC.txt");
Scanner sc = new Scanner(file);
while (sc.hasNextLine()) {
sc.nextLine();
Flag++;
}
while (sc.hasNextLine()) {
Flag=Flag+1;
String line = sc.nextLine(); //Do whatever with line
}
In the code you have written
int Flag=0;
File file = new File("/Users/a0r01ox/Documents/costl-tablet-automation/src/ItemUPC/ItemUPC.txt");
Scanner sc = new Scanner(file);
while (sc.hasNextLine()) { // this line is just checking whether there is next line or not.
Flag=Flag+1;
}
When you write while (sc.hasNextLine()){} it check whether there is nextLine or not.
eg line 1 : abcdefg
line 2: hijklmnop
here your code will just be on line 1 and keep telling you that yes there is a nextLine.
Whereas when you write
while(sc.hasNextLine()){
sc.nextLine();
Flag++;
}
Scanner will read the line 1 and then because of sc.nextLine() it will go to line 2 and then when sc.hasNextLine() is checked it gives false.

Java Scanner does not ignore new lines (\n)

I know that by default, the Scanner skips over whitespaces and newlines.
There is something wrong with my code because my Scanner does not ignore "\n".
For example: the input is "this is\na test." and the desired output should be ""this is a test."
this is what I did so far:
Scanner scan = new Scanner(System.in);
String token = scan.nextLine();
String[] output = token.split("\\s+");
for (int i = 0; i < output.length; i++) {
if (hashmap.containsKey(output[i])) {
output[i] = hashmap.get(output[i]);
}
System.out.print(output[i]);
if (i != output.length - 1) {
System.out.print(" ");
}
nextLine() ignores the specified delimiter (as optionally set by useDelimiter()), and reads to the end of the current line.
Since input is two lines:
this is
a test.
only the first line (this is) is returned.
You then split that on whitespace, so output will contain [this, is].
Since you never use the scanner again, the second line (a test.) will never be read.
In essence, your title is right on point: Java Scanner does not ignore new lines (\n)
It specifically processed the newline when you called nextLine().
You don't have to use a Scanner to do this
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
String result = in.lines().collect(Collectors.joining(" "));
Or if you really want to use a Scanner this should also work
Scanner scanner = new Scanner(System.in);
Spliterator<String> si = Spliterators.spliteratorUnknownSize(scanner, Spliterator.ORDERED);
String result = StreamSupport.stream(si, false).collect(Collectors.joining(" "));

How to determine the end of a line with a Scanner?

I have a scanner in my program that reads in parts of the file and formats them for HTML. When I am reading my file, I need to know how to make the scanner know that it is at the end of a line and start writing to the next line.
Here is the relevant part of my code, let me know if I left anything out :
//scanner object to read the input file
Scanner sc = new Scanner(file);
//filewriter object for writing to the output file
FileWriter fWrite = new FileWriter(outFile);
//Reads in the input file 1 word at a time and decides how to
////add it to the output file
while (sc.hasNext() == true)
{
String tempString = sc.next();
if (colorMap.containsKey(tempString) == true)
{
String word = tempString;
String color = colorMap.get(word);
String codeOut = colorize(word, color);
fWrite.write(codeOut + " ");
}
else
{
fWrite.write(tempString + " ");
}
}
//closes the files
reader.close();
fWrite.close();
sc.close();
I found out about sc.nextLine(), but I still don't know how to determine when I am at the end of a line.
If you want to use only Scanner, you need to create a temp string instantiate it to nextLine() of the grid of data (so it returns only the line it skipped) and a new Scanner object scanning the temp string. This way you're only using that line and hasNext() won't return a false positive (It isn't really a false positive because that's what it was meant to do, but in your situation it would technically be). You just keep nextLine()ing the first scanner and changing the temp string and the second scanner to scan each new line etc.
Lines are usually delimitted by \n or \r so if you need to check for it you can try doing it that way, though I'm not sure why you'd want to since you are already using nextLine() to read a whole line.
There is Scanner.hasNextLine() if you are worried about hasNext() not working for your specific case (not sure why it wouldn't though).
you can use the method hasNextLine to iterate the file line by line instead of word by word, then split the line by whitespaces and make your operations on the word
here is the same code using hasNextLine and split
//scanner object to read the input file
Scanner sc = new Scanner(file);
//filewriter object for writing to the output file
FileWriter fWrite = new FileWriter(outFile);
//get the line separator for the current platform
String newLine = System.getProperty("line.separator");
//Reads in the input file 1 word at a time and decides how to
////add it to the output file
while (sc.hasNextLine())
{
// split the line by whitespaces [ \t\n\x0B\f\r]
String[] words = sc.nextLine().split("\\s");
for(String word : words)
{
if (colorMap.containsKey(word))
{
String color = colorMap.get(word);
String codeOut = colorize(word, color);
fWrite.write(codeOut + " ");
}
else
{
fWrite.write(word + " ");
}
}
fWrite.write(newLine);
}
//closes the files
reader.close();
fWrite.close();
sc.close();
Wow I've been using java for 10 years and have never heard of scanner!
It appears to use white space delimiters by default so you can't tell when an end of line occurs.
Looks like you can change the delimiters of the scanner - see the example at Scanner Class:
String input = "1 fish 2 fish red fish blue fish";
Scanner s = new Scanner(input).useDelimiter("\\s*fish\\s*");
System.out.println(s.nextInt());
System.out.println(s.nextInt());
System.out.println(s.next());
System.out.println(s.next());
s.close();

Conflicting character counts

I'm trying to find the number of characters in a given text file.
I've tried using both a scanner and a BufferedReader, but I get conflicting results. With the use of a scanner I concatenate every line after I append a new line character. E.g. like this:
FileReader reader = new FileReader("sampleFile.txt");
Scanner lineScanner = new Scanner(reader);
String totalLines = "";
while (lineScanner.hasNextLine()){
String line = lineScanner.nextLine()+'\n';
totalLines += line;
}
System.out.println("Count "+totalLines.length());
This returns the true character count for my file, which is 5799
Whereas when I use:
BufferedReader reader = new BufferedReader(new FileReader("sample.txt"));
int i;
int count = 0;
while ((i = in.read()) != -1) {
count++;
}
System.out.println("Count "+count);
I get 5892.
I know using the lineScanner will be off by one if there is only one line, but for my text file I get the correct ouput.
Also in notepad++ the file length in bytes is 5892 but the character count without blanks is 5706.
Your file may have lines terminated with \r\n rather than \n. That could cause your discrepancy.
You have to consider the newline/carriage returns character in a text file. This also counts as a character.
I would suggest using the BufferedReader as it will return more accurate results.

Java parsing text file and preserving line breaks?

I have been researching how to do this and becoming a bit confused, I have tried so far with Scanner but that does not seem to preserve line breaks and I can't figure out how to make it determine if a line is a line break. I would appreciate if anyone has any advice. I have been using the Scanner class as below but am not sure how to even check if the line is a new line. Thanks
for (String fileName : f.list()) {
fileCount++;
Scanner sc = new Scanner(new File(f, fileName));
int count = 0;
String outputFileText = "";
//System.out.println(fileCount);
String text="";
while (sc.hasNext()) {
String line = sc.nextLine();
}
}
If you're just trying to read the file, I would suggesting using LineNumberReader instead.
LineNumberReader lnr = new LineNumberReader(new FileReader(f));
String line = "";
while(line != null){
line = lnr.readLine();
if(line==null){break;}
/* do stuff */
}
Java's Scanner class already splits it into lines for you, even if the line is an empty String. You just have to scan through the lines again to get your values:
Scanner lineScanner;
while(sc.hasNext())
{
String nextInputLine = sc.nextLine();
lineScanner = new Scanner(nextInputLine);
while(lineScanner.hasNext())
{
//read the values
}
}
You probably want to use BufferedReader#readLine.

Categories