Scanner class skips over whitespace - java

I am using a nested Scanner loop to extract the digit from a string line (from a text file) as follows:
String str = testString;
Scanner scanner = new Scanner(str);
while (scanner.hasNext()) {
String token = scanner.next();
// Here each token will be used
}
The problem is this code will skip all the spaces " ", but I also need to use those "spaces" too. So can Scanner return the spaces or I need to use something else?
My text file could contain something like this:
0
011
abc
d2d
sdwq
sda
Those blank lines contains 1 " " each, and those " " are what I need returned.

Use Scanner's hasNextLine() and nextLine() methods and you'll find your solution since this will allow you to capture empty or white-space lines.

By default, a scanner uses white space to separate tokens.
Use Scanner#nextLine method, Advances this scanner past the current line and returns the input that was skipped. This method returns the rest of the current line, excluding any line separator at the end. The position is set to the beginning of the next line.
To use a different token separator, invoke useDelimiter(), specifying
a regular expression. For example, suppose you wanted the token
separator to be a comma, optionally followed by white space. You would
invoke,
scanner.useDelimiter(",\\s*");
Read more from http://docs.oracle.com/javase/tutorial/essential/io/scanning.html

You have to understand what is a token. Read the documentation of Scanner:
A Scanner breaks its input into tokens using a delimiter pattern, which by default matches whitespace.
You could use the nextLine() method to get the whole line and not "ignore" with any whitespace.
Better you could define what is a token by using the useDelimiter method.

This will work for you
Scanner scanner = new Scanner(new File("D:\\sample.txt"));
while (scanner.hasNextLine()) {
String token = scanner.nextLine();
System.out.println(token);
}

To use a more funtional approach you could use something like this:
String fileContent = new Scanner(new File("D:\\sample.txt"))
.useDelimiter("")
.tokens()
.reduce("", String::concat);

Related

Multiple input.next() inside of a while(input.hasNext()) triggers NoSuchElementException in Java

I have a text file
Jim,A,94
Pam,B-4,120
Michael,CC,3
I want to save each token delimited by a comma to a string variable. I have this code:
File f = new File(filename);
Scanner input = new Scanner(f);
input.useDelimiter(",");
while (input.hasNext())
{
String s1 = input.next();
String s2 = input.next();
String s3 = input.next();
}
input.close();
But this keeps triggering NoSuchElementException. What am I doing wrong here?
This is a subtle quirk of how Scanner#next() works. That method looks for a "token", which is everything between occurrences of the delimiter character. The default delimiter is "\\p{javaWhitespace}+", which includes \n, \r.
You changed the delimiter to a comma, and ONLY the comma. To your Scanner the input looks like this:
Jim,A,94\nPam,B-4,120\nMichael,CC,3
The tokens as seen by your scanner are:
Jim
A
94\nPam
B-4
120\nMichael
CC
3
That's only 7 tokens, so you hit the end of input before satisfying all the next() invocations.
The simplest fix is to set the delimiter to "[," + System.lineSeparator() + "]".
However, Scanner is the cause of endless errors and frustration for new developers, and should be deprecated and banished (that won't happen because of the existing code base). What you should be doing is reading entire lines and then parsing them yourself.
You have a custom delimiter ,. So, your input string is delimitted by , and everything before and after that character will constitute a token returned by .next();.
\n escape character, that is present in your input as a new line character, is also a character:
\n (linefeed LF, Unicode \u000a)
and it's being read as part of your token.
Here is another way to achieve what you are trying to do:
File f = new File(filename);
Scanner input = new Scanner(f);
while (input.hasNext()) {
String[] sarray = input.nextLine().split(",");
for (String s : sarray ) {
System.out.println(s);
}
}
Of course, that can be improved, but basically I suggest that you use the Scanner to read the file line by line and then you split by comma delimiter.

How do I split an input string in Java?

User enters a string in java, I have to split it into different components.
Scanner scanner = new Scanner(System.in);
String test = scanner.next();
// split the test variable using the split method
String [] parts = test.split(" ,", 3);
s[i].setFirstName(parts[0].trim());
s[i].setlastName(parts[1].trim());
s[i].setID(Integer.parseInt(parts[2].trim()));
s[i].setgrade(Integer.parseInt(parts[3].trim()));
but it's not working. I can only get the first word to show up.
With your comment
I can get only one word to show up. it doesn't read any proceeding
words.
Use nextLine() instead of next().
next() will only return what comes before a space.
nextLine() automatically moves the scanner down after returning the current line.
name = scanner.nextLine();
Scanner doc
Use nextLine() rather than next() should fix the issue.
For further reference, take a look at the docs.
Change
String test = scanner.next();
to
String test = scanner.nextLine();
scanner.next() takes a word upto it encounters a blank space. nextLine() will consider the whole line.

How to split a string with space being the delimiter using Scanner

I am trying to split the input sentence based on space between the words. It is not working as expected.
public static void main(String[] args) {
Scanner scaninput=new Scanner(System.in);
String inputSentence = scaninput.next();
String[] result=inputSentence.split("-");
// for(String iter:result) {
// System.out.println("iter:"+iter);
// }
System.out.println("result.length: "+result.length);
for (int count=0;count<result.length;count++) {
System.out.println("==");
System.out.println(result[count]);
}
}
It gives the output below when I use "-" in split:
fsfdsfsd-second-third
result.length: 3
==
fsfdsfsd
==
second
==
third
When I replace "-" with space " ", it gives the below output.
first second third
result.length: 1
==
first
Any suggestions as to what is the problem here? I have already referred to the stackoverflow post How to split a String by space, but it does not work.
Using split("\\s+") gives this output:
first second third
result.length: 1
==
first
Change
scanner.next()
To
scanner.nextLine()
From the javadoc
A Scanner breaks its input into tokens using a delimiter pattern, which by default matches whitespace.
Calling next() returns the next word.
Calling nextLine() returns the next line.
The next() method of Scanner already splits the string on spaces, that is, it returns the next token, the string until the next string. So, if you add an appropriate println, you will see that inputSentence is equal to the first word, not the entire string.
Replace scanInput.next() with scanInput.nextLine().
The problem is that scaninput.next() will only read until the first whitespace character, so it's only pulling in the word first. So the split afterward accomplishes nothing.
Instead of using Scanner, I suggest using java.io.BufferedReader, which will let you read an entire line at once.
One more alternative is to go with buffered Reader class that works well.
String inputSentence;
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
inputSentence=br.readLine();
String[] result=inputSentence.split("\\s+");
rintln("result.length: "+result.length);
for(int count=0;count<result.length;count++)
{
System.out.println("==");
System.out.println(result[count]);
}
}

Deliminter is not working for scanner

The user will enter a=(number here). I then want it to cut off the a= and retain the number. It works when I use s.next() but of course it makes me enter it two times which I don't want. With s.nextLine() I enter it once and the delimiter does not work. Why is this?
Scanner s = new Scanner(System.in);
s.useDelimiter("a=");
String n = s.nextLine();
System.out.println(n);
Because nextLine() doesn't care about delimiters. The delimiters only affect Scanner when you tell it to return tokens. nextLine() just returns whatever is left on the current line without caring about tokens.
A delimiter is not the way to go here; the purpose of delimiters is to tell the Scanner what can come between tokens, but you're trying to use it for a purpose it wasn't intended for. Instead:
String n = s.nextLine().replaceFirst("^a=","");
This inputs a line, then strips off a= if it appears at the beginning of the string (i.e. it replaces it with the empty string ""). replaceFirst takes a regular expression, and ^ means that it only matches if the a= is at the beginning of the string. This won't check to make sure the user actually entered a=; if you want to check this, your code will need to be a bit more complex, but the key thing here is that you want to use s.nextLine() to return a String, and then do whatever checking and manipulation you need on that String.
Try with StringTokenizer if Scanner#useDelimiter() is not suitable for your case.
Scanner s = new Scanner(System.in);
String n = s.nextLine();
StringTokenizer tokenizer = new StringTokenizer(n, "a=");
while (tokenizer.hasMoreTokens()) {
System.out.println(tokenizer.nextToken());
}
or try with String#split() method
for (String str : n.split("a=")) {
System.out.println(str);
}
input:
a=123a=546a=78a=9
output:
123
546
78
9

Java Scanner with regex delimiter

Why does the following code return false?
Scanner sc = new Scanner("-v ");
sc.useDelimiter("-[a-zA-Z]\\s+");
System.out.println(sc.hasNext());
The weird thing is -[a-zA-Z]//s+ will return true.
I also can't understand why this returns true:
Scanner sc = new Scanner(" -v");
sc.useDelimiter("-[a-zA-Z]\\s+");
System.out.println(sc.hasNext());
A scanner is used to break up a string into tokens. Delimiters are the separators between tokens. The delimiters are what aren't matched by the scanner; they're discarded. You're telling the scanner that -[a-zA-Z]\\s+ is a delimiter and since -v matches that regex it skips it.
If you're looking for a string that matches the regex, use String.matches().
If your goal really is to split a string into tokens then you might also consider String.split(), which is sometimes more convenient to use.
Thanks John Kugelman, I think you're right.
Scanner can use customized delimiter to split input into tokens.
The default delimiter is a whitespace.
When delimiter doesn't match any input, it'll result all the input as one token:
Scanner sc = new Scanner("-v");
sc.useDelimiter( "-[a-zA-Z]\\s+");
if(sc.hasNext())
System. out.println(sc.next());
In the code above, the delimiter actually doesn't get any match, all the input "-v" will be the single token.
hasNext() means has next token.
Scanner sc = new Scanner( "-v ");
sc.useDelimiter( "-[a-zA-Z]\\s+");
if(sc.hasNext())
System. out.println(sc.next());
this will match the delimiter, and the split ended up with 0 token, so the hasNext() is false.

Categories