Reading text with Java Scanner next(Pattern pattern) - java

I am trying to use the Scanner class to read a line using the next(Pattern pattern) method to capture the text before the colon and then after the colon so that s1 = textbeforecolon and s2 = textaftercolon.
The line looks like this:
something:somethingelse

There are two ways of doing this, depending on specifically what you want.
If you want to split the entire input by colons, then you can use the useDelimiter() method, like others have pointed out:
// You could also say "scanner.useDelimiter(Pattern.compile(":"))", but
// that's the exact same thing as saying "scanner.useDelimiter(":")".
scanner.useDelimiter(":");
// Examines each token one at a time
while (scanner.hasNext())
{
String token = scanner.next();
// Do something with token here...
}
If you want to split each line by a colon, then it would be much easier to use String's split() method:
while (scanner.hasNextLine())
{
String[] parts = scanner.nextLine().split(":");
// The parts array now contains ["something", "somethingelse"]
}

I've never used Pattern with scanner.
I've always just changed the delimeter with a string.
http://java.sun.com/j2se/1.5.0/docs/api/java/util/Scanner.html#useDelimiter(java.lang.String)

File file = new File("someFileWithLinesContainingYourExampleText.txt");
Scanner s = new Scanner(file);
s.useDelimiter(":");
while (!s.hasNextLine()) {
while (s.hasNext()) {
String text = s.next();
System.out.println(text);
}
s.nextLine();
}

Related

Multiple input.next() inside of a while(input.hasNext()) triggers NoSuchElementException in Java

I have a text file
Jim,A,94
Pam,B-4,120
Michael,CC,3
I want to save each token delimited by a comma to a string variable. I have this code:
File f = new File(filename);
Scanner input = new Scanner(f);
input.useDelimiter(",");
while (input.hasNext())
{
String s1 = input.next();
String s2 = input.next();
String s3 = input.next();
}
input.close();
But this keeps triggering NoSuchElementException. What am I doing wrong here?
This is a subtle quirk of how Scanner#next() works. That method looks for a "token", which is everything between occurrences of the delimiter character. The default delimiter is "\\p{javaWhitespace}+", which includes \n, \r.
You changed the delimiter to a comma, and ONLY the comma. To your Scanner the input looks like this:
Jim,A,94\nPam,B-4,120\nMichael,CC,3
The tokens as seen by your scanner are:
Jim
A
94\nPam
B-4
120\nMichael
CC
3
That's only 7 tokens, so you hit the end of input before satisfying all the next() invocations.
The simplest fix is to set the delimiter to "[," + System.lineSeparator() + "]".
However, Scanner is the cause of endless errors and frustration for new developers, and should be deprecated and banished (that won't happen because of the existing code base). What you should be doing is reading entire lines and then parsing them yourself.
You have a custom delimiter ,. So, your input string is delimitted by , and everything before and after that character will constitute a token returned by .next();.
\n escape character, that is present in your input as a new line character, is also a character:
\n (linefeed LF, Unicode \u000a)
and it's being read as part of your token.
Here is another way to achieve what you are trying to do:
File f = new File(filename);
Scanner input = new Scanner(f);
while (input.hasNext()) {
String[] sarray = input.nextLine().split(",");
for (String s : sarray ) {
System.out.println(s);
}
}
Of course, that can be improved, but basically I suggest that you use the Scanner to read the file line by line and then you split by comma delimiter.

Java input parsing with delimiter | (pipe)

I know pipe is a special character and I need to use:
Scanner input = new Scanner(System.in);
String line = input.next();
String[] columns = line.split("\\|");
to use the pipe as a delimiter. But it doesn't work as desired when I parse from the command line.
e.g.
When I parse from a file, this just works. However, when the input has a white space, whenever I parse the input from command line, it gives me out of bounds error, because it splits the word into two array element.
input
a|5|Hello|3
output:
columns[0] = "a";
columns[1] = "5";
columns[2] = "Hello";
columns[3] = "3";
bug:
input:
a|5|Hello World|3;
output:
columns[0] = "a";
columns[1] = "5";
columns[2] = "Hello";
columns[3] = "World";
columns[4] = "3";
I want columns[3] as "Hello World". How can I fix this?
I think you should get the data from user by using nextLine() instead of only next().
In my case its working fine just click here and check the source code ..
next() can read the input only till the space. It can't read two words separated by space. Also, next() places the cursor in the same line after reading the input. nextLine() reads input including space between the words (that is, it reads till the end of line n).
A Scanner breaks its input into tokens using a delimiter pattern,
which by default matches whitespace. Source
To overcome, you should use nextline() method instead.
String line = input.nextline();

Split a string you’ve entered with scanner

I have a problem. I want to type in a string (with Java.util.scanner) with two words. Then I want the program to split my entered string at the whitespace, save both substrings in a seperate variable and make a output in the end.
I know that you can split strings with
String s = "Hello World";
String[] = s.split(" ");
But it doesnt seem to work when your String is
Scanner sc = new Scanner(System.in);
String s = sc.nextLine();
Any help?
Thank you very much
s.split("\\s+"); will split your string, even if you have multiple whitespace characters (also tab, newline..)
You can also use from java.util package
StringTokenizer tokens = new StringTokenizer(s.trim());
String word;
while (tokens.hasMoreTokens()) {
word = tokens.nextToken();
}
Or from Apache Commons Lang
StringUtils.split(s)
Your code works for me:
String s;
s=sc.nextLine();
String[] words=s.split(" ");
for(String w:words){
System.out.print(w+" ");
}
input: "Hello world"
output: "Hello world"
You may also want to try this way of splitting String that you get from user input:
Scanner sc = new Scanner(System.in);
String[] strings = sc.nextLine().split("\\s+");
If you simply want to print array containing these separated strings, you can do it without using any loop, simply by using:
Arrays.toString(strings);
If you want to have your printed strings to look other way, you can use for it simple loop printing each element or by using StringBuilder class and its append() method - this way may be faster than looping over longer arrays of strings.

Deliminter is not working for scanner

The user will enter a=(number here). I then want it to cut off the a= and retain the number. It works when I use s.next() but of course it makes me enter it two times which I don't want. With s.nextLine() I enter it once and the delimiter does not work. Why is this?
Scanner s = new Scanner(System.in);
s.useDelimiter("a=");
String n = s.nextLine();
System.out.println(n);
Because nextLine() doesn't care about delimiters. The delimiters only affect Scanner when you tell it to return tokens. nextLine() just returns whatever is left on the current line without caring about tokens.
A delimiter is not the way to go here; the purpose of delimiters is to tell the Scanner what can come between tokens, but you're trying to use it for a purpose it wasn't intended for. Instead:
String n = s.nextLine().replaceFirst("^a=","");
This inputs a line, then strips off a= if it appears at the beginning of the string (i.e. it replaces it with the empty string ""). replaceFirst takes a regular expression, and ^ means that it only matches if the a= is at the beginning of the string. This won't check to make sure the user actually entered a=; if you want to check this, your code will need to be a bit more complex, but the key thing here is that you want to use s.nextLine() to return a String, and then do whatever checking and manipulation you need on that String.
Try with StringTokenizer if Scanner#useDelimiter() is not suitable for your case.
Scanner s = new Scanner(System.in);
String n = s.nextLine();
StringTokenizer tokenizer = new StringTokenizer(n, "a=");
while (tokenizer.hasMoreTokens()) {
System.out.println(tokenizer.nextToken());
}
or try with String#split() method
for (String str : n.split("a=")) {
System.out.println(str);
}
input:
a=123a=546a=78a=9
output:
123
546
78
9

Scanner class skips over whitespace

I am using a nested Scanner loop to extract the digit from a string line (from a text file) as follows:
String str = testString;
Scanner scanner = new Scanner(str);
while (scanner.hasNext()) {
String token = scanner.next();
// Here each token will be used
}
The problem is this code will skip all the spaces " ", but I also need to use those "spaces" too. So can Scanner return the spaces or I need to use something else?
My text file could contain something like this:
0
011
abc
d2d
sdwq
sda
Those blank lines contains 1 " " each, and those " " are what I need returned.
Use Scanner's hasNextLine() and nextLine() methods and you'll find your solution since this will allow you to capture empty or white-space lines.
By default, a scanner uses white space to separate tokens.
Use Scanner#nextLine method, Advances this scanner past the current line and returns the input that was skipped. This method returns the rest of the current line, excluding any line separator at the end. The position is set to the beginning of the next line.
To use a different token separator, invoke useDelimiter(), specifying
a regular expression. For example, suppose you wanted the token
separator to be a comma, optionally followed by white space. You would
invoke,
scanner.useDelimiter(",\\s*");
Read more from http://docs.oracle.com/javase/tutorial/essential/io/scanning.html
You have to understand what is a token. Read the documentation of Scanner:
A Scanner breaks its input into tokens using a delimiter pattern, which by default matches whitespace.
You could use the nextLine() method to get the whole line and not "ignore" with any whitespace.
Better you could define what is a token by using the useDelimiter method.
This will work for you
Scanner scanner = new Scanner(new File("D:\\sample.txt"));
while (scanner.hasNextLine()) {
String token = scanner.nextLine();
System.out.println(token);
}
To use a more funtional approach you could use something like this:
String fileContent = new Scanner(new File("D:\\sample.txt"))
.useDelimiter("")
.tokens()
.reduce("", String::concat);

Categories