I'm helping my sisters with a simple java program and I'm stumped. They've only learned scanner classes to read file contents, so I think they're supposed to use the scanner class. Each line contains letters and potentially a blank space, and we're hoping to store each line in an array. This works fine and dandy until one of the lines contains something like:
abcde f (the blank space after f should be read in as part of the
line).
However, scanner.nextLine() seems to disregard this last blank space. I figured I could set my scanner delimiter to \n like so:
scanner.useDelimiter("\n")
and then use scanner.Next() from there, but this still doesn't seem to work. I've googled around and taken a look at a few stackoverflow questions. This question here seems to suggest this is not easily done with the scanner class: How to read whitespace with scanner.next()
Any ideas? I feel like there's an easy way I'm overlooking.
This is how I'm reading in the lines:
While(scanner.hasNextLine(){
String nextLine = scanner.nextLine();
Using the above example, my string would read abcde f. It will get rid of the empty space at the end.
I've also tried to use hasNext and next.
Pardon my formatting, I'm editing on a phone.
Save your text file as ANSI encoding and try again.
By right scanner.nextLine() will capture everything in the line, including whitespace.
scanner.next() will not capture whitespace as the delimiter is whitespace by default.
Related
I am wondering how you can use the split method with this example considering the fact that that there is a line break in the file.
g3,g3,g3,c4-,a3-,g4-,r,r,r,g3,g3,g3,c4-,a3-,a4,g4-,r,r,r,c4,c4,c4,e4,r
g4,r,a4,r,r,b4b,r,a4,f4,r,g4,r,r,g4#,r,g4,d4#,r,g4
I read the Pattern api and tutorials and think it should be like so.
line.split("(,\n)");
I also tried
line.split([,\n]);
and
line.split("[,\n]");
lines may separated using \r or \n both of them, or even some other characters. Since Java 8 you can use \\R to represent line separators (more info). So you could try using
String[] arr = yourText.split(",|\\R");
As Pshemo notes, the 3rd option str.split("[,\n]") should work assuming the file ends each line with \n and not \r\n.
Additionally, how you read the file may affect your split argument.
If you are reading the file in with a BufferedReader, then going line by line with the readLine() method will automatically exclude any line-termination characters.
So I have to get words from a text file, change them, and put them into a new text file.
The problem I'm having is, lets say the first line of the file is
hello my name is bob
the modified result should be:
ellohay myay amenay isay bobay
but instead, the result ends up being
ellomynameisbobhay
so scanner has .nextLine() but I want to have a method that is .nextWord() or something, so that it will recognize something as a word until it has a space after it. how can I create this?
nextLine() gives you the whole line.
What you should use is just next(), that will give you the next word.
Also see String.split() or StringTokenizer if you wanted to post-process whole lines. It sound s as though in your situation just using the scanner is fine, but I though i'd mention it because I assumed you'd have just used those methods if you knew about them.
I've been having lots of trouble trying to get either a scanner or a buffered reader to try and detect a blank line. For example if I have a file that contains:
there
cat
dog
(BLANK LINE)
If I do this:
while( scan.hasNextLine() )
{
String line = scan.nextLine();
...
...
}
The scanner doesn't pick up the blank line. I tried to use a buffered reader also but I run into this issue. Is there some way the scanner can just return a "" whenever it finds a blank line like that? Cheers
Your input has as many lines as it has \n characters. Given the input
"there\ncat\ndog\n"
the next-lines will be correctly divided as
"there\n"
"cat\n"
"dog\n"
(In other words, there is no fourth blank line, since it is not terminated by a \n.)
Put differently, after the "dog\n" has been read, the scanner (or buffered reader for that matter) has reached EOF and there's not even an empty line to return. (Note that when the lines are returned, the new-line character is stripped off.)
So, since this is the expected behavior, I don't know what the easiest fix is. I suspect that the best way to solve this is simply to append a \n to the input, so that the loop runs an extra iteration.
I've got some very basic code like
while (scan.hasNextLine())
{
String temp = scan.nextLine();
System.out.println(temp);
}
where scan is a Scanner over a file.
However, on one particular line, which is about 6k chars long, temp cuts out after something like 2470 characters. There's nothing special about when it cuts out; it's in the middle of the word "Australia." If I delete characters from the line, the place where it cuts out changes; e.g. if I delete characters 0-100 in the file then Scanner will get what was previously 100-2570.
I've used Scanner for larger strings before. Any idea what could be going wrong?
At a guess, you may have a rogue character at the cut-off point: look at the file in a hex editor instead of just a text editor. Perhaps there's an embedded null character, or possibly \r in the middle of the string? It seems unlikely to me that Scanner.nextLine() would just chop it arbitrarily.
As another thought, are you 100% sure that it's not all there? Perhaps System.out.println is chopping the string - again due to some "odd" character embedded in it? What happens if you print temp.length()?
EDIT: I'd misinterpreted the bit about what happens if you cut out some characters. Sorry about that. A few other things to check:
If you read the lines with BufferedReader.readLine() instead of Scanner, does it get everything?
Are you specifying the right encoding? I can't see why this would show up in this particular way, but it's something to think about...
If you replace all the characters in the line with "A" (in the file) does that change anything?
If you add an extra line before this line (or remove a line before it) does that change anything?
Failing all of this, I'd just debug into Scanner.nextLine() - one of the nice things about Java is that you can debug into the standard libraries.
I'm trying to scan a file that has the DOS ^M as end-of-line using something like:
Scanner file = new Scanner(new File(saveToFilePath)).useDelimiter("(?=\^M)")
In other words, I want to read the text line by line but also keep the ^M that marks the end of the line. This would be easy with \n but I'm not good with regexes and the DOS end-of-line is driving me crazy.
After some research I finally got it. The following is the correct regex for finding and keeping ^M. I didn't know that it meant CTRL-M, so some of your responses helped with that. For some reason, the "M" is not included in the regex and I'm not sure why it works, but it does. This gives us a delimiter for lines that includes the delimiter (with a lookahead regex) when searching for the elusive "^M".
Scanner file = new Scanner(source).useDelimiter("(?=\p{Cntrl})")
Thank you, everyone.