Scanner unable to read text file - java

I have a bunch of .txt files I am trying to read but for many of them they will not read. The ones that will not read appear to start with a blank line before the text. For example the following throws a NoSuchElementException:
public static void main(String[] args) throws FileNotFoundException{
Scanner input = new Scanner(new File("documentSets/med_doc_set/bmu409.shtml.txt"));
System.out.println(input.next());
}
where the text file being read begins with a blank line and then some text. I've also tried using input.skip("[\\s]*") to skip any leading whitespace but it throws the same error. Is there some way to fix this?
EDIT:
The file hosted on google docs. If you download to view in a text editor you can see the empty line it starts with.

The Scanner type is weirdly inconsistent when it comes to handling input. It swallows I/O exceptions - consumers should test for these explicitly - so it is lax in informing readers of errors. But the type is strict when decoding character data - incorrectly encoded text or use of the wrong encoding will cause an IOException to be raised, which the type promptly swallows.
This code reads all lines in a text file with error checking:
public static List<String> readAllLines(File file, Charset encoding)
throws IOException {
List<String> lines = new ArrayList<>();
try (Scanner scanner = new Scanner(file, encoding.name())) {
while (scanner.hasNextLine()) {
lines.add(scanner.nextLine());
}
if (scanner.ioException() != null) {
throw scanner.ioException();
}
}
return lines;
}
This code reads the lines and converts codepoints the decoder doesn't understand to question marks:
public static List<String> readAllLinesSloppy(File file, Charset encoding)
throws IOException {
List<String> lines = new ArrayList<>();
try (InputStream in = new FileInputStream(file);
Reader reader = new InputStreamReader(in, encoding);
Scanner scanner = new Scanner(reader)) {
while (scanner.hasNextLine()) {
lines.add(scanner.nextLine());
}
if (scanner.ioException() != null) {
throw scanner.ioException();
}
}
return lines;
}
Both these methods require you to provide the encoding explicitly rather than relying on the default encoding which is frequently not Unicode (see also the standard constants.)
Code is Java 7 syntax and is untested.

It starts with a blank line, and you're only printing the first line in your code, change it to:
public static void main(String[] args) throws FileNotFoundException{
Scanner input = new Scanner(new File("documentSets/med_doc_set/bmu409.shtml.txt"));
while(input.hasNextLine()){
System.out.println(input.nextLine());
}
}

Scanner reads all the words or numbers up to the end of the line. At this point you need to call nextLine(). If you want to avoid getting an Exception you need to call one of the hasNextXxxx() methods to determine if that type can be read.

Related

What am I missing? NumberFormatException error

I want to read from a txt file which contains just numbers. Such file is in UTF-8, and the numbers are separated only by new lines (no spaces or any other things) just that. Whenever i call Integer.valueOf(myString), i get the exception.
This exception is really strange, because if i create a predefined string, such as "56\n", and use .trim(), it works perfectly. But in my code, not only that is not the case, but the exception texts says that what it couldn't convert was "54856". I have tried to introduce a new line there, and then the error text says it couldn't convert "54856
"
With that out of the question, what am I missing?
File ficheroEntrada = new File("C:\\in.txt");
FileReader entrada =new FileReader(ficheroEntrada);
BufferedReader input = new BufferedReader(entrada);
String s = input.readLine();
System.out.println(s);
Integer in;
in = Integer.valueOf(s.trim());
System.out.println(in);
The exception text reads as follows:
Exception in thread "main" java.lang.NumberFormatException: For input string: "54856"
at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:68)
at java.base/java.lang.Integer.parseInt(Integer.java:658)
at java.base/java.lang.Integer.valueOf(Integer.java:989)
at Quicksort.main(Quicksort.java:170)
The file in.txt consists of:
54856
896
54
53
2
5634
Well, aparently it had to do with Windows and those \r that it uses... I just tried executing it on a Linux VM and it worked. Thanks to everyone that answered!!
Try reading the file with Scanner class has use it's hasNextInt() method to identify what you are reading is Integer or not. This will help you find out what String/character is causing the issue
public static void main(String[] args) throws Exception {
File ficheroEntrada = new File(
"C:\\in.txt");
Scanner scan = new Scanner(ficheroEntrada);
while (scan.hasNext()) {
if (scan.hasNextInt()) {
System.out.println("found integer" + scan.nextInt());
} else {
System.out.println("not integer" + scan.next());
}
}
}
If you want to ensure parsability of a string, you could use a Pattern and Regex that.
Pattern intPattern = Pattern.compile("\\-?\\d+");
Matcher matcher = intPattern.matcher(input);
if (matcher.find()) {
int value = Integer.parseInt(matcher.group(0));
// ... do something with the result.
} else {
// ... handle unparsable line.
}
This pattern allows any numbers and optionally a minus before (without whitespace). It should definetly parse, unless it is too long. I don't know how it handles that, but your example seems to contain mostly short integers, so this should not matter.
Most probably you have a leading/trailing whitespaces in your input, something like:
String s = " 5436";
System.out.println(s);
Integer in;
in = Integer.valueOf(s.trim());
System.out.println(in);
Use trim() on string to get rid of it.
UPDATE 2:
If your file contains something like:
54856\n
896
54\n
53
2\n
5634
then use following code for it:
....your code
FileReader enter = new FileReader(file);
BufferedReader input = new BufferedReader(enter);
String currentLine;
while ((currentLine = input.readLine()) != null) {
Integer in;
//get rid of non-numbers
in = Integer.valueOf(currentLine.replaceAll("\\D+",""));
System.out.println(in);
...your code

How to use scanner to read accented characters from file in correct way?

I have this method:
public static void readFile(String input)
throws FileNotFoundException, IOException{
try (Scanner sc = new Scanner(new File(input));){
while (sc.hasNextLine()){
String currentLine = sc.nextLine();
if(sc.hasNextLine()){
String nextLine = sc.nextLine();
System.out.println("currentLine\t"+currentLine);
System.out.println("nextLine\t"+nextLine);
}
}
}
}
It works correctly, without any errors, or any problems, but this not write any content from the file. In the file I have some basic "lorem ipsum" text, just to see how this works.
If in the file only latin characters are then this works correctly, but if is there any other characters (ex: áéűőúüöó) then this not write any content from the file. Where can I have the problem? How can I resolve this?
You probably have just single line and the bug is here:
while (sc.hasNextLine()){
String currentLine = sc.nextLine();
if(sc.hasNextLine()){
you get the current line but print it only when there is next line.
In this way last line will be missing.
Please remove the IF condition.
EDIT:
After question editing please try to give proper encoding ex for UTF8:
new Scanner(new File(fileName), StandardCharsets.UTF_8.name());

Write to files using Java

I am trying to use lists for my first time, I have a txt file that I am searching in it about string then I must write the result of searching in new file.
Check the image attached
My task is to retrieve the two checked lines of the input file to the output files.
And this is my code:
import java.io.*;
import java.util.Scanner;
public class TestingReport1 {
public static void main(String[] args) throws Exception {
File test = new File("E:\\test2.txt");
File Result = new File("E:\\Result.txt");
Scanner scanner = new Scanner(test);
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
if(line.contains("Visit Count")|| line.contains("Title")) {
System.out.println(line);
}
}
}
}
What should I do?!
Edit: How can I write the result of this code into text file?
Edit2:
Now using the following code:
public static void main(String[] args) throws Exception {
// TODO code application logic here
File test = new File("E:\\test2.txt");
FileOutputStream Result = new FileOutputStream("E:\\Result.txt");
Scanner scanner = new Scanner(test);
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
if(line.contains("Visit Count")|| line.contains("Title")) {
System.out.println(line);
Files.write(Paths.get("E:\\Result.txt"), line.getBytes(), StandardOpenOption.APPEND);
}
}
}
I got the result back as Visit Count:1 , and I want to get this number back as integer, Is it possible?
Have a look at Files, especially readAllLines as well as write. Filter the input between those two method calls, that's it:
// Read.
List<String> input = Files.readAllLines(Paths.get("E:\\test2.txt"));
// Filter.
String output = input.stream()
.filter(line -> line.matches("^(Title.*|Visit Count.*)"))
.collect(Collectors.joining("\n"));
// Write.
Files.write(Paths.get("E:\\Result.txt"), output.getBytes());

How to solve no input in StreamTokenizer

I have the following lines of code :
public static void main(String[] args) {
InputStreamReader inputStreamReader = new InputStreamReader(System.in);
StreamTokenizer t = new StreamTokenizer(inputStreamReader);
while (t.nextToken() != StreamTolenizer.TT_EOF) {
// process here
}
}
So, when I run, I call : java example.java < input.txt
However, I can't handle the situation "no input file" when I call : java example.
It seems to run forever.
If you don't redirect anything to stdin (System.in) such as "input.txt" in your example command line then your program will expect you to type data into the console window.
Perhaps you should refactor your program to expect a command line argument (e.g. by checking that "args.length >= 1") and interpret it as the name of the file to read. If no file name is given then you can print an error message. Additionally, you could interpret the special pseudo-filename "-" (a single hypen) to mean stdin so you can still redirect data.
For example:
public static void main(String[] args) {
if (args.length < 1) throw new IllegalArgumentException("no filename given");
InputStream in = ("-".equals(args[0])) ? System.in : new FileInputStream(args[0]);
InputStreamReader inputStreamReader = new InputStreamReader(in);
StreamTokenizer t = new StreamTokenizer(inputStreamReader);
while(t.nextToken() != StreamTolenizer.TT_EOF) {
// ...
However, don't forget to close the FileInputStream, e.g. in a finally block.

How can I read text appended to the end of a file I already read?

I want to read a text file in Java. After I finish, some text will be appended by another application, and then I want to read that. Lets say there are ten lines. When the other app appends one more line, I dont want to read the whole file again; just the new line. How can I do this?
Something like this could work:
BufferedReader reader = .. // create a reader on the input file without locking it
while(otherAppWritesToFile) {
String line = reader.readLine();
while(line != null) {
processLine(line);
line = reader.readLine();
}
Thread.sleep(100);
}
Exception handling has been left out for the sake of simplicity.
Once you get an EOF indication, wait a little bit and then try reading again.
Edit: Here is teh codez to support this solution. You can try it and then change the control flow mechanisms as needed.
public static void main(final String[] args) throws IOException {
final Scanner keyboard = new Scanner(System.in);
final BufferedReader input = new BufferedReader(new FileReader("input.txt"));
boolean cont = true;
while (cont) {
String line = input.readLine();
while (line != null) {
System.out.println(line);
line = input.readLine();
}
System.out.println("EOF reached, add more input and type 'y' to continue.");
final String in = keyboard.nextLine();
cont = in.equalsIgnoreCase("y");
}
}
EDIT: Thanks for adding some code Tim. Personally, I would just do a sleep instead of waiting for user input. That would more closely match the users' requirements.
You could try using a RandomAccessFile.
Open the file and then invoke the length() to get the length of the file. Then you can use the readLine() method to get your data. Then the next time you open the file you can use the seek() method to position yourself to the previous end of the file. Then read the lines and save the new length of the file.

Categories