Java - Parsing a text file with integers and strings - java

I have a text file with the following contents (delimiter is a single space):
1231 2134 143 wqfdfv -89 rwq f 8 qer q2
sl;akfj salfj 3 sl 123
My objective is to read the integers and strings seperately. Once I know how to parse them, I will create another output file to save them (but my question is only to know how to parse this text file).
I tried using Scanner and I am NOT able to get beyond the first inetger:
Scanner s = new Scanner (new File ("a.txt")).useDelimiter("");
while (s.hasNext()){
System.out.print(s.nextInt());}
and the output is
1231
How can I also get other integers from both the lines?
My desired outout is:
1231
2134
143
-89
8
3
123

The delimiter should be something else like at least one whitespace or more
Scanner s = new Scanner (new File ("a.txt")).useDelimiter("\\s+");
while (s.hasNext()) {
if (s.hasNextInt()) { // check if next token is an int
System.out.print(s.nextInt()); // display the found integer
} else {
s.next(); // else read the next token
}
}
and i have to admit that the solution from gotuskar is the better one in this simple case.

When reading data from file, read all as string types. Then test whether it is number by parsing it using Integer.parseInt(). If it throws an exception then it is a string, otherwise it is a number.
while (s.hasNext()) {
String str = s.next();
try {
b = Integer.parseInt(str);
} catch (NumberFormatException e) { // only catch specific exception
// its a string, do what you need to do with it here
continue;
}
// its a number
}

Related

Can't read integer file

I'm trying to read data from a file that contains integers, but the Scanner doesn't read anything from that file.
I've tried to read the file from the Scanner :
// switch() blablabla
case POPULATION:
try {
while (sc.hasNextInt()) {
this.listePops.add(sc.nextInt());
}
} catch (Exception e) {
System.err.println("~ERREUR~ : " + e.getMessage());
}
break;
And if I try to print each sc.nextInt() to the console, it just prints a blank line and then stops.
Now when I read the same file as a String:
?652432
531345
335975
164308
141220
1094283
328278
270582
// (Rest of the data)
So, I guess it can't read the file as a list of integers since there's a question mark at the beginning, but the problem is that this question mark doesn't appear anywhere in my file, so I can't remove it. What am I supposed to do?
If the first character in the file is a question mark (?) and its original origin is unknown then it is usually the UTF-8 Byte Order Mark (BOM). This means the file was saved as UTF-8. The Microsoft Notepad application will add a BOM to the saved text file if that file was saved in UTF-8 instead of ANSI. There are also other BOM characters for UTF-16, UTF-32, etc.
Reading a text file as String doesn't look like a bad idea now. Changing the save format of the file can work to but that BOM may have actual intended purpose for another application, so, that may not be a viable option. Let's read the file as String lines (read comments in code):
// Variable to hold the value of the UTF-8 BOM:
final String UTF8_BOM = "\uFEFF";
// List to hold the Integer numbers in file.
List<Integer> listePops = new ArrayList<>();
// 'Try With Resources' used to to auto-close file and free resources.
try (Scanner reader = new Scanner(new File("data.txt"))) {
String line;
int lineCount = 0;
while (reader.hasNextLine()) {
line = reader.nextLine();
line = line.trim();
// Skip blank lines (if any):
if (line.isEmpty()) {
continue;
}
lineCount++;
/* Is this the first line and is there a BOM at the
start of this line? If so, then remove it. */
if (lineCount == 1 && line.startsWith(UTF8_BOM)) {
line = line.substring(1);
}
// Validate Line Data:
// Is the line a String representation of an Integer Number?
if (line.matches("\\d+")) {
// Yes... then convert that line to Integer and add it to the List.
listePops.add(Integer.parseInt(line));
}
// Move onto next file line...
}
}
catch (FileNotFoundException ex) {
// Do what you want with this exception (but don't ignore it):
System.err.println(ex.getMessage());
}
// Display the gathered List contents:
for (Integer ints : listePops) {
System.out.println(ints);
}

How to read certain lines of a txt file in java?

I want to read only the parts i need to. For example my text file look likes these
Name Age Gender
=====================
Donald 13 Male
John 14 Non-binary
Pooh 42 Female
I only want to read the data but i don't know how because my code reads a .txt file line by line
try {
File myObj = new File("database.txt");
Scanner myReader = new Scanner(myObj);
while (myReader.hasNextLine()) { //to read each line of the file
String data = myReader.nextLine();
String [] array = data.split(" "); //store the words in the file line by line
if(array.length ==5){ // to check if data has all five parameter
people.add(new Person(array[0], array[1],array[2], Double.parseDouble(array[3]), Double.parseDouble(array[4])));
}
}
JOptionPane.showMessageDialog(null, "Successfully Read File","Javank",JOptionPane.INFORMATION_MESSAGE);
myReader.close();
} catch (FileNotFoundException e) {
System.out.println("An error occurred.");
e.printStackTrace();
}
You can simply call myReader.nextLine() twice before entering your loop to ignore the first two lines.
Another approach you can take is to use a RandomAccessFile object instead of a Scanner to process your input. If you know how many characters are in the file before the beginning of your relevant data, you can use the RandomAccessFile object's seek method to skip to the beginning of your input, e.g. if there are 50 characters in the file before your data you can use randomAccessFile.seek(50) and then read the lines with randomAccessFile.readLine().
I would probably recommend using the first method of skipping 2 lines however because it seems more simple and robust.

What am I missing? NumberFormatException error

I want to read from a txt file which contains just numbers. Such file is in UTF-8, and the numbers are separated only by new lines (no spaces or any other things) just that. Whenever i call Integer.valueOf(myString), i get the exception.
This exception is really strange, because if i create a predefined string, such as "56\n", and use .trim(), it works perfectly. But in my code, not only that is not the case, but the exception texts says that what it couldn't convert was "54856". I have tried to introduce a new line there, and then the error text says it couldn't convert "54856
"
With that out of the question, what am I missing?
File ficheroEntrada = new File("C:\\in.txt");
FileReader entrada =new FileReader(ficheroEntrada);
BufferedReader input = new BufferedReader(entrada);
String s = input.readLine();
System.out.println(s);
Integer in;
in = Integer.valueOf(s.trim());
System.out.println(in);
The exception text reads as follows:
Exception in thread "main" java.lang.NumberFormatException: For input string: "54856"
at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:68)
at java.base/java.lang.Integer.parseInt(Integer.java:658)
at java.base/java.lang.Integer.valueOf(Integer.java:989)
at Quicksort.main(Quicksort.java:170)
The file in.txt consists of:
54856
896
54
53
2
5634
Well, aparently it had to do with Windows and those \r that it uses... I just tried executing it on a Linux VM and it worked. Thanks to everyone that answered!!
Try reading the file with Scanner class has use it's hasNextInt() method to identify what you are reading is Integer or not. This will help you find out what String/character is causing the issue
public static void main(String[] args) throws Exception {
File ficheroEntrada = new File(
"C:\\in.txt");
Scanner scan = new Scanner(ficheroEntrada);
while (scan.hasNext()) {
if (scan.hasNextInt()) {
System.out.println("found integer" + scan.nextInt());
} else {
System.out.println("not integer" + scan.next());
}
}
}
If you want to ensure parsability of a string, you could use a Pattern and Regex that.
Pattern intPattern = Pattern.compile("\\-?\\d+");
Matcher matcher = intPattern.matcher(input);
if (matcher.find()) {
int value = Integer.parseInt(matcher.group(0));
// ... do something with the result.
} else {
// ... handle unparsable line.
}
This pattern allows any numbers and optionally a minus before (without whitespace). It should definetly parse, unless it is too long. I don't know how it handles that, but your example seems to contain mostly short integers, so this should not matter.
Most probably you have a leading/trailing whitespaces in your input, something like:
String s = " 5436";
System.out.println(s);
Integer in;
in = Integer.valueOf(s.trim());
System.out.println(in);
Use trim() on string to get rid of it.
UPDATE 2:
If your file contains something like:
54856\n
896
54\n
53
2\n
5634
then use following code for it:
....your code
FileReader enter = new FileReader(file);
BufferedReader input = new BufferedReader(enter);
String currentLine;
while ((currentLine = input.readLine()) != null) {
Integer in;
//get rid of non-numbers
in = Integer.valueOf(currentLine.replaceAll("\\D+",""));
System.out.println(in);
...your code

Java how to read a line from a text file that has multiple strings and double values?

I want to create a program that reads from a text file with three different parts and then outputs the name. E.g. text file:
vanilla 12 24
chocolate 23 20
chocolate chip 12 12
However, there is a bit of an issue on the third line, as there is a space. So far, my code works for the first two lines, but then throws a InputMismatchException on the third one. How do I make it so it reads both words from one line and then outputs it? My relevant code:
while (in.hasNext())
{
iceCreamFlavor = in.next();
iceCreamRadius = in.nextDouble();
iceCreamHeight = in.nextDouble();
out.println("Ice Cream: " + iceCreamFlavor);
}
In your input file, the separator between fields is composed of multiples spaces, no ?
if yes, you could simply use split method of String object.
You read a line.
You split it to obtain a String array.
String[] splitString = myString.split(" ");
Ther first element «0» is the String, the two others can be parsed as double
This could looks like :
try (BufferedReader br = new BufferedReader(new FileReader("path/to/the/file.txt"))) {
String line;
while ((line = br.readLine()) != null) {
String[] lineSplitted = line.split(" ");
String label = lineSplitted[0];
double d1 = Double.parseDouble(lineSplitted[1]);
double d2 = Double.parseDouble(lineSplitted[2]);
}
} catch (IOException e) {
e.printStackTrace();
}
You can use scanner.useDelimiter to change the delimiter or use a regular expression to parse the line.
//sets delimiter to 2 or more consecutive spaces
Scanner s = new Scanner(input).useDelimiter("(\\s){2-}");
Check the Scanner Javadoc for examples:

How to read only integers from file?

I got a problem when I'm trying to read int from text file.
I'm using this kind of code
import java.util.Scanner;
import java.io.*;
File fileName =new File( "D:\\input.txt");
try {
Scanner in = new Scanner(fileName);
c = in.nextInt();
n = in.nextInt();
} catch(Exception e){
System.out.println("File not Found!!!");
}
If my text is edit like this
30
40
So it will work (meaning c=30, n=40).
But if I want to edit the text file that will be like this
c=30
n=40
My code will not work.
How can I change my code to read only the numbers and ignore the "c=" and n="
or any others chars besides the numbers?
You need to read your lines using Scanner.nextLine, split each line on =, and then convert the 2nd part to integer.
Remember to do the check - Scanner.hasNextLine before you read any line. So, you need to use a while loop to read each line.
A Simple implementation, you can extend it according to your need: -
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
String[] tokens = line.split("=");
try {
System.out.println(Integer.parseInt(tokens[1]);
} catch (NumberFormatException e) {
e.printStackTrace();
}
}
Now if you want to use those numbers later on, you can also add them in an ArrayList<Integer>.
Following the format you want to use in the input file then it would be better if you make use of java.util.Properties. You won't need to care about the parsing.
Properties props = new Properties();
props.load(new FileInputStream(new File("D:\\input.txt")));
c = Integer.parseInt(props.getProperty("c"));
n = Integer.parseInt(props.getProperty("n"));
You can read more about the simple line-oriented format.
you could read line by line(Scanner.nextLine) and check every character in the line by asking isDigit()
If your data line will always be in the same format x=12345, use a regex to get the numeric value from the line

Categories