I want to read from a txt file which contains just numbers. Such file is in UTF-8, and the numbers are separated only by new lines (no spaces or any other things) just that. Whenever i call Integer.valueOf(myString), i get the exception.
This exception is really strange, because if i create a predefined string, such as "56\n", and use .trim(), it works perfectly. But in my code, not only that is not the case, but the exception texts says that what it couldn't convert was "54856". I have tried to introduce a new line there, and then the error text says it couldn't convert "54856
"
With that out of the question, what am I missing?
File ficheroEntrada = new File("C:\\in.txt");
FileReader entrada =new FileReader(ficheroEntrada);
BufferedReader input = new BufferedReader(entrada);
String s = input.readLine();
System.out.println(s);
Integer in;
in = Integer.valueOf(s.trim());
System.out.println(in);
The exception text reads as follows:
Exception in thread "main" java.lang.NumberFormatException: For input string: "54856"
at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:68)
at java.base/java.lang.Integer.parseInt(Integer.java:658)
at java.base/java.lang.Integer.valueOf(Integer.java:989)
at Quicksort.main(Quicksort.java:170)
The file in.txt consists of:
54856
896
54
53
2
5634
Well, aparently it had to do with Windows and those \r that it uses... I just tried executing it on a Linux VM and it worked. Thanks to everyone that answered!!
Try reading the file with Scanner class has use it's hasNextInt() method to identify what you are reading is Integer or not. This will help you find out what String/character is causing the issue
public static void main(String[] args) throws Exception {
File ficheroEntrada = new File(
"C:\\in.txt");
Scanner scan = new Scanner(ficheroEntrada);
while (scan.hasNext()) {
if (scan.hasNextInt()) {
System.out.println("found integer" + scan.nextInt());
} else {
System.out.println("not integer" + scan.next());
}
}
}
If you want to ensure parsability of a string, you could use a Pattern and Regex that.
Pattern intPattern = Pattern.compile("\\-?\\d+");
Matcher matcher = intPattern.matcher(input);
if (matcher.find()) {
int value = Integer.parseInt(matcher.group(0));
// ... do something with the result.
} else {
// ... handle unparsable line.
}
This pattern allows any numbers and optionally a minus before (without whitespace). It should definetly parse, unless it is too long. I don't know how it handles that, but your example seems to contain mostly short integers, so this should not matter.
Most probably you have a leading/trailing whitespaces in your input, something like:
String s = " 5436";
System.out.println(s);
Integer in;
in = Integer.valueOf(s.trim());
System.out.println(in);
Use trim() on string to get rid of it.
UPDATE 2:
If your file contains something like:
54856\n
896
54\n
53
2\n
5634
then use following code for it:
....your code
FileReader enter = new FileReader(file);
BufferedReader input = new BufferedReader(enter);
String currentLine;
while ((currentLine = input.readLine()) != null) {
Integer in;
//get rid of non-numbers
in = Integer.valueOf(currentLine.replaceAll("\\D+",""));
System.out.println(in);
...your code
Related
I have a file in the following format, records are separated by newline but some records have line feed in them, like below. I need to get each record and process them separately. The file could be a few Mb in size.
<?aaaaa>
<?bbbb
bb>
<?cccccc>
I have the code:
FileInputStream fs = new FileInputStream(FILE_PATH_NAME);
Scanner scanner = new Scanner(fs);
scanner.useDelimiter(Pattern.compile("<\\?"));
if (scanner.hasNext()) {
String line = scanner.next();
System.out.println(line);
}
scanner.close();
But the result I got have the begining <\? removed:
aaaaa>
bbbb
bb>
cccccc>
I know the Scanner consumes any input that matches the delimiter pattern. All I can think of is to add the delimiter pattern back to each record mannully.
Is there a way to NOT have the delimeter pattern removed?
Break on a newline only when preceded by a ">" char:
scanner.useDelimiter("(?<=>)\\R"); // Note you can pass a string directly
\R is a system independent newline
(?<=>) is a look behind that asserts (without consuming) that the previous char is a >
Plus it's cool because <=> looks like Darth Vader's TIE fighter.
I'm assuming you want to ignore the newline character '\n' everywhere.
I would read the whole file into a String and then remove all of the '\n's in the String. The part of the code this question is about looks like this:
String fileString = new String(Files.readAllBytes(Paths.get(path)), StandardCharsets.UTF_8);
fileString = fileString.replace("\n", "");
Scanner scanner = new Scanner(fileString);
... //your code
Feel free to ask any further questions you might have!
Here is one way of doing it by using a StringBuilder:
public static void main(String[] args) throws FileNotFoundException {
Scanner in = new Scanner(new File("C:\\test.txt"));
StringBuilder builder = new StringBuilder();
String input = null;
while (in.hasNextLine() && null != (input = in.nextLine())) {
for (int x = 0; x < input.length(); x++) {
builder.append(input.charAt(x));
if (input.charAt(x) == '>') {
System.out.println(builder.toString());
builder = new StringBuilder();
}
}
}
in.close();
}
Input:
<?aaaaa>
<?bbbb
bb>
<?cccccc>
Output:
<?aaaaa>
<?bbbb bb>
<?cccccc>
Here's the .txt file i'm trying to read from
20,Dan,09/05/1990,3,Here
5,Danezo,04/09/1990,99,There
And here's how I'm doing it.. Whenever the .txt file has only one line, it seems to be reading from file fine. Whenever more than one line is being read, I get this error
Exception in thread "main" java.lang.NumberFormatException: For input string: "Danezo"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:580)
at java.lang.Integer.parseInt(Integer.java:615)
at AttackMonitor.readFromFile(AttackMonitor.java:137)
at AttackMonitor.monitor(AttackMonitor.java:57)
at MonsterAttackDriver.main(MonsterAttackDriver.java:14)
Java Result: 1
Here's the readfromfile code.
private void readFromFile() throws FileNotFoundException, IOException
{
monsterAttacks.clear();
Scanner read = new Scanner(new File("Attacks.txt"));
read.useDelimiter(",");
String fullDateIn = "";
int attackIdIn = 0;
int attackVictimsIn = 0;
String monsterNameIn= "";
String attackLocationIn= "";
while (read.hasNext())
{
attackIdIn = Integer.parseInt(read.next());
monsterNameIn = read.next();
fullDateIn = read.next();
attackVictimsIn = Integer.parseInt(read.next());
attackLocationIn = read.next();
monsterAttacks.add(new MonsterAttack(fullDateIn, attackIdIn, attackVictimsIn, monsterNameIn, attackLocationIn));
}
read.close();
}
What is happening is that at the end of each line there is a newline character, which is currently not a delimiter. So your code is attempting to read it as the first integer of the next line, which it is not. This is causing the parse exception.
To remedy this, you can try adding newline to the list of delimiters for which to scan:
Scanner read = new Scanner(new File("Attacks.txt"));
read.useDelimiter("[,\r\n]+"); // use just \n on Linux
An alternative to this would be to just read in each entire line from the file and split on comma:
String[] parts = read.nextLine().split(",");
attackIdIn = Integer.parseInt(parts[0]);
monsterNameIn = parts[1];
fullDateIn = parts[2];
attackVictimsIn = Integer.parseInt(parts[3]);
attackLocationIn = parts[4];
You can use the Biegeleisen suggestion. Or else you can do as follows.
In your while loop you are using hasNext as condition. Instead of that you can use while (read.hasNextLine()) and get the nextLine inside the loop and then split it by your delimiter and do the processing. That would be a more appropriate approach.
e.g
while (read.hasNextLine()) {
String[] values = scanner.nextLine().split(".");
// do your rest of the logic
}
Put the while loop content in a try catch, and catch for NumberFormatException. So whenever it falls to catch code, you can understand you tried to convert a string to int.
Could help more if your business is explained.
attackLocationIn = read.next(); This value takes as "Here\n 5" because there is no comma between Here and 5 and it has new line character.
so 2nd iteration attackIdIn = Integer.parseInt(read.next()); here read.next() value is "Danezo" it is String and you are trying parse to Integer. That's why you are getting this exception.
What I suggest is use BufferReader to read line by line and split each line with comma. It will be fast also.
Or another solution Add comma at end of each line and use read.next().trim() in your code. That's it it will work with minimal changes to your current code.
So I am trying to change the format of a text file that has line numbers every couple of lines just to make it cleaner and easier to read. I made a simple program that goes in and replaces all of the first three characters of a line with spaces, these three character spaces are where the numbers can be. The actual text doesn't start until a few more spaces in. When i do this and have the end result printed out it comes out with a diamond with a question mark in it and I'm assuming that this is the result of missing characters. It seems like most of the missing characters are the apostrophe symbol. If anyone could let me know how to fix it i would really appreciate it :)
public class Conversion {
public static void main(String args[]) throws IOException {
BufferedReader scan = null;
try {
scan = new BufferedReader(new FileReader(new File("C:\\Users\\Nasir\\Desktop\\Beowulftesting.txt")));
} catch (FileNotFoundException e) {
System.out.println("failed to read file");
}
String finalVersion = "";
String currLine;
while( (currLine = scan.readLine()) !=null){
if(currLine.length()>3)
currLine = " "+ currLine.substring(3);
finalVersion+=currLine+"\n";
}
scan.close();
System.out.println(finalVersion);
}
}
Instead of using FileReader, use an InputStreamReader with the correct text encoding. I think the strange characters are appearing because you're reading the file with the wrong encoding.
By the way, don't use += with strings in a loop, like you have. Instead, use a StringBuilder:
StringBuilder finalVersion = new StringBuilder();
String currLine;
while ((currLine = scan.readLine()) != null) {
if (currLine.length() > 3) {
finalVersion.append(" ").append(currLine.substring(3));
} else {
finalVersion.append(currLine);
}
finalVersion.append('\n');
}
public void loadFromFile(String filename) {
File file = new File(filename);
BufferedReader br;
try {
br = new BufferedReader(new FileReader(file));
numberOfAttributes = Integer.parseInt(br.readLine());
}
...
}
Above is my program: I am trying to read from a txt file where the first line is the number 22 and nothing more. I don't know why the program gives me an exception.
Try stripping any whitespace from the string:
numberOfAttributes = Integer.parseInt(br.readLine().trim());
I think you might have a UTF-8 BOM (byte-order mark) at the start of your file.
Here's a class that reproduces the error:
import java.io.*;
public class BomTest {
public static void main(String[] args) throws Exception {
File file = new File("example.txt");
// Write out UTF-8 BOM, followed by the number 22 and a newline.
byte[] bs = { (byte)0xef, (byte)0xbb, (byte)0xbf, (byte)'2', (byte)'2', 10 };
FileOutputStream fos = new FileOutputStream(file);
fos.write(bs);
fos.close();
BufferedReader r = new BufferedReader(new FileReader(file));
String s = r.readLine();
System.out.println(Integer.parseInt(s));
}
}
When I run this class, I get the following output:
luke#computer:~$ java BomTest
Exception in thread "main" java.lang.NumberFormatException: For input string: "22"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:481)
at java.lang.Integer.parseInt(Integer.java:514)
at BomTest.main(BomTest.java:15)
There isn't really an easy way to deal with UTF-8 BOMs in Java; it's best not to generate them in the first place. See also this answer.
br.readLine() reads the entire line including the new line special character.Apart, form the solution suggested by James, you can use Scanner#nextInt().
try with numberOfAttributes = Integer.parseInt(br.readLine().trim());
public String trim()
Returns a copy of the string, with leading and trailing whitespace
omitted. If this String object represents an empty character sequence,
or the first and last characters of character sequence represented by
this String object both have codes greater than '\u0020' (the space
character), then a reference to this String object is returned.
Otherwise, if there is no character with a code greater than '\u0020'
in the string, then a new String object representing an empty string
is created and returned.
This happens because you have a space in the input line. Look at these:
int i1 = Integer.parseInt("22 ");
int i2 = Integer.parseInt("22a");
int i3 = Integer.parseInt("2 2");
int i4 = Integer.parseInt("22\n");
All of them generate exception. I suggest you to trim, tokenize or substitute. But in general, it doesn't sound to me a good solution to read a number from a file in that way.
If you really need to store data, why don't you create an object ad hoc and serialize/deserialize it?
You might have a null character in your string. Remove it using a regEx "\d+".
NumberFormatException is raised because the input string is not in expected number format. Generally, you can see 'the wrong string input' in the error message and can easily identify the bug. But in your case, the catch is that the error message does not display the string input completely (because it does not displays the null character).
Check the below output and the code.
public class TestParseInt{
private static final Pattern pattern = Pattern.compile("\\d+");
public static void main(String []args){
String a = "22\0";
try {
System.out.println("Successfull parse a: " + Integer.parseInt(a));
} catch(NumberFormatException e) {
System.out.println("Error:" +e.getMessage());
}
try {
Matcher matcher = pattern.matcher(a);
if(matcher.find()) {
System.out.println("Succesfull parse a: " +
Integer.parseInt(matcher.group(0)));
}
} catch(NumberFormatException e) {
System.out.println("Error" + e.getMessage());
}
}
}
Output:
Error:For input string: "22"
Succesfull parse a: 22
I have been trying to figure this out for couple of hours now and I hope one of you can help me. I have an file (actually two but thats not important) that have some rows and columns with numbers and blank spaces between. And I'm trying to read those with BufferedReader. And that works great. I can print out the strings & chars however I want. But when I try to parse those strings and chars I get the following error:
Exception in thread "main" java.lang.NumberFormatException: For input string: ""
at java.lang.NumberFormatException.forInputString(Unknown Source)
at java.lang.Integer.parseInt(Unknown Source)
at java.lang.Integer.parseInt(Unknown Source)
at FileProcess.processed(FileProcess.java:30)
at DecisionTree.main(DecisionTree.java:16)
From what I have found with google I think the error is located in how I read my file.
public class ReadFiles {
private BufferedReader read;
public ReadFiles(BufferedReader startRead) {
read = startRead;
}
public String readFiles() throws IOException {
try {
String readLine = read.readLine().trim();
String readStuff = "";
while(readLine != null) {
readStuff += (readLine + "\n");
readLine = read.readLine();
}
return readStuff;
}
catch(NumberFormatException e) {
return null;
}
}
And for the parsing bit
public class FileProcess {
public String processed() throws IOException {
fileSelect fs = new fileSelect();
ReadFiles tr = new ReadFiles(fs.traning());
String training = tr.readFiles();
ReadFiles ts = new ReadFiles(fs.test());
String test = ts.readFiles();
List liste = new List(14,test.length());
String[] test2 = test.split("\n");
for(int i = 0; i<test2[0].length(); i++) {
char tmp = test.charAt(i);
String S = Character.toString(tmp).trim();
//int i1 = Integer.parseInt(S);
System.out.print(S);
}
This isn't the actual code for what I planning to do with the output, but the error appears at the code that is commented out. So my string output is as following:
12112211
Which seems good to parse to integer. But it does not work. I tried to manually see what's in the char position 0 and 1, for 0 I get 1, but for 1 I get nothing aka "". So how can I remove the ""? I hope you guys can help me out, and let me know if you need more info. But I think I have covered what's needed.
Thanks in advance :)
Yeah, and another thing: If I replace "" with "0" it works, but then I get all those zeros which I can't find a clever way to remove. But is it possible to maybe skip them while parsing or something? My files only hold 1 and 2, so it wouldn't interfere with anything if it is possible.
The string "" will be returned if you have 2 of the splitting characters next to each other (i.e. \n\n) or if there is a whitespace character being passed into the trim() call so ignore empty strings and carry on.
You could use the Scanner class to parse for ints, skipping Whitespace:
sc = new java.util.Scanner (line);
sc.nextInt ();
Another idea is to trim the line, split, and parse the parts:
lin = line.trim ();
String [] words = lin.split (" +");
for (String si : words)
Integer.parseInt (si);