Scanner difficulties with different escape characters - java

First off let me start by saying that I know I'm not the only one who has experienced this issue and I spent the last couple of hours to research how to fix it. Sadly, I can't get my scanner to work. I'm new to java so I don't understand more complicated explanations that some answers have in different questions.
Here is a rundown:
I'm trying to read out of a file which contains escape characters of cards. Here is a short version: (Numbers 2 and 3 of 4 different card faces)
\u26602,2
\u26652,2
\u26662,2
\u26632,2
\u26603,3
\u26653,3
\u26663,3
\u26633,3
This is the format: (suit)(face),(value). an example:
\u2663 = suit
3 = face
3 = value
This is the code I'm using for reading it:
File file = new File("Cards.txt");
try {
Scanner scanner = new Scanner(file);
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
String[] temp = line.split(",");
cards.add(new Card(temp[0], Integer.parseInt(temp[1])));
}
scanner.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
}
the ArrayList cards should have 52 cards after this containing a name (suit and face) and a value. When i try to print the name this is the output:
\u26633
While it should be:
♣3
Can anyone give me pointers towards a solution? I really need this issue resolved. I don't want you to write my code for me.
Thanks in advance

Simply store directly the suit characters into your files Cards.txt using UTF-8 as character encoding instead of the corresponding unicode character format that is only understood by java such that when it is read from your file it is read as the String "\u2660" not as the corresponding unicode character.
Its content would then be something like:
♠2,2
...
Another way could be to use StringEscapeUtils.unescapeJava(String input) to unescape your unicode character.
The code change would then be:
cards.add(new Card(StringEscapeUtils.unescapeJava(temp[0]), Integer.parseInt(temp[1])));

You'll have to save your file with UTF-8 encoding and then read the file using the same encoding.
♥,1
♥,2
♥,3
Here is the code snippet:
BufferedReader buff = new BufferedReader(new InputStreamReader(
new FileInputStream("Cards.txt"), "UTF-8"));
String input = null;
while (null != (input = buff.readLine())) {
System.out.println(input);
String[] temp = input.split(",");
cards.add(new Card(temp[0], Integer.parseInt(temp[1])));
}
buff.close();
Also, you need to make sure that your console is enabled to support UTF-8. Look at this answer to read more about it.

Related

Double.valueOf() java.lang.NumberFormatException

I encountered the following issue when trying read numbers from a csv file.
The numbers are well formed and decimal points are also correct (dots):
10;111.1;0.94
9.5;111.1;0.94
9;111.4;0.94
8.5;110.7;0.94
I read the file line by line and split each of them into three tokens, free of white spaces etc. (e.g. "10","111.1","0.94"). In spite of this I got the exception when calling a parsing function:
Double pwr = Double.parseDouble(tokens[1]);
Double control = Double.parseDouble(tokens[0]);
Double cos = Double.parseDouble(tokens[2]);
java.lang.NumberFormatException: For input string: "10"
When I change the order of lines, e.g., 1 <--> 2, the problem persists, but now I got java.lang.NumberFormatException: For input string: "9.5"
What is interesting, every time I make the above calls from the debugger level, I obtain correct values with no exception. It looks like a problem related to the first line of file.
Have you any idea where the problem source is?
It's probably a non-printable ASCII character
To remove this, you can simple use replaceAll method like this and use the following regex to remove \\P{Print}
BufferedReader br = new BufferedReader(new FileReader(""));
String str=br.readLine();
str=str.replaceAll("\\P{Print}", "");
After running the following RegEx you should be able to parse the value
===========================================================================
To see which character it is you can try this.
1) Read the line and print it as it is like this.
public class Test {
public static void main(String[] args) {
try {
BufferedReader br = new BufferedReader(new FileReader("/path/to/some.csv"));
String str=br.readLine();
System.out.println(str);
} catch (Exception e) {
e.printStackTrace();
}
}
}
OUTPUT:
2) Now copy output as it is and paste it inside a ""(double quotes)
As you can see the special character is visible now
[SOLVED] Presumably it was a problem of some "hidden" zero-length character at the beginning of the file (BTW, thank you for the helpful suggestions!). I changed the file encoding to UTF-8 (Menu > File > File Encoding) and that resolved the issue.

Scanner unable to capture last newline character if last line is empty

I am taking a the programming class where we have to compress a file using a Huffman Tree and decompress it.
I am running into a problem where I am unable to capture the last newline character of a txt file.
E.G.
This is a line
This is a secondline
//empty line
So if I compress and decompress the above text in a file, I end up with a file with this
This is a line
This is a secondline
Right now I'm doing
while(Scanner.hasNextLine()){
char[] cArr = file.nextLine().toCharArray();
//count amount of times a character appears with a hashmap
if(file.hasNextLine()){
//add an occurrence of \n to the hashmap
}
}
I understand the problem is that the last line technically does not have a "Scanner.hasNextline()" since I just consumed the last '\n' of the file with the nextLine() call.
Upon realizing that I have tried doing useDelimiter("") and Scanner.next() instead of Scanner.nextLine() and both still lead to similar problems.
So is there a way to fix this?
Thanks in advance.
Not to completely change your code or approach, using StringBuilder seems to work well.
File testfile = new File("test.txt");
StringBuilder stringBuffer = new StringBuilder();
try{
BufferedReader reader = new BufferedReader(new FileReader(testfile));
char[] buff = new char[500];
for (int charsRead; (charsRead = reader.read(buff)) != -1; ) {
stringBuffer.append(buff, 0, charsRead);
}
}
catch(Exception e){
System.out.print(e);
}
System.out.println(stringBuffer);
Checking the total bytes read:
System.out.println(stringBuffer.length()); // 51
List of file size:
ls -l .
51 test.txt
Bytes read match, so it appears it got all lines including the blank line.
note: I use Java 6, modify to suit your version.
Hope this helps.

Java charset - How to get correct input from System.in?

my first post here.
Well, i'm building a simple app for messaging through console(cmd and terminal), just for learning, but i'm got a problem while reader and writing the text with a charset.
Here is my initial code for sending message, the Main.CHARSET was setted to UTF-8:
Scanner teclado = new Scanner(System.in,Main.CHARSET);
BufferedWriter saida = new BufferedWriter(new OutputStreamWriter(new BufferedOutputStream(cliente.getOutputStream()),Main.CHARSET)));
saida.write(nick + " conectado!");
saida.flush();
while (teclado.hasNextLine()) {
saida.write(nick +": "+ s);
saida.flush();
}
And the receiving code:
try (BufferedReader br = new BufferedReader(new InputStreamReader(servidor,Main.CHARSET))){
String s;
while ((s = br.readLine()) != null) {
System.out.println(s);
}
}
When i send "olá" or anything like "ÁàçÇõÉ" (Brazilian portuguese), i got just blank spaces on windows cmd (not tested in linux).
So i teste the following code:
Scanner s = new Scanner(System.in,Main.CHARSET);
System.out.println(s.nextLine());
And for input "olá", printed "ol ".
the question is, how to read the console so that the input is read correctly , and can be transmitted to another user and be displayed correctly to him.
if you just wanna output portuguese in text file, it would be easy.
The only thing you have to care about is display by UTF-8 encoding.
you can use a really simple way like
String text = "olá";
FileWriter fw = new FileWriter("hello.txt");
fw.write(text);
fw.close();
Then open hello.txt by notepad or any text tool that support UTF-8
or you have to change your tool's default font into UTF-8.
If you want show it on console, I think pvg already answer you.
OK, seems you still get confuse on it.
here is a simple code you can try.
Scanner userInput = new Scanner(System.in);//type olá plz
String text = userInput.next();
System.out.println((int)text.charAt(2));//you will see output int is 63
char word = 'á'; // this word covert to int is 225
int a = 225;
System.out.println((int)word);// output 225
System.out.println((char)a); // output á
So, what is the conclusion?
If you use console to tpye in portuguese then catch it, you totally get different word, not a gibberish word.

How to read only integers from file?

I got a problem when I'm trying to read int from text file.
I'm using this kind of code
import java.util.Scanner;
import java.io.*;
File fileName =new File( "D:\\input.txt");
try {
Scanner in = new Scanner(fileName);
c = in.nextInt();
n = in.nextInt();
} catch(Exception e){
System.out.println("File not Found!!!");
}
If my text is edit like this
30
40
So it will work (meaning c=30, n=40).
But if I want to edit the text file that will be like this
c=30
n=40
My code will not work.
How can I change my code to read only the numbers and ignore the "c=" and n="
or any others chars besides the numbers?
You need to read your lines using Scanner.nextLine, split each line on =, and then convert the 2nd part to integer.
Remember to do the check - Scanner.hasNextLine before you read any line. So, you need to use a while loop to read each line.
A Simple implementation, you can extend it according to your need: -
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
String[] tokens = line.split("=");
try {
System.out.println(Integer.parseInt(tokens[1]);
} catch (NumberFormatException e) {
e.printStackTrace();
}
}
Now if you want to use those numbers later on, you can also add them in an ArrayList<Integer>.
Following the format you want to use in the input file then it would be better if you make use of java.util.Properties. You won't need to care about the parsing.
Properties props = new Properties();
props.load(new FileInputStream(new File("D:\\input.txt")));
c = Integer.parseInt(props.getProperty("c"));
n = Integer.parseInt(props.getProperty("n"));
You can read more about the simple line-oriented format.
you could read line by line(Scanner.nextLine) and check every character in the line by asking isDigit()
If your data line will always be in the same format x=12345, use a regex to get the numeric value from the line

How to use escape chars when reading a file?

I have a method that reads a random joke from a file in raw, and then displays it, but i can't figure out, how to set a new line.
All the jokes in the line are 1 line, but obviously they are more than one so i use \n. For instance the line says "Hi! \n Hi to you too" When i use my code instead of:
Hi!
Hi to you too
it gives me
Hi! \n Hi to you too
I tried to append it, and that didn't work with the code bellow also tried to enter the joke in array and then display it from that array, did't work either... Any ideas would be much appreciated...
InputStreamReader inputStream = new InputStreamReader
(getResources().openRawResource(R.raw.vicove));
BufferedReader br = new BufferedReader(inputStream);
int numLines = 1;
Random r = new Random();
int desiredLine = r.nextInt(numLines);
String theLine="";
int lineCtr = 0;
try {
while ((theLine = br.readLine()) != null) {
if (lineCtr == desiredLine) {
break;
}
lineCtr++;
}
} catch (IOException e) {
e.printStackTrace();
}
textGenerateNumber.setText(String.valueOf(theLine));
I've used the same code to read files with no escape characters and it works perfectly...
P.S. Not only \n wouldn't work, but also \" and probably any other.
try this:
String source = "<br>"+String.valueOf(theLine);
textGenerateNumber.append(Html.fromHtml(source));
It's because you have
\n
in the file, and that are 2 normal characters, not a newline.
You can use Apache Commons library to unescape these sequences:
StringEscapeUtils.unescapeJava in Apache Commons.

Categories