Scanner and Buffered reader not showing double quotation marks Java Android - java

I'm trying to load from a .txt file. Like most .txt files, it's UTF-8 encoded, so it shows double-quotation mark characters when I load it inside of eclipse.
The problem is, when I load my text file into bufferedreader (also set to UTF-8 encoding), it converts double quotation marks and a few other characters into question mark boxes on my device.
I can't figure out what could be the problem, searches here and on Google are all talking about Arabic characters. Please help.
edit: ... updating question... one minute
edit2: I'm displaying them inside a TextView.
The following is from a method. Scanner wasn't working either so I used this:
InputStream in;
in = getResources().openRawResource(R.id.text);
BufferedReader br = new BufferedReader(new InputStreamReader(in,Charset.forName("UTF-8")));
ArrayList<String> letters = new ArrayList<String>(25);
try {
String line="";
while((line = br.readLine()) != null){
String[] splited = line.split("\\s+");
int m = 0;
String word="";
while(m<splited.length){
m++; // analyze the word and do some other stuff here
letters.add(word);
}
}
in.close();
br.close();
} catch (IOException e) {
e.printStackTrace();
}
Here is where I display my text inside a handler:
final Textview txt = (TextView) rootView.findViewById(R.id.something);
// handler stuff, then inside the handler:
txt.setText(word, BufferType.SPANNABLE);
Spannable s = (Spannable)txt.getText();
s.setSpan(new ForegroundColorSpan(0xFFFFFFFF),2,3,Spannable.SPAN_EXCLUSIVE_EXCLUSIVE);
I removed the spannable, no dice.

To see what you default notepad saves your txt files as, just go to file>>save as, then along the bottom or in some kind of options menu there should be something about character encoding

Related

assigning properties to strings in text file

Hopefully my explanation does me some justice. I am pretty new to java. I have a text file that looks like this
Java
The Java Tutorials
http://docs.oracle.com/javase/tutorial/
Python
Tutorialspoint Java tutorials
http://www.tutorialspoint.com/python/
Perl
Tutorialspoint Perl tutorials
http://www.tutorialspoint.com/perl/
I have properties for language name, website description, and website url. Right now, I just want to list the information from the text file exactly how it looks, but I need to assign those properties to them.
The problem I am getting is "index 1 is out of bounds for length 1"
try {
BufferedReader in = new BufferedReader(new FileReader("Tutorials.txt"));
while (in.readLine() != null) {
TutorialWebsite tw = new TutorialWebsite();
str = in.readLine();
String[] fields = str.split("\\r?\\n");
tw.setProgramLanguage(fields[0]);
tw.setWebDescription(fields[1]);
tw.setWebURL(fields[2]);
System.out.println(tw);
}
} catch (IOException e) {
e.printStackTrace();
}
I wanted to test something so i removed the new lines and put commas instead and made it str.split(",") which printed it out just fine, but im sure i would get points taken off it i changed the format.
readline returns a "string containing the contents of the line, not including any line-termination characters", so why are you trying to split each line on "\\r?\\n"?
Where is str declared? Why are you reading two lines for each iteration of the loop, and ignoring the first one?
I suggest you start from
String str;
while ((str = in.readLine()) != null) {
System.out.println(str);
}
and work from there.
The first readline gets the language, the second gets the description, and the third gets the url, and then the pattern repeats. There is nothing to stop you using readline three times for each iteration of the while loop.
you can read all the file in a String like this
// try with resources, to make sure BufferedReader is closed safely
try (BufferedReader in = new BufferedReader(new FileReader("Tutorials.txt"))) {
//str will hold all the file contents
StringBuilder str = new StringBuilder();
String line;
while ((line = in.readLine()) != null) {
str.append(line);
str.append("\n");
} catch (IOException e) {
e.printStackTrace();
}
Later you can split the string with
String[] fields = str.toString().split("[\\n\\r]+");
Why not try it like this.
allocate a List to hold the TutorialWebsite instances.
use try with resources to open the file, read the lines, and trim any white space.
put the lines in an array
then iterate over the array, filling in the class instance
the print the list.
The loop ensures the array length is a multiple of nFields, discarding any remainder. So if your total lines are not divisible by nFields you will not read the remainder of the file. You would still have to adjust the setters if additional fields were added.
int nFields = 3;
List<TutorialWebsite> list = new ArrayList<>();
try (BufferedReader in = new BufferedReader(new FileReader("tutorials.txt"))) {
String[] lines = in.lines().map(String::trim).toArray(String[]::new);
for (int i = 0; i < (lines.length/nFields)*nFields; i+=nFields) {
TutorialWebsite tw = new TutorialWebsite();
tw.setProgramLanguage(lines[i]);
tw.setWebDescription(lines[i+1]);
tw.setWebURL(lines[i+2]);
list.add(tw);
}
} catch (IOException ioe) {
ioe.printStackTrace();
}
list.forEach(System.out::println);
A improvement would be to use a constructor and pass the strings to that when each instance is created.
And remember the file name as specified is relative to the directory in which the program is run.

Android's BreakIterator considers line breaks as sentence delimiters

I have a unix text file that I want to read in my Android app and split it into sentences. However I noticed that BreakIterator considers some line break characters as sentence delimiters.
I use the following code to read the file and split it into senteces (only the first sentence is output for presentation purpose):
File file = new File...
String text = "";
BreakIterator sentenceIterator = BreakIterator.getSentenceInstance(Locale.US);
try {
FileInputStream inputStream = new FileInputStream(file);
InputStreamReader inputStreamReader = new InputStreamReader(inputStream);
BufferedReader bufferedReader = new BufferedReader(inputStreamReader);
String line;
StringBuilder stringBuilder = new StringBuilder();
while ((line = bufferedReader.readLine()) != null) {
stringBuilder.append(line);
stringBuilder.append('\n');
}
inputStream.close();
text = stringBuilder.toString();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
sentenceIterator.setText(text);
int end = sentenceIterator.next();
System.out.println(end);
System.out.println(text.substring(0, end));
But if I compile and run the code from Eclipse as a Desktop app the text is split correctly. I don't understand why it doesn't do the same on Android app.
I tried to convert the text file to dos format, I even tried to read the file and preserve original line breaks:
Pattern pat = Pattern.compile(".*\\R|.+\\z");
StringBuilder stringBuilder = new StringBuilder();
try (Scanner in = new Scanner(file, "UTF-8")) {
String line;
while ((line = in.findWithinHorizon(pat, 0)) != null) {
stringBuilder.append(line);
}
text = stringBuilder.toString();
sentenceIterator.setText(text);
int end = sentenceIterator.next();
System.out.println(end);
System.out.println(text.substring(0, end));
}
but without success. Any ideas?
You can download an excerpt from the file (unix format) here: http://dropmefiles.com/TZgBp
I've just noticed that it can be reproduced without download of this file. Just create a string that has line breaks inside sentences (e.g. "Hello, \nworld!") and run an instrumented test. If BreakIterator is used in a usual test then it splits correctly.
I expect 2 sentences:
sentence 1:
Foreword
IF a colleague were to say to you, Spouse of me this night today
manufactures the unusual meal in a home.
sentence 2:
You will join?
Yes, they don't look great but at least you know why it is so (sentence delimiters are ?. etc.). But if the code runs on Android it creates a sentence even from
Foreword
for some reason...
I'm not sure whether it is a bug, or whether there is a workaround for this. But in my eyes it makes Android version of BreakIterator as sentence splitter useless as it is normal for sentences in books to spread over multiple lines.
In all the experiments I've used the same import java.text.BreakIterator;
This is not really an answer but it might give you some insights.
It is not a file encoding issue, I tried it it his way and have the same faulty behaviour.
BreakIterator sentenceIterator = BreakIterator.getSentenceInstance(Locale.US);
String text = "Foreword\nIf a colleague were to say to you, Spouse of me this night today manufactures the unusual meal in a home. You will join?";
sentenceIterator.setText(text);
Android does not use the same Java version as your computer
I noticed that when I printout the class of the sentenceIterator object
sentenceIterator.getClass()
I have different classes when running with IntelliJ and when running on Android:
Running with IntelliJ:
sun.util.locale.provider.RuleBasedBreakIterator
Running on Android:
java.text.RuleBasedBreakIterator
sun.util.locale.provider.RuleBasedBreakIterator has the behaviour you want.
I don't know how to get Android to use the good RuleBasedBreakIterator class. I don't even know if it is possible.

Java charset - How to get correct input from System.in?

my first post here.
Well, i'm building a simple app for messaging through console(cmd and terminal), just for learning, but i'm got a problem while reader and writing the text with a charset.
Here is my initial code for sending message, the Main.CHARSET was setted to UTF-8:
Scanner teclado = new Scanner(System.in,Main.CHARSET);
BufferedWriter saida = new BufferedWriter(new OutputStreamWriter(new BufferedOutputStream(cliente.getOutputStream()),Main.CHARSET)));
saida.write(nick + " conectado!");
saida.flush();
while (teclado.hasNextLine()) {
saida.write(nick +": "+ s);
saida.flush();
}
And the receiving code:
try (BufferedReader br = new BufferedReader(new InputStreamReader(servidor,Main.CHARSET))){
String s;
while ((s = br.readLine()) != null) {
System.out.println(s);
}
}
When i send "olá" or anything like "ÁàçÇõÉ" (Brazilian portuguese), i got just blank spaces on windows cmd (not tested in linux).
So i teste the following code:
Scanner s = new Scanner(System.in,Main.CHARSET);
System.out.println(s.nextLine());
And for input "olá", printed "ol ".
the question is, how to read the console so that the input is read correctly , and can be transmitted to another user and be displayed correctly to him.
if you just wanna output portuguese in text file, it would be easy.
The only thing you have to care about is display by UTF-8 encoding.
you can use a really simple way like
String text = "olá";
FileWriter fw = new FileWriter("hello.txt");
fw.write(text);
fw.close();
Then open hello.txt by notepad or any text tool that support UTF-8
or you have to change your tool's default font into UTF-8.
If you want show it on console, I think pvg already answer you.
OK, seems you still get confuse on it.
here is a simple code you can try.
Scanner userInput = new Scanner(System.in);//type olá plz
String text = userInput.next();
System.out.println((int)text.charAt(2));//you will see output int is 63
char word = 'á'; // this word covert to int is 225
int a = 225;
System.out.println((int)word);// output 225
System.out.println((char)a); // output á
So, what is the conclusion?
If you use console to tpye in portuguese then catch it, you totally get different word, not a gibberish word.

Android Reading File with Accents

I have a problem. I am using a file to load some strings to use them in my App.
I have this function:
public void lecturaFichero(){
String linea = null;
try {
InputStream in = cntx.getAssets().open("cc.txt");
if (in != null) {
InputStreamReader input = new InputStreamReader(in,Charset.forName("iso-8859-1"));
BufferedReader buffreader = new BufferedReader(input);
while ((linea = buffreader.readLine()) != null) {
rellenaCodigo(linea);
}
in.close();
}
} catch (IOException e) {
e.printStackTrace();
}
}
The problem is that when I run the app it crashes right away (reading the file is the first thing I do).
If I do this instead of the above:
InputStreamReader input = new InputStreamReader(in)); //Without specifying the charset
It does work, it does not crash but the app shows where that special characters should be.
I need to solve this, I'd appreciate a solution in which I can read special characters in my app.
Thanks in advance.
PS: Android can print special characters because when I type a String by hand and print it on the screen it shows the character, the problem is when it comes to reading from the .txt.

How to use escape chars when reading a file?

I have a method that reads a random joke from a file in raw, and then displays it, but i can't figure out, how to set a new line.
All the jokes in the line are 1 line, but obviously they are more than one so i use \n. For instance the line says "Hi! \n Hi to you too" When i use my code instead of:
Hi!
Hi to you too
it gives me
Hi! \n Hi to you too
I tried to append it, and that didn't work with the code bellow also tried to enter the joke in array and then display it from that array, did't work either... Any ideas would be much appreciated...
InputStreamReader inputStream = new InputStreamReader
(getResources().openRawResource(R.raw.vicove));
BufferedReader br = new BufferedReader(inputStream);
int numLines = 1;
Random r = new Random();
int desiredLine = r.nextInt(numLines);
String theLine="";
int lineCtr = 0;
try {
while ((theLine = br.readLine()) != null) {
if (lineCtr == desiredLine) {
break;
}
lineCtr++;
}
} catch (IOException e) {
e.printStackTrace();
}
textGenerateNumber.setText(String.valueOf(theLine));
I've used the same code to read files with no escape characters and it works perfectly...
P.S. Not only \n wouldn't work, but also \" and probably any other.
try this:
String source = "<br>"+String.valueOf(theLine);
textGenerateNumber.append(Html.fromHtml(source));
It's because you have
\n
in the file, and that are 2 normal characters, not a newline.
You can use Apache Commons library to unescape these sequences:
StringEscapeUtils.unescapeJava in Apache Commons.

Categories