How can I write UTF-8 chars on java application? - java

I want to write
ısı
to csv on java netbeans. It works fine when I debug the code. But when I clean and build the project, I run .jar application and then when I look the csv I see
?s?
How can I solve this ?
thanks in advance.
EDIT
I use this to write :
PrintWriter csvWriter = new PrintWriter(new File("myfile.csv")) ;
csvWriter.println("ısı") ;

With this code:
PrintWriter csvWriter = new PrintWriter(new File("myfile.csv")) ;
csvWriter.println("ısı") ;
you are using the default character encoding of your system, which may or may not be UTF-8. If you want to use UTF-8, you have to specify that:
PrintWriter csvWriter = new PrintWriter(new File("myfile.csv"), "UTF-8");
Note that even if you do this, you might still see unexpected output. If that's the case, then you will need to check if whatever program you use to display the output (the Windows command prompt, or a text editor, or ...) understands that the file is in UTF-8 and displays it correctly.

Related

OpenCSV writing unescaped escape char

I am using OpenCSV 2.3 to read and write files data, but when, I switch Windows PC into in Japanese Language then, I notice that OpenCSV write file method internally uses Print writer that is converting yen char to \
As a result - the CSV file created ends up with unescaped \, and reading such file using CSVReader fails.
How could I fix this problem ?
Further investigated more on this problem and noticed that, this is not a problem of CSVWrite file method. Although, CSVWrite file methods is working Fine.
Now, Where is problem?
Previously, I was using FileWriter, It uses the system default Encoding. (In other words, If we uses the FileWriter then encoding of writing/reading files is depends on the mercy of Writer).
So, I tried/use
csvReader = new CSVReader(new BufferedReader(new InputStreamReader(new FileInputStream(inputFile), "UTF-8")));
to tell the Reader and Writer just read and write file in specified Encoding system not on systems's default.

can not save utf8 file in windows server with java

I have a simple java application that saves some String in utf-8 encode.
But when I open that file with notepad and save as,it shows it's encode ANSI.Now I don't know where is the problem?
My code that save the file is
File fileDir = new File("c:\\Sample.txt");
Writer out = new BufferedWriter(new OutputStreamWriter(
new FileOutputStream(fileDir), "UTF8"));
out.append("kodehelp UTF-8").append("\r\n");
out.append("??? UTF-8").append("\r\n");
out.append("???? UTF-8").append("\r\n");
out.flush();
out.close();
The characters you are writing to the file, as they appear in the code snippet, are in the basic ASCII subset of UFT-8. Notepad is likely auto-detecting the format, and seeing nothing outside the ASCII range, decides the file is ANSI.
If you want to force a different decision, place characters such as 字 or õ which are well out of the ASCII range.
It is possible that the ??? strings in your example were intended to be UTF-8. If so. make sure your IDE and/or build tool recognizes the files as UTF-8, and the files are indeed UTF-8 encoded. If you provide more information about your build system, then we can help further.

How to convert strange character from web page?

In the web page, it is "Why don't we" as follows:
But when I parse the webpage and save it to a text file, it becomes this under eclipse:
Why don鈥檛 we
More information about my implementation:
The webpage is: utf-8
I use jSoup to parse, the file is saved as a txt.
I use FileWriter f = new FileWriter() to write to file.
UPDATE:
I actually solve the display problem in eclipse by changing eclipse's encoding to utf-8.
FileWriter is a utility class that uses the default current platform encoding. That is non-portable, and probably incorrect.
BufferedWriter f = new BufferedWriter(New OutputStreamWriter(
new FileOutputStream(file), StandardCharsets.UTF_9));
f,Write("\uFEFF"); // Redundant BOM character might be written to be sure
// the text is read as UTF-8
...

Exporting CSV in french language shows junk charcters

I am having a problem in exporting a csv file using au.com.bytecode.opencsv.CSVWriter. I did something like:
File file = File.createTempFile("UserDetails_", ".csv");
CSVWriter writer = new CSVWriter(new OutputStreamWriter(
new FileOutputStream(file), "UTF-8"),
',');
and then when I exporting the .csv file, it shows the junk characters for french letters.[Data to be saved in the .csv are french characters].
But previously I was doing something like:
CSVWriter writer = new CSVWriter(new FileWriter(file));, then it was perfectly showing all french characters in Windows environment, but in Prod environment[Linux] it was showing junks. So I thought to use the Character set UTF-8 for the file format to be exported.
How can I get rid of the problem?
Please Suggest!!
Thanks in advance!
Hypothesis: you use Excel to open your CSVs under Windows.
Unfortunately for you, Excel is crap at reading UTF-8. Even though it should not be required, Excel expects to have a byte order mark at the beginning of the CSV if it uses any UTF-* encoding, otherwise it will try and read it using Windows 1252!
Solution? Errr... Don't use Excel?
Anyway, with your old way:
CSVWriter writer = new CSVWriter(new FileWriter(file));
this would use the JVM's default encoding; this is windows-1252 under Windows and UTF-8 under Linux.
Note that Apache's commons-io has BOM{Input,Output}Stream classes which may help you here.
Another solution would be (ewwww) to always read/write using Windows-1252.
Other note: if you use Java 7, use the Files.newBuffered{Reader,Writer}() methods -- and the try-with-resources statement.

Format of the output

I am using eclipse to run my program. My programs gives 1000 lines as output, and I write the output on a text file successfully. The problem is that the output on the text file is not same as on the console. On the console there are separate lines, but on text file all lines are appended as one line.
How to get the same console format in a text file?
You will have to make sure of the following:
When writing a line to a file you are including a line separator character(s), you can get a platform independent line separator using the following
System.getProperty("line.separator");
When viewing the text file, some app's (like notepad) may not display new line characters the same as others
The app you are using to view the file will need to be set to view in a monospaced font (such as Courier New)
completely guessing what you are doing but i think you need to do this.
BufferedWriter bw = new BufferedWriter(new FileWriter(f, false));
while ( rs.next() ) {
// code to write a line.
bw.write("\r\n");
}
use
bw.write("\r\n");
instead of
bw.newLine();
This is for windows systems POSIX systems do newlines differently i believe.
\n is a new line operator just remember that.
Well if you are using a PrintWriter I would simply do
PrintWriter pw = new PrintWriter(file);
while(...you still have data){
pw.println(<yourString>);
}
you can also append the string "\n" to create a new line manually

Categories