I'm developing a small project and I'd like to use internationalization for it. The problem is that when I try to use .properties file with cyrillic symbols inside, the text is displayed as rubbish. When I hard-code the strings it's displayed just fine.
Here is my code:
ResourceBundle labels = ResourceBundle.getBundle("Labels");
btnQuit = new JButton(labels.getString("quit"));
And in my .properties file:
quit = Изход
And I get rubbish. When i try
btnQuit = new JButton("Изход);
It is displayed correctly. As far as I am aware, UTF-8 is the encoding used for the files.
Any ideas?
AnyEdit is an eclipse-plugin that allows you to easily convert your your properties files from and to unicode notation. (avoiding the use of command-line tools like native2ascii)
If you were using the Properties class alone (without resource bundle), since Java 1.6 you have the option to load the file with a custom encoding, using a Reader (rather than an InputStream)
I'd guess you can also use new PropertyResourceBundle(reader), rather than ResourceBundle.getBundle(..), where reader is:
Reader reader = new BufferedReader(new InputStreamReader(
getClass().getResourceAsStream("messages.properties"), "utf-8")));
Properties are ISO-8859-1 encoded by default. You must use native2ascii to convert your UTF-8 properties to a valid ISO-8859-1 properties file containing unicode escape sequences for all non-ISO-8859-1 characters.
Related
I have property with string that contains some special characters. After I save it to properties file I have:
BB\u0161BB=0
I don't like character represented with \u0161 . Why it can't save like character I see it on the screen and type from keyboard?
UPD
What is the easiest way to read ini-file architecture file that contains special character?
That's how Properties files are defined to behave. Any other system is likely to use UTF-8 which might not be readable either.
As your character is outside the range of an ISO-8859-1 encoding, it has to use unicode instead.
You can!
Encode your file in unicode (UTF-8, for instance) using another application (Notepad++ is nice, for instance) and read your properties file like this:
File file = ... ;
Properties properties = new Properties();
try (FileInputStream in = new FileInputStream(file);
Reader reader = new InputStreamReader(in, StandardCharsets.UTF_8)) {
properties.load(reader);
}
// use properties
There you go, properties file in any charset that you can read and use.
You can't. From javadoc:
...the input/output stream is encoded in ISO 8859-1 character
encoding. Characters that cannot be directly represented in this
encoding can be written using Unicode escapes as defined in section
3.3 of The Java™ Language Specification; only a single 'u' character is allowed in an escape sequence.
Please refer to this How to use UTF-8 in resource properties with ResourceBundle
Properties files are ISO-8859-1 encoded, so characters outside of that set need to be \uXXXX escaped.
First time use FreeMarker on JAVA project and stack on configure the chinese character.
I tried a lot of examples to fix the code like below, but it still not able to make it.
// Free-marker configuration object
Configuration conf = new Configuration();
conf.setTemplateLoader(new ClassTemplateLoader(getClass(), "/"));
conf.setLocale(Locale.CHINA);
conf.setDefaultEncoding("UTF-8");
// Load template from source folder
Template template = conf.getTemplate(templatePath);
template.setEncoding("UTF-8");
// Get Free-Marker output value
Writer output = new StringWriter();
template.process(input, output);
// Map Email Full Content
EmailNotification email = new EmailNotification();
email.setSubject(subject);
.......
Saw some example request to make changes on the freemarker.properties but i have no this file. I just import the .jar file and use it.
Kindly advise what should i do to make it display chinese character.
What exactly is the problem?
Anyway, cfg.setDefaultEncoding("UTF-8"); should be enough, assuming your template files are indeed in UTF-8. But, another place where you have to ensure proper encoding is when you convert the the template output back to "binary" from UNICODE text. So FreeMarker sends its output into a Writer, so everything is UNICODE so far, but then you will have an OutputStreamWriter or something like that, and that has to use charset (UTF-8 probably) that can encode Chinese characters.
You need to change your file encoding of your .ftl template files by saving over them in your IDE or notepad, and changing the encoding in the save dialog.
There should be an Encoding dropdown at the bottom of the save dialog.
I have a project where everything is in UTF-8. I was using the Properties.load(Reader) method to read properties files in this encoding. But now, I need to make the project compatible with Java 1.5, and the mentioned method doesn't exist in Java 1.5. There is only a load method that takes an InputStream as a parameter, which is assumed to be in ISO-8859-1.
Is there any simple way to make my project 1.5-compatible without having to change all the .properties files to ISO-8859-1? I don't really want to have a mix of encodings in my project (encodings are already a time sink one at a time, let alone when you mix them) or change all my project to ISO-8859-1.
With "a simple way" I mean "without creating a custom Properties class from scratch".
Could you use xml-properties instead? As I understand by the spec .properties files should be in ISO-8859-1, if you want other characters, they should be quoted, using the native2ascii tool.
One strategy that might work for this situation is as follows:
Read the bytes of the Reader into a ByteArrayOutputStream.
Once that is completed, call toByteArray() See below.
With the byte[] construct a ByteArrayInputStream
Use the ByteArrayInputStream in Properties.load(InputStream)
As pointed out, the above failed to actually convert the character set from UTF-8 to ISO-8859-1. To fix that, a tweak.
After the BAOS has been filled, instead of calling toByteArray()..
Call toString("ISO-8859-1") to get an ISO-8859-1 encoded String. Then look to..
Call String.getBytes() to get the byte[]
What you can do is open a thread that would read data using a BufferedReader then write out the data to a PipedOutputStream which is then linked by a PipedInputStream that load uses.
PipedOutputStream pos = new PipedOutputStream();
PipedInputStream pis = new PipedInputStream(pos);
ReaderRunnable reader = new ReaderRunnable(pos, new File("utfproperty.properties"));
Thread t = new Thread(reader);
t.start();
properties.load(pis);
t.join();
The BufferedReader will read the data one character at a time and if it detects it to be a character data not to be within the US-ASCII (i.e. low 7-bit) range then it writes "\u" + the character code into the PipedOutputStream.
ReaderRunnable would be a class that looks like:
public class ReaderRunnable implements Runnable {
public ReaderRunnable(OutputStream os, File f) {
this.os = os;
this.f = f;
}
private final OutputStream os;
private final File f;
public void run() {
// open file
// read file, escape any non US-ASCII characters
}
}
Now after writing all that I was thinking that someone should've had this problem before and solved it, and the best place to look for these things is in Apache Commons. Fortunately, they have an implementation there.
https://commons.apache.org/io/apidocs/org/apache/commons/io/input/ReaderInputStream.html
The implementation from Apache is not without flaws though. Your input file even if it is UTF-8 must only contain the characters from the ISO-8859-1 character set. The design I had provided above can handle that situation.
Depending on your build engine you can \uXXXX-escape the properties into the build target directory. Maven can filter them via the native2ascii-maven-plugin.
What I personally do in my projects is I keep my properties in UTF-8 files with an extension .uproperties and I convert them to ISO at the build time to .properties files using native2ascii.exe. This allows me to maintain my properties in UTF-8 and the Ant script does everything else for me.
What I just now experienced is, Make all .java files also UTF-8 encoding type (not only properties file where you store UTF-8 characters). This way there no need to use for InputStreamReader also. Also, make sure to compile to UTF-8 encoding.
This has worked for me without any added parameter of UTF-8.
To test this, write a simple stub program in eclipse and change the format of that java file by going to properties of that file and Resource section, to set the UTF-8 encoding format.
I want to write "Arabic" in the message resource bundle (properties) file but when I try to save it I get this error:
"Save couldn't be completed
Some characters cannot be mapped using "ISO-85591-1" character encoding. Either change encoding or remove the character ..."
Can anyone guide please?
I want to write:
global.username = اسم المستخدم
How should I write the Arabic of "username" in properties file? So, that internationalization works..
BR
SC
http://sourceforge.net/projects/eclipse-rbe/
You can use the above plugin for eclipse IDE to make the Unicode conversion for you.
As described in the class reference for "Properties"
The load(Reader) / store(Writer, String) methods load and store properties from and to
a character based stream in a simple line-oriented format specified below.
The load(InputStream) / store(OutputStream, String) methods work the same way as the
load(Reader)/store(Writer, String) pair, except the input/output stream is encoded in
ISO 8859-1 character encoding. Characters that cannot be directly represented in this
encoding can be written using Unicode escapes ; only a single 'u' character is allowed
in an escape sequence. The native2ascii tool can be used to convert property files to
and from other character encodings.
Properties-based resource bundles must be encoded in ISO-8859-1 to use the default loading mechanism, but I have successfully used this code to allow the properties files to be encoded in UTF-8:
private static class ResourceControl extends ResourceBundle.Control {
#Override
public ResourceBundle newBundle(String baseName, Locale locale,
String format, ClassLoader loader, boolean reload)
throws IllegalAccessException, InstantiationException,
IOException {
String bundlename = toBundleName(baseName, locale);
String resName = toResourceName(bundlename, "properties");
InputStream stream = loader.getResourceAsStream(resName);
return new PropertyResourceBundle(new InputStreamReader(stream,
"UTF-8"));
}
}
Then of course you have to change the encoding of the file itself to UTF-8 in your IDE, and can use it like this:
ResourceBundle bundle = ResourceBundle.getBundle(
"package.Bundle", new ResourceControl());
new String(ret.getBytes("ISO-8859-1"), "UTF-8"); worked for me.
property file saved in ISO-8859-1 Encodiing.
If you are using Eclipse, you can choose "Window-->Preferences" and then filter on "content types". Then you should be able to set the default encoding. There's a screen shot showing this at the top of this post.
This is mainly an editor configuration issue. If you're working in Windows, you can edit the text in an editor that supports UTF-8. Notepad or Eclipse built-in editor should be more than enough, provided you've saved file as UTF-8. In Linux, I've used gedit and emacs successfully. In Notepad, you can do this by clicking 'Save As' button and choosing 'UTF-8' encoding. Other editors should have similar feature. Some editors might require font change in order to display letters correctly, but it seems that you don't have this issue.
Having said that, there are other steps to consider when performing i18n for arabic. You can find some useful links below. Make sure to use native2ascii on properties file before using it otherwise it might not work. I spent a lot of time until I figured this one out.
Tomcat webapps
Using nativ2ascii with properties files
Besides native2ascii tool mentioned in other answers there is a java Open Source library that can provide conversion functionality to be used in code
Library MgntUtils has a Utility that converts Strings in any language (including special characters and emojis to unicode sequence and vise versa:
result = "Hello World";
result = StringUnicodeEncoderDecoder.encodeStringToUnicodeSequence(result);
System.out.println(result);
result = StringUnicodeEncoderDecoder.decodeUnicodeSequenceToString(result);
System.out.println(result);
The output of this code is:
\u0048\u0065\u006c\u006c\u006f\u0020\u0057\u006f\u0072\u006c\u0064
Hello World
The library can be found at Maven Central or at Github It comes as maven artifact and with sources and javadoc
Here is javadoc for the class StringUnicodeEncoderDecoder
I've recently had to switch encoding of webapp I'm working on from ISO-xx to utf8. Everything went smooth, except properties files. I added -Dfile.encoding=UTF-8 in eclipse.ini and normal files work fine. Properties however show some strange behaviour.
If I copy utf8 encoded properties from Notepad++ and paste them in Eclipse, they show and work fine. When I reopen properties file, I see some Unicode characters instead of proper ones, like:
Zur\u00EF\u00BF\u00BDck instead of Zurück
but app still works fine.
If I start to edit properties, add some special characters and save, they display correctly, however they don't work and all previously working special characters don't work any more.
When I compare local version with CVS I can see special characters correctly on remote file and after update I'm at start again: app works, but Eclipse displays Unicode chars.
I tried changing file encoding by right clicking it and selecting „Other: UTF8” but it didn't help. It also said: „determined from content: ISO-8859-1”
I'm using Java 6 and Jboss Developer based on Eclipse 3.3
I can live with it by editing properties in Notepad++ and pasting them in Eclipse, but I would be grateful if someone could help me with fixing this in Eclipse.
Answer for "pre-Java-9" is below. As of Java 9, properties files are saved and loaded in UTF-8 by default, but falling back to ISO-8859-1 if an invalid UTF-8 byte sequence is detected. See the Java 9 release notes for details.
Properties files are ISO-8859-1 by definition - see the docs for the Properties class.
Spring has a replacement which can load with a specified encoding, using PropertiesFactoryBean.
EDIT: As Laurence noted in the comments, Java 1.6 introduced overloads for load and store which take a Reader/Writer. This means you can create a reader for the file with whatever encoding you want, and pass it to load. Unfortunately FileReader still doesn't let you specify the encoding in the constructor (aargh) so you'll be stuck with chaining FileInputStream and InputStreamReader together. However, it'll work.
For example, to read a file using UTF-8:
Properties properties = new Properties();
InputStream inputStream = new FileInputStream("path/to/file");
try {
Reader reader = new InputStreamReader(inputStream, "UTF-8");
try {
properties.load(reader);
} finally {
reader.close();
}
} finally {
inputStream.close();
}
Don't waste your time, you can use Resource Bundle plugin in Eclipse
Old Sourceforge page
It is not a problem with Eclipse. If you are using the Properties class to read and store the properties file, the class will escape all special characters.
From the class documentation:
When saving properties to a stream or loading them from a stream, the ISO 8859-1 character encoding is used. For characters that cannot be directly represented in this encoding, Unicode escapes are used; however, only a single 'u' character is allowed in an escape sequence. The native2ascii tool can be used to convert property files to and from other character encodings.
From the API, store() method:
Characters less than \u0020 and characters greater than \u007E are written as \uxxxx for the appropriate hexadecimal value xxxx.
Properties props = new Properties();
URL resource = getClass().getClassLoader().getResource("data.properties");
props.load(new InputStreamReader(resource.openStream(), "UTF8"));
Works like a charm
:-)
There are too many points in the process you describe where errors can occur, so I won't try to guess what you're doing wrong, but I think I know what's happening under the hood.
EF BF BD is the UTF-8 encoded form of U+FFFD, the standard replacement character that's inserted by decoders when they encounter malformed input. It sounds like your text is being saved as ISO-8859-1, then read as if it were UTF-8, then saved as UTF-8, then converted to the Properties format using native2ascii using the platform default encoding (e.g., windows-1252).
ü => 0xFC // save as ISO-8859-1
0xFC => U+FFFD // read as UTF-8
U+FFFD => 0xEF 0xBF 0xBD // save as UTF-8
0xEF 0xBF 0xBD => \u00EF\u00BF\u00BD // native2ascii
I suggest you leave the "file.encoding" property alone. Like "file.separator" and "line.separator", it's not nearly as useful as you would expect it to be. Instead, get into the habit of always specifying an encoding when reading and writing text files.
Properties props = new Properties();
URL resource = getClass().getClassLoader().getResource("data.properties");
props.load(new InputStreamReader(resource.openStream(), "UTF8"));
this works well in java 1.6. How can i do this in 1.5, Since Properties class does not have a method to pars InputStreamReader.
There is much easier way:
props.load(new InputStreamReader(new FileInputStream("properties_file"), "UTF8"));
Just another Eclipse plugin for *.properties files:
Properties Editor
You can define UTF-8 .properties files to store your translations and use ResourceBundle, to get values. To avoid problems you can change encoding:
String value = RESOURCE_BUNDLE.getString(key);
return new String(value.getBytes("ISO-8859-1"), "UTF-8");
This seems to work only for some characters ... including special characters for German, Portuguese, French. However, I ran into trouble with Russian, Hindi and Mandarin characters. These are not converted to Properties format 'native2ascii', instead get saved with ?? ?? ??
The only way I could get my app to display these characters correctly is by putting them in the properties file translated to UTF-8 format - as \u0915 instead of क, or \u044F instead of я.
Any advice?
I recommend you to use Attesoro (http://attesoro.org/). Is simple and easy to use. And is made in java.
I found a solution to this problem. You need to write file (*.properties) use standard "Properties", example:
Properties properties = new Properties();
properties.put("DB_DRIVER", "com.mysql.cj.jdbc.Driver");
properties.put("DB_URL", "jdbc:mysql://localhost:3306/world");
properties.put("DB_USERNAME", "root");
properties.put("DB_PASSWORD", "1111");
properties.put("DB_AUTO_RECONNECT", "true");
properties.put("DB_CHARACTER_ENCODING", "UTF-8");
properties.put("DB_USE_UNICODE", "true");
try {
properties.store(new FileWriter("src/connectionDB/base/db.properties"), "Comment writes");
} catch (IOException e) {
System.out.println(e.getMessage());
}
then, you can read file without mistakes:
try {
properties.load(new FileReader("src\\connectionDB\\base\\db.properties"));
properties.list(System.out);
} catch (IOException ex) {
System.out.println(ex.getMessage());
}
or
try {
String str = new String(Files.readAllBytes(Paths.get("src/connectionDB/base/db.properties")), StandardCharsets.UTF_8);
properties.load(new StringReader(str));
properties.list(System.out);
} catch (IOException e) {
System.out.println(e.getMessage());
}
or
InputStream inputStream = getClass().getClassLoader().getResourceAsStream("connectionDB/base/db.properties");
try {
Reader reader = new InputStreamReader(inputStream, "UTF-8");
try {
properties.load(reader);
properties.list(System.out);
} catch (IOException e) {
System.out.println(e.getMessage());
}
} catch (UnsupportedEncodingException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
never mind....
then close the code that creates this file and use file *.properties
If the properties are for XML or HTML, it's safest to use XML entities. They're uglier to read, but it means that the properties file can be treated as straight ASCII, so nothing will get mangled.
Note that HTML has entities that XML doesn't, so I keep it safe by using straight XML: http://www.w3.org/TR/html4/sgml/entities.html