Java: How to write "Arabic" in properties file? - java

I want to write "Arabic" in the message resource bundle (properties) file but when I try to save it I get this error:
"Save couldn't be completed
Some characters cannot be mapped using "ISO-85591-1" character encoding. Either change encoding or remove the character ..."
Can anyone guide please?
I want to write:
global.username = اسم المستخدم
How should I write the Arabic of "username" in properties file? So, that internationalization works..
BR
SC

http://sourceforge.net/projects/eclipse-rbe/
You can use the above plugin for eclipse IDE to make the Unicode conversion for you.

As described in the class reference for "Properties"
The load(Reader) / store(Writer, String) methods load and store properties from and to
a character based stream in a simple line-oriented format specified below.
The load(InputStream) / store(OutputStream, String) methods work the same way as the
load(Reader)/store(Writer, String) pair, except the input/output stream is encoded in
ISO 8859-1 character encoding. Characters that cannot be directly represented in this
encoding can be written using Unicode escapes ; only a single 'u' character is allowed
in an escape sequence. The native2ascii tool can be used to convert property files to
and from other character encodings.

Properties-based resource bundles must be encoded in ISO-8859-1 to use the default loading mechanism, but I have successfully used this code to allow the properties files to be encoded in UTF-8:
private static class ResourceControl extends ResourceBundle.Control {
#Override
public ResourceBundle newBundle(String baseName, Locale locale,
String format, ClassLoader loader, boolean reload)
throws IllegalAccessException, InstantiationException,
IOException {
String bundlename = toBundleName(baseName, locale);
String resName = toResourceName(bundlename, "properties");
InputStream stream = loader.getResourceAsStream(resName);
return new PropertyResourceBundle(new InputStreamReader(stream,
"UTF-8"));
}
}
Then of course you have to change the encoding of the file itself to UTF-8 in your IDE, and can use it like this:
ResourceBundle bundle = ResourceBundle.getBundle(
"package.Bundle", new ResourceControl());

new String(ret.getBytes("ISO-8859-1"), "UTF-8"); worked for me.
property file saved in ISO-8859-1 Encodiing.

If you are using Eclipse, you can choose "Window-->Preferences" and then filter on "content types". Then you should be able to set the default encoding. There's a screen shot showing this at the top of this post.

This is mainly an editor configuration issue. If you're working in Windows, you can edit the text in an editor that supports UTF-8. Notepad or Eclipse built-in editor should be more than enough, provided you've saved file as UTF-8. In Linux, I've used gedit and emacs successfully. In Notepad, you can do this by clicking 'Save As' button and choosing 'UTF-8' encoding. Other editors should have similar feature. Some editors might require font change in order to display letters correctly, but it seems that you don't have this issue.
Having said that, there are other steps to consider when performing i18n for arabic. You can find some useful links below. Make sure to use native2ascii on properties file before using it otherwise it might not work. I spent a lot of time until I figured this one out.
Tomcat webapps
Using nativ2ascii with properties files

Besides native2ascii tool mentioned in other answers there is a java Open Source library that can provide conversion functionality to be used in code
Library MgntUtils has a Utility that converts Strings in any language (including special characters and emojis to unicode sequence and vise versa:
result = "Hello World";
result = StringUnicodeEncoderDecoder.encodeStringToUnicodeSequence(result);
System.out.println(result);
result = StringUnicodeEncoderDecoder.decodeUnicodeSequenceToString(result);
System.out.println(result);
The output of this code is:
\u0048\u0065\u006c\u006c\u006f\u0020\u0057\u006f\u0072\u006c\u0064
Hello World
The library can be found at Maven Central or at Github It comes as maven artifact and with sources and javadoc
Here is javadoc for the class StringUnicodeEncoderDecoder

Related

FreeMarker Not Able Display Chinese Character

First time use FreeMarker on JAVA project and stack on configure the chinese character.
I tried a lot of examples to fix the code like below, but it still not able to make it.
// Free-marker configuration object
Configuration conf = new Configuration();
conf.setTemplateLoader(new ClassTemplateLoader(getClass(), "/"));
conf.setLocale(Locale.CHINA);
conf.setDefaultEncoding("UTF-8");
// Load template from source folder
Template template = conf.getTemplate(templatePath);
template.setEncoding("UTF-8");
// Get Free-Marker output value
Writer output = new StringWriter();
template.process(input, output);
// Map Email Full Content
EmailNotification email = new EmailNotification();
email.setSubject(subject);
.......
Saw some example request to make changes on the freemarker.properties but i have no this file. I just import the .jar file and use it.
Kindly advise what should i do to make it display chinese character.
What exactly is the problem?
Anyway, cfg.setDefaultEncoding("UTF-8"); should be enough, assuming your template files are indeed in UTF-8. But, another place where you have to ensure proper encoding is when you convert the the template output back to "binary" from UNICODE text. So FreeMarker sends its output into a Writer, so everything is UNICODE so far, but then you will have an OutputStreamWriter or something like that, and that has to use charset (UTF-8 probably) that can encode Chinese characters.
You need to change your file encoding of your .ftl template files by saving over them in your IDE or notepad, and changing the encoding in the save dialog.
There should be an Encoding dropdown at the bottom of the save dialog.

Java .properties file gibberish when trying to get value in Hebrew

I'm trying to read Hebrew values from a .properties file and I get gibberish. I've tried a couple of ways, including changing the file's encoding (Cp1255, ISO-8859-8, UTF-8), adding a -file.encoding to the arguments and nothing helped.
This issue was raised during our migration to Weblogic from IAS (OC4J container), I noticed the javascript messages (which are read from a .properties file) appear as ???? ???, which does not happen on the OC4J. However, this only applies to data read from .properties files, everything else is shown fine.
I've been googling for a couple of days now and I haven't been able to come up with a solution.
EDIT:
What I tried at home
ResourceBundle rb = ResourceBundle.getBundle("test");
System.out.println(rb.getString("test"));
This is what test.properties looks like:
test שלום
Output is: ùìåí
Since you have an instance of ResourceBundle and you can get the ISO-8859-1 encoded String with:
String strISO = rb.getString("test");
You can then convert this to UTF-8 and print it by using:
System.out.println(new String(strISO.getBytes("ISO-8859-1"), "UTF-8"));
Files for ResourceBundle should be in ISO 8859-1 encoding or in \uXXXX format, just as for Properties. See more at How to use UTF-8 in resource properties with ResourceBundle

Reading UTF-8 .properties files in Java 1.5?

I have a project where everything is in UTF-8. I was using the Properties.load(Reader) method to read properties files in this encoding. But now, I need to make the project compatible with Java 1.5, and the mentioned method doesn't exist in Java 1.5. There is only a load method that takes an InputStream as a parameter, which is assumed to be in ISO-8859-1.
Is there any simple way to make my project 1.5-compatible without having to change all the .properties files to ISO-8859-1? I don't really want to have a mix of encodings in my project (encodings are already a time sink one at a time, let alone when you mix them) or change all my project to ISO-8859-1.
With "a simple way" I mean "without creating a custom Properties class from scratch".
Could you use xml-properties instead? As I understand by the spec .properties files should be in ISO-8859-1, if you want other characters, they should be quoted, using the native2ascii tool.
One strategy that might work for this situation is as follows:
Read the bytes of the Reader into a ByteArrayOutputStream.
Once that is completed, call toByteArray() See below.
With the byte[] construct a ByteArrayInputStream
Use the ByteArrayInputStream in Properties.load(InputStream)
As pointed out, the above failed to actually convert the character set from UTF-8 to ISO-8859-1. To fix that, a tweak.
After the BAOS has been filled, instead of calling toByteArray()..
Call toString("ISO-8859-1") to get an ISO-8859-1 encoded String. Then look to..
Call String.getBytes() to get the byte[]
What you can do is open a thread that would read data using a BufferedReader then write out the data to a PipedOutputStream which is then linked by a PipedInputStream that load uses.
PipedOutputStream pos = new PipedOutputStream();
PipedInputStream pis = new PipedInputStream(pos);
ReaderRunnable reader = new ReaderRunnable(pos, new File("utfproperty.properties"));
Thread t = new Thread(reader);
t.start();
properties.load(pis);
t.join();
The BufferedReader will read the data one character at a time and if it detects it to be a character data not to be within the US-ASCII (i.e. low 7-bit) range then it writes "\u" + the character code into the PipedOutputStream.
ReaderRunnable would be a class that looks like:
public class ReaderRunnable implements Runnable {
public ReaderRunnable(OutputStream os, File f) {
this.os = os;
this.f = f;
}
private final OutputStream os;
private final File f;
public void run() {
// open file
// read file, escape any non US-ASCII characters
}
}
Now after writing all that I was thinking that someone should've had this problem before and solved it, and the best place to look for these things is in Apache Commons. Fortunately, they have an implementation there.
https://commons.apache.org/io/apidocs/org/apache/commons/io/input/ReaderInputStream.html
The implementation from Apache is not without flaws though. Your input file even if it is UTF-8 must only contain the characters from the ISO-8859-1 character set. The design I had provided above can handle that situation.
Depending on your build engine you can \uXXXX-escape the properties into the build target directory. Maven can filter them via the native2ascii-maven-plugin.
What I personally do in my projects is I keep my properties in UTF-8 files with an extension .uproperties and I convert them to ISO at the build time to .properties files using native2ascii.exe. This allows me to maintain my properties in UTF-8 and the Ant script does everything else for me.
What I just now experienced is, Make all .java files also UTF-8 encoding type (not only properties file where you store UTF-8 characters). This way there no need to use for InputStreamReader also. Also, make sure to compile to UTF-8 encoding.
This has worked for me without any added parameter of UTF-8.
To test this, write a simple stub program in eclipse and change the format of that java file by going to properties of that file and Resource section, to set the UTF-8 encoding format.

Why is text in Swedish from a resource bundle showing up as gibberish? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to use UTF-8 in resource properties with ResourceBundle
I want to allow internationalization to my Java Swing application. I use a bundle file to keep all labels inside it.
As a test I tried to set a Swedish title to a JButton. So in the bundle file I wrote:
nextStepButton=nästa
And in the Java code I wrote:
nextStepButton.setText(bundle.getString("nextStepButton"));
But the title characters of the button appear wrong at runtime:
I am using the Tahoma font, which supports Unicode.
When I set the button title manually through code it appears fine:
nextStepButton.setText("nästa");
Any idea why it fails in bundle file ?
--------------------------------------------> Edit: Encoding the title:
I have tried encoding the text coming from the bundle file using the code:
nextStepButton.setText(new String(bundle.getString("nextStepButton").getBytes("UTF-8")));
And still the result is:
As per the javadoc, properties files are read using ISO-8859-1.
.. the input/output stream is encoded in ISO 8859-1 character encoding. Characters that cannot be directly represented in this encoding can be written using Unicode escapes ; only a single 'u' character is allowed in an escape sequence. The native2ascii tool can be used to convert property files to and from other character encodings.
Apart from using the native2ascii tool to convert UTF-8 properties files to ISO-8859-1 properties files, you can also use a custom ResourceBundle.Control so that you can control the loading of properties files and use UTF-8 there. Here's a kickoff example:
public class UTF8Control extends Control {
public ResourceBundle newBundle
(String baseName, Locale locale, String format, ClassLoader loader, boolean reload)
throws IllegalAccessException, InstantiationException, IOException
{
// The below is a copy of the default implementation.
String bundleName = toBundleName(baseName, locale);
String resourceName = toResourceName(bundleName, "properties");
ResourceBundle bundle = null;
InputStream stream = null;
if (reload) {
URL url = loader.getResource(resourceName);
if (url != null) {
URLConnection connection = url.openConnection();
if (connection != null) {
connection.setUseCaches(false);
stream = connection.getInputStream();
}
}
} else {
stream = loader.getResourceAsStream(resourceName);
}
if (stream != null) {
try {
// Only this line is changed to make it to read properties files as UTF-8.
bundle = new PropertyResourceBundle(new InputStreamReader(stream, "UTF-8"));
} finally {
stream.close();
}
}
return bundle;
}
}
Use it as follows:
ResourceBundle bundle = ResourceBundle.getBundle("com.example.i18n.text", new UTF8Control());
This way you don't need to hassle with native2ascii tool and you end up with better maintainable properties files.
See also:
Unicode - How to get the characters right?
Take a look at Java Internationalization FAQ. If you've put non ASCII characters in your .properties file, you must convert it using the native2ascii tool. Then everything should work.
The problem is that the resource bundle properties file is encoded in UTF-8 but your application is loading it using Latin-1.
If you take "LATIN SMALL A WITH DIAERESIS" (E4 in Latin-1 or 0000E4 as a Unicode codepoint) and represent it as UTF-8, you get C3 A4. If you then treat those as Latin-1 bytes you get "LATIN CAPITAL LETTER A WITH TILDE" and the square "CURRENCY SIGN" character ... which is how the characters are showing in your screenshot of the button!!
(Incidentally, here's a neologism for the mangling you get as a result of using the wrong character encoding ... mojibake. Baffle your friends by using it in conversation.)

Use cyrillic .properties file in eclipse project

I'm developing a small project and I'd like to use internationalization for it. The problem is that when I try to use .properties file with cyrillic symbols inside, the text is displayed as rubbish. When I hard-code the strings it's displayed just fine.
Here is my code:
ResourceBundle labels = ResourceBundle.getBundle("Labels");
btnQuit = new JButton(labels.getString("quit"));
And in my .properties file:
quit = Изход
And I get rubbish. When i try
btnQuit = new JButton("Изход);
It is displayed correctly. As far as I am aware, UTF-8 is the encoding used for the files.
Any ideas?
AnyEdit is an eclipse-plugin that allows you to easily convert your your properties files from and to unicode notation. (avoiding the use of command-line tools like native2ascii)
If you were using the Properties class alone (without resource bundle), since Java 1.6 you have the option to load the file with a custom encoding, using a Reader (rather than an InputStream)
I'd guess you can also use new PropertyResourceBundle(reader), rather than ResourceBundle.getBundle(..), where reader is:
Reader reader = new BufferedReader(new InputStreamReader(
getClass().getResourceAsStream("messages.properties"), "utf-8")));
Properties are ISO-8859-1 encoded by default. You must use native2ascii to convert your UTF-8 properties to a valid ISO-8859-1 properties file containing unicode escape sequences for all non-ISO-8859-1 characters.

Categories