Java properties : ï¿½ displayed instead of ä

Java properties : ï¿½ displayed instead of ä - java

I have a property file with the following key and value:
elsi.log.status.1 = Keine Änderungen
But the character Ä is not properly displayed on my webpage.
The output is ï¿½
But if i use the faces-config and then directly display a message from the xhtml the message is displayed same as in the property file
This is the method used to get values from the propertyfile in java. When I debug the value is allready wrong here (bundle.getString(key) returns Keine ï¿½nderungen)
public static String getString(String key) {
try {
Locale locale = CurrentEnvironment.getLocale();
ResourceBundle bundle = ResourceBundle.getBundle(BUNDLE_NAME, locale);
if (bundle != null) {
return bundle.getString(key);
}
} catch (MissingResourceException e) {
return '!' + key + '!';
}
return '!' + key + '!';
}
Direct output with xhtml works
<h:outputText value="#{messages.elsi_copyright}" />
I also noticed that replacing the chars in the propertyfile with hexcodes helped but i want to know if it is possible to do this otherwise.
Thanks for your help

The problem is that ResourceBundle.getBundle() uses ISO Latin-1 encoding for reading the bundles and hence can't interpret UTF-8 files (which would be the case when inserting non-Latin-1 characters like ä etc.).
Currently I can think of 2 solutions:
Replace every special character with an encoded form, e.g. by using unicode point in the form \u00E4 for ä etc.
since Java 5 ResourceBundle provides a means to read UTF-8 files, although the internal caching and fallback mechanism won't work in that case and you'd have to do that yourself.
Update: instead of an example for no. 2, please have a look here: How to use UTF-8 in resource properties with ResourceBundle
There are a lot of good resources that should help you deal with bundles containing umlauts etc. in a way that fits your needs.

Property files are hard-encoded in ISO-8859-1.
If your page has another encoding (say, UTF-8), you will encouter encoding problems.
Fortunately, Java ResourceBundles provide an alternative way : using XML files.
You don't have to change anything in your code, just use the XML format as described in the javadoc of the java.util.Properties file. Those files can be encoded in any encoding, provided you specify it in the XML header.
Its doctype is
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
Example :
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
<entry key="com.compant.key1">value1</entry>
<entry key="com.company.key2">value2</entry>
...
</properties>

Related

Java - Escaped backslashes being taken literally when writing to file

I want to store a URL in a properties file. This is the URL:
jdbc\:sqlserver\://dummydata\\SHARED
When programming this in Java, I obviously need to escape the backslashes. So my code ends up looking like this
properties.setProperty("db", "jdbc\\:sqlserver\\://dummydata\\\\SHARED");
The issue with this is that the properties file is saving the String URL and including the backslashes used for escaping, which is an incorrect URL. I was hoping that Java would interpret the backslashes used for escaping so that only the correct URL is saved. Is there a way to achieve this?

You're correct that a property value with : needs to escape the colons in a .properties text file, but you're not writing that text file directly.
You are giving the value to a Properties object using setProperty(), and presumably writing that to a text file using store(), and the store() method will escape the values as needed for you.
You should give the value you want to Properties, and forget about the encoding rules of the text file. Properties will handle all needed encoding. Since the value you want to give is jdbc:sqlserver://dummydata\SHARED, you write a string literal "jdbc:sqlserver://dummydata\\SHARED"
Example
String db = "jdbc:sqlserver://dummydata\\SHARED";
System.out.println(db); // To see actual string value
Properties properties = new Properties();
properties.setProperty("db", db);
try (FileWriter out = new FileWriter("test.properties")) {
properties.store(out, null);
}
Output
jdbc:sqlserver://dummydata\SHARED
Content of test.properties
#Tue Jun 11 11:54:24 EDT 2019
db=jdbc\:sqlserver\://dummydata\\SHARED
As you can see, the store() method has escaped the : and \ for you.
If you save the properties as an XML file instead, there's no need to escape anything, and Properties won't.
Example
try (FileOutputStream out = new FileOutputStream("test.xml")) {
properties.storeToXML(out, null);
}
Content of test.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
<entry key="db">jdbc:sqlserver://dummydata\SHARED</entry>
</properties>

Properties.store() escapes backslashes, there is no way around it. I guess my first question is why is this an issue? Are you reading the file in any other way than using Properties.load(). If not they you don't need to worry about it as the load function will remove the escape characters.
properties.load(file);
System.out.println(properties.get("db"));
// output: jdbc\:sqlserver\://dummydata\\SHARED
As an aside are you sure you the URL is correct? Shouldn't you be storing it as properties.setProperty("jdbc:sqlserver://dummydata\SHARED")?

In the documentation for load, it says the following:
The method does not treat a backslash character, \, before a non-valid escape character as an error; the backslash is silently dropped. For example, in a Java string the sequence "\z" would cause a compile time error. In contrast, this method silently drops the backslash. Therefore, this method treats the two character sequence "\b" as equivalent to the single character 'b'.
This means that two backslashes will be treated as a single one because it's not a valid escape sequence. Loading this string should work just fine:
C:\\path\\to\\file

Java .properties file gibberish when trying to get value in Hebrew

I'm trying to read Hebrew values from a .properties file and I get gibberish. I've tried a couple of ways, including changing the file's encoding (Cp1255, ISO-8859-8, UTF-8), adding a -file.encoding to the arguments and nothing helped.
This issue was raised during our migration to Weblogic from IAS (OC4J container), I noticed the javascript messages (which are read from a .properties file) appear as ???? ???, which does not happen on the OC4J. However, this only applies to data read from .properties files, everything else is shown fine.
I've been googling for a couple of days now and I haven't been able to come up with a solution.
EDIT:
What I tried at home
ResourceBundle rb = ResourceBundle.getBundle("test");
System.out.println(rb.getString("test"));
This is what test.properties looks like:
test שלום
Output is: ùìåí

Since you have an instance of ResourceBundle and you can get the ISO-8859-1 encoded String with:
String strISO = rb.getString("test");
You can then convert this to UTF-8 and print it by using:
System.out.println(new String(strISO.getBytes("ISO-8859-1"), "UTF-8"));

Files for ResourceBundle should be in ISO 8859-1 encoding or in \uXXXX format, just as for Properties. See more at How to use UTF-8 in resource properties with ResourceBundle

Reading properties file via Resourcebundle encoding issue

I'm using a resourcebundle to read a properties file based on locale. (Lang_en_US.properties, ..)
The resourcebundle is read as iso-8859-1 (standard?).
ResourceBundle rb= ResourceBundle.getBundle("Lang", locale);
The resourcebundle is then used throughout the Spring/JSF web-application to generate the front-end text.
<h:outputText value="#{msg['message.example']}" />
But I believe this is irrelevant, as debugging shows that the text is already gibberish right after rb getMessage is called.
// returns gibberish:
log.trace(rb.getMessage("l_SampleText"));

I believe you are correct in assuming the resourcebundle is read in as iso-8859-1.
Javadoc of Properties class
(source)
Are you sure your properties file is saved under the iso-8859-1 format?
I believe Notepad++ provides the functionality to at least check the encoding, if not convert it.

Since you are using UTF-8 characters that might not be encoded properly in ISO-8859-1, you have 2 options
Use nativetoascii tool to escape the characters in the bundle so that all the chars are read properly
Use Spring's MessageSources's bundle support which supports files in UTF-8 encoding

Is it possible to display "<" or ">" in generated XML source using XStreamMarshaller

I have been trying to use XStreamMarshaller to generate XML output in my Java Spring project. The XML I am generating has CDATA values in the element text. I am manually creating this CDATA text in the command object like this:
f.setText("<![CDATA[cdata-text]]>");
The XStreamMarshaller generated the element(text-data below is an alias) as:
<text-data><![CDATA[cdata-text]]></text-data>
The above XML display is as expected (Please ignore the back slash in the above element name: forum formatting). But when I do a View Source on the XML output generated I see this for the element: <text-data><![CDATA[cdata-text]]></text-data>.
Issue:
As you can see the less than and greater than characters have been replaced by < and > in the View Source. I need my client to read the source and identify CDATA section from the XML output which it will not in the above scenario.
Is there a way I can get the XStreamMarshaller to escape special characters in the text I provided?
I have set the encoding of the Marshaller to ISO-8859-1 but that does not work either. If the above cannot be done by XStreamMarshaller can you please suggest alternate marshallers/unmarshallers that can do this for me?
// Displaying my XML and View Source as suggested by Paŭlo Ebermann below:
XML View (as displayed in IE):
An invalid character was found in text content. Error processing resource 'http://localhost:8080/file-service-framework/fil...
Los t
View Source:
<service id="file-text"><text-data><![CDATA[
Los túneles a través de las montañas hacen más fácil viajar por carretera.
]]></text-data></service>
Thanks you very much.

Generating CDATA sections is the task of your XML-generating library, not of its client. So you should simply have to write
f.setText("cdata-text");
and then the library can decide whether to use <![CDATA[...]]> or the <-escaping for its contents. It should make no difference for the receiver.
Edit:
Looking at your output, it looks right (apart from the CDATA) - here you must work on your input, as said.
If IE throws an error here, most probably you don't have declared the right encoding.
I don't really know much about the Spring framework, but the encoding used by the Marshaller should be the same encoding as the encoding sent in either the HTTP header (Content-Type: ... ;charset=...) or the <?xml version="1.0" encoding="..." ?> XML prologue (these two should not differ, too).
I would recommend UTF-8 as encoding everywhere, as this can represent all characters, not only the Latin-1 ones.

Java: How to write "Arabic" in properties file?

I want to write "Arabic" in the message resource bundle (properties) file but when I try to save it I get this error:
"Save couldn't be completed
Some characters cannot be mapped using "ISO-85591-1" character encoding. Either change encoding or remove the character ..."
Can anyone guide please?
I want to write:
global.username = اسم المستخدم
How should I write the Arabic of "username" in properties file? So, that internationalization works..
BR
SC

http://sourceforge.net/projects/eclipse-rbe/
You can use the above plugin for eclipse IDE to make the Unicode conversion for you.

As described in the class reference for "Properties"
The load(Reader) / store(Writer, String) methods load and store properties from and to
a character based stream in a simple line-oriented format specified below.
The load(InputStream) / store(OutputStream, String) methods work the same way as the
load(Reader)/store(Writer, String) pair, except the input/output stream is encoded in
ISO 8859-1 character encoding. Characters that cannot be directly represented in this
encoding can be written using Unicode escapes ; only a single 'u' character is allowed
in an escape sequence. The native2ascii tool can be used to convert property files to
and from other character encodings.

Properties-based resource bundles must be encoded in ISO-8859-1 to use the default loading mechanism, but I have successfully used this code to allow the properties files to be encoded in UTF-8:
private static class ResourceControl extends ResourceBundle.Control {
#Override
public ResourceBundle newBundle(String baseName, Locale locale,
String format, ClassLoader loader, boolean reload)
throws IllegalAccessException, InstantiationException,
IOException {
String bundlename = toBundleName(baseName, locale);
String resName = toResourceName(bundlename, "properties");
InputStream stream = loader.getResourceAsStream(resName);
return new PropertyResourceBundle(new InputStreamReader(stream,
"UTF-8"));
}
}
Then of course you have to change the encoding of the file itself to UTF-8 in your IDE, and can use it like this:
ResourceBundle bundle = ResourceBundle.getBundle(
"package.Bundle", new ResourceControl());

new String(ret.getBytes("ISO-8859-1"), "UTF-8"); worked for me.
property file saved in ISO-8859-1 Encodiing.

If you are using Eclipse, you can choose "Window-->Preferences" and then filter on "content types". Then you should be able to set the default encoding. There's a screen shot showing this at the top of this post.

This is mainly an editor configuration issue. If you're working in Windows, you can edit the text in an editor that supports UTF-8. Notepad or Eclipse built-in editor should be more than enough, provided you've saved file as UTF-8. In Linux, I've used gedit and emacs successfully. In Notepad, you can do this by clicking 'Save As' button and choosing 'UTF-8' encoding. Other editors should have similar feature. Some editors might require font change in order to display letters correctly, but it seems that you don't have this issue.
Having said that, there are other steps to consider when performing i18n for arabic. You can find some useful links below. Make sure to use native2ascii on properties file before using it otherwise it might not work. I spent a lot of time until I figured this one out.
Tomcat webapps
Using nativ2ascii with properties files

Besides native2ascii tool mentioned in other answers there is a java Open Source library that can provide conversion functionality to be used in code
Library MgntUtils has a Utility that converts Strings in any language (including special characters and emojis to unicode sequence and vise versa:
result = "Hello World";
result = StringUnicodeEncoderDecoder.encodeStringToUnicodeSequence(result);
System.out.println(result);
result = StringUnicodeEncoderDecoder.decodeUnicodeSequenceToString(result);
System.out.println(result);
The output of this code is:
\u0048\u0065\u006c\u006c\u006f\u0020\u0057\u006f\u0072\u006c\u0064
Hello World
The library can be found at Maven Central or at Github It comes as maven artifact and with sources and javadoc
Here is javadoc for the class StringUnicodeEncoderDecoder

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java properties : ï¿½ displayed instead of ä - java

Related

Java - Escaped backslashes being taken literally when writing to file

Java .properties file gibberish when trying to get value in Hebrew

Reading properties file via Resourcebundle encoding issue

Is it possible to display "<" or ">" in generated XML source using XStreamMarshaller

Java: How to write "Arabic" in properties file?

Categories

Resources