I was working with an app that loads a .properties file with java.util.Properties like this:
Properties _properties = new Properties();
_properties.load(new FileInputStream("app.properties"));
The properties file (initially) was this:
app=myApp
dbLogin=myDbLogin
version=0.9.8.10
server=1
freq=10000
stateGap=360000
The strange thing was that when I called _properties.getProperty("app"), it always returned null, however I could load all of the other properties without any issues. I solved the problem by adding a comment to the top of the properties file, then everything worked fine.
My question is: Why does Java do this? I can't seem to find any documentation about this, and it seems counter-intuitive.
Thanks to #KonstantinV.Salikhov and #pms for their help in hunting this down; I decided to post the answer that was discovered to save people hunting through the comments.
The problem was that my file was the wrong encoding, as mentioned here: http://docs.oracle.com/javase/7/docs/api/java/util/Properties.html
The load(Reader) / store(Writer, String) methods load and store properties from and to a character based stream in a simple line-oriented format specified below. The load(InputStream) / store(OutputStream, String) methods work the same way as the load(Reader)/store(Writer, String) pair, except the input/output stream is encoded in ISO 8859-1 character encoding.
(Emphasis mine).
I changed the encoding of the properties file to ISO-8859-1 and everything worked.
Java does not handle the BOM correctly – you can see it in the properties as key. It is possible to save the file UTF-8 but without BOM. In vim for instance
:set nobomb
See vim wiki
Related
Hi this post is related to my old post. But in this I have achieved much.
I am using FileBasedConfigurationBuilder class of Apache common configuration API for updating property file in java. Below is my code:
FileBasedConfigurationBuilder<PropertiesConfiguration> builder = new FileBasedConfigurationBuilder<PropertiesConfiguration>(
PropertiesConfiguration.class)
.configure(new Parameters().properties().setFileName("test.properties")
.setThrowExceptionOnMissing(true));
PropertiesConfiguration config = builder.getConfiguration();
config.setProperty("Id", "3");
builder.save();
System.out.println("config.properties updated Successfully!!");
Now one of my key have value like
C\://ABC.net\\:1010.
After modification it becomes
C\\://ABC.net\\:1010. Means it single backslash is converting to two single backslash. Previously I was using common configuration jar 1.10 it that forwardslash also getting change. Now I have used common configuration version
commons-configuration2-2.0.jar. By this version only problem in backslash.
Can any one suggested how to avoid this? I need that after modification single backslash should not convert to doublebackslash.Please note that I don't want to change the property file.
I was following below post to reach till here.
PropertiesConfiguration - Using "/" in Property value
In a properties file, the : character is escaped with a single backslash, which is what seems to be happening in your question.
See this question How do you escape colon (:) in Properties file?
which points to the Properties documentation
this is my first post. I'm new in Java. I'm working on file parser. I've tried to identify if it is CSV or another file format, but it looks like it is not quite a standard format. I'm working on apache camel solution (my first and last idea :( ), but maybe some of you recognize this kind of file format? Additionally, I've got .imp file for my output.
Here is my example input:
NrDok:FS-2222/17/W
Data:12.02.2017
SposobPlatn:GOT
NazwaWystawcy:MAAKAI Gawron
AdresWystawcy:33-123 bABA
KodWystawcy:33-112
MiastoWystawcy:bABA
UlicaWystawcy:czysfa 8
NIPWystawcy:123-19-85-123
NazwaOdbiorcy:abc abc-HANDLOWO-USŁUGOWE
AdresOdbiorcy:33-123 fghd
KodOdbiorcy:33-123
MiastoOdbiorcy:Tdsfs
UlicaOdbiorcy:dfdfdA 39
NIPOdbiorcy:82334349
TelefonOdbiorcy:654-522-124
NrOdbiorcyWSieciSklepow:efdsS-sffgsA
IloscLinii:1
Linia:Nazwa{ĆWIARTKA KG}Kod{C1}Vat{5}Jm{kg.}Asortyment{dfgv}Sww{}PKWIU{10.12.10}Ilosc{3.40}Cena{n3.21}Wartosc{n11.83}IleWOpak{1}CenaSp{b0.00}
DoZaplaty:252.32
And here is my example output file:
FH 2015.07.31 2015.07.31 F04443 Gotowka
FO 812-123-45-11 P.a.b.Uc"fdad" abcd deffF UL.fdfgdfdA 12/33 33-123 afvdf
FS 779-19-06-082 badfdf S.A. ul. Wisniowa 89 60-003 Poznan
FP 00218746 CHRZAN TARTY EXTRA POLONAISE 180G SZT 32.00 2.21 8 10.39.17.0 32.00 5900138000055
Is there any easy way to convert the first file to second file format? Maybe you know the type of this file? In a meanwhile, I'm continuing my work with apache camel.
Thanks in advance for your time and help!
I suggest you to play with https://tika.apache.org/1.1/detection.html#Mime_Magic_Detection
It's very good lib for file type recognition.
Here https://www.tutorialspoint.com/tika/tika_document_type_detection.htm we have simple example.
Your file can be read as standard Java .properties file. This type of files allows both = and : as key and value separators. While the fact that it contains non ISO-8859-1 characters like Polish Ć may prevent Java from correctly parsing it.
This line
Nazwa{ĆWIARTKA KG}Kod{C1}Vat{5}Jm{kg.}Asortyment{dfgv}Sww{}PKWIU{10.12.10}Ilosc{3.40}Cena{n3.21}Wartosc{n11.83}IleWOpak{1}CenaSp{b0.00}
Seem to be some custom serialization format of the object in the form
key1{value1}key2{value2}...
Your output file contains lots of data that is not listed in the input which makes me think that there is some data querying from external systems to build the output. You should investigate it yourself. There is no way anyone can guess the transformation with provided input.
Since I do not know of a better solution, I am currently writing small Java classes to process .properties file to merge them, remove duplicate properties, override properties, etc. (I need to process many files and a huge number of properties).
org.apache.commons.configuration.PropertiesConfiguration works great for reading a properties file (using org.apache.commons.configuration.AbstractFileConfiguration.load(InputStream, String), however if I rewrite the file using org.apache.commons.configuration.AbstractFileConfiguration.save(File), I have two problems:
the original layout and comments are lost. I am going to try the PropertiesConfigurationLayout, which is supposed to help here (see How to overwrite one property in .properties without overwriting the whole file?) and post the results
the properties are slightly modified. Accents é and è are rewritten as unicode characters (\u00E9), which I do not want. Afaik .properties files are generally ISO-8859-1 (and I think mine are), so escaping shouldn't be necessary.
Specifying the encoding when calling org.apache.commons.configuration.AbstractFileConfiguration.load(InputStream, String) does not make a difference, because when it is not specified, the same encoding is used by default anyway (private static final String DEFAULT_ENCODING = "ISO-8859-1";). What could I do about that ?
Doing some tests I think you can do what you want, using CombinedConfiguration plus a OverrideCombiner. Basically the properties will be merged automatically and the trick for the layout is to get the layout from one of the loaded files:
CombinedConfiguration props = new CombinedConfiguration();
final PropertiesConfiguration defaultsProps = new PropertiesConfiguration(new File("/tmp/default.properties"));
final PropertiesConfiguration customProps = new PropertiesConfiguration(new File("/tmp/custom.properties"));
props.setNodeCombiner(new OverrideCombiner());
props.addConfiguration(customProps); //first should be loaded the override values
props.addConfiguration(defaultsProps); // last your 'default' values
PropertiesConfiguration finalFile = new PropertiesConfiguration();
finalFile.append(props);
PropertiesConfigurationLayout layout = new PropertiesConfigurationLayout(finalFile, defaultsProps.getLayout()); //here we copy the layout from the 'base file'
layout.save(new FileWriter(new File("/tmp/app.properties")));
The issue with the encoding I don't know if its possible to find a solution.
I'm trying to read Hebrew values from a .properties file and I get gibberish. I've tried a couple of ways, including changing the file's encoding (Cp1255, ISO-8859-8, UTF-8), adding a -file.encoding to the arguments and nothing helped.
This issue was raised during our migration to Weblogic from IAS (OC4J container), I noticed the javascript messages (which are read from a .properties file) appear as ???? ???, which does not happen on the OC4J. However, this only applies to data read from .properties files, everything else is shown fine.
I've been googling for a couple of days now and I haven't been able to come up with a solution.
EDIT:
What I tried at home
ResourceBundle rb = ResourceBundle.getBundle("test");
System.out.println(rb.getString("test"));
This is what test.properties looks like:
test שלום
Output is: ùìåí
Since you have an instance of ResourceBundle and you can get the ISO-8859-1 encoded String with:
String strISO = rb.getString("test");
You can then convert this to UTF-8 and print it by using:
System.out.println(new String(strISO.getBytes("ISO-8859-1"), "UTF-8"));
Files for ResourceBundle should be in ISO 8859-1 encoding or in \uXXXX format, just as for Properties. See more at How to use UTF-8 in resource properties with ResourceBundle
I want to write "Arabic" in the message resource bundle (properties) file but when I try to save it I get this error:
"Save couldn't be completed
Some characters cannot be mapped using "ISO-85591-1" character encoding. Either change encoding or remove the character ..."
Can anyone guide please?
I want to write:
global.username = اسم المستخدم
How should I write the Arabic of "username" in properties file? So, that internationalization works..
BR
SC
http://sourceforge.net/projects/eclipse-rbe/
You can use the above plugin for eclipse IDE to make the Unicode conversion for you.
As described in the class reference for "Properties"
The load(Reader) / store(Writer, String) methods load and store properties from and to
a character based stream in a simple line-oriented format specified below.
The load(InputStream) / store(OutputStream, String) methods work the same way as the
load(Reader)/store(Writer, String) pair, except the input/output stream is encoded in
ISO 8859-1 character encoding. Characters that cannot be directly represented in this
encoding can be written using Unicode escapes ; only a single 'u' character is allowed
in an escape sequence. The native2ascii tool can be used to convert property files to
and from other character encodings.
Properties-based resource bundles must be encoded in ISO-8859-1 to use the default loading mechanism, but I have successfully used this code to allow the properties files to be encoded in UTF-8:
private static class ResourceControl extends ResourceBundle.Control {
#Override
public ResourceBundle newBundle(String baseName, Locale locale,
String format, ClassLoader loader, boolean reload)
throws IllegalAccessException, InstantiationException,
IOException {
String bundlename = toBundleName(baseName, locale);
String resName = toResourceName(bundlename, "properties");
InputStream stream = loader.getResourceAsStream(resName);
return new PropertyResourceBundle(new InputStreamReader(stream,
"UTF-8"));
}
}
Then of course you have to change the encoding of the file itself to UTF-8 in your IDE, and can use it like this:
ResourceBundle bundle = ResourceBundle.getBundle(
"package.Bundle", new ResourceControl());
new String(ret.getBytes("ISO-8859-1"), "UTF-8"); worked for me.
property file saved in ISO-8859-1 Encodiing.
If you are using Eclipse, you can choose "Window-->Preferences" and then filter on "content types". Then you should be able to set the default encoding. There's a screen shot showing this at the top of this post.
This is mainly an editor configuration issue. If you're working in Windows, you can edit the text in an editor that supports UTF-8. Notepad or Eclipse built-in editor should be more than enough, provided you've saved file as UTF-8. In Linux, I've used gedit and emacs successfully. In Notepad, you can do this by clicking 'Save As' button and choosing 'UTF-8' encoding. Other editors should have similar feature. Some editors might require font change in order to display letters correctly, but it seems that you don't have this issue.
Having said that, there are other steps to consider when performing i18n for arabic. You can find some useful links below. Make sure to use native2ascii on properties file before using it otherwise it might not work. I spent a lot of time until I figured this one out.
Tomcat webapps
Using nativ2ascii with properties files
Besides native2ascii tool mentioned in other answers there is a java Open Source library that can provide conversion functionality to be used in code
Library MgntUtils has a Utility that converts Strings in any language (including special characters and emojis to unicode sequence and vise versa:
result = "Hello World";
result = StringUnicodeEncoderDecoder.encodeStringToUnicodeSequence(result);
System.out.println(result);
result = StringUnicodeEncoderDecoder.decodeUnicodeSequenceToString(result);
System.out.println(result);
The output of this code is:
\u0048\u0065\u006c\u006c\u006f\u0020\u0057\u006f\u0072\u006c\u0064
Hello World
The library can be found at Maven Central or at Github It comes as maven artifact and with sources and javadoc
Here is javadoc for the class StringUnicodeEncoderDecoder