I'm designing a new application in JSE which I want to internationalize.
I've never done such an application. I'm looking for the best practices about the internationalization. The application while be writing the translated data in files or DB. I've searched about best practices but I didn't found anything about my main question(the first one).
Should I put all the internationalization data in some layer or next to the object they are about ?
Could I directly use the properties files as a kind of enum to do a switch case ?
Or can I reverse engineer the data catched and know the default internationalize value and work with it?
I did encounter several strategies. I would start with a properties file.
One factor is that the data must be professionally maintained:
keep it in version control.
keep a version number for us humans, "1.0.23"
keep the texts ordered and nice, to help translation.
keep a second properties file with a glossary for consistent translation.
Undermore I did see generating properties or java ListResourceBundles from DocBook XML, Excel, translation memories. And yes, database.
Maintenance of data must be done careful, as several different parties will use the text at different times.
Programming tools, consistency checks and preparing data, communicating are tasks not to neglect.
Properties files are not entirely ideal, but IDEs have generally some support for them.
Set up everything for UTF-8, though take notice that properties files use ISO-8859-1, but you can use \uXXXX escaping or do a encoding conversion in your build process. ListResourceBundle java sources, generated than, would be an alternative.
For my internship i've been asked to do some research on software internationalization and the current practices and solutions.
I've done some research and have come to no viable solution. My project manager has asked that I ask on stackoverflow,
What are the current practices that you guys at your job do in order to internationalize your Java software?
EDIT
The following is a summary of my research in case any other person is interested in my findings:
As the software is written in Java, RessourceBundles are obviously used. RessourceBundles provide good key value lookup with fallback to default values if no specific translation for the current locale exists. ResourceBundles are also not limited to translation of text but to internationalization of, well, resources. For example, color or images mean differente things for different cultures.
While all that is nice, just purely using Java PropertiesResourceBundles fails to provide metadata for the translator and fails to handle plural forms.
GNU Gettext takes an alternate approche to internationalization. Messages are written in source code in english and then extracted and stored into a file. The extraction program searches for function calls and extracts the parameters. For example, tr("Hello, World!") the command line utility xgettext would search for occurences of the function "tr" and extract all string literals.
Java implementations of gettext exist, such as:
https://code.google.com/p/gettext-commons/
https://github.com/jhorstmann/i18n
What gettext provides that ResourceBundles don't is plural handling and context for translations.
Have a read of this trail as it should answer most of your questions.
For web applications we use the standard facilities offered by JavaEE. That essentially means passing a message bundle into a JSF page and then using mark up that looks like this #{msg.hello} in the page. "msg" is the name of the message bundle and "hello" is the key that will be used to look up the translated string.
The translations are all held in properties files which have a standardized format and naming convention. The process works in much the same way for client applications although I don't feel it's quite as smooth
As I understand it professional translators have software that will load properties files and assist them in producing the translations. Adding comments to your properties files is useful so the translators have some context when translating.
In addition to other answers I would suggest using some technique/software that can analyze/check that all localization resources in your project are in sync.
That usually should be done during build time, so you can find/catch errors earlier.
One of such tools that I personally use and would recommend is i18n-maven-plugin
Hope this helps.
In Android applications, resources are specified in xml documents, which automatically are built into the R class, readily accessible within the source code as strongly typed.
Is there any way I could use a similar approach for a regular Java desktop application?
What I'd like to accomplish, is both the removal of strings from the code (as a separation of "layers", more or less) and to make it easy to add support for localization, by simply telling the program to choose the xml file corresponding to the desired language.
I've googled around a bit, but the things I'm looking for seem to be drowning in results about parsing or outputting xml, rather than tools utilizing xml to generate code.
Eclipse's message bundle implementation (used by plugins for example) integrates with the Externalize Strings feature and generates both a static class and a resource properties file for your strings:
http://www.eclipse.org/eclipse/platform-core/documents/3.1/message_bundles.html
For this integration to work Eclipse needs to see org.eclipse.osgi.util.NLS on the class path. From memory, the dependencies of the libraries it was available in were a little tricky for the project I used this approach in, so I just got the source and have it as a stand-alone class in my core module (see the comments for more on that).
It provides the type safety you're looking for and the IDE features save a lot of time. I've found no downsides to the approach so far.
Edit: this is actually what ghostbust555 mentioned in the comments, but not clear in that article that this isn't limited to Eclipse plugins and you refer to your resources via static members of a messages class.
I haven't seen any mention of others using this approach with their own applications, but to me it makes complete sense given the IDE integration and type safety.
I'm not sure if this is what you mean but check out internationalization- http://netbeans.org/kb/docs/java/gui-automatic-i18n.html
Are you looking for something that parses XML files and generates Java instances of similar "struct-like" objects, like JAXP, and JAXB?
I came across ResGen which, given resource bundle XML files generates Java files that can be used to access the resources in a type-safe way.
http://eigenbase.sourceforge.net/resgen/
Javascript is executed by Java application. However, something like Jquery library is really too long to fit into a String variable. I am able to read jquery.js from a file but not sure how to package it inside the .jar file.
Loading the .js files is the same as loading any other resource from a jar file. Generally, this is what I do:
For files stored in the root of the jar file:
SomeClass.getClass().getClassLoader.getResourceAsStream( "myFile.js" );
For files stored along side a .class file in the jar:
SomeClass.getClass().getResourceAsStream( "myFile.js" )
Both techniques give you an InputStream. This can be turned into a String with code a little bit more work. See Read/convert an InputStream to a String.
This technique is for when your resource files are in the same jar as your java class files.
There are all sorts of places you can keep your JavaScript sources:
In the CLASSPATH. You fetch them with getResourceAsStream()
In the database. Yes, the database. You fetch them like you'd fetch any other CLOB.
Personally I've use both approaches for different purposes. You can keep your JavaScript files around in your build tree in a way that exactly parallels the way you keep .properties files. Personally I just keep them in with the .java files and then have a build rule to make sure they end up in the .war, but they can really live anywhere your build engine can find them.
The database is a nice place to keep scripts because it makes it much easier for your web application to support a "script portal" that allows dynamic updates. That's an extremely powerful facility to have, especially if you craft the web application so that Javascript modules control some of the more important business logic, because you can deploy updates more-or-less "live" without anything like a deployment operation.
One thing that helps a lot is to create some utility code to "wrap" whatever access path you're using to Javascript (that is, either the Sun "javax.script" stuff, or else the Rhino bindings; at this point in time, personally I'd go with straight Rhino because it really doesn't make much difference one way or the other anyway, and the Sun stuff is stuck with a fairly old and buggy Rhino version that in the current climate will probably not see an update for a while). With a utility wrapper, one of the most important things to do is make it possible for your JavaScript code (wherever it comes from) to import other JavaScript files from your server infrastructure. That way you can develop JavaScript tool libraries (or, of course, adapt open-source libraries) and have your business logic scripts import and use them.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
OK, so I don't want to start a holy-war here, but we're in the process of trying to consolidate the way we handle our application configuration files and we're struggling to make a decision on the best approach to take. At the moment, every application we distribute is using it's own ad-hoc configuration files, whether it's property files (ini style), XML or JSON (internal use only at the moment!).
Most of our code is Java at the moment, so we've been looking at Apache Commons Config, but we've found it to be quite verbose. We've also looked at XMLBeans, but it seems like a lot of faffing around. I also feel as though I'm being pushed towards XML as a format, but my clients and colleagues are apprehensive about trying something else. I can understand it from the client's perspective, everybody's heard of XML, but at the end of the day, shouldn't be using the right tool for the job?
What formats and libraries are people using in production systems these days, is anyone else trying to avoid the angle bracket tax?
Edit: really needs to be a cross platform solution: Linux, Windows, Solaris etc. and the choice of library used to interface with configuration files is just as important as the choice of format.
YAML, for the simple reason that it makes for very readable configuration files compared to XML.
XML:
<user id="babooey" on="cpu1">
<firstname>Bob</firstname>
<lastname>Abooey</lastname>
<department>adv</department>
<cell>555-1212</cell>
<address password="xxxx">ahunter#example1.com</address>
<address password="xxxx">babooey#example2.com</address>
</user>
YAML:
babooey:
computer : cpu1
firstname: Bob
lastname: Abooey
cell: 555-1212
addresses:
- address: babooey#example1.com
password: xxxx
- address: babooey#example2.com
password: xxxx
The examples were taken from this page: http://www.kuro5hin.org/story/2004/10/29/14225/062
First: This is a really big debate issue, not a quick Q+A.
My favourite right now is to simply include Lua, because
I can permit things like width=height*(1+1/3)
I can make custom functions available
I can forbid anything else. (impossible in, for instance, Python (including pickles.))
I'll probably want a scripting language somewhere else in the project anyway.
Another option, if there's a lot of data is to use sqlite3, because they're right to claim
Small.
Fast.
Reliable.
Choose any three.
To which I would like to add:
backups are a snap. (just copy the db file.)
easier to switch to another db, ODBC, whatever. (than it is from fugly-file)
But again, this is a bigger issue. A "big" answer to this probably involves some kind of feature matrix or list of situations like:
Amount of data, or short runtime
For large amounts of data, you might want efficient storage, like a db.
For short runs (often), you might want something that you don't need to do a lot of parsing for, consider something that can be mmap:ed in directly.
What does the configuration relate to?
Host:
I like YAML in /etc. Is that reimplemented in windows?
User:
Do you permit users to edit config with text editor?
Should it be centrally manageable? Registry / gconf / remote db?
May the user have several different profiles?
Project:
File(s) in project directory? (Version control usually follows this model...)
Complexity
Are there only a few flat values? Consider YAML.
Is the data nested, or dependent in some way? (This is where it gets interesting.)
Might it be a desirable feature to permit some form of scripting?
Templates can be viewed as a kind of configuration files..
XML XML XML XML. We're talking config files here. There is no "angle bracket tax" if you're not serializing objects in a performance-intense situation.
Config files must be human readable and human understandable, in addition to machine readable. XML is a good compromise between the two.
If your shop has people that are afraid of that new-fangled XML technology, I feel bad for you.
Without starting a new holy war, the sentiments of the 'angle bracket tax' post is one area where I majorly disagree with Jeff. There's nothing wrong with XML, it's reasonably human readable (as much as YAML or JSON or INI files are) but remember its intent is to be read by machines. Most language/framework combos come with an XML parser of some sort for free which makes XML a pretty good choice.
Also, if you're using a good IDE like Visual Studio, and if the XML comes with a schema, you can give the schema to VS and magically you get intellisense (you can get one for NHibernate for example).
Ulimately you need to think about how often you're going to be touching these files once in production, probably not that often.
This still says it all for me about XML and why it's still a valid choice for config files (from Tim Bray):
"If you want to provide general-purpose data that the receiver might want to do unforeseen weird and crazy things with, or if you want to be really paranoid and picky about i18n, or if what you’re sending is more like a document than a struct, or if the order of the data matters, or if the data is potentially long-lived (as in, more than seconds) XML is the way to go.
It also seems to me that the combination of XML and XPath hits a sweet spot for data formats that need to be extensible; that is to say, it’s pretty easy to write XML-processing code that won’t fail in the presence of changes to the message format that don’t touch the piece you care about."
#Guy
But application config isn't always just key/value pairs. Look at something like the tomcat configuration for what ports it listens on. Here's an example:
<Connector port="80" maxHttpHeaderSize="8192"
maxThreads="150" minSpareThreads="25" maxSpareThreads="75"
enableLookups="false" redirectPort="8443" acceptCount="100"
connectionTimeout="20000" disableUploadTimeout="true" />
<Connector port="8009"
enableLookups="false" redirectPort="8443" protocol="AJP/1.3" />
You can have any number of connectors. Define more in the file and more connectors exist. Don't define any more and no more exist. There's no good way (imho) to do that with plain old key/value pairs.
If your app's config is simple, then something simple like an INI file that's read into a dictionary is probably fine. But for something more complex like server configuration, an INI file would be a huge pain to maintain, and something more structural like XML or YAML would be better. It all depends on the problem set.
We are using ini style config files. We use the Nini library to manage them. Nini makes it very easy to use. Nini was orignally for .NET but it has been ported to other platforms using Mono.
XML, JSON, INI.
They all have their strengths and weaknesses.
In an application context, I feel that the abstraction layer is the important thing.
If you can choose a way to structure the data that is a good middle ground between human readability and how you want to access/abstract the data in code, you're golden.
We mostly use XML where I work, and I cant really believe that a configuration file loaded into a cache as objects when first read or after it has been written to, and then abstracted away from the rest of the program, really is that much of a hit on neither CPU nor disk space.
And it is pretty readable too, as long as you structure the file right.
And all languages on all platforms supports XML through some pretty common libraries.
#Herms
What I really meant was to stick to the recommended way software should store configuration values for any given platform.
What you often get then is also the recommended ways these should/can be modified. Like a configuration menu in a program or a configuration panel in a "system prefs" application (for system services softwares ie). Not letting the end users modify them directly via RegEdit or NotePad...
Why?
The end users (=customers) are used to their platforms
System for backups can better save "safe setups" etc
#ninesided
About " choice of library ", try to link in (static link) any selected library to lower the risk of getting into a version-conflict-war on end users machines.
If your configuration file is write-once, read-only-at-bootup, and your data is a bunch of name value pairs, your best choice is the one your developer can get working first.
If your data is a bit more complicated, with nesting etc, you are probably better off with YAML, XML, or SQLite.
If you need nested data and/or the ability to query the configuration data after bootup, use XML or SQLite. Both have pretty good query languages (XPATH and SQL) for structured/nested data.
If your configuration data is highly normalized (e.g. 5th normal form) you are better off with SQLite because SQL is better for dealing with highly normalized data.
If you are planning to write to the configuration data set during program operation, then you are better off going with SQLite. For example, if you are downloading configuration data from another computer, or if you are basing future program execution decisions on data collected in previous program execution. SQLite implements a very robust data storage engine that is extremely difficult to corrupt when you have power outages or programs that are hung in an inconsistent state due to errors. Corruptible data leads to high field support costs, and SQLite will do much better than any home-grown solution or even popular libraries around XML or YAML.
Check out my page for more information on SQLite.
As far as I know, the Windows registry is no longer the preferred way of storing configuration if you are using .NET - most applications now make use of System.Configuration [1, 2]. Since this is also XML based it seems to be that everything is moving in the direction of using XML for configuration.
If you want to stay cross-platform I would say that using some sort of a text file would be the best route to go. As for the formatting of said file, you might want to take into account if a human is going to be manipulating it or not. XML seems to be a bit more friendly to manual manipulation than INI files due to the visible structure of the file.
As for the angle bracket tax - I don't worry about it too often as the XML libraries take care of abstracting it. The only time it might be a consideration is if you have very little storage space to work with and every byte counts.
[1] System.Configuration Namespace - http://msdn.microsoft.com/en-us/library/system.configuration.aspx
[2] Using Application Configuration Files in .NET - http://www.developer.com/net/net/article.php/3396111
We are using properties files, simply because Java supports them natively. A couple of months ago I saw that SpringSource Application Platform uses JSON to configure their server and it looks very interesting. I compared various configuration notations and came to the conclusion that XML seems to be the best fit at the moment. It has nice tools support and is rather platform independent.
Re: epatel's comment
I think the original question was asking about application configuration that an admin would be doing, not just storing user preferences. The suggestions you gave seem more for user prefs than application config, and aren't usually something that the user would ever deal with directly (the app should provide the configuration options in the UI, and then update the files). I really hope you'd never make the user have to view/edit the Registry. :)
As for the actual question, I'd say XML is probably OK, as plenty of people will be used to using that for configuration. As long as you organize the configuration values in an easy to use manner then the "angle bracket tax" shouldn't be too bad.
Maybe a bit of a tangent here but my opinion is that the config file should be read into a key value dictionary/hash table when the app first starts up and always accessed via this object from then on for speed. Typically the key/value table starts off as string to string but helper functions in the object do things such DateTime GetConfigDate(string key) etc...
I think the only important thing is to choose a format that you prefer and can navigate quickly. XML and JSON are both fine formats for configs and are widely supported--technical implementation isn't at the crux of the issue, methinks. It's 100% about what makes the task of config files easier for you.
I have started using JSON, because I work quite a bit with it as a data transport format, and the serializers make it easy to load into any development framework. I find JSON easier to read than XML, which makes handling multiple services, each using a config file that is modified quite frequently, that much easer for me!
What platform are you working on? I'd recommend trying to use the preferred/common method for it.
MacOSX - plists
Win32 - Registry (or are there a new one here, long since I developed on it)
Linux/Unix - ~/.apprc (name-value perhaps)