java locale getDisplayLanguage is missing a few locales - java

I'm using locale.getDisplayLanguage(otherLocale) to get display several languages names in one locale.
If i'm using the following:
Locale loc = new Locale("he","IL");
Locale loc2 = new Locale("fr","FR");
System.out.println(loc2.getDisplayLanguage(loc));
It doesn't print the french language name in Hebrew.
Other locales that doesnt work are Arabic, finnish and some more.
do you have any idea? is java missing some locales translations?
Thanks,
Tal.

"is java missing some locales
translations?"
It would seem so. In fact, I think I spotted a posting on the Sun forums site that said the same.
However, I couldn't see a current Bug in the Sun Java Bug database. Why don't you create one?

Related

How to use unsupported Locale in Java 11 and numbers in String.format()

How can I use an unsupported Locale (eg. ar-US) in JAVA 11 when I output a number via String.format()?
In Java 8 this worked just fine (try jdoodle, select JDK 1.8.0_66):
Locale locale = Locale.forLanguageTag("ar-US");
System.out.println(String.format(locale, "Output: %d", 120));
// Output: 120
Since Java 11 the output is in Eastern Arabic numerals (try jdoodle, use default JDK 11.0.4):
Locale locale = Locale.forLanguageTag("ar-US");
System.out.println(String.format(locale, "Output: %d", 120));
// Output: ١٢٠
It seems, this problem comes from the switch in the Locale Data Providers form JRE to CLDR (source: Localization Changes in Java 9 by #mcarth). Here is a list of supported locales: JDK 11 Supported Locales
UPDATE
I updated the questions example to ar-US, as my example before didn't make sense. The idea is to have a format which makes sense in that given country. In the example it would be the United States (US).
The behavior conforms to the CLDR being treated as the preferred Locale. To confirm this, the same snippet in Java-8 could be executed with
-Djava.locale.providers=CLDR
If you step back to look at the JEP 252: Use CLDR Locale Data by Default, the details follow :
The default lookup order will be CLDR, COMPAT, SPI, where COMPAT
designates the JRE's locale data in JDK 9. If a particular provider
cannot offer the requested locale data, the search will proceed to the
next provider in order.
So, in short if you really don't want the default behaviour to be that of Java-11, you can change the order of lookup with the VM argument
-Djava.locale.providers=COMPAT,CLDR,SPI
What might help further is understanding more about picking the right language using CLDR!
I'm sure I'm missing some nuance, but the problem is with your tag, so fix that. Specifically:
ar-EN makes no sense. That's short for:
language = arabic
country = ?? nobody knows.
EN is not a country. en is certainly a language code (for english), but the second part in a language tag is for country, and EN is not a country. (for context, there is en-GB for british english and en-US for american english).
Thus, this is as good as ar (as in, language = arabic, not tied to any particular country). Even if you did tie it to some country, that is mostly immaterial here; that would affect things like 'what is the first day of the week' ,'which currency symbol is to be presumed' and 'should temperatures be stated in Kelvin or Fahrenheit' perhaps. It has no bearing on how to show digits, because that's all based on language.
And language is arabic, thus, ١٢٠ is what you get when you try ar as a language tag when printing the number 120. The problem is that you expect this to return "120" which is a bizarre wish1, combined with the fact that java, unfortunately, shipped with a bug for a long long time that made it act in this bizarre fashion, thinking that rendering the number 120 in arabic is best done with "120", which is wrong.
So, with that context, in order of preference:
Best solution
Find out why your system ends up with ar-EN and nevertheless expects '120', and fix this. Also fix ar-EN in general; EN is not a country.
More generally, 'unsupported locale' isn't really a thing. the ar part is supported, and it's the only relevant part of the tag for rendering digits.
Alternatives
The most likely best answer if the above is not possible is to explicitly work around it. Detect the tag yourself, and write code that will just respond with the result of formatting this number using Locale.ENGLISH instead, guaranteeing that you get Output: 120. The rest seems considerably worse: You could try to write a localization provider which is a ton of work, or you can try to tell java to use the JRE version of the provider, but that one is obsoleted and will not be updated, so you're kicking the can down the road and setting yourself up for a maintenance burden later.
1.) Given that the JRE variant actually printed 120, and you're also indicating you want this, I get that nagging feeling I'm missing some political or historical info and the expectation that ar-EN results in rendering the number 120 as "120" is not so crazy. I'd love to hear that story if you care to provide it!

How to convert 3 letter language code to corresponding text?

Do we have any java libraries to convert 3 letter language code to its corresponding language with localization support?
Like,
ENG -> English
PS: I guess its a bad question. But, google was of not a good help. Hence, turning to you all. Probably, my search term was not accurate.
Use Locale's getDisplayLanguage() method:
Locale eng = Locale.forLanguageTag("ENG"); // Make a locale from language code
System.out.println(eng.getDisplayLanguage()); // Obtain language display name
Demo.
I do not know about a Java library but this might help.
https://www.loc.gov/standards/iso639-2/php/code_list.php
It has the data you are looking for. You might have to scrape it off the page and put it into your Java code.

Java Locale.getDefault() cannot return values other than en

there is some problem I encounter in Java, i want to get system locale country from a window computer, so I write the code like this:
Locale x = Locale.getDefault();
String output = x.getCountry();
If i set my system language to like English(Singapore), i will get result as en-SG, and if i set my system language as English(Canada), it will also return me the result with en-CA, but if i change to some language which is not english, it will return me as en-GB for all options, why is it so??
Besides that, is there any other way to get the current country information using java?
Locale.getDefault() should abstract from the operating system. That is the purpose.
I do not htink that this is a problem of Java. I just tried to change my language and it works.
My Setup was:
Windows 10 Home edition 64 Bit
Java 1.8.0_45
de_DE to en_GB
I had to download the language pack for en_GB.
I had to log off and login in to activate the other language.

how to convert english number to chinese in java

I have to convert english number to chines number. but as chinese number system is different than english. I there any way to convert english number getting at run time to convert in to chinese.
Thank You.
Vikram
Instead of rolling your own, it is advisable to use ICU4J NumberFormat as #mcdowell's answer.
The only thing different is the Numbering Systems ID "hansfin" should be replace with "hans" if you wish converting 61305 into "六万一千三百零五".
Locale chineseNumbers = new Locale("C#numbers=hans");
com.ibm.icu.text.NumberFormat formatter =
com.ibm.icu.text.NumberFormat.getInstance(chineseNumbers);
System.out.println(formatter.format(61305));
Here is the results for different Numbering Systems IDes.
hans 六万一千三百零五
hant 六萬一千三百零五
hansfin 陆万壹仟叁佰零伍
hansfin 陸萬壹仟參佰零伍
The hans is the abbreviation of "Han Simplified" (i.e. Simplified Chinese), while the hant is "Han Traditional" (i.e. Traditional Chinese) and the fin is "Finance".
ICU4J has support for this:
Locale chineseNumbers = new Locale("en_US#numbers=hansfin");
com.ibm.icu.text.NumberFormat formatter =
com.ibm.icu.text.NumberFormat.getInstance(chineseNumbers);
System.out.println(formatter.format(100));
Tested with version 4.8.
In that case,
i suggest you build a hash table for it.
It's not that difficult to start with.
we know that chinese 'numerals' are pretty much defined by:
See: http://en.wikipedia.org/wiki/Chinese_numerals
With that, i think you are more than capable to build a table in your programming lang preference, java.

Dot Net and Java Culture Codes

I've been tasked with the awesome job of generating a look-up table for our application culture information. The columns I need to generate data for are:
Dot Net Code
Version
Culture Name
Country Name
Language Name
Java Country Code
Java Language Code
Iso Country Code
Iso Language Code
I have found the globalization name space, but I'm sure someone out there has asked the same question, or there is a table already available.
Thanks for any help
Java uses the 2-letter ISO country and language codes. I recommend getting rid of the "Java Country Code" and "Java Language Code" fields in your lookup table, since they would be redundant.
I assume that wherever you get your ISO country and language codes, you'll find their corresponding names in English. However, the Java Locale API will give also you the localized names for the country and language, if you need them. (I.e., what is America called in Japan?)
For example, you can do this:
Locale l = Locale.ITALY;
System.out.println(l.getDisplayCountry() + ": " + l.getDisplayLanguage());
System.out.println(l.getDisplayCountry(l) + ": " + l.getDisplayLanguage(l));
Which, running in the US English locale prints:
Italy: Italian
Italia: italiano
Note that you can obtain 3-letter ISO codes from the Locale class, but when constructing them, be sure to only use 2-letter codes.
That's strange, the last time I visited this page, someone had beaten me to posting the links to the Java references for Localization.
However, since their post is gone, here's what I was writing before they beat me to it.
Java uses two ISO standards for localization with java,util.Locale.
2-letter ISO-639 for language.
2-letter ISO-3166 for country.
Java uses Locales to store this information. Most of all the information you need regarding it can be found on Sun's Internationalization page. Java uses a syntax similar to the "en-us" syntax, however rather than using a hyphen it delineates with an underscore.
I'm guessing that you mean Localization or Internationalization or i18n.
Try this tutorial:
http://java.sun.com/docs/books/tutorial/i18n/index.html
Good Luck,
Randy Stegbauer

Categories