Internationalization of distances in java

Internationalization of distances in java - java

Is it possible in Java without any extra library to internationalize distances?
I mean it is possible to handle that with date, time, currencies, numbers...
I would have expected to find a NumberFormat.getDistanceInstance or something.
Is there something like that already embedded or should i make my own internationalization system for distances (mostly miles vs kilometers)

I would love to hear about such formatter but unfortunately I never did. The problem is, there is no such data in CLDR yet, so it is not to easy to do.
That is to say that people actually think about this for quite a while – see ICU's Measure class. Unfortunately for now, it seems as close you can get is to determine measurement system – see LocaleData and LocaleData.MeasurementSystem.
After that you are on your own. You would need to leave this for translators (they need to actually translate units as well as formatting pattern).

No, there's nothing in the JDK to i18n distances, weights and most other measurement units, except for calendars (I know it's not really a unit, but the lunar calendar is quite different from the Gregorian calendar). Even OSs don't have that kind of information.
The only i18n you can do with time, currencies, numbers is the formatting. There's no feature to change the measurement unit.
So you'll need to build your own for distances :S.

Related

How do you grab a certain string of text through a link in java?

Is it possible to grab a certain piece of text through Java in a website? like for example, https://weather.com/weather/today/l/41.93,-88.25?par=google&temp=f , how would i be able to figure out the temp that it displays in java?

The practical answer to your question is: You don't wanna do that.
Let me try to answer it, at which point you'll realize why you don't want to:
How do I programatically parse a website?
It's complicated. Just about every browser has an option to right click and 'view source'. Presumably the number(s) you want are in here; you can parse this text to find them. It's NOT easy though. You'll probably be tempted to use something like a regular expression or a simple 'find me this exact string of text' trick to find what you need. It may work. But generally that means the day that this site changes the style or just does some basic updates, your code ceases to work.
You'll need to put in your agenda to check, every day if you have to, if your code still works. That's 5 minutes out of your day, every day, for the rest of the life of this project. That sounds incredulously expensive, which is why you don't want this.
If you must, there are ways to tighten up your parsing code. If you use libraries like jsoup, that helps a bit. If you toss the entire site through a 'browser emulator', you can deal with javascript making ajax requests and the like (these days websites are like little programs, and to truly observe programmatically what the site shows to human eyes, you need to run that program to get the job done. If you're very lucky, you can inspect the 'source code' of the little program and that's all you need, but you're not always that lucky).
But, as I said, that just helps a bit. The day will come the weather channel changes their site and breaks your code. They won't announce it. It is not considered immoral or technically dubious to do so. Maybe you can update your agenda to check if your code works down to once a week instead of daily, but it'll be a permanent maintenance burden. You DO NOT WANT THIS.
Okay, forget that. How does this really work?
Sites that are designed to let you read this stuff have an API. They'll document it someplace. This is a 'website' made specifically for code. It has no formatting, and a well defined specification. Send it this specific simple string, and this specific simple answer comes out, and the site has tooling to let you know when they change it (for example, an 'API version') - all luxuries the site meant for human consumption will not have.
You're in luck. The weather channel has an API.
What you really want, is to read all that, figure out how that API works, and use that.
The API will not break when the weather channel decides that today is a good day to slightly change the shade of the background image.

Java typed i18n (java)

I'd like to know if it's possible (and with which tooling) to do typesafe i18n in Java. Maybe it's not clear so here are some details, assuming we use something based on MessageFormat
1) Translate using typesafe parameters
I'd like to avoid having an interface like String translate(Object key,Object... values) where the values are untyped. It should be impossible to call with a bad parameter type.
Note I'm fine specifying the typing of all the keys. The solution I'm looking for should be scalable and should not increase the backend startup time significantly.
2) It should be known at compile time which keys are still used
I don't want my translation keys base to be like many websites' CSS, growing and growing forever and everybody being frightened to remove keys because we don't know easily if they are still useful or not.
In JS/React land there is babel-plugin-react-intl which permit to extract at compile time the translation keys that are still found in the code. Then we can diff these keys with our translation backend/SaaS and delete the unused keys automatically. Is there anything close to that experience in Java land?
I'm looking for:
any trick you have that could make i18n more manageable in Java regarding these 2 problems I have
current tooling that might help me solve the problem
hints on how to implement something custom if tooling does not exist
Also, is Enum suitable to store a huge fixed list of translation keys?

Translation keys are an open ended domain. For a closed domain an enum would do.
Having something like enums or constant lists likely causes a growth of different enums, constants classes.
And then there is the very important perspective of the translating business:
you would want at least one glossary (not needing translation occurrences), structurally equal phrases grouped,
comments maybe on ambivalent terms and usages (button/menu). This can reduce
the time costs and improve the quality. There also are things like online-help.
Up till now XML, like simple docbook / translation memory (tmx/xliff/...), was sufficient for that. And the tooling
including different forms of evaluation was done ourselves.
I hope a more professional answer will be given, but my answer might shed some light
on the desired functionality:
translation centric: as that needs the most work.
version control: some text lists involved.
checking tools: what you mentioned, integrity, missing, almost equal.

Does any know how to extract a tensorflow DNNRegressor model and evaluate manually?

I am trying to use a DNNRegressor model in a java realtime context, unfortunately this requires a garbage free implementation. It doesn't look like tensorflow-light offers a GC free implementation. The path of least resistance would be to extract the weights and re-implement the NN manually. Has anyone tried extracting the weights from a regression model and implementing the regression manually, and if so could you describe any pitfalls?
Thanks!

I am not quite sure if your conclusion
The path of least resistance would be to extract the weights and re-implement the NN manually.
is actually true. It sounds to me like you want to use the trained model in an Android mobile application. I personally do not know much about that, but I am sure there are efficient ways to do exactly that.
However, assuming you actually need to extract the weights there are multiple ways to do this.
One straight forward way to do this is to implement the exact network you want yourself with Tensorflows low level API instead of using the canned DNNRegressor class (which is deprecated btw.). That might sound unnecessarily complex, but is actually quite easy and has the upside of you being in full control.
A general way to get all trainable variables is to use Tensorflows trainable_variables method.
Or maybe this might help you.
In terms of pitfalls I don't really believe there are any. At the end of the day you are just storing a bunch of floats. You should probably make sure to use an appropriate file format like hdf5 and sufficient float precision.

Java localization best practices

I have a Java application with server and Swing client. Now I need to localize the user interface and possibly also some of the data needs to be locale specific. There are few things in specific I would like to hear your opinions on.
How should I distribute the localized strings for the UI into properties files? In my application there are several views and each has several panels. Should I have one localization file per language for each panel or view or should I keep all translations for one language in the same file? I'm currently leaning towards one file per view and language, but I'm not sure how I should handle some domain specific terms which appear in many places. Having the same translation on several files does not sound too good.
The server throws some exceptions that contain a message that should be displayed to the user. I could get the selected locale from the session and handle the localization at the server, but I feel it would be more elegant to keep all localization files at the client. I have been thinking about sending only a localization key from the server with some kind of placeholders for error specific information, which would be sent with the exception. Then the client could construct the message based on the localization key and replace the placeholders with the error specific information. Does that sound like a good way to handle it, or are there other options? Typically my exception messages contain some additional information that changes for each case. It could be for example "A user with username Khilon already exists", in which case the string in the properties file would be something like "A user with username {0} already exists".
The localization of the data is the area that is the most unclear to me. As I'm not sure if it will be ever required, I have so far not planned it very much. The database part sounds straightforward enough, you basically just need an additional table for the strings and a column to tell for which locale the string is. Though I'm not sure if it would be best to have a localization table for each data table (eg Product and Product_names), or could I use one table for localization strings for all the data tables. The really tricky part is how to handle the UI, as to some degree it would be required for an user to enter text for an object in multiple languages. In practice this could mean for example, that a worker in Finland would give the object a name in Finnish and English, and then a worker in another country could translate it to her own language. If any of you has done something similar, I'd be happy to hear how you did it.
I'm very grateful to everybody who can share their experiences with me.
P.S. If you happen to know any exceptionally good websites or books on the subject, I would be happy to hear of them. I have of course done some googling and read some articles about localization, but nothing mind blowing yet.

Actually, what you are talking about is Internationalization (i18n), not Localization (L10n).
From my experience, you are on the right path.
ad 1). One properties file per view and locale (not necessary language, as you may want to use different translations for certain languages depending on country, i.e. using different strings for British an American English thus different locales) is the right approach. Since applications tend to evolve, it could save a good deal of money when you want to modify just one view (as translators will charge you even for something they won't touch - they will have to actually find strings that need to be updated/newly translated). It would be also easier to use with Translation Memory tools if you do it right (new strings at the end of the file all the time).
ad 2). The best idea is to sent out only the resource key from server or other process; other approach could be attaching a resource key and possibly the data (i.e. numeric value) using delimiters, so the message could be recreated and reformatted into local language.
ad 3). I have seen several approaches to localizing Databases, but the best (and it is not only my opinion, but also IEEE members) is to store resource keys and recreate the data on client side using appropriate locale. Of course this goes for pre-installed data, if you let users to enter the data, other issues will arose... There is no silver bullet, one need to think what works best in his/her context. I would lean to including a foreign key column that will identify the language, but it really depends on kind of data that will be stored.
Unfortunately i18n doesn't end here, please remember about correctly formatting dates and numbers so that they will be understandable for people using your program. And also, if you happen to have some list of strings, the sorting order should also depend on locale (it's called collation).
Sun used to have (now our beloved Oracle) has quite good i18n trail which you can find here: http://download.oracle.com/javase/tutorial/i18n/index.html .
If you want to read good book on the subject of i18n and L10n, that will save you years of learning these topics (although not necessary will teach you how to program it), there is great book from Microsoft Press: "Developing International Software" - http://www.amazon.com/Developing-International-Software-Dr/dp/0735615837 . It still relevant, although quite old.

1) I usually keep everything in one file and use names that signify where the properties are used. For example, I prefix with things like "view" and "menu"
view.add_request.title
view.add_request.contact_information.sectionheader
view.add_request.contact_information.first_name.label
view.add_request.contact_information.last_name.label
menu.admin.user_management.add_user.label
menu.admin.user_management.add_role.label
2) Yes, passing around the key makes things simpler and makes the server code easier to test. It also avoids having to pass locale information to the server to have it decide on a language for the client. Its a thick client, so let it handle the localization.
3) I haven't localized data before (usually just labels, and static UI verbage), but I would probably lean towards having a single table with all the localized strings and locales to start with (just to keep it simple). I'm not sure what you're asking about in reference to the UI, but I would suggest you make sure that whatever character-set you're using allows all the languages you want to support. Make sure you read Joel Spolsky's article entitled: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)

what localization changes are needed for Arabic with Java Applet

How big task is it to implement support for Arabic localization, our Java 1.5 Applet was designed as fully localizable (european languages) but now we plan to add also arabic as a new language.
We are using custom GUI text i/o components inherited from Component class using e.g. Drawstring, how well is arabic supported within Component class ?
The keyboard input is done with KeyListener getKeyChar, getKeyCode etc.

It depends on the quality of the original internationalization work. If everything is implemented correctly, then it will be similar to adding support for a new European language - most of the work will be translation and testing.
However, if you've only tested the software with European languages, you might find a lot of problems with your original internationalization work. In particular you might need to consider:
bi-directional text
ligatures (joinining the characters)
rendering (characters change shape depending on their position in the word)
number and date formats formats
specialized input methods
cultural differences (for icons etc)
file encodings
testing
If you have custom code that implements software features in a way that isn't fully localizable then you need to budget for fixing this too.
If you have manuals, help text and other collateral that also needs to be translated, then the software cost might not be such a large proportion of the total budget.
Also, if you have plans to perform localization for any Far Eastern languages (Japanese, Chinese, Korean, ...) you might consider sharing the cost across those projects, since many of the issues will be similar.
One final point - maintaining the localization for future releases might cost substantially more than providing it in the first place.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.