How can I put Tolkeins Elvish in Java - java

I'm wondering, I saw that Tengwar, an Elvish language created by Tolkien, has some Unicode values. How could I get them into a JavaFX window or console output? I just get a question mark in the console and a box in the JavaFX window.
I'm just trying to say hello in Elvish. The character I tried to use is Hyarmen. The website says that I'm supposed to use
"\u1xy28"
but that didn't work so I tried
"\E020"
which at least didn't give me an error. Can anyone help?

First, "\u1xy28" is incorrect because it contains 2 non hexa characters xy after a \u prefix.
For the rest, according to the document from the official Unicode Consortium about Special Areas and Format Characters
23.5 Private-Use Characters
Private-use characters are assigned Unicode code points whose interpretation is not specified
by this standard and whose use may be determined by private agreement among cooperating
users. These characters are designated for private use and do not have defined,
interpretable semantics except by private agreement.
...
The primary Private Use Area consists of code points in the range U+E000 to U+F8FF, for
a total of 6,400 private-use characters.
That means that you can decide to use those code points for Tengwar characters, but you cannot hope that a standard font will know how to display them (because they are reserved for private use), and you will have to find a dedicated font.
Near the end of the wikipedia page you linked, you can find a link to a comprehensive list of tengwar fonts
I must say that I have not tested any of them...

Related

How to use wingdings in java

How to display wingdings2 symbols in java. I tried googling but i couldn't find much help.
http://www.alanwood.net/demos/wingdings-2.html
Please refer the above link. I would like to display the white diamond symbol.
Wingdings is a font-based trick
The old Wingdings was a hack, a font that used alternate designs as the glyph. For example, where a NUMBER SIGN (pound sign, hash mark) was expected when using character position 35, a fountain pen image appears instead.
For this hack to succeed, (a) the user must have the desired Wingdings font installed, and (b) the text being displayed must use that font specifically.
Unicode
Nowadays, it likely makes sense to use emoji or other image characters defined among the 143,859 characters in Unicode. Those characters are each assigned a number, a code point, using numbers over the range of about a million.
Perhaps this character would work for you: ◇ Unicode Character 'WHITE DIAMOND' (U+25C7) at decimal code point 9,671.
System.out.println( "◇ = WHITE DIAMOND" ) ;
Your user needs a font, any font, that provides a glyph for that particular character. Modern OSes are skilled at automatically finding and using a secondary font with such a glyph if a displayed text block’s primary font is lacking. Understand that no single font provides a glyph for each and every character in Unicode.
There are likely other diamond-related characters too that a search might expose.
As a Java programmer, you can simply paste your desired character into your source code. Be sure to use UTF-8 as the character encoding of your file.

Which subset of Unicode symbols should I use to mark special substrings in text?

Our application sends strings which then shall be localized on client side. Sometimes those are whole strings, sometimes only substring, so we have to mark them. It would be the best if it only used Unicode as it wouldn't require any protocol changes.
Example:
"Length: (mark)10(mark)"
where 10 is length in cm but it should be converted so it is displayed as inches or mm.
Are Unicode special characters (0xFFF0-0xFFFF) right choice for marking such special substrings in text?
No, code points in the Specials block have their own uses. Using them for other purposes may result in unexpected effects. Even if you code all the processing yourself, the incoming data might contain those code points. It is of course possible to detect them and filter them out, but it is better to use code points that cannot clash with any assigned code points.
Use code points in the range U+FDD0..U+FDEF. They are designated as “noncharacters” and intended for use inside an application. See the Unicode FAQ section Private-Use Characters, Noncharacters & Sentinels FAQ.

Print Arabic (RTL) and English (LTR) in the correct directions at the same time

I want to output "Arabic" and "English" text at the same time in Java for example, outputting the following statement: مرحبا I am Adham.
I searched the internet and I found that the BiDi algorithm is needed in this case. Are there any java classes for BiDi.
I have tried this class BiDiReferenceJava and I tested it, but when I call runSample() in the class BidiReferenceTest and entering an arabic string as parameter, I got an OutOfIndexException as the count of the character is duplicated (exactly at this line of code in the class BidiReferenceTestCharmap)
byte[] result = new byte[count];
Where if the string length is 4 the count is 8!
The ICU4J is more or less the standard comprehensive Unicode library for Java, and thus supports the bidirectional algorithm. I really wonder why you need this, though; BiDi is usually applied by the display layer, unless you're a word-processor or something.
BidiReference.java is apparently a demonstration piece; it's designed to show how the algorithm works on ASCII characters instead of using actual Unicode characters.

Java regex to distinguish special characters while allowing non english chars

I am trying to do above. One option is get a set of chars which are special characters and then with some java logic we can accomplish this. But then I have to make sure I include all special chars.
Is there any better way of doing this ?
You need to decide what constitutes a special character. One method that may be of interest is Character.getType(char) which returns an int which will match one of the constant values of Character such as Character.LOWERCASE_LETTER or Character.CURRENCY_SYMBOL. This lets you determine the general category of a character, and then you need to decide which categories count as 'special' characters and which you will accept as part of text.
Note that Java uses UTF-16 to encode its char and String values, and consequently you may need to deal with supplementary characters (see the link in the description of the getType method). This is a nuisance, but the Character method does offer methods which help you detect this situation and work around it. See the Character.isSupplementaryCodepoint(int) and Character.codepointAt(char[], int) methods.
Also be aware that Java 6 is far less knowledgeable about Unicode than is Java 7. The newest version of Java has added far more to its Unicode database, but code running on Java 6 will not recognise some (actually quite a few) exotic codepoints as being part of a Unicode block or general category, so you need to bear this in mind when writing your code.
It sounds like you would like to remove all control characters from a Unicode string. You can accomplish this by using a Unicode character category identifier in a regex. The category "Cc" contains those characters, see http://www.fileformat.info/info/unicode/category/Cc/list.htm.
myString = myString.replaceAll("[\p{Cc}]+", "");

How to display X-Bar statistics symbol in Java Swing label?

What's the best way to insert statistics symbols in a JLabel's text? For example, the x-bar? I tried assigning the text field the following with no success:
<html>x̄
Html codes will not work in Java. However, you can use the unicode escape in Java Strings.
For example:
JLabel label = new JLabel(new String("\u0304"));
Also, here is a cool website for taking Unicode text and turning it into Java String leterals.
Well, that's completely mal-formed HTML, probably even for Swing (I think you would need the </html> at the end for it to work. But I would try to never go that road if you can help it, as Swing's HTML support has many drawbacks and bugs.
You can probably simply insert the appropriate character directly, either directly in the source code if you're using Unicode or with the appropriate Unicode escape:
"x\u0304"
This should work, actually. But it depends on font support and some fonts are pretty bad in positioning combining characters. But short of drawing it yourself it should be your best option.
You can obtain x̄ in adding \u0304 to x character.
In your case, you must generate following string
"x\u0304"
The character \u0304 alone is only a overscore or overline character. It is a special Diacritic character in UNICODE table. You can use it in combination with other normal character to obtain a composite character.
You can find more information on Diacritics characters at following location //en.wikipedia.org/wiki/Combining_character
You can also use
\u0305 to have a longer bar
\u0307 to have a dot above previous character (speed in mechanic)
\u0308 to have 2 dots above previous character (accelaration in
mechanic)
\u0325 to have a circle above previous character
Example
"[x\u0305]" -> x̄
"[z\u0307]" -> x̅
"[t\u0308]" -> ṫ
"[u\u0309]" -> ü
"[A\u030A]" -> Å
In example, only x overlapped with long bar is not correctly displayed in HTML because Chrome browser has a bug (I think).
If you Paste/Copy correctly displayed character in MS Word, you will see correct display of all these characters.
To type a new overlined character in MS Word, you type the character and immediately after you press Alt & 0 774 where 0774 is the base 10 representation of diacritic overline character.

Categories