In my android app, I want to create a text field with a lot of modification options (strong text, emphasized text, enter code here, enter link description here) like on stackoverflow.
I thought about this option:
In the answerfield of stackoverflow, strong texts are wrapped in "** **" and emphasized texts in "* *".
So I could conclude that every modification option have different special characters.
In the answer editor the special characters (**, *) are visible, so the special characters are inserted in the stackoverflow database.
So the only possibility is, that the text is interpreted after receiving it from database.
Is my guess right?
Can one of you give me a little bit android code of such a text modifier, so that I know how to start?
Just iterate About the Content in the textField and if you find a construct like ** then replace the following with a String in the whished font.
Related
I'm trying to get data from a dictionnary
(this one : http://vk.com/doc8069473_312422685?hash=78fd2d459ed8547b29&dl=86147ab2323652f43d). I use PDFBox to extract the text from this pdf file.
In order to do that, I created a class "Article" to store each word, its type (adj, noun, etc...), all its definitions and all its examples.
I use regular expressions to find the beginning and the end of each article.
Here is the pattern I use (PHNTC is added by me to replace phonetic notations):
Pattern pattern = Pattern.compile("(((\\w|\\–|\\-|&|,|’|/|â|é|è|ê|à|ô| )*)(\\s)+(PHNTC( )+)?(abbr|adj|adv|article|conj|interj|modal verb|noun|plural noun|prefix|prep|pron|phrase|suffix|(?<!((forming|making part of) a ))verb|expr)(, (abbr|adj|adv|article|conj|interj|modal verb|noun|plural noun|prefix|prep|pron|phrase|suffix|(?<!((forming|making part of) a ))verb|expr)\\s)?[^a-z]|((\\w|\\–|\\-|&|,|’|/|â|é|è|ê|à|ô| )*)(\\s)+(PHNTC( )+))");
As you can see, it is quite complicated, and even if it is sufficient for 99% of the articles (I have about 100 "wrong" articles among 29,000 articles), I still have some problems. For example, if "noun" is written somewhere in a definition, my program might think it is the beginning of a new article ! You can see in the code above my attempts to solve some ambiguities with "verb".
I think that the only solution to solve those problems would be to put some markups around bold texts and italic texts. I would like to use something like this :
Pattern pattern = Pattern.compile("<b>.*</b>(\\s)+(PHNTC( )+)?<i>.*</i>(, <i>.*</i>)?");
And now, here is my problem : how could I put those markups using PDFBox ?
I found a subject (How to extract bold text from pdf using pdfbox?) about extracting bold text (by overriding the method
processTextPosition( TextPosition text ) from PDFTextStripper).
I tried it but :
1) I failed to find bold text
2) I don't want to extract only bold text, I still want to extract everything !
Any ideas ?
I have a problem with inserting UNICODE characters in a PDF file in eclipse.
There is some solution for this that it is not very efficient for me.
The solution is like this.
document.add(new Paragraph("Unicode: \u0418", new Font(bfComic, 12)));
I want to retrieve data from a database and show them to the user and my characters are in Arabic script and sometimes in Farsi script.
What solution do you suggest?
thanks
You are experiencing different problems:
Encoding of the data:
Please download chapter 2 of my book and go to section 2.2.2 entitled "The Phrase object: a List of Chunks with leading". In this section, look for the title "Database encoding versus the default CharSet used by the JVM".
You will see that database values are retrieved like this:
String name1 = new String(rs.getBytes("given_name"), "UTF-8");
That’s because the database contains different names with special characters. You risk that these special characters are displayed as gibberish if you would retrieve the field like this:
String name2 = rs.getString("given_name")
Encoding of the font:
You create your font like this:
Font font = new Font(bfComic, 12);
You don't show how you create bfComic, but I assume that this object is a BaseFont object using IDENTITY_H as encoding.
Writing from right to left / making ligatures
Although your code will work to show a single character, it won't work to show a sentence correctly.
Suppose that name1 is the Arabic version of the name "Lawrence of Arabia" and that we want to write this name to a PDF. This is done three times in the following screen shot:
The first line is wrong, because the characters are in the wrong order. They are written from left to right whereas they should be written from right to left. This is what will happen when you do:
document.add(name1);
Even if the encoding is correct, you're rendering the text incorrectly.
The second line is also wrong. The characters are now in the correct order, but no ligatures are made: ل followed by و should be combined into a single glyph: لو
You can only achieve this by adding the content to a ColumnText or PdfPCell object, and by setting the run direction to PdfWriter.RUN_DIRECTION_RTL. For instance:
pdfCell.setRunDirection(PdfWriter.RUN_DIRECTION_RTL);
Now the text will be rendered correctly.
This is explained in chapter 11 of my book. You can find a full example here: Ligatures2
Doing linguistics and phonetics, I often need to use certain special phonetic symbols. Although I'm using a special keyboard layout that enables me to write some of those characters by typing, they key combinations can often get both quite complex and highly repetitive, so I would like to create a litle app that would contain some buttons, perhaps, each of them capable of sending a specified (phonetic) symbol to whatever the current cursor position is, no matter what window on one's screen is in focus.
Is anything of this sort possible to do in Java?
I've seen a solution that copies the values into clipboard and then pastes them (Java paste to current cursor position), but that is not a very clean way to do it, is it? Is there a way better than just pasting the charactedr(s) via ctrl+V?
Many thanks for any help or advice in advance!
P.
You can use the AWT Robot to generate key press events. This will not provided the ability to insert arbitrary unicode characters but you can combine it with the technique you already described: transfer the unicode characters to the clipboard and generate a CTRL+V key event afterwards. You can try to save and restore the original clipboard content but this will work with types supported by Java only.
The focus problem mentioned in the comments can be solved by setting the window to not receive the focus via Window.setFocusableWindowState with an argument of false.
An alternative is to provide the unicode text via drag&drop. Most applications support dropping text in their input fields. The code for exporting the text is very similar as both, clipboard and d&d use the same interfaces in Java.
***************************UPDATED***********************************************************************
i have found a similar question :
here : same question by another user
this one is with little details , but i still cant get it to work !
any help would be glad fully accepted !
I want to type Sinhalese words in (J2SE)swing textfileds , but they don't appear correctly in Java , same text in notepad shows correct word. how can i fix this ?
notepad picture:
http://imageupper.com/i/?S0200010080011O13734602521426968
java picture :
http://imageupper.com/i/?A0300010070011I13734604591427932
The same letters are in both images, so it's not problem of encoding.
The problem is that you have to set a proper font to the textfield. You can create the font if you don't have it, check this setting custom font
#AndrewThompson answer :)
Edit: That other user with the same question that you have mentioned is me. :)
As I found out, the char data type is not enough to render letters like "ශ්ර" because it needs 3 8bit characters. And the Java language isn't going to change the size of char data type just because we Sri Lankans want to render our characters. I had the similar question previously and I am the user you mentioned in your question.
"ශ් ර" is shown with 2 8bit characters for each character and that's all we got till now. You might want to checout SWT because it shows characters pretty damn well in my experience.
This is similar to my own previous question, but that solution didn't work here.
As mentioned in the previous question, I'm working on a cross platform(Windows/Ubuntu) application that has to transliterate English into one of several official Indian languages. The application has a custom input method, and typing in English and pressing space will transliterate the typed text into the specific local language. Urdu is different from the others in being right to left, like Arabic/Hebrew.
I managed to find an open licensed Urdu font that has both English and Urdu glyphs, but when I type characters in English, nothing shows up.
I don't understand whether it's a font painting issue, or related to the input method. So far, if I disable the custom input method (InputMethod.dispatchEvent() ) for this language, I am able to see the English text (but of course no transliteration takes place).
My findings:
Change font to one of Windows' built in Arabic fonts - same result.
Instead of using ComponentOrientation to align text in the text field, I used setHorizontalAlignment for when the locale is Urdu. Same result.
Decompiled the JDK's default input method provider on Windows (sun.awt.windows.WInputMethod). Here I see the dispatchEvent() makes a native call to the OS for handling IME. I can't do that here.
Found a custom IM for Hebrew - my version of dispatchEvent() is essentially the same.
Stepped through code for JTextField in Eclipse - wasn't able to find anything in the AbstractDocument and subclasses. The AbstractDocument.insertUpdate() method checks for and updates bidirectional text input, but there wasn't anything else significant.
I'm unable to understand what happens after the dispatchEvent() call. The characters are being registered, i.e. the transliteration engine is able to detect the typed characters and process them, but they just don't show up on screen.
Workaround
If I let the text field's orientation be as it is for regular left to right languages, I can see the English text. However, this would not be acceptable to an Urdu speaking user.
Can someone point me in the right direction?
I set the locale to ur_IN.
Sadly, ur_IN is not among the supported locales; I only see en_IN and hi_IN. In the example cited, I used the following code to get the image below:
spinner.setLocale(new Locale("hi", "IN"));