Why am I getting extra symbols when replacing Arial Narrow? - java

I'm using iText in my Java application to replace a certain string in a PDF document. I noticed extra symbols appear when the original font used in the PDF doc was Arial Narrow.
As you can see in the example below, the string ##[PORTFOLIO-123456789] was replaced with the string vlopezbo. It worked just fine when the font was Arial or Futura. However, in the case of Arial Narrow, extra square-like symbols where added to the right after the replacement. Trying to figure out why?
Please notice square symbols on the right

Related

getting square emoticons instead of actual emoticons

I am working a java chat application and I am adding emoticons by replacing the emoticon shortcut, like :) ,with ◕‿◕ . Its not an image that I am replacing it with but simple text. Now the problem that I am facing is that sometimes I get just Square boxes instead of the actual thing that I want. I am making these images/texts in MS Word by converting the unicode to the actual image. I am also using various online resources to get these images/text.
Can anyone tell me how to get rid of the boxes and get the actual text.
My encoding is in UTF-8 and my font is also set to monospaced.
Your unicode-character is probably not supported by your font. Either the font implements the character as a box, or the operating system / font-renderer draws a box instead of the glyph.
I would say the Font used in your application just cannot show some chars. Find one which font really can and use it there.
Font has boolean canDisplay(char c) method which you can use.
See also the doc about font

how to create persian content in pdf using eclipse

I have a problem with inserting UNICODE characters in a PDF file in eclipse.
There is some solution for this that it is not very efficient for me.
The solution is like this.
document.add(new Paragraph("Unicode: \u0418", new Font(bfComic, 12)));
I want to retrieve data from a database and show them to the user and my characters are in Arabic script and sometimes in Farsi script.
What solution do you suggest?
thanks
You are experiencing different problems:
Encoding of the data:
Please download chapter 2 of my book and go to section 2.2.2 entitled "The Phrase object: a List of Chunks with leading". In this section, look for the title "Database encoding versus the default CharSet used by the JVM".
You will see that database values are retrieved like this:
String name1 = new String(rs.getBytes("given_name"), "UTF-8");
That’s because the database contains different names with special characters. You risk that these special characters are displayed as gibberish if you would retrieve the field like this:
String name2 = rs.getString("given_name")
Encoding of the font:
You create your font like this:
Font font = new Font(bfComic, 12);
You don't show how you create bfComic, but I assume that this object is a BaseFont object using IDENTITY_H as encoding.
Writing from right to left / making ligatures
Although your code will work to show a single character, it won't work to show a sentence correctly.
Suppose that name1 is the Arabic version of the name "Lawrence of Arabia" and that we want to write this name to a PDF. This is done three times in the following screen shot:
The first line is wrong, because the characters are in the wrong order. They are written from left to right whereas they should be written from right to left. This is what will happen when you do:
document.add(name1);
Even if the encoding is correct, you're rendering the text incorrectly.
The second line is also wrong. The characters are now in the correct order, but no ligatures are made: ل followed by و should be combined into a single glyph: لو
You can only achieve this by adding the content to a ColumnText or PdfPCell object, and by setting the run direction to PdfWriter.RUN_DIRECTION_RTL. For instance:
pdfCell.setRunDirection(PdfWriter.RUN_DIRECTION_RTL);
Now the text will be rendered correctly.
This is explained in chapter 11 of my book. You can find a full example here: Ligatures2

How to use Character.UnicodeBlock while I set some strings in a JEditorpane

I need to display some Bengali characters. I've tried to set the font to a Bengali Unicode font but It does not work properly. The last hope to fulfill my project is to use Character.UnicodeBlock. But I do not have any idea about it. Is it really possible to get the actual display of any Unicode character in Java? How can I use Character.UnicodeBlock in a component?
I assume Java 7.
First one needs a Unicode font.
If this font goes into the Windows fonts, and there is no name clash with an already existing font, everything should work.
Otherwise one might store the font as resource file in the application:
InputStream fontIn = getClass().getResourceAsStream("/.../... .ttf");
Font font = Font.createFont(Font.TRUETYPE_FONT, file);
GraphicsEnvironment.getLocalGraphicsEnvironment().registerFont(font);
After this jEditorPane.setFont(font) should work. Mind, the text in the JEditorPane should not be HTML, where own fonts might be set.
It is tricky, because of font substitution, on font decoding using names.
Another problem might be hard-coded strings in the java source: the encoding of the java source (the editor) must be the same as is used by the javac compiler. For international projects best both UTF-8 (javac -encoding=UTF-8 ...). To test whether there is a problem with that one can test with "\u099C" for জ.

Printing Unicode characters in java

I'm developing a Sinhala-English Unicode translator on Java. When I print a Unicode character in a JTextPane, it only shows a blank box. But when I copy that box to the notepad in windows it shows me the letter.
The problem is that Java not showing the Unicode characters instead windows.
How can I fix this problem ?
It's likely that the font you are using in your JTextPane does not fully support the Unicode range that you are trying to display. Try setting the text area's font to something more Unicode-friendly (see the row labeled "Sinhala (80: 0D80–0DFF)").

how to manage formatting of text when read a save file?

i have a java applet application in which i use rich text area . i write URDU the national language of PAKISTAN. i managed to do so with uni codes. the problem is, when i write urdu in text area and select a font and color for each line it do all of this but when i save this file using UTF-8 encoding and then open it again it shows all text formatted as i choose format of last line.
my requirement is to open file as it is saved. i mean each file should have same formatting as i done before saving.
I'm still suffering with this problem even after bounty can any one help! dated 07-06-2010.
See, when you actually format text using some font and color, it will generate some RTF/HTML code right? You should try to get the RTF/HTML of the text area so that all your formatting can be saved in a file.
Basically all its a text file, so you need to get it with all code right?
Check this link for RTF formatted text saving mechanism.
Java JTextPane RTF Save
Also check HTMLEditorKit for more info.
http://java.sun.com/j2se/1.4.2/docs/api/javax/swing/text/html/HTMLEditorKit.html
thanks.
UTF-8 is an assignment of codes to characters. For convenience a decision was made that the lowest 127 codes are the same in ASCII and UTF-8. For all characters the codes differ.
UTF-8 Fonts have a character map (cmap) which assigned unicode code to their glyphs. There are very few fonts that cover large portions of the unicode range (Arial Unicode and Gentium i know, there are some others), and to get full coverage in a rendering solution, you have to mix fonts.
To be able to display arbitray Unicode texts, you therefore have to create a set of fonts with one as the default font and fallback fonts for the unicode characters that are not contained the default font. Back to Java and your Textpane: If you select a font for a given part of text in your Textpane, this only means that to render the text glyph are used from the selected font. But the text itself is not associated with the font in any means.
So you have two options:
You don't just store the UTF-8 text,
but also information about the
selected font, or
more interesting:
You store the text simply as UTF-8
and apply fonts after loading the
text into your textpanel!

Categories