PDF/iText : replace font defs

PDF/iText : replace font defs - java

I'm using iText (Java lib) to process an already created PDF file.
What I would like to achieve is to replace fonts that are metric-compatible with a PDF base font with that PDF base font. This would make the PDF more "compliant" and potentially also smaller.
Here's how it would go:
Loop through the fonts used in the PDF.
If font is metric-compatible
with a PDF base font then replace font name with that font (but maintain the PDF resource name, e.g. /F13, so that we do not need to touch any text
objects). Since iText embeds in its jar the AFM files for the PDF
base fonts I'm assuming that iText actually has enough knowledge to
make this assesment. I would probably have to look at
serif/sans-serif and monotype flags as well to know if I should swap
in Helvetica, Times or Courier.
Further if metric-compatible: Remove
any font embeds for that font. (since we've replaced with a PDF base
font there's no need to embed anything .. size matters!)
An example:
An existing PDF file uses "Calibri", "Arial" and "Times". Here's how each of those should be handled.
Calibri. This font doesn't have a metric-compatible cousin among the PDF base fonts so processing for this font resource will be skipped.
Arial. This font has a metric-compatible cousin among the PDF base fonts, namely "Helvetica". The name of the font resource (attribute BaseFont I suppose) will be changed to "Helvetica" and any potential embeds will be removed.
Times. This font is already a PDF base font. Skip processing. (we may consider unembedding here if there's something to unembed, but I already know how to do that so not part of the question)
I basically get stuck on the step which is to determine metric-compatibility. Any help is greatly appreciated.
(Note: An answer based on iText 5.x is perfectly ok as I feel the recent iText 7 is still somewhat undocumented)
UPDATE
As pointed out a number of checks would need to be carried out in addition in order to do a safe replacement:
Font encoding compatibility. Not really a problem for me as fonts in the documents I'll be processing will be using WinAnsiEncoding.
Available chars in font. Not really a problem for me as I'll only be processing documents that use only ISO 8859-1 chars. Furthermore: If the PDF contains an embedded subset of a font then I'll have easily accessible knowledge about exactly which chars is used in the document for that font.
I'm sure I can figure out to check for both these conditions. (I'm blissfully naive)
I'm not trying to do a general tool. I know where the PDF's I'll be processing comes from. In any case I guess it is possible to have enough information from the PDF to skip the font substitution if it can't be determined that the substitution will be "safe".

Related

How to use Character.UnicodeBlock while I set some strings in a JEditorpane

I need to display some Bengali characters. I've tried to set the font to a Bengali Unicode font but It does not work properly. The last hope to fulfill my project is to use Character.UnicodeBlock. But I do not have any idea about it. Is it really possible to get the actual display of any Unicode character in Java? How can I use Character.UnicodeBlock in a component?

I assume Java 7.
First one needs a Unicode font.
If this font goes into the Windows fonts, and there is no name clash with an already existing font, everything should work.
Otherwise one might store the font as resource file in the application:
InputStream fontIn = getClass().getResourceAsStream("/.../... .ttf");
Font font = Font.createFont(Font.TRUETYPE_FONT, file);
GraphicsEnvironment.getLocalGraphicsEnvironment().registerFont(font);
After this jEditorPane.setFont(font) should work. Mind, the text in the JEditorPane should not be HTML, where own fonts might be set.
It is tricky, because of font substitution, on font decoding using names.
Another problem might be hard-coded strings in the java source: the encoding of the java source (the editor) must be the same as is used by the javac compiler. For international projects best both UTF-8 (javac -encoding=UTF-8 ...). To test whether there is a problem with that one can test with "\u099C" for জ.

Apache FOP - PDF creation russian text

I have a small Java application that creates (besides other stuff) a PDF file using Apache's FOP 1.0.
Everything works fine when using latin letters. But it doesn't when there are others - e.g. cyrillic.
I don't think, it is the default problem of missing fonts, since the bookmarks within the PDF file are alright (unfortunately I can't add pics to this post).
Any ideas, what I'm doing wrong?
Thanks for your help!
Andreas

in your f:block you need to specify the font you want to use
<fo:block font-family="MS Mincho" font-size="12pt" font-weight="normal" space-after="5mm" background-color="#8BAF3F" color="white">
Of course the font should be available as well.

Thanks for the hints.
I've had set the font-family to 'Verdana', which may or may not have cyrillic letters.
Additionally I set the font-family in 'simple-page-master', so all pages using this master should be using this font.
On basis of your hints I changed the font-family to 'Arial'.
I also set the font-family in one block explicitly, just for a simple test.
I tried even a change of the system language to russian.
Unfortunately nothing worked. The changes of the font-family can be seen each and every time (Arial, Courier, Times, MS Mincho, MAC C Times) on the changed style, but there are always '#' shown.
And, most confusing, the bookmarks are alright...

Insert text to a generated pdf document

I have a library which generates pdf document with images.
I want to be able to add text after each image. What is the syntax for that? How to insert text into pdf documents?
I have to use the library I have, not another one.

First of all, mkl is correct, have a look at the specification for all of the details. PDF is an exact language, if you make mistakes they will routinely be punished severely once you open the PDF in viewers.
Secondly, when you think about putting text on the page, don't forget that besides the text operators to draw the text on the page, you'll also have to specify the font to use to draw this text. Which will include making sure there is a font resource included in the PDF file if your library doesn't automatically handle all of that for you.
If you want to cut corners (I shiver while writing this) and perhaps don't read the specification as thoroughly, try this.
1) Create a PDF file that looks more or less like what you want.
2) Use a tool such as pdfToolbox from callas (http://www.callassoftware.com/callas/doku.php/en:download) or Browser from Enfocus (http://www.enfocus.com/en/products/browser). Both of these tools allow you to investigate the low-level structure of a PDF file, including looking at the actual page description code. This will show you how fonts are embedded (if you have to do it yourself that could be very handy) and how text is rendered on the page (and how you set the font, size, color etc... to use).

Jasper Reports - Custom Barcode Generation

Libraries/Tools used:
1) Jasper Reports
2) iReport
3) Java
I've already generated some standard barcodes for my reports, but this time, I'm trying to generate a custom barcode, for which I've a font file custom-barcode.ttf. As of now, iReport supports 2 barcode libraries - Barcode4j and Barbecue, which don't support the custom barcode that I need. Any ideas to get started, without much overhead of using some new library (I'm using Barcode4j already)?
BTW, I'm aware that a similar question (custom barcode font) exists on this site already.

Just tried #mdahlman's answer and it worked. I generated the value "CODE123" using a barcode39 font (free) and Jaspersoft Barbecue.
Setting the size is not very easy using the font but the result is the same. I verified the barcode using Barcode Scanner on my Android phone (can see it's visually similar too). The reason this worked for me, probably same reason #bchetty's test didn't work, is because Barcode39 doesn't have a check-digit. It is a 1-to-1 translation except with a leading and trailing asterisk (*) appended to the data. If you want to use a ttf to generate a barcode type that has a check-digit you'll need a function (external jar like you mentioned) to encode it. Barcode39 doesn't need a function since it's just "*" + V${data} + "*".

Given that you have custom-barcode.ttf, it really can be treated as text. So your steps are like this:
Create a font extension in iReport for custom-barcode.ttf.
Create a Text Field in the report with a relevant expression.
Set the font for the Text Field to "custom-barcode" (or whatever you call your font extension). Play with the font size to get the desired output.
Using a custom font for a barcode could be considered a bit of a hack. But what it lacks in flexibility it makes up for in simplicity.

Loading custom fonts at runtime for use with JTextPane

Thanks for your time. My question is regarding the display of different fonts within the one JTextPane. My client wishes to view a word in two different languages within the one field. They've explicitly specified that they wish the different languages (namely Amharic, Arabic, Coptic and Hebrew) to be shown with different fonts. These are obviously non-standard fonts and I can't rely on the user having the required fonts installed on their OS.
From my research I've found that I can load a font file at runtime and set the JTextPane's font accordingly, which is fine if I just wanted to use one font, not two. I've also read about adding fonts to the OS' font directory or the JRE's font directory, outlined here.
I was hoping however, that there might be away to use the fonts without altering the user's OS. Am I out of luck?
Thanks again for your time and I look forward to any replies with bright ideas!

From my research I've found that I can load a font file at runtime and set the JTextPane's font accordingly, which is fine if I just wanted to use one font, not two.
A JTextPane can use multiple fonts.
Check out the section from the Swing tutorial on Text Component Features for an example of playing with the attributes of the text in the text pane.
Edit:
However to use multiple fonts, the only way I have worked out to set the font is by creating a MutableAttributeSet and setting the "FontFamily" attribute (a string) to the desired font name, and then assigning the Attribute set to the text using the StyledDocument.setCharacterAttributes
Reading the API for the createFont() method it looks like you should be able to use:
GraphicsEnvironment.registerFont(Font)

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.