PdfBox write hindi characters in pdf file - java

I tried many things to write hindi characters using Apache PdfBox but seems its existing issue in the library.
I tried many font files available, Can someone really help me out in this.
I tried following :
PDDocument doc = new PDDocument();
PDPage page = new PDPage();
doc.addPage(page);
PDFont font = PDTrueTypeFont.loadTTF( doc, new FileInputStream(new File("D:\\Data\\fonts\\dn.ttf")));
font.setFontEncoding(new WinAnsiEncoding());
PDPageContentStream content = new PDPageContentStream( doc, page, true, false );
content.setFont(font, 10);
content.beginText();
content.moveTextPositionByAmount( 200, 100 );
content.drawString( "हिंदी" ); // Writing word "Hindi" in hindi language.
content.endText();
content.close();
doc.save( new FileOutputStream(new File("D:\\testOutput1.pdf")));
doc.close();

It's working for me in PDFBox.
The trick here is to use non-unicode string instead of unicode string.
Use Kruti Dev Font given in below link.
Then convert your unicode string to non-unicode string.
And finally use that converted string in your code.
That means replace this like
content.drawString( "हिंदी" ); // Writing word "Hindi" in hindi language.
With this line
content.drawString( "fganh" ); // Writing word "Hindi" in hindi language.
Convert Unicode (Mangal) To Kruti Dev Font

I think this cannot be done using PdfBox as there are lot of issues with it.
I tried many fonts and the encoding types of PdfBox but failed to write in Hindi.
At the end I tried it in Node Js express pdfmaker() which converts Html to PDF, However I had issues on my Linux server and I installed appropriate ttf font and it worked !

Related

Java iText7 UTF8 Characters [duplicate]

I am trying to create a pdf with greek characters using iText 7 for Java.
Only latin characters and numbers are visible in the PDF.
I am loading fonts using this code:
PdfFont normalFont = PdfFontFactory.createFont(FontConstants.HELVETICA, "CP1253");
What should I do?
This is the solution:
PdfFont normalFont = PdfFontFactory.createFont("C:\\Windows\\Fonts\\arial.ttf", "Identity-H", true);
You can use any font that supports your language. Also Identity-H seems to be important as the encoding of the PDF file.

java - generate unicode pdf with Apache PDFBox

I have to generate pdf in my spring mvc application. recently I tested iTextPdf library, but i could not generate unicode pdf document. in fact I didn't see non-latin characters in the generated document. I decided to use Apache PDFBox for my purpose, but I don't know has it support unicode characters? If has, is there any good tutorial for learning pdfBox? And If not, which library should I use?
Thanks in advance.
The 1.8.* versions don't support PDF generation with Unicode, but the 2.0.* versions do. This is the example EmbeddedFonts.java:
public class EmbeddedFonts
{
public static void main(String[] args) throws IOException
{
PDDocument document = new PDDocument();
PDPage page = new PDPage(PDRectangle.A4);
document.addPage(page);
String dir = "../pdfbox/src/main/resources/org/apache/pdfbox/resources/ttf/";
PDType0Font font = PDType0Font.load(document, new File(dir + "LiberationSans-Regular.ttf"));
PDPageContentStream stream = new PDPageContentStream(document, page);
stream.beginText();
stream.setFont(font, 12);
stream.setLeading(12 * 1.2);
stream.newLineAtOffset(50, 600);
stream.showText("PDFBox Unicode with Embedded TrueType Font");
stream.newLine();
stream.showText("Supports full Unicode text ?");
stream.newLine();
stream.showText("English русский язык Tiếng Việt");
stream.endText();
stream.close();
document.save("example.pdf");
document.close();
}
}
Note that unlike iText, PDFBox support for PDF creation is very low level, i.e. we don't support paragraphs or tables out of the box. There is no tutorial, but a lot of examples. The API orients itself on the PDF specification.
The current version of Apache PDFBox can't deal with Unicode, see:
https://pdfbox.apache.org/ideas.html
iTextPdf v. 5.x generates pdf files with Unicode. There is an exemple here:
iText in Action: Chapter 11: Choosing the right font
part3.chapter11.UnicodeExample
http://itextpdf.com/examples/iia.php?id=199
To run it, you just need to adapt the value of EncodingExample.FONT and to add some code to create the output file.

PDF Generation having multi-lingual text using Flying Saucer

I am trying to print Arabic and English text in PDF using Flying Saucer library. Here's my code :
String inputFile = "D:/test.xhtml";
String url = new File(inputFile).toURI().toURL().toString();
String outputFile = "D:/doc.pdf";
OutputStream os = new FileOutputStream(outputFile);
ITextRenderer renderer = new ITextRenderer();
ITextFontResolver resolver = renderer.getFontResolver();
resolver.addFont("D:/arialuni.ttf", BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
renderer.setDocument(url);
renderer.layout();
renderer.createPDF(os);
os.close();
and my XHTML file has following data enclosed in paragraph tags:
اب اب اب اب Hello
The output generated displays only English characters but not Arabic glyphs. Please help.
for some reason, if no specific font is used, the generated PDF uses some kind of default (probably Helvetica) font, that contains a very limited character set, that obviously does not contain the Greek code page.
Reference
Arial is a pretty standard font, installed by default in most operating system, and implements a wide variety of alphabets (including Greek).

how to handle with writing to pdf file chinese characters

i am trying to get text from properties file that he is coded in utf-8 and write it in to a PDF file using document object in java .
Document document = new Document();
File file = new File(FILES_PATH + ".pdf");
FileOutputStream fos = new FileOutputStream(file);
PdfWriter.getInstance(document, fos);
.
.
.
pdfTable table;
document.add(table);
document.close();
when i get just the value from property is ignores Chinese characters .
when i try to encode the string instead Chinese characters i get
strange words or "?".
tried to code it in utf-8 , iso-8859-1 , gbk or gb3212.
need help that PDF file will be able to get Chinese characters
It will not work that way.
In order to display Unicode character in PDF, that is not in build-in PDF fonts, you need to specify custom font for the text frangment and create the separate fragment for each text fragment, that is covered by given font. You need also to embed the used fonts into PDF document (so please consider, if the licence for the fonts you use enables distributing them).
So each String could be rendered using many fonts. But iText has the class FontSelector, that does that task:
FontSelector selector = new FontSelector();
BaseFont bf1 = BaseFont.createFont(fontPath, BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
bf1.setSubset(true);
Font font1 = new Font(baseFont, 12, Font.BOLD);
selector.addFont(font1);
// ... do that with all fonts you need
Phrase ph = selector.process(TEXT);
document.add(new Paragraph(ph));
More complex example you can find in my article: Using dynamic fonts for international texts in iText

OpenType font kerning with itext

I am using itext and ColdFusion (java) to write text strings to a PDF document. I have both trueType and openType fonts that I need to use. Truetype fonts seem to be working correctly, but the kerning is not being used for any font file ending in .otf. The code below writes "Line 1 of Text" in Airstream (OpenType) but the kerning between "T" and "e" is missing. When the same font is used in other programs, it has kerning. I downloaded a newer version of itext also, but the kerning still did not work. Does anyone know how to get kerning to work with otf fonts in itext?
<cfscript>
pdfContentByte = createObject("java","com.lowagie.text.pdf.PdfContentByte");
BaseFont= createObject("java","com.lowagie.text.pdf.BaseFont");
bf = BaseFont.createFont("c:\windows\fonts\AirstreamITCStd.otf", "" , BaseFont.EMBEDDED);
document = createobject("java","com.lowagie.text.Document").init();
fileOutput = createObject("java","java.io.FileOutputStream").init("c:\inetpub\test.pdf");
writer = createobject("java","com.lowagie.text.pdf.PdfWriter").getInstance(document,fileOutput);
document.open();
cb = writer.getDirectContent();
cb.beginText();
cb.setFontAndSize(bf, 72);
cb.showTextAlignedKerned(PdfContentByte.ALIGN_LEFT,"Line 1 of Text",0,72,0);
cb.endText();
document.close();
bf.hasKernPairs(); //returns NO
bf.getClass().getName(); //returns "com.lowagie.text.pdf.TrueTypeFont"
</cfscript>
according the socalled spec: http://www.microsoft.com/typography/otspec/kern.htm
OpenType™ fonts containing CFF outlines are not supported by the 'kern' table and must use the 'GPOS' OpenType Layout table.
I checked out the source, IText implementation only check the kern for truetype font, not read GPOS table at all, so the internal kernings must be empty, and the hasKernPairs must return false.
So, there have 2 way to solove:
get rid of the otf you used:)
patch the truetypefont by reading the GPosition table
wait for me, I'm processing the cff content, but PDF is optional of ever of my:) but not exclude the possibility:)
Have a look at this thread about How to use Open Type Fonts in Java.
Here is stated that otf is not supported by java (not even with iText). Otf support depends on sdk version and OS.
Alternatively you could use FontForge which converts otf to ttf.

Categories