I have to generate pdf in my spring mvc application. recently I tested iTextPdf library, but i could not generate unicode pdf document. in fact I didn't see non-latin characters in the generated document. I decided to use Apache PDFBox for my purpose, but I don't know has it support unicode characters? If has, is there any good tutorial for learning pdfBox? And If not, which library should I use?
Thanks in advance.
The 1.8.* versions don't support PDF generation with Unicode, but the 2.0.* versions do. This is the example EmbeddedFonts.java:
public class EmbeddedFonts
{
public static void main(String[] args) throws IOException
{
PDDocument document = new PDDocument();
PDPage page = new PDPage(PDRectangle.A4);
document.addPage(page);
String dir = "../pdfbox/src/main/resources/org/apache/pdfbox/resources/ttf/";
PDType0Font font = PDType0Font.load(document, new File(dir + "LiberationSans-Regular.ttf"));
PDPageContentStream stream = new PDPageContentStream(document, page);
stream.beginText();
stream.setFont(font, 12);
stream.setLeading(12 * 1.2);
stream.newLineAtOffset(50, 600);
stream.showText("PDFBox Unicode with Embedded TrueType Font");
stream.newLine();
stream.showText("Supports full Unicode text ?");
stream.newLine();
stream.showText("English русский язык Tiếng Việt");
stream.endText();
stream.close();
document.save("example.pdf");
document.close();
}
}
Note that unlike iText, PDFBox support for PDF creation is very low level, i.e. we don't support paragraphs or tables out of the box. There is no tutorial, but a lot of examples. The API orients itself on the PDF specification.
The current version of Apache PDFBox can't deal with Unicode, see:
https://pdfbox.apache.org/ideas.html
iTextPdf v. 5.x generates pdf files with Unicode. There is an exemple here:
iText in Action: Chapter 11: Choosing the right font
part3.chapter11.UnicodeExample
http://itextpdf.com/examples/iia.php?id=199
To run it, you just need to adapt the value of EncodingExample.FONT and to add some code to create the output file.
Related
I am trying to create a pdf with greek characters using iText 7 for Java.
Only latin characters and numbers are visible in the PDF.
I am loading fonts using this code:
PdfFont normalFont = PdfFontFactory.createFont(FontConstants.HELVETICA, "CP1253");
What should I do?
This is the solution:
PdfFont normalFont = PdfFontFactory.createFont("C:\\Windows\\Fonts\\arial.ttf", "Identity-H", true);
You can use any font that supports your language. Also Identity-H seems to be important as the encoding of the PDF file.
I tried many things to write hindi characters using Apache PdfBox but seems its existing issue in the library.
I tried many font files available, Can someone really help me out in this.
I tried following :
PDDocument doc = new PDDocument();
PDPage page = new PDPage();
doc.addPage(page);
PDFont font = PDTrueTypeFont.loadTTF( doc, new FileInputStream(new File("D:\\Data\\fonts\\dn.ttf")));
font.setFontEncoding(new WinAnsiEncoding());
PDPageContentStream content = new PDPageContentStream( doc, page, true, false );
content.setFont(font, 10);
content.beginText();
content.moveTextPositionByAmount( 200, 100 );
content.drawString( "हिंदी" ); // Writing word "Hindi" in hindi language.
content.endText();
content.close();
doc.save( new FileOutputStream(new File("D:\\testOutput1.pdf")));
doc.close();
It's working for me in PDFBox.
The trick here is to use non-unicode string instead of unicode string.
Use Kruti Dev Font given in below link.
Then convert your unicode string to non-unicode string.
And finally use that converted string in your code.
That means replace this like
content.drawString( "हिंदी" ); // Writing word "Hindi" in hindi language.
With this line
content.drawString( "fganh" ); // Writing word "Hindi" in hindi language.
Convert Unicode (Mangal) To Kruti Dev Font
I think this cannot be done using PdfBox as there are lot of issues with it.
I tried many fonts and the encoding types of PdfBox but failed to write in Hindi.
At the end I tried it in Node Js express pdfmaker() which converts Html to PDF, However I had issues on my Linux server and I installed appropriate ttf font and it worked !
I am willing to convert xhtml files into pdf/a format or pdf files to pdf/a format.. Can anyone please suggest which java library I can use..
Thank you
I will make my example more specific
I have a simple html file xyz.html
<html><body>
hello
<br>
<font style = "Helvetica">hello</font>
<br>
</body></html>
java code :
Document document = new Document(PageSize.A4);
FileOutputStream fout = new FileOutputStream(pdffile);
PdfWriter pdfWriter = PdfWriter.getInstance(document, fout);
pdfWriter.setPDFXConformance(PdfWriter.PDFA1B);
FileReader fr = new FileReader(xyz.html);
document.open();
HashMap<String, Object> Provider = new HashMap<String, Object>();
DefaultFontProvider def = new
Provider.put(HTMLWorker.FONT_PROVIDER, def);
HTMLWorker htmlWorker = new HTMLWorker(document);
htmlWorker.setProviders(Provider);
htmlWorker.parse(fr);
I get the error com.itextpdf.text.pdf.PdfXConformanceException: All the fonts must be embedded. This one isn't: Helvetica
try the flying soucer: http://code.google.com/p/flying-saucer/
Check for iText library which has support for both Java and .net
http://itextpdf.com/
Few examples in the below link :
http://itextpdf.com/book/examples.php
http://www.rgagnon.com/javadetails/java-html-to-pdf-using-itext.html
This is proprietory but Its really a smart enterprise library and has good customer support.
Consider Apache FOP project, it supports conversion of xml files to pdf files.
I work at Expected Behavior, and we've developed a SaaS application called DocRaptor that converts HTML to PDF using Prince XML as our rendering engine. DocRaptor uses HTTP POST requests to generate PDF files, and can be used with Java.
Here's a link to our Java example:
DocRaptor Java example
And a link to DocRaptor's home page:
DocRaptor
DocRaptor IS a subscription based service, but our free plan allows you to create up to 5 documents per month, and we don't embed watermarks or restrict the size of free documents.
I have made a software that generate a pdf as the part of its function, I am using iTextPDF Java library to generate PDF. For a demo version of my software, I added text watermarking (like "demo software") by use of following code
PdfContentByte under = writer.getDirectContentUnder();
BaseFont baseFont = BaseFont.createFont(BaseFont.HELVETICA, BaseFont.WINANSI, BaseFont.EMBEDDED);
under.beginText();
under.setColorFill(BaseColor.RED);
under.setFontAndSize(baseFont, 25);
under.showTextAligned(PdfContentByte.ALIGN_CENTER," demo software",250, 470,55);
under.endText();
After it I converted it to .docx format using PDF to Word converter and the resultant docx file does not contain the watermark also the contents are easily editable so as a result the sole purpose of giving demo software is vanished.
How can I achieve permanent watermarking so that pdf to word converter wont be able to remove it.
One idea come to my mind is that instead of putting the text in the pdf there should be a way of converting all the text of a page first into an image then making the pdf comprising of those images. But I am unsure on how to achieve this using iTextPdf.
You can encrypt your PDF so that it cannot be modified without an owner password, after you have generated your PDF, create a PDFStamper with your PDF as input
and encrypt the pdf like the following:
final PdfReader reader = new PdfReader(your_input_stream);
final PdfStamper stamper = new PdfStamper(reader, your_output_stream);
stamper.setEncryption(PdfWriter.ENCRYPTION_AES_128 | PdfWriter.DO_NOT_ENCRYPT_METADATA,
"your_user_password", "your_owner_password", PdfWriter.ALLOW_PRINTING);
stamper.close();
As a side note, i would recommend not using a hardcoded owner password; since you have no need for the owner password after the file has been generated, I would suggest making it a SHA hash of a random string of say 20 alphanumeric characters.
I am using itext and ColdFusion (java) to write text strings to a PDF document. I have both trueType and openType fonts that I need to use. Truetype fonts seem to be working correctly, but the kerning is not being used for any font file ending in .otf. The code below writes "Line 1 of Text" in Airstream (OpenType) but the kerning between "T" and "e" is missing. When the same font is used in other programs, it has kerning. I downloaded a newer version of itext also, but the kerning still did not work. Does anyone know how to get kerning to work with otf fonts in itext?
<cfscript>
pdfContentByte = createObject("java","com.lowagie.text.pdf.PdfContentByte");
BaseFont= createObject("java","com.lowagie.text.pdf.BaseFont");
bf = BaseFont.createFont("c:\windows\fonts\AirstreamITCStd.otf", "" , BaseFont.EMBEDDED);
document = createobject("java","com.lowagie.text.Document").init();
fileOutput = createObject("java","java.io.FileOutputStream").init("c:\inetpub\test.pdf");
writer = createobject("java","com.lowagie.text.pdf.PdfWriter").getInstance(document,fileOutput);
document.open();
cb = writer.getDirectContent();
cb.beginText();
cb.setFontAndSize(bf, 72);
cb.showTextAlignedKerned(PdfContentByte.ALIGN_LEFT,"Line 1 of Text",0,72,0);
cb.endText();
document.close();
bf.hasKernPairs(); //returns NO
bf.getClass().getName(); //returns "com.lowagie.text.pdf.TrueTypeFont"
</cfscript>
according the socalled spec: http://www.microsoft.com/typography/otspec/kern.htm
OpenType™ fonts containing CFF outlines are not supported by the 'kern' table and must use the 'GPOS' OpenType Layout table.
I checked out the source, IText implementation only check the kern for truetype font, not read GPOS table at all, so the internal kernings must be empty, and the hasKernPairs must return false.
So, there have 2 way to solove:
get rid of the otf you used:)
patch the truetypefont by reading the GPosition table
wait for me, I'm processing the cff content, but PDF is optional of ever of my:) but not exclude the possibility:)
Have a look at this thread about How to use Open Type Fonts in Java.
Here is stated that otf is not supported by java (not even with iText). Otf support depends on sdk version and OS.
Alternatively you could use FontForge which converts otf to ttf.