Adding fonts to Apache Pdfbox? - java

Is there a way to add additional font styles into Apache Pdfbox?
We're currently trying to work around printing PDFs in our system (currently being done with PDF-Renderer.) I have been looking at various alternatives (pdfbox, jpedal, jPDFPrint)
Our hope is for a free GPL compatible library to use, and as such we're leaning towards pdfbox. I have been able to write some sample code to print out the pdf which 'works'. See below:
PDDocument doc;
try {
doc = PDDocument.load("test.pdf");
doc.print();
} catch (Exception e) {
// Come up with better thing to do on fail.
e.printStackTrace();
}
As I mentioned, this works but the problem I'm running into is that PdfBox doesn't seem to be recognizing the fonts used in the pdf, and as such changes the font being used. As a result the document looks very odd (spacing and character size are different and look bizarre). I routinely see the following log message, or things like it:
Apr 16, 2014 2:56:21 PM org.apache.pdfbox.pdmodel.font.PDSimpleFont drawString
WARNING: Changing font on < > from < NimbusMono > to the default font
Does anyone know of a way (or a reference) on how to approach adding a new fonttype into pdfbox? Or barring that, how to change the default font type?
From what I can tell, pdfbox supports 14 standard fonts. Unfortunately NimbusMono is not one of them. Any guidance would be appreciated.

The unreleased 2.0 version supports the rendering of embedded fonts. You can get it as a snapshot
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/
or through "svn checkout http://svn.apache.org/repos/asf/pdfbox/trunk/". The API is slightly different from the 1.8.x versions and might change, the best is to look at the code examples. A quick test to see whether your file will be rendered properly is to download the "pdfbox-app"
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/2.0.0-SNAPSHOT/
and then run the viewer:
java -jar pdfbox-app-2.0.0-20140416.173452-273.jar PDFReader your-file-name.pdf
There's also a print feature.
Good luck!
Update 2016: 2.0 release is out, download it here.
If you have used the 1.8 version, read the migration guide.

I came across this post while trying to solve the same problem. The PDFBox 2.0 API documentation isn't great at the moment.
What you're looking for is the FontFileFinder in Fontbox.
Make sure you're using the full pdfbox-app jar which includes Fontbox.
I've only tried this on Windows but looking at the classes it seems like it supports the other main operating systems.
Here's a simple example class I wrote that writes out a small bit of text in the bottom left corner of a PDF, using a non-standard font.
import java.io.File;
import java.io.IOException;
import java.net.URI;
import java.util.List;
import org.apache.fontbox.util.autodetect.FontFileFinder;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDPageContentStream;
import org.apache.pdfbox.pdmodel.font.PDType0Font;
import org.apache.pdfbox.pdmodel.font.PDType1Font;
public class TestPDFWrite {
public static void main(String[] args) throws IOException {
FontFileFinder fontFinder = new FontFileFinder();
List<URI> fontURIs = fontFinder.find();
File fontFile = null;
for (URI uri : fontURIs) {
File font = new File(uri);
if (font.getName().equals("CHILLER.TTF")) {
fontFile = font;
}
}
PDDocument document = new PDDocument();
PDPage page = new PDPage();
document.addPage(page);
PDPageContentStream contentStream = new PDPageContentStream(document, page);
contentStream.beginText();
if (fontFile != null) {
contentStream.setFont(PDType0Font.load(document, fontFile), 12);
} else {
contentStream.setFont(PDType1Font.HELVETICA, 12);
}
contentStream.newLineAtOffset(10, 10);
contentStream.showText("Hello World");
contentStream.endText();
contentStream.close();
document.save("C:/Hello World.pdf");
document.close();
}
}

I ran into a similar problem with PDFBox. PDFs can be printed in a straightforward way using Java's javax.print package. The following code is slightly modified from the API docs for javax.print.
DocFlavor flavor = DocFlavor.INPUT_STREAM.PDF;
PrintRequestAttributeSet aset = new HashPrintRequestAttributeSet();
aset.add(MediaSizeName.ISO_C6); //letter size
PrintService[] pservices = PrintServiceLookup.lookupPrintServices(flavor, aset);
if (pservices.length > 0) {
DocPrintJob pj = pservices[0].createPrintJob();
try {
FileInputStream fis = new FileInputStream("test.pdf");
Doc doc = new SimpleDoc(fis, flavor, null);
pj.print(doc, aset);
} catch (FileNotFoundException | PrintException e) {
//do something
}
This code assumes that the printer can accept a PDF directly but it allows you to bypass PDFBox 1.8 branch's wonky font issues.

Related

I can't import com.itextpdf.text.Document class

I'm building an android app and I want to use iText for creating pdf file, but I can't use Document class. As I seen in tutorials, there should be import com.itextpdf.text.Document for using Document class. For this app, I'm using com.itextpdf:itext-pdfa:5.5.9 library. I want to create a simple pdf file with 2 paragraphs, something like this:
try{
File pdfFolder = new File(Environment.getExternalStoragePublicDirectory(
Environment.DIRECTORY_DOCUMENTS), "pdfdemo");
if (!pdfFolder.exists()) {
pdfFolder.mkdir();
}
Date date = new Date() ;
String timeStamp = new SimpleDateFormat("yyyyMMdd_HHmmss").format(date);
File myFile = new File(pdfFolder + timeStamp + ".pdf");
OutputStream output = new FileOutputStream(myFile);
Document document = new Document();
PdfAWriter.getInstance(document, output);
document.open();
document.add(new Paragraph(mSubjectEditText.getText().toString()));
document.add(new Paragraph(mBodyEditText.getText().toString()));
document.close();
}catch (Exception e) {}
'
Could anyone help me with this problem? What am I doing wrong?
You say:
I'm using com.itextpdf:itext-pdfa:5.5.9 library
That is wrong for two reasons:
itext-pdfa is an addon to iText that is meant for writing or manipulating PDF/A documents. It requires the core iText libary. Read about the different parts of iText on the official web site: https://developers.itextpdf.com/itext-java
You say you want to use iText on Android, but you are referring to iText for Java. iText for Java contains classes that are not allowed on Android (java.awt.*, javax.nio,...). You should use the Android port for iText, which is called iTextG: https://developers.itextpdf.com/itextg-android
It's as if you're using iText without having visited the official iText web site. How is that even possible?
Just open your app level gradle file and add following code into your dependencies
implementation 'com.itextpdf:itext-pdfa:5.5.9'
It works for me

Creating Word File With JTextPane Style Option

I want to save the contents of a JTextPane to a word file.
I don't have a problem saving but I can't currently keep some style options such as paragraph styles.
I use these libraries:
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;
import org.apache.poi.xwpf.usermodel.XWPFRun;
Lines of code;
System.out.println("Kaydete basıldı");
String text = textPane.getText();
lblNewLabel.setText(text);
XWPFDocument document = new XWPFDocument();
XWPFParagraph paragraph = document.createParagraph();
XWPFRun run = paragraph.createRun();
run.setText(text);
try {
FileOutputStream dosyaCikis = new FileOutputStream(
"sercan.docx");
document.write(dosyaCikis);
dosyaCikis.close();
} catch (Exception e2) {
e2.printStackTrace();
}
Apache POI or another way, it does not matter, I am waiting for your help.
This example shows how to set various style options:(Apache POI)
SimpleDocument
Example code form the link:
XWPFDocument doc = new XWPFDocument();
XWPFParagraph p1 = doc.createParagraph();
p1.setAlignment(ParagraphAlignment.CENTER);
p1.setBorderBottom(Borders.DOUBLE);
p1.setBorderTop(Borders.DOUBLE);
p1.setBorderRight(Borders.DOUBLE);
p1.setBorderLeft(Borders.DOUBLE);
p1.setBorderBetween(Borders.SINGLE);
p1.setVerticalAlignment(TextAlignment.TOP);
XWPFRun r1 = p1.createRun();
r1.setBold(true);
r1.setText("The quick brown fox");
r1.setBold(true);
r1.setFontFamily("Courier");
r1.setUnderline(UnderlinePatterns.DOT_DOT_DASH);
r1.setTextPosition(100);
Other examples(styles,images .etc) can be found here:
Example Package
AFAIK the options for writing Word files are limited from standard Java libraries.
You probably want to use a tool that explicitly supports Word formats - the best bet is probably LibreOffice, which is Free software. The LibreOffice API supports Java and other languages.
For a fuller explanation look here:
What's a good Java API for creating Word documents?
However that answer refers to OpenOffice, of which LibreOffice is a more actively developed fork due to management issues over the years.
You could try docx_editor_kit. From the web page:
it can open docx file and reflect the content in JEditorPane (or
JTextPane). Also user can create styled content and store the content
in docx format.
Somewhat related is my docx4all, but it hasn't been updated recently, and it may be overkill for your purposes.
Both of these use docx4j (as opposed to POI).

Viewing .doc file with java applet

I have a web application. I've generated MS Word document in xml format (Word 2003 XML Document) on server side. I need to show this document to a user on a client side using some kind of viewer. So, question is: what libraries I can use to solve this problem? I need an API to view word document on client side using java.
You cannot reliably display a Word document in a web page using Java (or any other simple technology for that matter). There are several commercial libraries out there to render Word, but you will not find these to be easy, cheap or reliable solutions.
What you should do is the following:
(1) Open the Word engine on the server using a .NET program
(2) Convert the document to Rich Text using the Word engine
(3) Display the rich text either using the RTF Swing widget, or convert to HTML:
String rtf = [your document rich text];
BufferedReader input = new BufferedReader(new StringReader(rtf));
RTFEditorKit rtfKit = new RTFEditorKit();
StyledDocument doc = (StyledDocument) rtfKit.createDefaultDocument();
rtfEdtrKt.read( input, doc, 0 );
input.close();
HTMLEditorKit htmlKit = new HTMLEditorKit();
StringWriter output = new StringWriter();
htmlKit.write( output, doc, 0, doc.getLength());
String html = output.toString();
The main risk in this approach is that the Word engine will either crash or have a memory leak. For this reason you have to have a mechanism for restarting it periodically and testing it to make sure it is functional and not hogging memory.
docx4all is a Swing-based applet which does Word 2007 XML (ie not Word 2003 XML), which we wrote several years ago.
Get it from svn.
That's a possible approach for editing. If all you want is a viewer, which not convert to HTML or PDF? You can use docx4j for that. (Disclosure: "my" project).
You might have a look at the Apache POI - Java API to Handle Microsoft Word Files which is able to read all kinds of word documents (OLE2 and OOXML formats, .doc and .docx extensions respectively).
Reading a doc file can be easy as:
import java.io.*;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;
public class ReadDocFile {
public static void main(String[] args) {
File file = null;
WordExtractor extractor = null ;
try {
file = new File("c:\\New.doc");
FileInputStream fis=new FileInputStream(file.getAbsolutePath());
HWPFDocument document=new HWPFDocument(fis);
extractor = new WordExtractor(document);
String [] fileData = extractor.getParagraphText();
for(int i=0;i<fileData.length;i++){
if(fileData[i] != null)
System.out.println(fileData[i]);
}
}
catch(Exception exep){}
}
}
You can find more at: HWPF Quick-Guide (specifically HWPF unit tests)
Note that, according to the POI site:
HWPF is still in early development.
I'd suggest looking at the openoffice source code and implement that.
It's supposed to be written in java.

Apache POI - Docx output issue

I am evaluating apache poi as an option to write docx files. The specific thing I am looking for is to generate content in the docx file in different languages (hindi/marathi to be specific). I am facing the following issue:
When the docx file gets written the "Hindi/Marathi" text is visible as square boxes even though the font "Arial Unicode MS" supports it. The point is that when we check the boxes MS Word displays the font as "Cailbri", even though i have explicitly set the font to "Arial Unicode MS". If i select the boxes in MS Word and then change the font to "Arial Unicode MS" the hindi/marathi words are visible correctly. Any idea why this happens? Please note I am using a development version of POI as the previous stable version did not support setting of font families. Here is the source:
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;
import org.apache.poi.xwpf.usermodel.XWPFRun;
public class CreateDocumentFromScratch
{
public static void main(String[] args)
{
XWPFDocument document = new XWPFDocument();
XWPFParagraph paragraphTwo = document.createParagraph();
XWPFRun paragraphTwoRunOne = paragraphTwo.createRun();
paragraphTwoRunOne.setFontFamily("Arial Unicode MS");
paragraphTwoRunOne.setText("नसल्यास");
XWPFParagraph paragraphThree = document.createParagraph();
XWPFRun paragraphThreeRunOne = paragraphThree.createRun();
paragraphThreeRunOne.setFontFamily("Arial Unicode MS");
paragraphThreeRunOne.setText("This is nice");
FileOutputStream outStream = null;
try {
outStream = new FileOutputStream("c:/will/First.doc");
} catch (FileNotFoundException e) {
e.printStackTrace();
}
try {
document.write(outStream);
outStream.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
}
Any help will be appreciated.
To resurrect a very old post; can the OP confirm the version of MS Office that being used? The problem appears to be with MS Office 2003 running on Windows XP. But then it could be on a higher OS version, too.
It would appear that MS Word applies the Mangal font for Hindi script [Encoding standard: Indic: Hindi ISCII 57002 (Devanagari)]. The following link explains this:
https://support.office.com/en-ca/article/Choose-text-encoding-when-you-open-and-save-files-60d59c21-88b5-4006-831c-d536d42fd861
Suggested workaround:
From Windows XP Control Panel, select Regional and Language Options. Select Languages. Check the box "Install files for complex script and right-to-left languages (including Thai).
Restart PC.
However, no such problem was observed when opening the file using LibreOffice versions 4.3.5.2 on Windows, and LibreOffice 4.2.7.2 on Linux (Ubuntu).
Used the following libraries:
poi-3.10-FINAL-20140208.jar, poi-ooxml-3.10-FINAL-20140208.jar,
poi-ooxml-schemas-3.10-FINAL-20140208.jar, xmlbeans-2.3.0.jar,
dom4j-1.6.1.jar, stax-api-1.0.1.jar

Silent Printing of PDF From Within Java

We are looking into silent printing of PDF documents from within Java. The printing will be invoked from the desktop and not through a browser so we cannot use JavaScript. PDF Renderer is an operational solution but their rendering quality is not acceptable. iText does not seem to be pluggable with the Java print service. There are some commercial Java libraries, jPDFPrint by Qoppa, JPedal, and ICEpdf which we have not tried out yet.
Does anybody have any experience with PDF silent printing from Java?
Apache PDFBox. It is currently in incubation, but the PDF printing functionality has been around before that. Internally, it uses the Java Print Services to create a print job, and it also supports silent printing.
Do note that it requires Fontbox as well, and the current (upcoming 0.8.0 release) has included graceful fallback for documents with Type 0 fonts. Type 1 fonts are printed correctly; however in 0.7.3, attempts to print documents with Type 0 fonts will result in an exception being thrown.
Maybe I'm misunderstanding, but why not just use the Print Service API directly? The following works for me (assumes you have the PDF document as a byte array):
DocFlavor flavor = DocFlavor.BYTE_ARRAY.PDF;
PrintService[] services = PrintServiceLookup.lookupPrintServices(flavor, null);
if (services.length > 0)
{
DocPrintJob printJob = services[0].createPrintJob();
Doc document = new SimpleDoc(pdfBytes, flavor, null)
printJob.print(document, null);
}
else
{
System.out.println("No PDF printer available.");
}
This works for me:
public void print() {
DocFlavor flavor = DocFlavor.INPUT_STREAM.AUTOSENSE;
PrintService[] services = PrintServiceLookup.lookupPrintServices(flavor, null);
FileInputStream psStream = null;
try {
psStream = new FileInputStream("c:\\test.pdf");
} catch (FileNotFoundException ffne) {
ffne.printStackTrace();
}
if (psStream == null) {
return;
}
if (services.length > 0)
{
PrintService myService = null;
for(PrintService service : services) {
System.out.println(service.getName());
if(service.getName().contains("my printer")) {
myService = service;
break;
}
}
DocPrintJob printJob = myService.createPrintJob();
Doc document = new SimpleDoc(psStream, flavor, null);
try {
printJob.print(document, null);
} catch (PrintException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
else
{
System.out.println("No PDF printer available.");
}
}
Have a look at www.pdflib.com. Its comercial but PDFlib Lite is available for free for open source projects. It has bindings for java.
There is an example using JPedal at http://www.jpedal.org/support_egSP.php
You will need the commercial version of IcePdf if you want full font support.
I have experience with making Acrobat (Reader or Full) do the printing, but it's anything but silent (it is unattended, though - just depends on how 'silent' the silent requirement is). If there's interest, I can shoot you the native code that makes the required DDE calls.
iText is intended for creating PDF files (per a post I saw from the author), and thus probably isn't what you want.
I've used Qoppa's jPDFPrint quite successfully for exactly this purpose, but it's not cheap. If you can afford it, it's the most robust solution I've found thus far. I've also been very impressed with the level of support; they even generated some custom sample code for me.
I tried PDFBox, but found that it doesn't support the "Shrink to printable area" page scaling that you get with Acrobat. Not everyone will care about this feature, but it's essential for me.

Categories