I am integrating my RESTful service to support download of documents etc. For that I am exploring PDFBox library which has capability to work with PDF docs (fantastic) and it works OK if I have to create and save documents.
My example code looks like following;
PDDocument doc = null;
PDPage page = null;
try{
doc = new PDDocument();
page = new PDPage();
doc.addPage(page);
PDFont font = PDType1Font.HELVETICA;
PDPageContentStream content = new PDPageContentStream(doc,page);
content.beginText();
content.drawString("Some Content Received At Runtime");
content.endText();
content.close();
doc.save("SomeName.pdf");
doc.close();
// Now load and return the stream
return PDDocument.load("SomeName.pdf").getDocument().createCOSStream().getFilteredStream();
} catch (Exception e) {
// Do nothing for now
}
If you see in above example I have an option to convert document into InputStream but to achieve this I first have to Save the document and then reload! This is not desirable because this will clutter the server I am running on with junk.
What I really want is to achieve this without saving the document! Is it possible? Should I be looking at some other library like iText? If you know an example for the same please share.
Related
Im trying to add a TIFF image (CCIT Group 3) to a PDF using Java and PDFBox 1.8.10. There is an image shown on the output file, but its displayed wrong. Its only some black and white pixels.
String outputPath = "/tmp/PDFImage.pdf";
String imagePath = "/tmp/header.tif";
PDDocument doc = new PDDocument();
PDPage page = new PDPage();
doc.addPage(page);
PDPageContentStream content = new PDPageContentStream(doc, page);
PDXObjectImage ximage = new PDCcitt(doc, new RandomAccessFile(new File(imagePath), "r"));
content.drawImage(ximage, 0, 500);
content.close();
doc.save(outputPath);
doc.close();
The PDFBox dependencies says : To write TIFF images a JAI ImageIO Core library will be needed.
I imported the library and and scanned for plugins, but dont found an example how to use is exactly. Someone can help ?
What I am trying to do here is to create text and place it onto a blank page. That page would then be overlayed onto another document and that would then be saved as one document. In 1.8 I was able to create a blank PDPage in a PDF, write text to it as needed, then overlay that PDF with another and then save or view on screen using the code below -
overlayDoc = new PDDocument();
page = new PDPage();
overlayDoc.addPage(page);
overlayObj = new Overlay();
font = PDType1Font.COURIER_OBLIQUE;
try {
contentStream = new PDPageContentStream(overlayDoc, page);
contentStream.setFont(font, 10);
}
catch (Exception e){
System.out.println("content stream failed");
}
After I created the stream, when I needed to write something to the overlay document's contentStream, I would call this method, give it my x, y coords and tell it what text to write (again, this is in my 1.8 version):
protected void writeString(int x, int y, String text) {
if (text == null) return;
try {
contentStream.moveTo(x, y);
contentStream.beginText();
contentStream.drawString(text); // deprecated. Use showText(String text)
contentStream.endText();
}
catch (Exception e){
System.out.println(text + " failed. " + e.toString());
}
}
I would call this method whenever I needed to add text and to wherever I needed to do so. After this, I would close my content stream and then merge the documents together as such:
import org.apache.pdfbox.Overlay;
Overlay overlayObj = new Overlay();
....
PDDocument finalDoc = overlayObj.overlay(overlayDoc, originalDoc);
finalDoc now contains a PDDocument which is my original PDF with text overlayed where needed. I could save it and view it as a BufferedImage on the desktop. The reason I moved to 2.0 was that first off I needed to stay on top of the most recent library and also that I was having issues putting an image onto the page (see here).
The issue I am having in this question is that 2.0 no longer has something similar to the org.apache.pdfbox.Overlay class. To confuse me even more is that there are two Overlay classes in 1.8 (org.apache.pdfbox.Overlay and org.apache.pdfbox.util.Overlay) whereas in 2.0 there is only one. The class I need (org.apache.pdfbox.Overlay), or the methods it offers at least, are not present in 2.0 as far as I can tell. I can only find org.apache.pdfbox.multipdf.Overlay.
Here's some quick code that works, it adds "deprecated" over a document and saves it elsewhere:
PDDocument overlayDoc = new PDDocument();
PDPage page = new PDPage();
overlayDoc.addPage(page);
Overlay overlayObj = new Overlay();
PDFont font = PDType1Font.COURIER_OBLIQUE;
PDPageContentStream contentStream = new PDPageContentStream(overlayDoc, page);
contentStream.setFont(font, 50);
contentStream.setNonStrokingColor(0);
contentStream.beginText();
contentStream.moveTextPositionByAmount(200, 200);
contentStream.drawString("deprecated"); // deprecated. Use showText(String text)
contentStream.endText();
contentStream.close();
PDDocument originalDoc = PDDocument.load(new File("...inputfile.pdf"));
overlayObj.setOverlayPosition(Overlay.Position.FOREGROUND);
overlayObj.setInputPDF(originalDoc);
overlayObj.setAllPagesOverlayPDF(overlayDoc);
Map<Integer, String> ovmap = new HashMap<Integer, String>(); // empty map is a dummy
overlayObj.setOutputFile("... result-with-overlay.pdf");
overlayObj.overlay(ovmap);
overlayDoc.close();
originalDoc.close();
What I did additionally to your version:
declare variables
close the content stream
set a color
set to foreground
set a text position (not a stroke path position)
add an empty map
And of course, I read the OverlayPDF source code, it shows more possibilities what you can do with the class.
Bonus content:
Do the same without using the Overlay class, which allows further manipulation of the document before saving it.
PDFont font = PDType1Font.COURIER_OBLIQUE;
PDDocument originalDoc = PDDocument.load(new File("...inputfile.pdf"));
PDPage page1 = originalDoc.getPage(0);
PDPageContentStream contentStream = new PDPageContentStream(originalDoc, page1, true, true, true);
contentStream.setFont(font, 50);
contentStream.setNonStrokingColor(0);
contentStream.beginText();
contentStream.moveTextPositionByAmount(200, 200);
contentStream.drawString("deprecated"); // deprecated. Use showText(String text)
contentStream.endText();
contentStream.close();
originalDoc.save("....result2.pdf");
originalDoc.close();
I have multiple PDFs that get populated with multiple records (a.pdf,b.pdf,c[0-9].pdf,d[0-9].pdf,ez.pdf) using acroforms and pdfbox.
The resulting files (aflat.pdf,bflat.pdf,c[0-9]flat.pdf,d[0-9]flat.pdf,ezflat.pdf) should have their forms(dictionaries and whatever adobe uses) removed but the fields filled as raw text saved on the pdf (setReadOnly is not what I want!).
PdfStamper can only remove fields without saving their content but I've found some references to PdfContentByte as a way to save the content. Alas, the documentation is too brief to understand how I should do this.
As a last resort I could use FieldPosition to write directly on the PDF. Has anyone ever encountered such problem? How do I solve it?
UPDATE: Saving a single page of b.pdf yields a valid bfilled.pdf but a blank bflattened.pdf. Saving the whole document solved the issue.
populateB();
try (PDDocument doc = new PDDocument(); FileOutputStream stream = new FileOutputStream("bfilled.pdf")) {
//importing the page will corrupt the fields
/*wrong approach*/doc.importPage((PDPage)pdfDocuments.get(0).getDocumentCatalog().getAllPages().get(0));
/*wrong approach*/doc.save(stream);
//save the whole document instead
pdfDocuments.get(0).save(stream);//<---right approach
}
try (FileOutputStream stream = new FileOutputStream("bflattened.pdf")) {
PdfStamper stamper = new PdfStamper(new PdfReader("bfilled.pdf"), stream);
stamper.setFormFlattening(true);
stamper.close();
}
Use PdfStamper.setFormFlattening(true) to get rid of the fields and write them as content.
Always use the whole page when working with acroforms
populateB();
try (PDDocument doc = new PDDocument(); FileOutputStream stream = new FileOutputStream("bfilled.pdf")) {
//importing the page will corrupt the fields
doc.importPage((PDPage) pdfDocuments.get(0).getDocumentCatalog().getAllPages().get(0));
doc.save(stream);
//save the whole document instead
pdfDocuments.get(0).save(stream);
}
try (FileOutputStream stream = new FileOutputStream("bflattened.pdf")) {
PdfStamper stamper = new PdfStamper(new PdfReader("bfilled.pdf"), stream);
stamper.setFormFlattening(true);
stamper.close();
}
I try to create a PDF file with PDFBox and then to create from it an image using commercial library jPDFImages (Qoppa software). Yes, i know that PDFBox is able too to create images from PDFs, but for some reasons I need to use commercial library.
I created PDF file and pass it to jPDFImajes, but I have an error: "Unable to find PDF trailer". Qoppa software describe this error.
The problem is seems to be in the PDF trailer, which is created by PDFBox, but I don`t understand how to set it up in right mode? (I have a problem only with PDFs created with PDFBox)
Here is my code for pdf creation:
public void createPDFFromImage( String file) throws Exception {
PDDocument doc = null;
try {
doc = new PDDocument();
BufferedImage bufferedImage = ImageIO.read(new File(/home/.../files/test.png));
PDPage page = new PDPage();
doc.addPage( page );
PDJpeg ximage = new PDJpeg(doc,bufferedImage, (float) 0.95);
PDPageContentStream contentStream = new PDPageContentStream(doc, page);
contentStream.drawXObject(ximage, x, y, W, H);
contentStream.close();
doc.save(file);
} finally {
if( doc != null ) {
doc.close();
}
}
}
Here is an error from commercial library:
java.lang.RuntimeException: com.qoppa.pdf.PDFException: Unable to find PDF trailer.
Caused by: com.qoppa.pdf.PDFException: Unable to find PDF trailer.
I think the problem is, how do I create pdf. Maybe I need to add some information to the pdf to make it valid?
I'm writing a java app that creates a pdf from scratch using the pdfbox library.
I need to place a jpg image in one of the page.
I'm using this code:
PDDocument document = new PDDocument();
PDPage page = new PDPage(PDPage.PAGE_SIZE_A4);
document.addPage(page);
PDPageContentStream contentStream = new PDPageContentStream(document, page);
/* ... */
/* code to add some text to the page */
/* ... */
InputStream in = new FileInputStream(new File("c:/myimg.jpg"));
PDJpeg img = new PDJpeg(document, in);
contentStream.drawImage(img, 100, 700);
contentStream.close();
document.save("c:/mydoc.pdf");
When I run the code, it terminates successfully, but if I open the generated pdf file using Acrobat Reader, the page is completely white and the image is not placed in it.
The text instead is correctly placed in the page.
Any hint on how to put my image in the pdf?
Definitely add the page to the document. You'll want to do that, but I've also noticed that PDFBox won't write out the image if you create the PDPageContentStream BEFORE the PDJpeg. It's unexplained why this is so, but if you look close at the source of ImageToPDF that's what they do. Create the PDPageContentStream after PDJpeg and it magically works.
...
PDJpeg img = new PDJpeg(document, in);
PDPageContentStream stream = new PDPageContentStream( doc, page );
...
Looks like you're missing just a document.addPage(page) call.
See also the ImageToPDF example class in PDFBox for some sample code.
this is how default constructor for PDPageContentStream looks like:
public PDPageContentStream(PDDocument document, PDPage sourcePage) throws IOException
{
this(document, sourcePage, AppendMode.OVERWRITE, true, false);
}
Problem is AppendMode.OVERWRITE for me using another constructor with parameter PDPageContentStream.AppendMode.APPEND resolved a problem
For me this worked:
PDPageContentStream contentStream =
new PDPageContentStream(document, page, PDPageContentStream.AppendMode.APPEND, true, false);