I am using Java PDFBox version 2.0. I want to know how to add a back ground image to the pdf. I can not find any good example in the pdfbox.apache.org
Do this with each page, i.e. from 0 to doc.getNumberOfPages():
PDPage pdPage = doc.getPage(page);
InputStream oldContentStream = pdPage.getContents();
byte[] ba = IOUtils.toByteArray(oldContentStream);
oldContentStream.close();
// brings a warning because a content stream already exists
PDPageContentStream newContentStream = new PDPageContentStream(doc, pdPage, false, true);
// createFromFile is the easiest way with an image file
// if you already have the image in a BufferedImage,
// call LosslessFactory.createFromImage() instead
PDImageXObject pdImage = PDImageXObject.createFromFile(imagePath, doc);
newContentStream.saveGraphicsState();
newContentStream.drawImage(pdImage, 0, 0);
newContentStream.restoreGraphicsState();
newContentStream.close();
// append the saved existing content stream
PDPageContentStream newContentStream2 = new PDPageContentStream(doc, pdPage, true, true);
newContentStream2.appendRawCommands(ba); // deprecated... needs to be rediscussed among devs
newContentStream2.close();
There is another way to do it which is more painful IMHO, getting a iterator of PDStream objects from the page with getContentStreams(), build a List, and insert the new stream at the beginning, and reassign this PDStream list to the page with setContents(). I can add this as an alternative solution if needed.
Call PDPageContentStream.drawImage:
val document = PDDocument()
val page = PDPage()
document.addPage(page)
val contentStream = PDPageContentStream(document, page)
val imageBytes = this::class.java.getResourceAsStream("/image.jpg").readAllBytes()
val image = PDImageXObject.createFromByteArray(document, imageBytes, "background")
contentStream.drawImage(image, 0f, 0f, page.mediaBox.width, page.mediaBox.height)
contentStream.close()
page.close()
This worked best for me... (Please note the use of AppendMode.PREPEND)
InputStream is = getClass().getResourceAsStream("/yourImageFileNameWithExtenstion");
PDImageXObject pdImageXObject = PDImageXObject.createFromByteArray(document, is.readAllBytes(), "");
for (int i = 0; i < document.getNumberOfPages(); i++) {
PDPage page = document.getPage(i);
PDPageContentStream cos = new PDPageContentStream(document, page, AppendMode.PREPEND, true);
cos.drawImage(pdImageXObject, 0, 0, page.getMediaBox().getWidth(), page.getMediaBox().getHeight());
cos.close();
}
Related
I need to convert scanned PDF to grayscale PDF. I found 2 solutions for that.
First one is to just use renderImage
private void convertToGray() throws IOException {
File pdfFile = new File(PATH);
try (PDDocument originalPdf = PDDocument.load(pdfFile);
PDDocument doc = new PDDocument()) {
LOGGER.info("Current heap after loading file: {}", Runtime.getRuntime().totalMemory());
PDFRenderer pdfRenderer = new PDFRenderer(originalPdf);
for (int pageNum = 0; pageNum < originalPdf.getNumberOfPages(); pageNum++) {
// PDImageXObject pdImage = LosslessFactory.createFromImage(doc, bufferedImage);
BufferedImage grayImage = pdfRenderer.renderImageWithDPI(pageNum, 300F, ImageType.GRAY);
PDImageXObject pdImage = JPEGFactory.createFromImage(doc, grayImage);
float pageWight = originalPdf.getPage(pageNum).getMediaBox().getWidth();
float pageHeight = originalPdf.getPage(pageNum).getMediaBox().getHeight();
PDPage page = new PDPage(new PDRectangle(pageWight, pageHeight));
doc.addPage(page);
try (PDPageContentStream contentStream = new PDPageContentStream(doc, page)) {
contentStream.drawImage(pdImage, 0F, 0F, pageWight, pageHeight);
}
}
doc.save(NEW_PATH);
}
}
But this leads to increase size of the file (because some PDFs has less DPI than 300.
Second one is to just replace existing image with gray analog
private void convertByImageToGray() throws IOException {
File pdfFile = new File(PATH);
try (PDDocument document = PDDocument.load(pdfFile)) {
List<COSObject> objects = document.getDocument().getObjectsByType(COSName.IMAGE);
for (COSObject object : objects) {
LOGGER.info("Class: {}; {}", object.getClass(), object.toString());
}
for (int pageNum = 0; pageNum < document.getNumberOfPages(); pageNum++) {
PDPage page = document.getPage(pageNum);
replaceImage(document, page);
}
document.save(NEW_PATH);
}
}
private void replaceImage(PDDocument document, PDPage page) throws IOException {
PDResources resources = page.getResources();
Iterable<COSName> xObjectNames = resources.getXObjectNames();
if (xObjectNames != null) {
for (COSName xObjectName : xObjectNames) {
PDXObject object = resources.getXObject(xObjectName);
if (object instanceof PDImageXObject) {
PDImageXObject img1 = (PDImageXObject) object;
BufferedImage bufferedImage1 = img1.getImage();
BufferedImage grayBufferedImage = convertBufferedImageToGray(bufferedImage1);
// PDImageXObject grayImage = JPEGFactory.createFromImage(document, grayBufferedImage);
PDImageXObject grayImage = LosslessFactory.createFromImage(document, grayBufferedImage);
resources.put(xObjectName, grayImage);
}
}
}
}
private static BufferedImage convertBufferedImageToGray(BufferedImage sourceImg) {
ColorSpace cs = ColorSpace.getInstance(ColorSpace.CS_GRAY);
ColorConvertOp op = new ColorConvertOp(sourceImg.getColorModel().getColorSpace(), cs, null);
op.filter(sourceImg, sourceImg);
return sourceImg;
}
But still some files increase in size like 3 times (even they were already grayscale; interesting that int this case JPEGFactory produces larger files than LosslessFactory). All images in grayscale PDF have the same size as original ones. And I don't understand why.
Maybe there is a better way to make grayscale PDF with predictable size (except ghostscript)?
UPDATE: I've just realized that the issue is with creating PDF from image. It does not compress as well.
For example, I have dummy 1-page scan file that is less than 1 Mb. But if I get image from it (directly copying via Acrobat Reader to Paint, or via code above) it size is ~8-10 Mb depending on the method. And if I create new PDF from this image it's barely compressed. Here is example code:
File pdfFile = new File(FULL_FILE);
try (PDDocument document = PDDocument.load(pdfFile)) {
PDPage page = new PDPage();
document.addPage(page);
PDImageXObject pdImage = PDImageXObject.createFromFile("example.png", document);
try (PDPageContentStream contents = new PDPageContentStream(document, page)) {
contents.drawImage(pdImage, 0F, 0F);
}
document.save(FULL_FILE_NEW);
}
Yes LosslessFactory produces smaller files compared to JPEGFactory
In the below link there are different methods to try and achieve the same goal. Overall the best quality gray scale image was the one from Option 6, however this was by no means the fastest (I myself used Option 4). Comparisons are also provided for you to choose
This link contains possible ways to convert color images to black. It helped me a lot.
Let me know if it works for you and approve my answer if it helped.
I'm writing a simple scanning application using jfreesane and Apache PDFBox.
Here is the scanning code:
InetAddress address = InetAddress.getByName("192.168.0.17");
SaneSession session = SaneSession.withRemoteSane(address);
List<SaneDevice> devices = session.listDevices();
SaneDevice device = devices.get(0);
device.open();
device.getOption("resolution").setIntegerValue(300);
BufferedImage bimg = device.acquireImage();
File file = new File("test_scan.png");
ImageIO.write(bimg, "png", file);
device.close();
And making PDF:
PDDocument document = new PDDocument();
float width = bimg.getWidth();
float height = bimg.getHeight();
PDPage page = new PDPage(new PDRectangle(width, height));
document.addPage(page);
PDImageXObject pdimg = LosslessFactory.createFromImage(document, bimg);
PDPageContentStream stream = new PDPageContentStream(document, page, PDPageContentStream.AppendMode.APPEND, true);
stream.drawImage(pdimg, 0, 0);
stream.close();
document.save(filename);
document.close();
And here is the result:
As you can see the PDF image is more "pale" (saturation? - sorry, I'm not good at color theory and don't know how to name it correctly).
What I have found out:
Printing BufferedImage to JLabel using JLabel(new ImageIcon(bimg))
constructor produces the same result as with PDF ("pale" colors)
so I guess PDFBox is not the reason.
Changing scanning resolution -
no effect.
bimg.getTransparency() returns 1 (OPAQUE)
bimg.getType() returns 0 (TYPE_CUSTOM)
PNG file:
http://s000.tinyupload.com/index.php?file_id=95648202713651192395
PDF file
http://s000.tinyupload.com/index.php?file_id=90369236997064329368
There was an issue in JFreeSane with colorspaces, it was fixed in version 0.97:
https://github.com/sjamesr/jfreesane/releases/tag/jfreesane-0.97
What I am trying to do here is to create text and place it onto a blank page. That page would then be overlayed onto another document and that would then be saved as one document. In 1.8 I was able to create a blank PDPage in a PDF, write text to it as needed, then overlay that PDF with another and then save or view on screen using the code below -
overlayDoc = new PDDocument();
page = new PDPage();
overlayDoc.addPage(page);
overlayObj = new Overlay();
font = PDType1Font.COURIER_OBLIQUE;
try {
contentStream = new PDPageContentStream(overlayDoc, page);
contentStream.setFont(font, 10);
}
catch (Exception e){
System.out.println("content stream failed");
}
After I created the stream, when I needed to write something to the overlay document's contentStream, I would call this method, give it my x, y coords and tell it what text to write (again, this is in my 1.8 version):
protected void writeString(int x, int y, String text) {
if (text == null) return;
try {
contentStream.moveTo(x, y);
contentStream.beginText();
contentStream.drawString(text); // deprecated. Use showText(String text)
contentStream.endText();
}
catch (Exception e){
System.out.println(text + " failed. " + e.toString());
}
}
I would call this method whenever I needed to add text and to wherever I needed to do so. After this, I would close my content stream and then merge the documents together as such:
import org.apache.pdfbox.Overlay;
Overlay overlayObj = new Overlay();
....
PDDocument finalDoc = overlayObj.overlay(overlayDoc, originalDoc);
finalDoc now contains a PDDocument which is my original PDF with text overlayed where needed. I could save it and view it as a BufferedImage on the desktop. The reason I moved to 2.0 was that first off I needed to stay on top of the most recent library and also that I was having issues putting an image onto the page (see here).
The issue I am having in this question is that 2.0 no longer has something similar to the org.apache.pdfbox.Overlay class. To confuse me even more is that there are two Overlay classes in 1.8 (org.apache.pdfbox.Overlay and org.apache.pdfbox.util.Overlay) whereas in 2.0 there is only one. The class I need (org.apache.pdfbox.Overlay), or the methods it offers at least, are not present in 2.0 as far as I can tell. I can only find org.apache.pdfbox.multipdf.Overlay.
Here's some quick code that works, it adds "deprecated" over a document and saves it elsewhere:
PDDocument overlayDoc = new PDDocument();
PDPage page = new PDPage();
overlayDoc.addPage(page);
Overlay overlayObj = new Overlay();
PDFont font = PDType1Font.COURIER_OBLIQUE;
PDPageContentStream contentStream = new PDPageContentStream(overlayDoc, page);
contentStream.setFont(font, 50);
contentStream.setNonStrokingColor(0);
contentStream.beginText();
contentStream.moveTextPositionByAmount(200, 200);
contentStream.drawString("deprecated"); // deprecated. Use showText(String text)
contentStream.endText();
contentStream.close();
PDDocument originalDoc = PDDocument.load(new File("...inputfile.pdf"));
overlayObj.setOverlayPosition(Overlay.Position.FOREGROUND);
overlayObj.setInputPDF(originalDoc);
overlayObj.setAllPagesOverlayPDF(overlayDoc);
Map<Integer, String> ovmap = new HashMap<Integer, String>(); // empty map is a dummy
overlayObj.setOutputFile("... result-with-overlay.pdf");
overlayObj.overlay(ovmap);
overlayDoc.close();
originalDoc.close();
What I did additionally to your version:
declare variables
close the content stream
set a color
set to foreground
set a text position (not a stroke path position)
add an empty map
And of course, I read the OverlayPDF source code, it shows more possibilities what you can do with the class.
Bonus content:
Do the same without using the Overlay class, which allows further manipulation of the document before saving it.
PDFont font = PDType1Font.COURIER_OBLIQUE;
PDDocument originalDoc = PDDocument.load(new File("...inputfile.pdf"));
PDPage page1 = originalDoc.getPage(0);
PDPageContentStream contentStream = new PDPageContentStream(originalDoc, page1, true, true, true);
contentStream.setFont(font, 50);
contentStream.setNonStrokingColor(0);
contentStream.beginText();
contentStream.moveTextPositionByAmount(200, 200);
contentStream.drawString("deprecated"); // deprecated. Use showText(String text)
contentStream.endText();
contentStream.close();
originalDoc.save("....result2.pdf");
originalDoc.close();
I'm using pdxbox on google app engine, I use modifed version to be compatible with app engine (https://stackoverflow.com/a/12342272/2459131).
But I'm not able to add image in pdf. Because : javax.imageio.ImageIO is a restricted class
I need to make this work :
BufferedImage awtImage = ImageIO.read(new File(image));
I found two link that can help but I don't know how to use it :
https://github.com/pascalleclercq/appengine-awt/releases/tag/appengine-awt-1.0.0
http://mvnrepository.com/artifact/fr.opensagres.xdocreport.appengine-awt/appengine-awt/1.0.0
--> import in my project but com.google.code.appengine.awt.image.BufferedImage cannot be use in pdfbox's method
Edit :
PDDocument document = new PDDocument();
PDPage tmp = (PDPage) PDDocument.load("WEB-INF/pdfs/TemplateFactureEmpty3.pdf").getDocumentCatalog().getAllPages().get(0);
document.addPage(tmp);
PDPageContentStream content = new PDPageContentStream(document, tmp, true, true);
// choice 1 , pb : call ImageIO restricted
InputStream in = new FileInputStream(new File("WEB-INF/CokeLogo1.png"));
PDJpeg ximage = new PDJpeg(document, in);
// choise 2 , pb : BufferedImage and ImageIO restricted
BufferedImage awtImage = ImageIO.read(new File("WEB-INF/CokeLogo1.png") );
PDXObjectImage ximage = new PDPixelMap(document, awtImage);
float scale = 1f; // alter this value to set the image size
content.drawXObject(ximage, 56, 800, ximage.getWidth() * scale,ximage.getHeight() * scale);
Thanks,
Francois
I'm writing a java app that creates a pdf from scratch using the pdfbox library.
I need to place a jpg image in one of the page.
I'm using this code:
PDDocument document = new PDDocument();
PDPage page = new PDPage(PDPage.PAGE_SIZE_A4);
document.addPage(page);
PDPageContentStream contentStream = new PDPageContentStream(document, page);
/* ... */
/* code to add some text to the page */
/* ... */
InputStream in = new FileInputStream(new File("c:/myimg.jpg"));
PDJpeg img = new PDJpeg(document, in);
contentStream.drawImage(img, 100, 700);
contentStream.close();
document.save("c:/mydoc.pdf");
When I run the code, it terminates successfully, but if I open the generated pdf file using Acrobat Reader, the page is completely white and the image is not placed in it.
The text instead is correctly placed in the page.
Any hint on how to put my image in the pdf?
Definitely add the page to the document. You'll want to do that, but I've also noticed that PDFBox won't write out the image if you create the PDPageContentStream BEFORE the PDJpeg. It's unexplained why this is so, but if you look close at the source of ImageToPDF that's what they do. Create the PDPageContentStream after PDJpeg and it magically works.
...
PDJpeg img = new PDJpeg(document, in);
PDPageContentStream stream = new PDPageContentStream( doc, page );
...
Looks like you're missing just a document.addPage(page) call.
See also the ImageToPDF example class in PDFBox for some sample code.
this is how default constructor for PDPageContentStream looks like:
public PDPageContentStream(PDDocument document, PDPage sourcePage) throws IOException
{
this(document, sourcePage, AppendMode.OVERWRITE, true, false);
}
Problem is AppendMode.OVERWRITE for me using another constructor with parameter PDPageContentStream.AppendMode.APPEND resolved a problem
For me this worked:
PDPageContentStream contentStream =
new PDPageContentStream(document, page, PDPageContentStream.AppendMode.APPEND, true, false);