How to copy one page of a PDFDocument , to another PDFDocument - java

I have a doubt about PDFDocuments. How can I copy a specific page of a PdfDocument object , to another PdfDocument object.
I've tried with the moveTo method , but It didn't worked , also Ive tried with copyPagesTo method , but I throws an Requested page is out of bounds (when I try to copy for example only one page , from 1 to 1).
Any hint?
List<PdfDocument> pdfDocuments = new ArrayList<>();
PdfDocument pdfWithMultiplePages = here I have a PDF with 3 pages.
for (int i = 0 ; i<pdfWithMultiplePages.getNumberOfPages() ; i++){
final ByteArrayOutputStream byteArrayOutputStream = new
ByteArrayOutputStream();
final PdfWriter pdfWriter = new PdfWriter(byteArrayOutputStream);
PdfDocument pdfDocument = new PdfDocument(pdfWriter);
pdfDocument.copyPagesTo(i+1, i+1,pdfWithMultiplePages);
pdfWriter.close();
byteArrayOutputStream.close();
shippingLabels.add(shippingLabelSplitted);
}
I've tried also with this :
List<PdfDocument> pdfDocuments = new ArrayList<>();
PdfDocument pdfWithMultiplePages = here I have a PDF with 3 pages.
for (int i = 0 ; i<pdfWithMultiplePages.getNumberOfPages() ; i++){
final ByteArrayOutputStream byteArrayOutputStream = new
ByteArrayOutputStream();
final PdfWriter pdfWriter = new PdfWriter(byteArrayOutputStream);
PdfDocument pdfDocument = new PdfDocument(pdfWriter);
pdfDocument.addPage(1,pdfWithMultiplePages.getPage(i+1));
pdfWriter.close();
byteArrayOutputStream.close();
shippingLabels.add(shippingLabelSplitted);
}
But it throws :
com.itextpdf.kernel.PdfException: Page com.itextpdf.kernel.pdf.PdfPage#6576eb4b cannot be added to document com.itextpdf.kernel.pdf.PdfDocument#286ef136, because it belongs to document com.itextpdf.kernel.pdf.PdfDocument#2c74aa66.

A page in a PDF has many relations to other objects in a PDF.
If you could add a page located in one document to another one, the page would reside in both documents. Thus, the page suddenly would have to have all those relations to objects in both documents. This obviously does not work, thus iText prevents this.
Instead you have to create a copy of the page(s) in question for which the relations are switched to documents in the target document.
For this task there are multiple method overloads of PdfDocument.copyPagesTo. Thus, these methods indeed are the ones to use.
Unfortunately you mistake source and target of the operation:
PdfDocument pdfWithMultiplePages = here I have a PDF with 3 pages.
....
PdfDocument pdfDocument = new PdfDocument(pdfWriter);
pdfDocument.copyPagesTo(i+1, i+1,pdfWithMultiplePages);
This tries to copy page i+1 from pdfDocument to pdfWithMultiplePages. But you just created pdfDocument from scratch, so it does not have any pages yet. What you most probably want is:
pdfWithMultiplePages.copyPagesTo(i+1, i+1, pdfDocument);

Related

iText 7 Html to Pdf conversion and linking external file to the generated pdf

I am encountering an issue while merging two PDFs generated out of IText.
I am new to iText7
I am creating one pdf from html and creating another pdf with excel(.xls) as embedded document to pdf.
I want to merge the 2 files.
Basically I want to generate a PDF from html then attach a excel document to it and then output combined html outPutStream from these two pdfs.
Below is the code I am using
ByteArrayOutputStream htmlToPdfContent = new ByteArrayOutputStream();
PdfWriter writer = new PdfWriter(htmlToPdfContent);
PdfDocument pdf = new PdfDocument(writer);
pdf.setTagged();
PageSize pageSize = PageSize.A4.rotate();
pdf.setDefaultPageSize(pageSize);
ConverterProperties properties = new ConverterProperties();
HtmlConverter.convertToPdf(htmlContent, pdf, properties);
FileUtils.cleanDirectory(new File(outputDir));
ByteArrayOutputStream pdfResult = new ByteArrayOutputStream();
PdfWriter writerResult = new PdfWriter(pdfResult);
PdfDocument pdfDocResult = new PdfDocument(writerResult);
PdfReader reader = new PdfReader(new ByteArrayInputStream(htmlToPdfContent.toByteArray()));
PdfDocument pdfDoc = new PdfDocument(reader);
pdfDoc.copyPagesTo(1, pdfDoc.getNumberOfPages(), pdfDocResult);
ByteArrayOutputStream pdfAttach = new ByteArrayOutputStream();
PdfDocument pdfLaunch = new PdfDocument(new PdfWriter(pdfAttach));
Rectangle rect = new Rectangle(36, 700, 100, 100);
byte[] embeddedFileContentBytes = Files.readAllBytes(Paths.get(excelPath));
PdfFileSpec fs = PdfFileSpec.createEmbeddedFileSpec(pdfLaunch, embeddedFileContentBytes, null, "test.xlsx", null, null);
PdfAnnotation attachment = new PdfFileAttachmentAnnotation(rect, fs)
.setContents("Click me");
pdfLaunch.addNewPage().addAnnotation(attachment);
PdfDocument appliedChanges = new PdfDocument(new PdfReader(new ByteArrayInputStream(pdfAttach.toByteArray())));
appliedChanges.copyPagesTo(1, appliedChanges.getNumberOfPages(), pdfDocResult);
try(OutputStream outputStream = new FileOutputStream(dest)) {
pdfResult.writeTo(outputStream);
}
This is throwing exception
13:56:05.724 [main] ERROR com.itextpdf.kernel.pdf.PdfReader - Error occurred while reading cross reference table. Cross reference table will be rebuilt.
com.itextpdf.io.IOException: Error at file pointer 19,272.
at com.itextpdf.io.source.PdfTokenizer.throwError(PdfTokenizer.java:678)
at com.itextpdf.kernel.pdf.PdfReader.readXrefSection(PdfReader.java:801)
at com.itextpdf.kernel.pdf.PdfReader.readXref(PdfReader.java:774)
at com.itextpdf.kernel.pdf.PdfReader.readPdf(PdfReader.java:538)
at com.itextpdf.kernel.pdf.PdfDocument.open(PdfDocument.java:1818)
at com.itextpdf.kernel.pdf.PdfDocument.<init>(PdfDocument.java:238)
at com.itextpdf.kernel.pdf.PdfDocument.<init>(PdfDocument.java:221)
at com.mediaocean.prisma.order.command.infrastructure.pdf.itext.PdfAttachmentLaunch.main(PdfAttachmentLaunch.java:76)
Caused by: com.itextpdf.io.IOException: xref subsection not found.
... 8 common frames omitted
Exception in thread "main" com.itextpdf.kernel.PdfException: Trailer not found.
at com.itextpdf.kernel.pdf.PdfReader.rebuildXref(PdfReader.java:1064)
at com.itextpdf.kernel.pdf.PdfReader.readPdf(PdfReader.java:543)
at com.itextpdf.kernel.pdf.PdfDocument.open(PdfDocument.java:1818)
at com.itextpdf.kernel.pdf.PdfDocument.<init>(PdfDocument.java:238)
at com.itextpdf.kernel.pdf.PdfDocument.<init>(PdfDocument.java:221)
at com.mediaocean.prisma.order.command.infrastructure.pdf.itext.PdfAttachmentLaunch.main(PdfAttachmentLaunch.java:88)
13:56:05.773 [main] ERROR com.itextpdf.kernel.pdf.PdfReader - Error occurred while reading cross reference table. Cross reference table will be rebuilt.
com.itextpdf.io.IOException: PDF startxref not found.
at com.itextpdf.io.source.PdfTokenizer.getStartxref(PdfTokenizer.java:262)
at com.itextpdf.kernel.pdf.PdfReader.readXref(PdfReader.java:753)
at com.itextpdf.kernel.pdf.PdfReader.readPdf(PdfReader.java:538)
at com.itextpdf.kernel.pdf.PdfDocument.open(PdfDocument.java:1818)
at com.itextpdf.kernel.pdf.PdfDocument.<init>(PdfDocument.java:238)
at com.itextpdf.kernel.pdf.PdfDocument.<init>(PdfDocument.java:221)
at com.mediaocean.prisma.order.command.infrastructure.pdf.itext.PdfAttachmentLaunch.main(PdfAttachmentLaunch.java:88)
Please advise. Thanks in advance !!
Concerning revision 2 of your question
You changed your code differently than proposed in my answer to the first revision of your question, you now convert into the formerly unused PdfDocument pdf instead of directly into the ByteArrayOutputStream htmlToPdfContent.
This actually also is a possible fix of the problem identified in that answer. Thus, you don't get an exception here anymore:
PdfReader reader = new PdfReader(new ByteArrayInputStream(htmlToPdfContent.toByteArray()));
PdfDocument pdfDoc = new PdfDocument(reader);
Instead you now get an exception further down the flow, here:
PdfDocument appliedChanges = new PdfDocument(new PdfReader(new ByteArrayInputStream(pdfAttach.toByteArray())));
And the reason is simple, you have not yet closed the PdfDocument pdfLaunch which writes to the ByteArrayOutputStream pdfAttach. But only closing finalizes the PDF in the output stream. Thus, add the close():
ByteArrayOutputStream pdfAttach = new ByteArrayOutputStream();
PdfDocument pdfLaunch = new PdfDocument(new PdfWriter(pdfAttach));
[...]
pdfLaunch.addNewPage().addAnnotation(attachment);
pdfLaunch.close(); //<==== added
PdfDocument appliedChanges = new PdfDocument(new PdfReader(new ByteArrayInputStream(pdfAttach.toByteArray())));
And you actually do the same mistake again, shortly after, you store the contents of the ByteArrayOutputStream pdfResult to outputStream without closing the PdfDocument pdfDocResult which writes to pdfResult. Thus, also add a close call there:
appliedChanges.copyPagesTo(1, appliedChanges.getNumberOfPages(), pdfDocResult);
pdfDocResult.close(); //<==== added
try(OutputStream outputStream = new FileOutputStream(dest)) {
pdfResult.writeTo(outputStream);
}
Concerning revision 1 of your question
You use the ByteArrayOutputStream htmlToPdfContent as target of two distinct PDF generators, the PdfDocument pdf via the PdfWriter writer and the HtmlConverter.convertToPdf call:
ByteArrayOutputStream htmlToPdfContent = new ByteArrayOutputStream();
PdfWriter writer = new PdfWriter(htmlToPdfContent);
PdfDocument pdf = new PdfDocument(writer);
pdf.setTagged();
PageSize pageSize = PageSize.A4.rotate();
pdf.setDefaultPageSize(pageSize);
ConverterProperties properties = new ConverterProperties();
HtmlConverter.convertToPdf(content, htmlToPdfContent, properties);
This makes the content of htmlToPdfContent a hodgepodge of the outputs of both of them, in particular not a valid PDF.
As you don't add any content to pdf, you can safely remove it and reduce the above excerpt to
ByteArrayOutputStream htmlToPdfContent = new ByteArrayOutputStream();
ConverterProperties properties = new ConverterProperties();
HtmlConverter.convertToPdf(content, htmlToPdfContent, properties);

How do you add multiple images to a PDF with itext7 Java?

First google result takes me to Add multiple images into a single pdf file with iText using java which was posted 5 years ago. I am not sure which version they are using, because the Image object doesn't even have the getInstance method for me. Needless to say I am not getting much help from that link.
Anyways I am trying to create a javaFX application that loops multiple JPG images to create a single PDF document. Below is my code, which successfully creates a PDF from 2 images, but I am having trouble making the second image display on the second page.
In the link I posted above the simple solution I saw was to do document.newPage() then do document.add(img), but my document object doesn't have that method? I am not sure what to do.
PdfWriter writer = new PdfWriter("D:/sample1.pdf");
// Creating a PdfDocument
PdfDocument pdfDoc = new PdfDocument(writer);
// Adding a new page
// I can add multiple pages here, but when I add multiple images they do not
// automatically flow over to the next page.
pdfDoc.addNewPage();
pdfDoc.addNewPage();
// Creating a Document
Document document = new Document(pdfDoc);
String imageFile = "C:/Users/***/Downloads/MAT204/1.3-1.4 HW/test.jpg";
ImageData data = ImageDataFactory.create(imageFile);
Image img = new Image(data);
img.setAutoScale(true);
img.setRotationAngle(-Math.toRadians(90));
// I can add multiple images, but they overlaps each other and only
// appears on the first page.
// Is there a way for me to change the current page to write on?
document.add(img);
document.add(img);
// Closing the document
document.close();
System.out.println("PDF Created");
Anyways, I just want to figure out how to manually add another image before I write a loop to automate the process.
After doing more research I found the answer here.
https://kb.itextpdf.com/home/it7kb/examples/multiple-images
protected void manipulatePdf(String dest) throws Exception {
Image image = new Image(ImageDataFactory.create(IMAGES[0]));
PdfDocument pdfDoc = new PdfDocument(new PdfWriter(dest));
Document doc = new Document(pdfDoc, new PageSize(image.getImageWidth(), image.getImageHeight()));
for (int i = 0; i < IMAGES.length; i++) {
image = new Image(ImageDataFactory.create(IMAGES[i]));
pdfDoc.addNewPage(new PageSize(image.getImageWidth(), image.getImageHeight()));
image.setFixedPosition(i + 1, 0, 0);
doc.add(image);
}
doc.close();
}

Add Empty/Blank Page to PdfDocument java

It is there any way to add a Blank Page to an existing PdfDocument ? I've created a method like this:
public void addEmptyPage(PdfDocument pdfDocument){
pdfDocument.addNewPage();
pdfDocument.close();
}
However , when I use it with a PdfDocument , it throws :
com.itextpdf.kernel.PdfException: There is no associate PdfWriter for making indirects.
at com.itextpdf.kernel.pdf.PdfObject.makeIndirect(PdfObject.java:228) ~[kernel-7.1.1.jar:?]
at com.itextpdf.kernel.pdf.PdfObject.makeIndirect(PdfObject.java:248) ~[kernel-7.1.1.jar:?]
at com.itextpdf.kernel.pdf.PdfPage.<init>(PdfPage.java:104) ~[kernel-7.1.1.jar:?]
at com.itextpdf.kernel.pdf.PdfDocument.addNewPage(PdfDocument.java:416) ~[kernel-7.1.1.jar:?]
Which is the correct way to insert a Blank page into a pdf document?
com.itextpdf.kernel.PdfException: There is no associate PdfWriter for making indirects.
That exception indicates that you initialize your PdfDocument with only a PdfReader, no PdfWriter. You don't show your PdfDocument instantiation code but I assume you do something like this:
PdfReader reader = new PdfReader(SOURCE);
PdfDocument document = new PdfDocument(reader);
Such documents are for reading only. (Actually you can do some minor manipulations but nothing as big as adding pages.)
If you want to edit a PDF, initialize your PdfDocument with both a PdfReader and a PdfWriter, e.g.
PdfReader reader = new PdfReader(SOURCE);
PdfWriter writer = new PdfWriter(DESTINATION);
PdfDocument document = new PdfDocument(reader, writer);
If you want to store the edited file at the same location as the original file,
you must not use the same file name as SOURCE in the PdfReader and as DESTINATION in the PdfWriter.
Either first write to a temporary file, close all participating objects, and then replace the original file with the temporary file:
PdfReader reader = new PdfReader("document.pdf");
PdfWriter writer = new PdfWriter("document-temp.pdf");
PdfDocument document = new PdfDocument(reader, writer);
...
document.close();
Path filePath = Path.of("document.pdf");
Path tempPath = Path.of("document-temp.pdf");
Files.move(tempPath, filePath, StandardCopyOption.REPLACE_EXISTING);
Or read the original file into a byte[] and initialize the PdfReader from that array:
PdfReader reader = new PdfReader(new ByteArrayInputStream(Files.readAllBytes(Path.of("document.pdf"))));
PdfWriter writer = new PdfWriter("document.pdf");
PdfDocument document = new PdfDocument(reader, writer);
...
document.close();

Why am I receiving and IOException: PDF header signature not found when creating a PDF?

I trying to create a PDF file with empty pages, but the code throws java.io.IOException: P
PDF header signature not found when I'm trying to read the temp file. Why?
Relevant code:
Document testDoc = new Document();
File testFile = File.createTempFile("pdfTemp", ".tmp");
String test = testFile.getName();
PdfWriter testWriter = PdfWriter.getInstance(document, new FileOutputStream(test));
testDoc.open();
for (int x=1; x<=pdfReader.getNumberOfPages(); x++){
testWriter.setPageEmpty(false);
testDoc.newPage();
}
testDoc.close();
PdfReader testReader = new PdfReader(test);
This may be a coding issue. Your code uses
PdfWriter testWriter = PdfWriter.getInstance(document, new FileOutputStream(test));
but document variable isn't declared anywhere. I suspect you meant to use the following instead:
PdfWriter testWriter = PdfWriter.getInstance(testDoc, new FileOutputStream(test));

How to add an image in the last page of pdf using iText?

How do i add an image on the last page of existing PDF document. Please help me.
The following example adds an image to the second page of an existing pdf using Itext 5.
String src = "c:/in.pdf;
String dest = "c:/out.pdf";
String IMG = "C:/image.jpg";
try {
PdfReader reader = new PdfReader(src);
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
com.itextpdf.text.Image image = com.itextpdf.text.Image.getInstance(IMG);
image.setAbsolutePosition(36, 400);
PdfContentByte over = stamper.getOverContent(2);
over.addImage(image);
stamper.close();
reader.close();
} catch (Exception e) {
e.printStackTrace();
}
You can read the text from the PDF using the same ITEXT library.Try this
PdfReader reader = new PdfReader(INPUTFILE);
int n = reader.getNumberOfPages();
PdfTextExtractor parser =new PdfTextExtractor(new PdfReader("C:/Text.pdf"));
parser.getTextFromPage(3); // Extracting the content from a particular page.
After you have add your data ,You can load images either from file or from a URL, like this:
Image image1 = Image.getInstance("watermark.png");
document.add(image1);
String imageUrl = "http://applause-voice.com/wp-content/uploads/2011/04/1hello.jpg";
Image image2 = Image.getInstance(new URL(imageUrl));
document.add(image2);
If you will add this code at the end of your Java Program , then the image will automatically comes at the end of your page.
The best solution for me was to create a new in-memory PDF document with the image I want to add, then copy this page to the original document.
// Create a separate doc for image
var pdfDocWithImageOutStream = new ByteArrayOutputStream();
var pdfDocWithImage = new PdfDocument(new PdfWriter(pdfDocWithImageOutStream).setSmartMode(true));
var docWithImage = new Document(pdfDocWithImage, destinationPdf.getDefaultPageSize());
// Add image to the doc
docWithImage.add(image);
// Close the doc to save data
docWithImage.close();
pdfDocWithImage.close();
// Open the same doc for reading
pdfDocWithImage = new PdfDocument(new PdfReader(new ByteArrayInputStream(pdfDocWithImageOutStream.toByteArray())));
docWithImage = new Document(pdfDocWithImage, destinationPdf.getDefaultPageSize());
// Copy page to original (destinationPdf)
pdfDocWithImage.copyPagesTo(1, pdfDocWithImage.getNumberOfPages(), destinationPdf);
docWithImage.close();
pdfDocWithImage.close();

Categories