Itext7 - write multiple documents on single pdf - java

I'm writing multiple PdfDocument on a single pdf. The PdfDocument are generated populating PdfAcroForm of different templates.
My solution actually works, but I think it is not very efficent.
Is there a more efficent way to have the same result?
ByteArrayOutputStream globalBaos = new ByteArrayOutputStream();
PdfDocument pdfDocGlobal = new PdfDocument(new PdfWriter(globalBaos));
for (Documento document : docs) {
PdfReader reader = new PdfReader(document.getPath());
ByteArrayOutputStream baos = new ByteArrayOutputStream();
PdfWriter writer = new PdfWriter(baos);
PdfDocument pdfDoc = new PdfDocument(reader, writer);
Document doc = new Document(pdfDoc);
[.... populate pdfDoc ]
doc.close();
pdfDoc.close();
baos.close();
//read again the closed document to copy it in the globalBaos
PdfDocument dr = new PdfDocument(new PdfReader(new ByteArrayInputStream(baos.toByteArray())));
dr.copyPagesTo(1, dr.getNumberOfPages(), pdfDocGlobal);
dr.close();
}
pdfDocGlobal.close();
FileOutputStream output = new FileOutputStream(FILE_OUTPUT);
output.write(globalBaos.toByteArray());
output.close();
globalBaos.close();

Related

iText 7 Html to Pdf conversion and linking external file to the generated pdf

I am encountering an issue while merging two PDFs generated out of IText.
I am new to iText7
I am creating one pdf from html and creating another pdf with excel(.xls) as embedded document to pdf.
I want to merge the 2 files.
Basically I want to generate a PDF from html then attach a excel document to it and then output combined html outPutStream from these two pdfs.
Below is the code I am using
ByteArrayOutputStream htmlToPdfContent = new ByteArrayOutputStream();
PdfWriter writer = new PdfWriter(htmlToPdfContent);
PdfDocument pdf = new PdfDocument(writer);
pdf.setTagged();
PageSize pageSize = PageSize.A4.rotate();
pdf.setDefaultPageSize(pageSize);
ConverterProperties properties = new ConverterProperties();
HtmlConverter.convertToPdf(htmlContent, pdf, properties);
FileUtils.cleanDirectory(new File(outputDir));
ByteArrayOutputStream pdfResult = new ByteArrayOutputStream();
PdfWriter writerResult = new PdfWriter(pdfResult);
PdfDocument pdfDocResult = new PdfDocument(writerResult);
PdfReader reader = new PdfReader(new ByteArrayInputStream(htmlToPdfContent.toByteArray()));
PdfDocument pdfDoc = new PdfDocument(reader);
pdfDoc.copyPagesTo(1, pdfDoc.getNumberOfPages(), pdfDocResult);
ByteArrayOutputStream pdfAttach = new ByteArrayOutputStream();
PdfDocument pdfLaunch = new PdfDocument(new PdfWriter(pdfAttach));
Rectangle rect = new Rectangle(36, 700, 100, 100);
byte[] embeddedFileContentBytes = Files.readAllBytes(Paths.get(excelPath));
PdfFileSpec fs = PdfFileSpec.createEmbeddedFileSpec(pdfLaunch, embeddedFileContentBytes, null, "test.xlsx", null, null);
PdfAnnotation attachment = new PdfFileAttachmentAnnotation(rect, fs)
.setContents("Click me");
pdfLaunch.addNewPage().addAnnotation(attachment);
PdfDocument appliedChanges = new PdfDocument(new PdfReader(new ByteArrayInputStream(pdfAttach.toByteArray())));
appliedChanges.copyPagesTo(1, appliedChanges.getNumberOfPages(), pdfDocResult);
try(OutputStream outputStream = new FileOutputStream(dest)) {
pdfResult.writeTo(outputStream);
}
This is throwing exception
13:56:05.724 [main] ERROR com.itextpdf.kernel.pdf.PdfReader - Error occurred while reading cross reference table. Cross reference table will be rebuilt.
com.itextpdf.io.IOException: Error at file pointer 19,272.
at com.itextpdf.io.source.PdfTokenizer.throwError(PdfTokenizer.java:678)
at com.itextpdf.kernel.pdf.PdfReader.readXrefSection(PdfReader.java:801)
at com.itextpdf.kernel.pdf.PdfReader.readXref(PdfReader.java:774)
at com.itextpdf.kernel.pdf.PdfReader.readPdf(PdfReader.java:538)
at com.itextpdf.kernel.pdf.PdfDocument.open(PdfDocument.java:1818)
at com.itextpdf.kernel.pdf.PdfDocument.<init>(PdfDocument.java:238)
at com.itextpdf.kernel.pdf.PdfDocument.<init>(PdfDocument.java:221)
at com.mediaocean.prisma.order.command.infrastructure.pdf.itext.PdfAttachmentLaunch.main(PdfAttachmentLaunch.java:76)
Caused by: com.itextpdf.io.IOException: xref subsection not found.
... 8 common frames omitted
Exception in thread "main" com.itextpdf.kernel.PdfException: Trailer not found.
at com.itextpdf.kernel.pdf.PdfReader.rebuildXref(PdfReader.java:1064)
at com.itextpdf.kernel.pdf.PdfReader.readPdf(PdfReader.java:543)
at com.itextpdf.kernel.pdf.PdfDocument.open(PdfDocument.java:1818)
at com.itextpdf.kernel.pdf.PdfDocument.<init>(PdfDocument.java:238)
at com.itextpdf.kernel.pdf.PdfDocument.<init>(PdfDocument.java:221)
at com.mediaocean.prisma.order.command.infrastructure.pdf.itext.PdfAttachmentLaunch.main(PdfAttachmentLaunch.java:88)
13:56:05.773 [main] ERROR com.itextpdf.kernel.pdf.PdfReader - Error occurred while reading cross reference table. Cross reference table will be rebuilt.
com.itextpdf.io.IOException: PDF startxref not found.
at com.itextpdf.io.source.PdfTokenizer.getStartxref(PdfTokenizer.java:262)
at com.itextpdf.kernel.pdf.PdfReader.readXref(PdfReader.java:753)
at com.itextpdf.kernel.pdf.PdfReader.readPdf(PdfReader.java:538)
at com.itextpdf.kernel.pdf.PdfDocument.open(PdfDocument.java:1818)
at com.itextpdf.kernel.pdf.PdfDocument.<init>(PdfDocument.java:238)
at com.itextpdf.kernel.pdf.PdfDocument.<init>(PdfDocument.java:221)
at com.mediaocean.prisma.order.command.infrastructure.pdf.itext.PdfAttachmentLaunch.main(PdfAttachmentLaunch.java:88)
Please advise. Thanks in advance !!
Concerning revision 2 of your question
You changed your code differently than proposed in my answer to the first revision of your question, you now convert into the formerly unused PdfDocument pdf instead of directly into the ByteArrayOutputStream htmlToPdfContent.
This actually also is a possible fix of the problem identified in that answer. Thus, you don't get an exception here anymore:
PdfReader reader = new PdfReader(new ByteArrayInputStream(htmlToPdfContent.toByteArray()));
PdfDocument pdfDoc = new PdfDocument(reader);
Instead you now get an exception further down the flow, here:
PdfDocument appliedChanges = new PdfDocument(new PdfReader(new ByteArrayInputStream(pdfAttach.toByteArray())));
And the reason is simple, you have not yet closed the PdfDocument pdfLaunch which writes to the ByteArrayOutputStream pdfAttach. But only closing finalizes the PDF in the output stream. Thus, add the close():
ByteArrayOutputStream pdfAttach = new ByteArrayOutputStream();
PdfDocument pdfLaunch = new PdfDocument(new PdfWriter(pdfAttach));
[...]
pdfLaunch.addNewPage().addAnnotation(attachment);
pdfLaunch.close(); //<==== added
PdfDocument appliedChanges = new PdfDocument(new PdfReader(new ByteArrayInputStream(pdfAttach.toByteArray())));
And you actually do the same mistake again, shortly after, you store the contents of the ByteArrayOutputStream pdfResult to outputStream without closing the PdfDocument pdfDocResult which writes to pdfResult. Thus, also add a close call there:
appliedChanges.copyPagesTo(1, appliedChanges.getNumberOfPages(), pdfDocResult);
pdfDocResult.close(); //<==== added
try(OutputStream outputStream = new FileOutputStream(dest)) {
pdfResult.writeTo(outputStream);
}
Concerning revision 1 of your question
You use the ByteArrayOutputStream htmlToPdfContent as target of two distinct PDF generators, the PdfDocument pdf via the PdfWriter writer and the HtmlConverter.convertToPdf call:
ByteArrayOutputStream htmlToPdfContent = new ByteArrayOutputStream();
PdfWriter writer = new PdfWriter(htmlToPdfContent);
PdfDocument pdf = new PdfDocument(writer);
pdf.setTagged();
PageSize pageSize = PageSize.A4.rotate();
pdf.setDefaultPageSize(pageSize);
ConverterProperties properties = new ConverterProperties();
HtmlConverter.convertToPdf(content, htmlToPdfContent, properties);
This makes the content of htmlToPdfContent a hodgepodge of the outputs of both of them, in particular not a valid PDF.
As you don't add any content to pdf, you can safely remove it and reduce the above excerpt to
ByteArrayOutputStream htmlToPdfContent = new ByteArrayOutputStream();
ConverterProperties properties = new ConverterProperties();
HtmlConverter.convertToPdf(content, htmlToPdfContent, properties);

Trying to merge pdfs using iText7 merger, but when I open final merged pdf it says failed to load pdf document

I'm creating two ByteArrayOutputStream using itext7 PdfWriter, and then merging them in one pdf using merger but when I try to open final merged pdf it says failed to load.
#GetMapping(value = "/customers",
produces = MediaType.APPLICATION_PDF_VALUE)
public ResponseEntity<InputStreamResource> customersReport() throws IOException {
ByteArrayOutputStream out = new ByteArrayOutputStream();
PdfDocument pdf = new PdfDocument(new PdfWriter(out));
Document document = new Document(pdf);
Paragraph p = new Paragraph("AAAAAAAAA");
document.add(p);
document.close();
ByteArrayOutputStream out1 = new ByteArrayOutputStream();
PdfDocument pdf1 = new PdfDocument(new PdfWriter(out1));
Document document1 = new Document(pdf1);
Paragraph p1 = new Paragraph("123456A");
document1.add(p1);
document1.close();
ByteArrayOutputStream outfinal = new ByteArrayOutputStream();
PdfDocument pdfDoc = new PdfDocument(new PdfWriter(outfinal));
PdfMerger merger = new PdfMerger(pdfDoc);
PdfDocument pdf2 = new PdfDocument(new PdfReader(new ByteArrayInputStream(out.toByteArray())));
merger.merge(pdf2,1,pdf2.getNumberOfPages());
PdfDocument pdf3 = new PdfDocument(new PdfReader(new ByteArrayInputStream(out1.toByteArray())));
merger.merge(pdf3,1,pdf3.getNumberOfPages());
ByteArrayInputStream bis = new ByteArrayInputStream(outfinal.toByteArray());
HttpHeaders headers = new HttpHeaders();
headers.add("Content-Disposition", "inline; filename=customers.pdf");
return ResponseEntity
.ok()
.headers(headers)
.contentType(MediaType.APPLICATION_PDF)
.body(new InputStreamResource(bis));
}
You have to close the merger
merger.close();
before using its output in
ByteArrayInputStream bis = new ByteArrayInputStream(outfinal.toByteArray());
because only during closing the pdf file is completed.

Add Empty/Blank Page to PdfDocument java

It is there any way to add a Blank Page to an existing PdfDocument ? I've created a method like this:
public void addEmptyPage(PdfDocument pdfDocument){
pdfDocument.addNewPage();
pdfDocument.close();
}
However , when I use it with a PdfDocument , it throws :
com.itextpdf.kernel.PdfException: There is no associate PdfWriter for making indirects.
at com.itextpdf.kernel.pdf.PdfObject.makeIndirect(PdfObject.java:228) ~[kernel-7.1.1.jar:?]
at com.itextpdf.kernel.pdf.PdfObject.makeIndirect(PdfObject.java:248) ~[kernel-7.1.1.jar:?]
at com.itextpdf.kernel.pdf.PdfPage.<init>(PdfPage.java:104) ~[kernel-7.1.1.jar:?]
at com.itextpdf.kernel.pdf.PdfDocument.addNewPage(PdfDocument.java:416) ~[kernel-7.1.1.jar:?]
Which is the correct way to insert a Blank page into a pdf document?
com.itextpdf.kernel.PdfException: There is no associate PdfWriter for making indirects.
That exception indicates that you initialize your PdfDocument with only a PdfReader, no PdfWriter. You don't show your PdfDocument instantiation code but I assume you do something like this:
PdfReader reader = new PdfReader(SOURCE);
PdfDocument document = new PdfDocument(reader);
Such documents are for reading only. (Actually you can do some minor manipulations but nothing as big as adding pages.)
If you want to edit a PDF, initialize your PdfDocument with both a PdfReader and a PdfWriter, e.g.
PdfReader reader = new PdfReader(SOURCE);
PdfWriter writer = new PdfWriter(DESTINATION);
PdfDocument document = new PdfDocument(reader, writer);
If you want to store the edited file at the same location as the original file,
you must not use the same file name as SOURCE in the PdfReader and as DESTINATION in the PdfWriter.
Either first write to a temporary file, close all participating objects, and then replace the original file with the temporary file:
PdfReader reader = new PdfReader("document.pdf");
PdfWriter writer = new PdfWriter("document-temp.pdf");
PdfDocument document = new PdfDocument(reader, writer);
...
document.close();
Path filePath = Path.of("document.pdf");
Path tempPath = Path.of("document-temp.pdf");
Files.move(tempPath, filePath, StandardCopyOption.REPLACE_EXISTING);
Or read the original file into a byte[] and initialize the PdfReader from that array:
PdfReader reader = new PdfReader(new ByteArrayInputStream(Files.readAllBytes(Path.of("document.pdf"))));
PdfWriter writer = new PdfWriter("document.pdf");
PdfDocument document = new PdfDocument(reader, writer);
...
document.close();

Java PDFBox, how to get File object from PDDocument

I am trying to retrieve a File or InputStream instance from PDDocument without saving a PDDocument to the file system.
PDDocument doc= new PDDocument();
...
doc.save("D:\\document.pdf");
File f= new File("D:\\document.pdf");
Is there any method in PDFBox which returns File or InputStream from an existing PDDocument?
I solved it:
PDDocument doc=new PDDocument();
PDStream ps=new PDStream(doc);
InputStream is=ps.createInputStream();
I solve it in this way ( It's creating a file but in temporary-file directory ):
final PDDocument document = new PDDocument();
final File file = File.createTempFile(filename, ".pdf");
document.save(file);
and if you need
document.close();
What if you first create the outputstream
PDDocument doc= new PDDocument();
File f= new File("D:\\document.pdf");
FileOutputStream fOut = new FileOutputStream(f);
doc.save(fOut);
Take a look at this
http://pdfbox.apache.org/apidocs/org/apache/pdfbox/pdmodel/PDDocument.html#save(java.io.OutputStream)
I am trying to retrieve a File or InputStream instance from PDDocument without saving a PDDocument to the file system.
[...]
Is there any method in PDFBox which returns File or InputStream from an existing PDDocument?
Obviously PDFBox cannot return a meaningful File object without saving a PDDocument to the file system.
It does not offer a method providing an InputStream directly either but it is easy to write code around it that does. e.g.:
InputStream docInputStream = null;
try ( ByteArrayOutputStream baos = new ByteArrayOutputStream();
PDDocument doc = new PDDocument() )
{
[...]
doc.save(baos);
docInputStream = new ByteArrayInputStream(baos.toByteArray());
}

binary pdf byte[] to com.lowagie.text.Document

I had an binary pdf(Byte []). I would like to convert it to a com.lowagie.text.Document.is there anyway to convert it without losing any information on it.Thanks
this is i tried, and to not sure how to move further.
Document document = new Document();
PdfWriter writer = PdfWriter.getInstance(document, out);
document.open();
PdfReader reader = new PdfReader(byteBuffer);

Categories