Delete pdf pages in java with iTextpdf - java

I have an existing function to show pdf files that I can't change.
The input of function is an InputStream variable.
In the past they used to pass a pdf file to it and it shows it.
But right now they asked me to show only first 30 pages of the pdf. So I am using iTextpdf and I do something like this:
PdfReader reader = new PdfReader (inputStream);
reader.selectPages("1-30");
Now I should send the result as InputStream variable to show method.
How I should do it?
Thanks

You can store the result using a PdfStamper like this:
PdfReader reader = new PdfReader (inputStream);
reader.selectPages("1-30");
ByteArrayOutputStream os = new ByteArrayOutputStream();
PdfStamper stamper = new PdfStamper(reader, os);
stamper.close();
byte[] changedPdf = os.toByteArray();
If you want the result again to be in the InputStream inputStream variable, simply add a line
inputStream = new ByteArrayInputStream(changedPdf);

Get the reader of existing pdf file by
PdfReader pdfReader = new PdfReader("source pdf file path");
Now update the reader by
reader.selectPages("1-5,15-20");
then get the pdf stamper object to write the changes into a file by
PdfStamper pdfStamper = new PdfStamper(pdfReader,
new FileOutputStream("destination pdf file path"));
close the PdfStamper by
pdfStamper.close();
It will close the PdfReader too.

Related

iText 7 Html to Pdf conversion and linking external file to the generated pdf

I am encountering an issue while merging two PDFs generated out of IText.
I am new to iText7
I am creating one pdf from html and creating another pdf with excel(.xls) as embedded document to pdf.
I want to merge the 2 files.
Basically I want to generate a PDF from html then attach a excel document to it and then output combined html outPutStream from these two pdfs.
Below is the code I am using
ByteArrayOutputStream htmlToPdfContent = new ByteArrayOutputStream();
PdfWriter writer = new PdfWriter(htmlToPdfContent);
PdfDocument pdf = new PdfDocument(writer);
pdf.setTagged();
PageSize pageSize = PageSize.A4.rotate();
pdf.setDefaultPageSize(pageSize);
ConverterProperties properties = new ConverterProperties();
HtmlConverter.convertToPdf(htmlContent, pdf, properties);
FileUtils.cleanDirectory(new File(outputDir));
ByteArrayOutputStream pdfResult = new ByteArrayOutputStream();
PdfWriter writerResult = new PdfWriter(pdfResult);
PdfDocument pdfDocResult = new PdfDocument(writerResult);
PdfReader reader = new PdfReader(new ByteArrayInputStream(htmlToPdfContent.toByteArray()));
PdfDocument pdfDoc = new PdfDocument(reader);
pdfDoc.copyPagesTo(1, pdfDoc.getNumberOfPages(), pdfDocResult);
ByteArrayOutputStream pdfAttach = new ByteArrayOutputStream();
PdfDocument pdfLaunch = new PdfDocument(new PdfWriter(pdfAttach));
Rectangle rect = new Rectangle(36, 700, 100, 100);
byte[] embeddedFileContentBytes = Files.readAllBytes(Paths.get(excelPath));
PdfFileSpec fs = PdfFileSpec.createEmbeddedFileSpec(pdfLaunch, embeddedFileContentBytes, null, "test.xlsx", null, null);
PdfAnnotation attachment = new PdfFileAttachmentAnnotation(rect, fs)
.setContents("Click me");
pdfLaunch.addNewPage().addAnnotation(attachment);
PdfDocument appliedChanges = new PdfDocument(new PdfReader(new ByteArrayInputStream(pdfAttach.toByteArray())));
appliedChanges.copyPagesTo(1, appliedChanges.getNumberOfPages(), pdfDocResult);
try(OutputStream outputStream = new FileOutputStream(dest)) {
pdfResult.writeTo(outputStream);
}
This is throwing exception
13:56:05.724 [main] ERROR com.itextpdf.kernel.pdf.PdfReader - Error occurred while reading cross reference table. Cross reference table will be rebuilt.
com.itextpdf.io.IOException: Error at file pointer 19,272.
at com.itextpdf.io.source.PdfTokenizer.throwError(PdfTokenizer.java:678)
at com.itextpdf.kernel.pdf.PdfReader.readXrefSection(PdfReader.java:801)
at com.itextpdf.kernel.pdf.PdfReader.readXref(PdfReader.java:774)
at com.itextpdf.kernel.pdf.PdfReader.readPdf(PdfReader.java:538)
at com.itextpdf.kernel.pdf.PdfDocument.open(PdfDocument.java:1818)
at com.itextpdf.kernel.pdf.PdfDocument.<init>(PdfDocument.java:238)
at com.itextpdf.kernel.pdf.PdfDocument.<init>(PdfDocument.java:221)
at com.mediaocean.prisma.order.command.infrastructure.pdf.itext.PdfAttachmentLaunch.main(PdfAttachmentLaunch.java:76)
Caused by: com.itextpdf.io.IOException: xref subsection not found.
... 8 common frames omitted
Exception in thread "main" com.itextpdf.kernel.PdfException: Trailer not found.
at com.itextpdf.kernel.pdf.PdfReader.rebuildXref(PdfReader.java:1064)
at com.itextpdf.kernel.pdf.PdfReader.readPdf(PdfReader.java:543)
at com.itextpdf.kernel.pdf.PdfDocument.open(PdfDocument.java:1818)
at com.itextpdf.kernel.pdf.PdfDocument.<init>(PdfDocument.java:238)
at com.itextpdf.kernel.pdf.PdfDocument.<init>(PdfDocument.java:221)
at com.mediaocean.prisma.order.command.infrastructure.pdf.itext.PdfAttachmentLaunch.main(PdfAttachmentLaunch.java:88)
13:56:05.773 [main] ERROR com.itextpdf.kernel.pdf.PdfReader - Error occurred while reading cross reference table. Cross reference table will be rebuilt.
com.itextpdf.io.IOException: PDF startxref not found.
at com.itextpdf.io.source.PdfTokenizer.getStartxref(PdfTokenizer.java:262)
at com.itextpdf.kernel.pdf.PdfReader.readXref(PdfReader.java:753)
at com.itextpdf.kernel.pdf.PdfReader.readPdf(PdfReader.java:538)
at com.itextpdf.kernel.pdf.PdfDocument.open(PdfDocument.java:1818)
at com.itextpdf.kernel.pdf.PdfDocument.<init>(PdfDocument.java:238)
at com.itextpdf.kernel.pdf.PdfDocument.<init>(PdfDocument.java:221)
at com.mediaocean.prisma.order.command.infrastructure.pdf.itext.PdfAttachmentLaunch.main(PdfAttachmentLaunch.java:88)
Please advise. Thanks in advance !!
Concerning revision 2 of your question
You changed your code differently than proposed in my answer to the first revision of your question, you now convert into the formerly unused PdfDocument pdf instead of directly into the ByteArrayOutputStream htmlToPdfContent.
This actually also is a possible fix of the problem identified in that answer. Thus, you don't get an exception here anymore:
PdfReader reader = new PdfReader(new ByteArrayInputStream(htmlToPdfContent.toByteArray()));
PdfDocument pdfDoc = new PdfDocument(reader);
Instead you now get an exception further down the flow, here:
PdfDocument appliedChanges = new PdfDocument(new PdfReader(new ByteArrayInputStream(pdfAttach.toByteArray())));
And the reason is simple, you have not yet closed the PdfDocument pdfLaunch which writes to the ByteArrayOutputStream pdfAttach. But only closing finalizes the PDF in the output stream. Thus, add the close():
ByteArrayOutputStream pdfAttach = new ByteArrayOutputStream();
PdfDocument pdfLaunch = new PdfDocument(new PdfWriter(pdfAttach));
[...]
pdfLaunch.addNewPage().addAnnotation(attachment);
pdfLaunch.close(); //<==== added
PdfDocument appliedChanges = new PdfDocument(new PdfReader(new ByteArrayInputStream(pdfAttach.toByteArray())));
And you actually do the same mistake again, shortly after, you store the contents of the ByteArrayOutputStream pdfResult to outputStream without closing the PdfDocument pdfDocResult which writes to pdfResult. Thus, also add a close call there:
appliedChanges.copyPagesTo(1, appliedChanges.getNumberOfPages(), pdfDocResult);
pdfDocResult.close(); //<==== added
try(OutputStream outputStream = new FileOutputStream(dest)) {
pdfResult.writeTo(outputStream);
}
Concerning revision 1 of your question
You use the ByteArrayOutputStream htmlToPdfContent as target of two distinct PDF generators, the PdfDocument pdf via the PdfWriter writer and the HtmlConverter.convertToPdf call:
ByteArrayOutputStream htmlToPdfContent = new ByteArrayOutputStream();
PdfWriter writer = new PdfWriter(htmlToPdfContent);
PdfDocument pdf = new PdfDocument(writer);
pdf.setTagged();
PageSize pageSize = PageSize.A4.rotate();
pdf.setDefaultPageSize(pageSize);
ConverterProperties properties = new ConverterProperties();
HtmlConverter.convertToPdf(content, htmlToPdfContent, properties);
This makes the content of htmlToPdfContent a hodgepodge of the outputs of both of them, in particular not a valid PDF.
As you don't add any content to pdf, you can safely remove it and reduce the above excerpt to
ByteArrayOutputStream htmlToPdfContent = new ByteArrayOutputStream();
ConverterProperties properties = new ConverterProperties();
HtmlConverter.convertToPdf(content, htmlToPdfContent, properties);

Save pdf to bytearray using itext in Java

I am using itext to read a large pdf file and save selected pages.
PdfReader reader = null;
reader = new PdfReader("customPath/largePdf.pdf");
int pages = reader.getNumberOfPages();
List<Integer> pagesList = new ArrayList<Integer>();
pagesList.add(1);
pagesList.add(2);
reader.selectPages(pagesList);
String path;
PdfStamper stamper = null;
path = String.format("customerPath/split.pdf");
stamper = new PdfStamper(reader, new FileOutputStream(path));
All good until now, i can open the split.pdf .
Now, Instead of saving to a file, i want to save it to a bytearray (so that i can save it as a blob later)
Tried this:
PdfReader reader = null;
reader = new PdfReader("customPath/largePdf.pdf");
int pages = reader.getNumberOfPages();
List<Integer> pagesList = new ArrayList<Integer>();
pagesList.add(1);
pagesList.add(2);
reader.selectPages(pagesList);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
PdfStamper stamper2 = new PdfStamper(reader, baos);
byte[] byteARy = baos.toByteArray();
Just to make sure it works, i tried writing this bytearray to a file:
OutputStream out = new FileOutputStream("customPath/fromByteArray.pdf");
out.write(byteARy);
out.close();
fromByteArray.pdf does not open and the size is zero, any idea what might be wrong ?
You retrieve the byte array (using baos.toByteArray()) immediately after creating the PdfStamper.
PdfStamper stamper2 = new PdfStamper(reader, baos);
byte[] byteARy = baos.toByteArray();
At that time there is (next to) nothing in the output. You must instead wait until after closing your PdfStamper to retrieve the output.
PdfStamper stamper2 = new PdfStamper(reader, baos);
...
stamper2.close();
byte[] byteARy = baos.toByteArray();
Now the byte array should contain the complete, stamped PDF.

Add Empty/Blank Page to PdfDocument java

It is there any way to add a Blank Page to an existing PdfDocument ? I've created a method like this:
public void addEmptyPage(PdfDocument pdfDocument){
pdfDocument.addNewPage();
pdfDocument.close();
}
However , when I use it with a PdfDocument , it throws :
com.itextpdf.kernel.PdfException: There is no associate PdfWriter for making indirects.
at com.itextpdf.kernel.pdf.PdfObject.makeIndirect(PdfObject.java:228) ~[kernel-7.1.1.jar:?]
at com.itextpdf.kernel.pdf.PdfObject.makeIndirect(PdfObject.java:248) ~[kernel-7.1.1.jar:?]
at com.itextpdf.kernel.pdf.PdfPage.<init>(PdfPage.java:104) ~[kernel-7.1.1.jar:?]
at com.itextpdf.kernel.pdf.PdfDocument.addNewPage(PdfDocument.java:416) ~[kernel-7.1.1.jar:?]
Which is the correct way to insert a Blank page into a pdf document?
com.itextpdf.kernel.PdfException: There is no associate PdfWriter for making indirects.
That exception indicates that you initialize your PdfDocument with only a PdfReader, no PdfWriter. You don't show your PdfDocument instantiation code but I assume you do something like this:
PdfReader reader = new PdfReader(SOURCE);
PdfDocument document = new PdfDocument(reader);
Such documents are for reading only. (Actually you can do some minor manipulations but nothing as big as adding pages.)
If you want to edit a PDF, initialize your PdfDocument with both a PdfReader and a PdfWriter, e.g.
PdfReader reader = new PdfReader(SOURCE);
PdfWriter writer = new PdfWriter(DESTINATION);
PdfDocument document = new PdfDocument(reader, writer);
If you want to store the edited file at the same location as the original file,
you must not use the same file name as SOURCE in the PdfReader and as DESTINATION in the PdfWriter.
Either first write to a temporary file, close all participating objects, and then replace the original file with the temporary file:
PdfReader reader = new PdfReader("document.pdf");
PdfWriter writer = new PdfWriter("document-temp.pdf");
PdfDocument document = new PdfDocument(reader, writer);
...
document.close();
Path filePath = Path.of("document.pdf");
Path tempPath = Path.of("document-temp.pdf");
Files.move(tempPath, filePath, StandardCopyOption.REPLACE_EXISTING);
Or read the original file into a byte[] and initialize the PdfReader from that array:
PdfReader reader = new PdfReader(new ByteArrayInputStream(Files.readAllBytes(Path.of("document.pdf"))));
PdfWriter writer = new PdfWriter("document.pdf");
PdfDocument document = new PdfDocument(reader, writer);
...
document.close();

how to set attributes for existing pdf that contains only images using java itext?

I would like to set attributes to pdf before uploading it into a server.
Document document = new Document();
try
{
OutputStream file = new FileOutputStream({Localpath});
PdfWriter.getInstance(document, file);
document.open();
//Set attributes here
document.addTitle("TITLE");
document.close();
file.close();
} catch (Exception e)
{
e.printStackTrace();
}
But its not working. The file is getting corrupted
In a comment to another answer the OP clarified:
I want to set attributes to an existing pdf(not to create new pdf)
Obviously, though, his code creates a new document from scratch (as is obvious from the fact that a mere FileOutputStream is used to access the file, no reading, only writing).
To manipulate an existing PDF, one has to use a PdfReader / PdfWriter couple. Bruno Lowagie provided an example for that in his answer to the stack overflow question "iText setting Creation Date & Modified Date in sandbox.stamper.SuperImpose.java":
public void manipulatePdf(String src, String dest) throws IOException, DocumentException {
PdfReader reader = new PdfReader(src);
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
Map info = reader.getInfo();
info.put("Title", "New title");
info.put("CreationDate", new PdfDate().toString());
stamper.setMoreInfo(info);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
XmpWriter xmp = new XmpWriter(baos, info);
xmp.close();
stamper.setXmpMetadata(baos.toByteArray());
stamper.close();
reader.close();
}
(ChangeMetadata.java)
As you see the code sets the metadata both in the ol'fashioned PDF info dictionary (stamper.setMoreInfo) and in the XMP metadata (stamper.setXmpMetadata).
Obviously src and dest should not be the same here.
Without a second file
In yet another comment the OP clarified that he had already tried a similar solution but that he wants to prevent the
Temporary existence of second file
This can easily be prevented by first reading the original PDF into a byte[] and then stamping to it as the target file. E.g. if File singleFile references the original file which is also to be the target file, you can implement:
byte[] original = Files.readAllBytes(singleFile.toPath());
PdfReader reader = new PdfReader(original);
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(singleFile));
Map<String, String> info = reader.getInfo();
info.put("Title", "New title");
info.put("CreationDate", new PdfDate().toString());
stamper.setMoreInfo(info);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
XmpWriter xmp = new XmpWriter(baos, info);
xmp.close();
stamper.setXmpMetadata(baos.toByteArray());
stamper.close();
reader.close();
(UpdateMetaData test testChangeTitleWithoutTempFile)

Extract pdf page and insert into existing pdf

I have below iText code, I want to copy one page from src pdf file to other pdf file(I have existing PdfStamper, here it is mainPdfStamper).
PdfReader srcReader = new PdfReader(new FileInputStream("source.pdf"));
File file = File.createTempFile("temporary", ".pdf");
PdfStamper pdfStamper = new PdfStamper(srcReader, new FileOutputStream(file));
PdfImportedPage importedPage = pdfStamper.getImportedPage(srcReader, 1);
// copying extracted page from src pdf to existing pdf
mainPdfStamper.getOverContent(1).addTemplate(importedPage, 10,10);
pdfStamper.close();
srcReader.close();
This is not working and I am not aware of how to achieve this. In short, I want to copy one page from source pdf to existing pdf. Please help.
UPDATE
Below code worked as per the answer from Bruno.
PdfReader reader2 = new PdfReader(srcPdf.getAbsolutePath());
PdfImportedPage page = pdfStamper.getImportedPage(reader2, 1);
stamper.insertPage(1, reader2.getPageSize(1));
pdfStamper.getUnderContent(1).addTemplate(page, 100, 100);
// Close the stamper and the readers
pdfStamper.close();
reader2.close();
Please read the documentation, for instance chapter 6 of iText in Action. If you go to section 6.3.4 ("Inserting pages into an existing document"), you'll find the InsertPages example.
You only need this code if p is the page number indicating where you want to insert the page, main_file is the path to your main file and to_be_inserted the path to the file that needs to be inserted (dest is the path to the resulting file):
PdfReader reader = new PdfReader(main_file);
PdfReader reader2 = new PdfReader(to_be_inserted);
// Create a stamper
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
// Create an imported page to be inserted
PdfImportedPage page = stamper.getImportedPage(reader2, 1);
stamper.insertPage(p, reader2.getPageSize(1));
stamper.getUnderContent(i).addTemplate(page, 0, 0);
// Close the stamper and the readers
stamper.close();
reader.close();
reader2.close();
This is only one way to combine pages from two files. You can also use PdfCopy for this purpose. The advantage of using PdfCopy is the fact that you'll preserve the interactive features of the interactive page. When using PdfStamper, you'll lose any interactive feature (e.g. all links) that were present in the inserted page.

Categories