I want to ask a question. I added digital paging seal to a multi-page PDF, each page has the same seal, add the digital signature once on the first page, and then the other pages only need to quote the appearance of the first seal. But using adobe Acrobat DC to open, there will be an extra "123" signature in the generated document. What causes it?
I wrote the following code based on this answer and it helped me a lot.
addAp(doc, doc.getPage(0), rect, signature, xz[0]);
for (int i = 0; i < doc.getNumberOfPages() - 1; i++) {
addAp(doc, doc.getPage(1), lerect, signature, xz[1]);
for (int i = 1; i < doc.getNumberOfPages(); i++) {
void addAp(PDDocument pdDocument, PDPage pdPage, PDRectangle rectangle, PDSignature signature, BufferedImage signatureImage) throws IOException {
PDAcroForm acroForm = pdDocument.getDocumentCatalog().getAcroForm();
List<PDField> acroFormFields = acroForm.getFields();
PDSignatureField signatureField = new PDSignatureField(acroForm);
PDAnnotationWidget widget = signatureField.getWidgets().get(0);
// from PDVisualSigBuilder.createHolderForm()
PDStream stream = new PDStream(pdDocument);
PDFormXObject form = new PDFormXObject(stream);
PDResources res = new PDResources();
PDRectangle bbox = new PDRectangle(rectangle.getWidth(), rectangle.getHeight());
// from PDVisualSigBuilder.createAppearanceDictionary()
PDAppearanceDictionary appearance = new PDAppearanceDictionary();
PDAppearanceStream appearanceStream = new PDAppearanceStream(form.getCOSObject());
ByteArrayOutputStream bao = new ByteArrayOutputStream();
ImageIO.write(signatureImage, "png", bao);
byte[] imageByte = bao.toByteArray();
PDImageXObject pdImage = PDImageXObject.createFromByteArray(pdDocument, imageByte, null);
try (PDPageContentStream cs = new PDPageContentStream(pdDocument, appearanceStream)) {
PDExtendedGraphicsState r0 = new PDExtendedGraphicsState();
cs.addComment("This is a comment");
cs.drawImage(pdImage, 0, 0, rectangle.getWidth(), rectangle.getHeight());
void addAnnots(PDPage pdPage) throws IOException {
COSDictionary pageTreeObject = pdPage.getCOSObject();
while (pageTreeObject != null) {
pageTreeObject = (COSDictionary) pageTreeObject.getDictionaryObject(COSName.PARENT);
PDFBOX version is 2.0.20.
After modification:
ArrayList<PDAnnotationWidget> listWidget = addAp1(doc, signature);
addAp2(doc, doc.getPage(0), rect, xz[0], listWidget.get(0));
for (int i = 0; i < doc.getNumberOfPages() - 1; i++) {
addAp2(doc, doc.getPage(1), lerect, xz[1], listWidget.get(1));
for (int i = 1; i < doc.getNumberOfPages(); i++) {
ArrayList<PDAnnotationWidget> addAp1(PDDocument pdDocument, PDSignature signature) throws IOException {
ArrayList<PDAnnotationWidget> widgetList = new ArrayList<>();
PDAcroForm acroForm = pdDocument.getDocumentCatalog().getAcroForm();
List<PDField> acroFormFields = acroForm.getFields();
PDAnnotationWidget widget1 = new PDAnnotationWidget();
PDAnnotationWidget widget2 = new PDAnnotationWidget();
PDSignatureField signatureField = new PDSignatureField(acroForm);
return widgetList;
void addAp2(PDDocument pdDocument, PDPage pdPage, PDRectangle rectangle, BufferedImage signatureImage, PDAnnotationWidget widget) throws IOException {
Adobe Acrobat DC:
You add 2 signature fields to the document.
You call addAp twice. Each time that method creates a PDSignatureField, and in the loop immediately after the addAp call the single widget of that field is added to the pages. Thus, both signature fields are reachable in the resulting PDF.
The two signature fields share the signature value.
addAp sets the value of both signature fields to the same signature value. When eventually the signature bytes are written into this value, both signature fields become signed.
Only the second signature field is in the PDF form definition.
addAp removes any field from the PDF form definition before adding the newly generated one. In the end, therefore, the PDF form definition only contains the signature field from the last addAp call.
Adobe Acrobat opens the file...
Adobe Acrobat automatically only validates the signature field in the PDF form definition. But as soon as it displays the widget of the other signature field, it also displays it on the signature panel. As it wasn't there from the start, though, it is displayed as not-yet-validated.
By not clearing the PDF form definition field list in the second addAp call, you should get two automatically validated signature fields in the signature panel.
Alternatively, by creating only a single form field with two widget annotations, you should get only a single signature field in the signature panel.
As a warning: You reference the same widget annotation from multiple pages. This strictly speaking is forbidden by the PDF specification. Thus, any validator may warn about this issue and - as this issue occurs in the context of a signature - message doubts about the validity of that signature.
My goal is to transfer textual content from a PDF to a new PDF while preserving the formatting of the font. (e.g. Bold, Italic, underlined..).
I try to use the TextPosition List from the existing PDF and write a new PDF from it.
For this I get from the TextPosition List the Font and FontSize of the current entry and set them in a contentStream to write the upcoming text through contentStream.showText().
after 137 successful loops this error follows:
Exception in thread "main" java.lang.IllegalArgumentException: No glyph for U+00AD in font VVHOEY+FrutigerLT-BoldCn
at org.apache.pdfbox.pdmodel.font.PDType1CFont.encode(PDType1CFont.java:357)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:333)
at org.apache.pdfbox.pdmodel.PDPageContentStream.showTextInternal(PDPageContentStream.java:514)
at org.apache.pdfbox.pdmodel.PDPageContentStream.showText(PDPageContentStream.java:476)
at haupt.PageTest.printPdf(PageTest.java:294)
at haupt.MyTestPDF.main(MyTestPDF.java:54)
This is my code up to this step:
public void printPdf() throws IOException {
TextPosition tpInfo = null;
String pdfFileInText = null;
int charIDindex = 0;
int pageIndex = 0;
try (PDDocument pdfDocument = PDDocument.load(new File(srcFile))) {
if (!pdfDocument.isEncrypted()) {
MyPdfTextStripper myStripper = new MyPdfTextStripper();
var articlesByPage = myStripper.getCharactersByArticleByPage(pdfDocument);
String newFileString = (srcErledigt + "Test.pdf");
File input = new File(newFileString);
PDDocument document = new PDDocument();
// For Pages
for (Iterator<List<List<TextPosition>>> pageIterator = articlesByPage.iterator(); pageIterator.hasNext();) {
List<List<TextPosition>> pageList = pageIterator.next();
PDPage newPage = new PDPage();
PDPageContentStream contentStream = new PDPageContentStream(document, newPage);
// For Articles
for (Iterator<List<TextPosition>> articleIterator = pageList.iterator(); articleIterator.hasNext();) {
List<TextPosition> articleList = articleIterator.next();
// For Text
for (Iterator<TextPosition> tpIterator = articleList.iterator(); tpIterator.hasNext();) {
tpCharID = charIDindex;
tpInfo = tpIterator.next();
System.out.println(tpCharID + ". charID: " + tpInfo);
PDFont tpFont = tpInfo.getFont();
float tpFontSize = tpInfo.getFontSize();
pdfFileInText = tpInfo.toString();
contentStream.setFont(tpFont, tpFontSize);
contentStream.newLineAtOffset(50, 700);
} else {
System.out.println("pdf Encrypted");
public class MyPdfTextStripper extends PDFTextStripper {
public MyPdfTextStripper() throws IOException {
public List<List<TextPosition>> getCharactersByArticle() {
return super.getCharactersByArticle();
// Add Pages to CharactersByArticle List
public List<List<List<TextPosition>>> getCharactersByArticleByPage(PDDocument doc) throws IOException {
final int maxPageNr = doc.getNumberOfPages();
List<List<List<TextPosition>>> byPageList = new ArrayList<>(maxPageNr);
for (int pageNr = 1; pageNr <= maxPageNr; pageNr++) {
return byPageList;
Additional Info:
There are seven fonts in my document, all of which are set as subsets.
I need to write the Text given with the corresponding Font given.
All glyphs that should be written already exist in the original document, where I get my TextPositionList from.
All fonts are subtype 1 or 0
There is no AcroForm defined
Thanks in advance
Edit 30.08.2022:
Fixed the Issue by manually replacing this particular Unicode with a placeholder for the String before trying to write it.
Now I ran into this open ToDo:
public byte[] encode(int unicode)
// todo: we can use a known character collection CMap for a CIDFont
// and an Encoding for Type 1-equivalent
throw new UnsupportedOperationException();
Anyone got any suggestions or Workarounds for this?
Edit 01.09.2022
I tried to replace occurrences of that Font with an alternative Font from the source file, but this opens another problem where a COSStream is "randomly" closed, which results in the new document not being able to save the File after writing my text with a contentStream.
Using standard Fonts like PDType1Font.HELVETICA instead works though..
I am using iText to parse text in a PDF document, and i am using PdfContentStreamProcessor with a RenderListener. Such as:
PdfReader reader = new PdfReader(file.toURI().toURL());
int numberOfPages = reader.getNumberOfPages();
MyRenderListener listener = new MyRenderListener ();
PdfContentStreamProcessor processor = new PdfContentStreamProcessor(listener);
for (int pageNumber = 1; pageNumber <= numberOfPages; pageNumber++) {
PdfDictionary pageDic = reader.getPageN(pageNumber);
PdfDictionary resourcesDic = pageDic.getAsDict(PdfName.RESOURCES);
Rectangle pageSize = reader.getPageSize(pageNumber);
listener.startPage(pageNumber, pageSize);
processor.processContent(ContentByteUtils.getContentBytesForPage(reader, pageNumber), resourcesDic);
I have no problem to get the text with the renderText(TextRenderInfo) method, but how do I parse the graphic content appart from images? For example in my case I would like to get:
Text content which is in a box
Horizontal lines
Per mkl comment, by using ExtRenderListener I am able to get the geometries. I used How to extract the color of a rectangle in a PDF, with iText for reference
I want to use iText to add a check box to a PDF file, and here is my code:
public static void testPdf() throws IOException {
String src = "/Users/heartisan/Downloads/xx.pdf";
String dest = "/Users/heartisan/Downloads/yy.pdf";
PdfDocument pdf = new PdfDocument(new PdfReader(src), new PdfWriter(dest));
PdfAcroForm form = PdfAcroForm.getAcroForm(pdf, true);
Document document = new Document(pdf);
for (int i = 0; i < 3; i++) {
PdfButtonFormField checkField = PdfFormField.createCheckBox(pdf, new Rectangle(369 + i * 69, 751, 15, 15),
"experience".concat(String.valueOf(i+1)), "Off", PdfFormField.TYPE_CHECK);
// checkField.getWidgets().get(0).setBorderStyle(PdfAnnotation.STYLE_SOLID);
form.addField(checkField, pdf.getPage(1));
Then here is the result:
Actually, as the code showed before, I set up the border color and width, but it just not work, I used Adobe Arcobat and it works:
Then I debugged the two files' fields, and I found:
As I marked, both the color and width's values are gone, both the values were there just before I call document.close(), I don't know why.
Can anyone help me?
Is it expected result if you open your resultant PDF in Google Chrome or Pdf Studio 2020? I have the same result In Acrobat so I think it's not an issue of iText. Btw, If you click on your checkFields in Acrobat, all looks as expected.
I compare 2 pdf files and mark highlight on them.
When i using pdfbox to merge it for comparison . It have error missing highlight.
I using this function:
The function to merge 2 file pdfs with all pages of them to side by side.
function void generateSideBySidePDF() {
File pdf1File = new File(FILE1_PATH);
File pdf2File = new File(FILE2_PATH);
File outPdfFile = new File(OUTFILE_PATH);
PDDocument pdf1 = null;
PDDocument pdf2 = null;
PDDocument outPdf = null;
try {
pdf1 = PDDocument.load(pdf1File);
pdf2 = PDDocument.load(pdf2File);
outPdf = new PDDocument();
for(int pageNum = 0; pageNum < pdf1.getNumberOfPages(); pageNum++) {
// Create output PDF frame
PDRectangle pdf1Frame = pdf1.getPage(pageNum).getCropBox();
PDRectangle pdf2Frame = pdf2.getPage(pageNum).getCropBox();
PDRectangle outPdfFrame = new PDRectangle(pdf1Frame.getWidth()+pdf2Frame.getWidth(), Math.max(pdf1Frame.getHeight(), pdf2Frame.getHeight()));
// Create output page with calculated frame and add it to the document
COSDictionary dict = new COSDictionary();
dict.setItem(COSName.TYPE, COSName.PAGE);
dict.setItem(COSName.MEDIA_BOX, outPdfFrame);
dict.setItem(COSName.CROP_BOX, outPdfFrame);
dict.setItem(COSName.ART_BOX, outPdfFrame);
PDPage outPdfPage = new PDPage(dict);
// Source PDF pages has to be imported as form XObjects to be able to insert them at a specific point in the output page
LayerUtility layerUtility = new LayerUtility(outPdf);
PDFormXObject formPdf1 = layerUtility.importPageAsForm(pdf1, pageNum);
PDFormXObject formPdf2 = layerUtility.importPageAsForm(pdf2, pageNum);
// Add form objects to output page
AffineTransform afLeft = new AffineTransform();
layerUtility.appendFormAsLayer(outPdfPage, formPdf1, afLeft, "left" + pageNum);
AffineTransform afRight = AffineTransform.getTranslateInstance(pdf1Frame.getWidth(), 0.0);
layerUtility.appendFormAsLayer(outPdfPage, formPdf2, afRight, "right" + pageNum);
} catch (IOException e) {
} finally {
try {
if (pdf1 != null) pdf1.close();
if (pdf2 != null) pdf2.close();
if (outPdf != null) outPdf.close();
} catch (IOException e) {
Insert this into your code after the "Source PDF pages has to be imported" segment to copy the annotations. The ones of the right PDF must have their rectangle moved.
// copy annotations
PDPage src1Page = pdf1.getPage(pageNum);
PDPage src2Page = pdf2.getPage(pageNum);
for (PDAnnotation ann : src1Page.getAnnotations())
for (PDAnnotation ann : src2Page.getAnnotations())
PDRectangle rect = ann.getRectangle();
ann.setRectangle(new PDRectangle(rect.getLowerLeftX() + pdf1Frame.getWidth(), rect.getLowerLeftY(), rect.getWidth(), rect.getHeight()));
Note that this code has a flaw - it works only with annotations WITH appearance stream (most have it). It will have weird effects for those that don't, in that case, one would have to adjust the coordinates depending on the annotation type. For highlights, it would be the quadpoints, for line it would be the line coordinates, etc, etc.
I created code that adds an image to an existing pdf document and then signs it, all using PDFBox (see code below).
The code nicely adds the image and the signature. However, in some documents, Acrobat Reader complains that "The signature byte range is invalid."
The problem seems to be the same as the problem described in this question. The answer to that question describes the problem in more detail: the problem is that my code leaves a mix of cross reference types in the document (streams and tables). Indeed, some documents won't even open because of the problems that this creates.
My question is: how do I prevent this? How do I add an image to an existing pdf document without creating multiple cross reference types?
public class TC3 implements SignatureInterface{
private char[] pin = "123456".toCharArray();
private BouncyCastleProvider provider = new BouncyCastleProvider();
private PrivateKey privKey;
private Certificate[] cert;
public TC3() throws Exception{
KeyStore keystore = KeyStore.getInstance("PKCS12", provider);
keystore.load(new FileInputStream(new File("resources/IIS_keystore.pfx")), pin.clone());
String alias = keystore.aliases().nextElement();
privKey = (PrivateKey) keystore.getKey(alias, pin);
cert = keystore.getCertificateChain(alias);
public void doSign() throws Exception{
byte inputBytes[] = IOUtils.toByteArray(new FileInputStream("resources/rooster.pdf"));
PDDocument pdDocument = PDDocument.load(new ByteArrayInputStream(inputBytes));
PDJpeg ximage = new PDJpeg(pdDocument, ImageIO.read(new File("resources/logo.jpg")));
PDPage page = (PDPage)pdDocument.getDocumentCatalog().getAllPages().get(0);
PDPageContentStream contentStream = new PDPageContentStream(pdDocument, page, true, true);
contentStream.drawXObject(ximage, 50, 50, 356, 40);
ByteArrayOutputStream os = new ByteArrayOutputStream();
inputBytes = os.toByteArray();
pdDocument = PDDocument.load(new ByteArrayInputStream(inputBytes));
PDSignature signature = new PDSignature();
signature.setName("signer name");
signature.setLocation("signer location");
signature.setReason("reason for signature");
pdDocument.addSignature(signature, this);
File outputDocument = new File("resources/signed.pdf");
ByteArrayInputStream fis = new ByteArrayInputStream(inputBytes);
FileOutputStream fos = new FileOutputStream(outputDocument);
byte[] buffer = new byte[8 * 1024];
int c;
while ((c = fis.read(buffer)) != -1)
fos.write(buffer, 0, c);
FileInputStream is = new FileInputStream(outputDocument);
pdDocument.saveIncremental(is, fos);
public byte[] sign(InputStream content) {
CMSProcessableInputStream input = new CMSProcessableInputStream(content);
CMSSignedDataGenerator gen = new CMSSignedDataGenerator();
List<Certificate> certList = Arrays.asList(cert);
CertStore certStore = null;
certStore = CertStore.getInstance("Collection", new CollectionCertStoreParameters(certList), provider);
gen.addSigner(privKey, (X509Certificate) certList.get(0), CMSSignedGenerator.DIGEST_SHA256);
CMSSignedData signedData = gen.generate(input, false, provider);
return signedData.getEncoded();
}catch (Exception e){}
return null;
public static void main(String[] args) throws Exception {
new TC3().doSign();
The issue
As had already been explained in this answer, the issue at work here is that
when non-incrementally storing the document with the added image, PDFBox 1.8.9 does so using a cross reference table no matter if the original file used a table or stream; if the original file used a stream, the cross reference stream dictionary entries are copied into the trailer dictionary;
0000033667 00000 n
0000033731 00000 n
/DecodeParms <<
/Columns 4
/Predictor 12
/Filter /FlateDecode
/ID [<5BD95916CAE5E84E9D964396022CBDCD> <6420B4547602C943AF37DD6C77496BE8>]
/Info 6 0 R
/Length 61
/Root 1 0 R
/Size 35
/Type /XRef
/W [1 2 1]
/Index [20 22]
(Most of these trailer entries here are useless or even misleading, see below.)
when incrementally saving the signature, COSWriter.doWriteXRefInc uses COSDocument.isXRefStream to determine whether the existing document (the one we stored as above) uses a cross reference stream. As mentioned above, it does not. Unfortunately, though, COSDocument.isXRefStream in PDFBox 1.8.9 is implemented as
public boolean isXRefStream()
if (trailer != null)
return COSName.XREF.equals(trailer.getItem(COSName.TYPE));
return false;
Thus, the misleading trailer entry Type shown above make PDFBox think it has to use a cross reference stream.
The result is a document whose initial revision ends with a cross reference table and weird trailer entries and whose second revision ends with a cross reference stream. This is not valid.
A work-around
Fortunately, though, understanding how the issue arises presents a work-around: Removing the troublesome trailer entry, e.g. like this:
inputBytes = os.toByteArray();
pdDocument = PDDocument.load(new ByteArrayInputStream(inputBytes));
pdDocument.getDocument().getTrailer().removeItem(COSName.TYPE); // <<<<<<<<<< Remove misleading entry <<<<<<<<<<
With this work-around both revisions in the signed document use cross reference tables and the signature is valid.
Beware, if upcoming PDFBox versions change to save documents loaded from sources with cross reference streams using xref streams, too, the work-around must again be removed.
I would assume, though, that won't happen in the 1.x.x versions to come, and version 2.0.0 will introduce a fundamentally changed API, so the original code won't work out-of-the-box then anyhow.
Other ideas
I tried other ways, too, to circumvent this problem, trying to
store the first manipulation as incremental update, too, or
add the image during the same incremental update as the signature,
cf. SignLikeUnOriginalToo.java, but failed. PDFBox 1.8.9 incremental updates only seem to properly work for adding signatures.
Other ideas revisited
After looking into the creation of additional revisions using PDFBox some more, I tried the other ideas again and now succeeded!
The crucial part is to mark the added and changed objects as updated, including a path from the document catalog.
Applying the first idea (adding the image as an explicit intermediate revision) amounts to this change in doSign:
FileOutputStream fos = new FileOutputStream(intermediateDocument);
FileInputStream fis = new FileInputStream(intermediateDocument);
byte inputBytes[] = IOUtils.toByteArray(inputStream);
PDDocument pdDocument = PDDocument.load(new ByteArrayInputStream(inputBytes));
PDJpeg ximage = new PDJpeg(pdDocument, ImageIO.read(logoStream));
PDPage page = (PDPage) pdDocument.getDocumentCatalog().getAllPages().get(0);
PDPageContentStream contentStream = new PDPageContentStream(pdDocument, page, true, true);
contentStream.drawXObject(ximage, 50, 50, 356, 40);
pdDocument.saveIncremental(fis, fos);
pdDocument = PDDocument.load(intermediateDocument);
PDSignature signature = new PDSignature();
(as in SignLikeUnOriginalToo.java method doSignTwoRevisions)
Applying the second idea (adding the image as part of the signing revision) amounts to this change in doSign:
byte inputBytes[] = IOUtils.toByteArray(inputStream);
PDDocument pdDocument = PDDocument.load(new ByteArrayInputStream(inputBytes));
PDJpeg ximage = new PDJpeg(pdDocument, ImageIO.read(logoStream));
PDPage page = (PDPage) pdDocument.getDocumentCatalog().getAllPages().get(0);
PDPageContentStream contentStream = new PDPageContentStream(pdDocument, page, true, true);
contentStream.drawXObject(ximage, 50, 50, 356, 40);
PDSignature signature = new PDSignature();
(as in SignLikeUnOriginalToo.java method doSignOneStep)
Both variants are clearly preferable to the original approach.