Do you know a way (f.e. library) in which I can generate signature lines in Microsoft Word document in Java as described for example on Microsoft's page
and later use it to sign a document multiple times? Each signature can be added at different point in time.
The crucial functionality I need to have is that a signer's name is added when a document is signed. In MS Word at the beginning you can define a number of 'signature lines' and then each person must right click on one signature box and click sign. The signer's name is filled in without breaking previous signatures.
I know how to sign documents in java (usually external xades signatures). I was also able to add multiple signatures at the same time using Apache POI as described here:
How to programatically sign an MS office XML document with Java? but this will not work when changing a document (to update a signer's name). Maybe this functionality is also available in Apache POI?
In the attached print screen you can see (unfortunately in Polish; PODPIS/PODPISY == SIGNATURE/SIGNATURES) what can be generated in Microsoft Word. Near the 'X' signers' names are updated when the document is signed by each person.
Thanks for any help.
Related
I am working on a project which needs multiple users of the application to sign a document with docusign.
I was working with the JAVA quickstart project and tried to retrieve a document after signing using the download document example (7th one on the page )in the quickstart.
The document was sent via the bulk signature example and I signed it through two of my e-mail accounts. I hardcoded the document id and the envelope id in the
code as the quickstart by default only lets you view a document created through example 2. The document does get downloaded but I cannot see the signatures on it.
Secondly, when I signed the document from the two accounts, I signed it from both in the same place to see how it will get handled.
Lastly I see the option to place the signature widget by the signer only in the bulk signature example, can this be enabled for any signature request ?(This may be there somewhere in
the documentation but i could not find it anywhere)
In general, there is very little clarity on how multiple signatures are handled in docusign (How the widget is placed, who can place them, how they are finally visible on the document when the document
is completled, and when signatures are being done in some order, then do the succeeeding signers see the signatures of the preceeding ones), will greatly appreicate any help on this.
Regards,
ANur
Signatures are placed in documents using Tabs. In the Java Quickstart there are a few helper functions that add the tabs to the documents. You can find those functions in this file. When you're dealing with multiple signers you need to make sure that the tabs are associated with the correct recipients. If you download the document using example 7 before the document has been signed it won't have the signature fields on it but after it has been signed you should see the signatures there. I don't think DocuSign will prevent the two signers from placing their signatures in the same location but you can prevent that by specifying the locations of the SignHere tabs so that they do not overlap.
For more on placing tabs in documents you can check out this blog post.
I am looking for a way to resolve multiple signatures on a document, so I got a couple of questions of what I can do and what I cannot.
First, since multiple signatures from different people can be added to the document, the position of the signatures is important due to aesthetics and document printing if needed. Having said this, I would like to know an approach to handle this. What I was thinking was adding/append an additional page at the end of the documents and assign to it some kind of identifier like "doc_signatures", so when the second person opens the document for signature, it detects it already has a "doc_signatures" page created, and just add the signature and save the document using the increment option in PDFBox. Is this a good approach? If it is, is there a way to identify the "doc_signatures" page so I don't append it again.
Also, can I add like signature fields to that "doc_signatures" page, with a position each one, so when I open the PDF, I detect it has "doc_signatures" already created and that it already has a signature on that page on "Field 1"(with its own X,Y coordinates) so place the second signature on "Field 2" on "doc_signatures" page and "Field 3" for the third signature, and also some type of limmit of the amount of signatures on the document?
I would appreciate if this is a acceptable approach and if it is not, is there any recommendation or something I can do to accomplish this? I would appreciate any other approach or logic for this that can be implemented using PDFBox. Regards everyone.
As you ask this question in general (not PDFBox specific) terms, I'll start by answering similarly. PDFBox is versatile enough to implement the concepts in question.
First, since multiple signatures from different people can be added to the document, the position of the signatures is important due to aesthetics and document printing if needed. Having said this, I would like to know an approach to handle this. What I was thinking was adding/append an additional page at the end of the documents and assign to it some kind of identifier like "doc_signatures", so when the second person opens the document for signature, it detects it already has a "doc_signatures" page created, and just add the signature and save the document using the increment option in PDFBox. Is this a good approach?
Whether this is a good approach or not depends on the nature of the documents to be signed and your influence on the pre-signing workflow of the document.
Paper documents often have dedicated positions for signatures of persons in a specific role. If you are buying something and as part of the sales contract acknowledge receipt, your signature has to clearly also sign the receipt part while the signature of the vendor needs not.
In digital PDF signatures you can alternatively make this clear by means of the Reason entry of the signature field value, but as you also want to print the documents, that might not suffice: In print there is no signature field value, only its appearance.
In such a situation the document to sign should already be prepared with empty signature fields positioned appropriately in the document and named or otherwise flagged to signal the role of the person to sign it. This, by the way, would also be the interoperable way, empty signature fields can easily be signed in e.g. Adobe Reader.
If this is not possible, though, and if the software for signing the document has a GUI, this GUI might provide the capabilities for each signer to position his signature appropriately for his signing reason and role.
Otherwise your extra signature page approach would be the approach of choice.
If all signers have the same role, though, or if there at least is no special appropriate position for any of the signing roles, your extra page approach might not merely be a last resort. It even kind of looks like a document resulting from a notarial act.
If it is, is there a way to identify the "doc_signatures" page so I don't append it again.
For PDFs according to the current ISO 32000-1 norm, you could do this using a page-piece dictionary:
A page-piece dictionary may be used to hold private conforming product data. The data may be
associated with a page or form XObject by means of the optional PieceInfo entry in the page object or form dictionary.
(section 14.5 of ISO 32000-1)
It looks like these piece dictionaries will be deprecated in the upcoming ISO 32000-2, though. Thus, a more future-proof approach would be for you to register a developer prefix and use your own key for that endeavor:
Developer
prefixes shall be used to identify extensions to PDF that use First Class names (see below) and that are
intended for public use.
(annex E of ISO 32000-1)
These custom keys don't seem to become deprecated in ISO 32000-2.
Also, can I add like signature fields to that "doc_signatures" page, with a position each one, so when I open the PDF, I detect it has "doc_signatures" already created and that it already has a signature on that page on "Field 1"(with its own X,Y coordinates) so place the second signature on "Field 2" on "doc_signatures" page and "Field 3" for the third signature, and also some type of limmit of the amount of signatures on the document?
You can easily inspect the annotations on your extra page and especially determine their location and extent. Consequently you can arrange additional signatures to your liking on an individual basis. Alternatively you can prepare a fixed number of empty signature fields on that extra page when you create it, arranging the signatures to your liking in one go.
All the above is only possible if the source document has not been signed before! If it already has been signed, adding a new page usually is considered a disallowed change of the document, effectively invalidating that first signature. For allowed and disallowed changes of signed document, see this answer.
Lets say each document type has a specific amount of signatures, for example, A sales document, with seller and buyer signatures, so the approach would be adding too signing fields to the documents and then place the signatures on those fields.. am I correct?
Exactly that is what I would propose: If you know the number and roles of the signers beforehand, prepare empty signature fields for them. In that case you do not even have to mark a signature page or something.
Now, sorry to bother you, with PDFBox will I be able to create signature fields and add signatures to those fields? Is there any example code for that?
Both is possible with PDFBox, but in particular adding a signature to an existing empty signature field may require some own coding.
I fill (programatically) a form (AcroPdf) in a PDF document and sign the document afterwards. I start with doc.pdf, create doc_filled.pdf, using the setFields.java example of PDFBox. Then I sign doc_filled.pdf, creating doc?filled_signed.pdf, using some code, based on the signature examples and open the pdf in the Acrobat Reader. The entered Field data is visible and the signature panel tells me
"There are errors in the formatting or information contained in this signature (The signature byte array is invalid)"
So far, I know that:
the signature code applied alone (i.e. directly creating some doc_signed.pdf) creates a valid signature
the problem exists for "invisible signatures", visible signatures and visible signatures, being added to existing signature fields.
the problem even occurs, if I do not fill the form, but only open it and save it, i.e.:
PDDocument doc = PDDocument.load(new File("doc.pdf"));
doc.save(new File("doc_filled.pdf"));
doc.close();
suffices to break the afterwards applied signing code.
On the other hand, if I take the same doc.pdf, enter the field's values manually in Adobe, the signing code produces valid signatures.
What am I doing wrong?
Update:
#mkl asked me to provide the files, i am talking about (I do not have enough reputation currently, to post all files as links, sorry for that inconvenience):
odc.pdf: https://www.dropbox.com/s/ev8x9q48w5l0hof/doc.pdf?dl=0
doc_filled.pdf: https://www.dropbox.com/s/fxn4gyneizs1zzb/doc_filled.pdf?dl=0
doc_filled_signed.pdf: https://www.dropbox.com/s/xm846sj8f9kiga9/doc_filled_signed.pdf?dl=0
doc_filled_and_signed.pdf: https://www.dropbox.com/s/5jftje6ke87jedr/doc_filled_and_signed.pdf?dl=0
the last one was created, by signing and filling the document in one go, using
doc.saveIncremental();
As I already wrote in the comment, some
setNeedToBeUpdate(true);
seems to be missing, though.
With reference to #mkl 's second comment, I found this
SO question: Saved Text Field value is not displayed properly in PDF generated using PDFBOX, which also covers to some entered text not being show. I gave it a first try, applying
setBoolean(COSName.getPDFName("NeedAppearances"), true);
to the field's and form's dictionary, which then shows the fields context, but the signature does not get added in the end. Still I have to look further into that.
Update:
The story continues here: PDFBox 1.8.10: Fill and Sign Document, Filling again fails
The cause of the OP's original problem, i.e. that after loading his PDF (for form fill-in) with PDFBox and then saving it, this new PDF cannot be successfully signed using PDFBox signing code, has already been explained in detail in this answer, in short:
When saving documents regularly, PDFBox does so using a cross reference table.
If the document to save regularly had been loaded from a PDF with a cross reference stream, all entries of the cross reference stream dictionary are saved in the trailer dictionary.
When saving documents in the process of applying a signature, PDFBox creates an incremental update; as such incremental updates require that the update uses the same kind of cross reference as the original revision, PDFBox in this case tries to use the same technique.
For recognizing the technique originally used PDFBox looks at the Type entry of the dictionary in its document representation into which trailer or cross reference stream dictionary had been loaded: If there is a Type entry with value XRef (which is so specified for cross reference streams), a stream is assumed, otherwise a table.
Thus, in the case of the OP's original PDF doc.pdf which has a cross reference stream:
After loading and form fill-in the document is saved regularly, i.e. using a cross reference table, but all the former cross reference stream entries, among them the Type, are copied to the trailer. (doc_filled.pdf)
After loading this saved PDF with a cross reference table for signing, it is saved again using an incremental update. PDFBox assumes (due to the Type trailer entry) that the existing file has a cross reference stream and, therefore, uses a cross reference stream at the end of the incremental update, too. (doc_filled_signed.pdf)
Thus, in the end the filled-in, then signed PDF has two revisions, the inner one with a cross reference table, the outer one with a cross reference stream.
As this is not valid, Adobe Reader upon loading the PDF, repairs this in its internal document representation. Repairing changes the document bytes. Thus, the signature in Adobe Reader's eyes is broken.
Most other signature validators don't attempt such repairs but check the signature of the document as is. They validate the signature successfully.
The answer referenced above also offers some ways around this:
A: After loading the PDF for form fill-in, remove the Type entry from the trailer before saving regularly. If signing is applied to this file, PDFBox will assume a cross reference table (because the misleading Type entry is not there. Thus, the signature incremental update will be valid.
B: Use an incremental update for saving the form fill-in changes, too, either in a separate run or in the same run as signing. This also results in a valid incremental update.
Generally I would propose the latter option because the former option likely will break if the PDFBox saving routines ever are made compatible with each other.
Unfortunately, though, the latter option requires marking the added and changed objects as updated, including a path from the document catalog. If this is not possible or at least too cumbersome, the first option might be preferable.
In the case at hand the OP tried the latter option (doc_filled_and_signed.pdf):
At the Moment the text box's content is only visible, when the text box is selected (with Acrobat reader and Preview the same behaviour). I flag the PDField, all of its parents, the AcroForm, the Catalog as well as the page where it is displayed.
He marked the changed field as updated but not the associated appearance stream which automatically is generated by PDFBox when setting the form field value.
Thus, in the result PDF file the field has the new value but the old, empty appearance stream. Only when clicking into the field, Adobe Reader creates a new appearance based on the value for editing.
Thus, the OP also has to mark the new normal appearance stream (the form field dictionary contains an entry AP referencing a dictionary in which N references the normal appearance stream). Alternatively (if finding the changed or added entries becomes too cumbersome) he might try the other option.
I'm wondering if it is possible, using iText (that I used for signing) or other tools in Java, to add biometric data on a pdf.
I'll explain better: while signing on a sign tablet, I collect signature information like pen pressure, signing speed and so on. I'd like to store those informations (variables in java) togheter with the signature on the pdf. Obviously hidden and encrypted such as the signatures info.
Is there some kind of hidden data field on a pdf or something that can contain this kind of information? I think it is inappropriate to store it in the metadata fields such as author etc.
There are different ways to add info to a PDF document.
You could add the data in a document-level attachment. That way, people can inspect the data by opening the attachment panel.
Storing it as metadata is fine too, but you're right about it being inappropriate to store that info in something like the author key.
As you may know, the /Info dictionary will be deprecated in PDF 2.0 in favor of using an XMP metadata stream. In this metadata stream, you can add custom XML data (see section 2.2.1 of the XMP specification - Part 3).
If you don't want to mix your biometric data with the document metadata, you can even define an XMP stream for any dictionary you want, probably including the signature dictionary. See section 14.3.2 of ISO-32000-1.
PS 1: I don't know who downvoted your question. I upvoted it, so you're back at 0.
PS 2: If you want to create future proof signatures, read http://itextpdf.com/book/digitalsignatures
PS 3: Signatures created with the 4-year-old version of iText usually aren't future-proof.
I have generated two PDFs by using this example (FirstPDF) removing the "new Date()" sentence.
They look equal but when calculating a md5 hash on them, they are really different.
I've examinated them and they register a creationDate, even if the sentence document.addCreationDate() is not included in the source code.
The question is very simple: is it possible in any way with any API to generate two PDFs that are exactly equal byte to byte?
This is how it SHOULD be. Apart from the date in the metadata, there's
also a unique ID that is added every time a PDF is generated from
scratch.
from
If you need two identical files giving you the same MD5 hash, why not copy one that's been created already?
If you need to create two identical files by two separate API calls, then you can use any PDF-creating API that's worth it's money:
Because each of these APIs have contain a call to set the creation and the modification date of the output PDF to any value that you need... Just don't let this setting happen automatically! Use the same setting two times.
Attention! PDF also supports the setting of a document UUID. Some of these APIs also do set an arbitrary UUID for each new document (which would break your MD5 hash), unless you actively prevent this to happen.
As described here, files are not equals because they have different identifiers (having two files, created on a different moment, should have a different ID as defined in the PDF specification).
The file identifier is usually a hash created based on the date, a path name, the size of the file, part of the content of the PDF file (e.g. the entries in the information dictionary).
.
File identifiers are involved (and mandatory) in document encryption. As a result, encrypted PDF files with different file identifiers will have streams that are completely different.
By design, you should never be able to create two identical PDFs using the same code.