Converting pdfDocument to byte[] stream - PDFBox Java - java

I am updating the values of an editable PDF using PDFBox. Instead of saving, I want to return stream. I saved it, it works all fine. Now I want to return byte[] instead of saving it.
public static void main(String[] args) throws IOException
{
String formTemplate = "myFormPdf.pdf";
try (PDDocument pdfDocument = PDDocument.load(new File(formTemplate)))
{
PDAcroForm acroForm = pdfDocument.getDocumentCatalog().getAcroForm();
if (acroForm != null)
{
PDTextField field = (PDTextField) acroForm.getField( "sampleField" );
field.setValue("Text Entry");
}
pdfDocument.save("updatedPdf.pdf"); // instead of this I need STREAM
}
}
I tried SerializationUtils.serialize but it fails to serialize it.
Failed to serialize object of type: class org.apache.pdfbox.pdfmodel.PDDcoumemt

Use the overloaded save method which accepts an OutputStream and use ByteArrayOutputStream.
public static void main(String[] args) throws IOException
{
String formTemplate = "myFormPdf.pdf";
try (PDDocument pdfDocument = PDDocument.load(new File(formTemplate)))
{
PDAcroForm acroForm = pdfDocument.getDocumentCatalog().getAcroForm();
if (acroForm != null)
{
PDTextField field = (PDTextField) acroForm.getField( "sampleField" );
field.setValue("Text Entry");
}
ByteArrayOutputStream baos = new ByteArrayOutputStream();
pdfDocument.save(baos);
byte[] pdfBytes = baos.toByteArray(); // PDF Bytes
}
}

Related

Unable to open pdf after creation

I'm writing a program that takes a template PDF with a bunch of blank form fields, makes a copy of it, fills in the forms, then flattens the fields.
One of these templates has a ton of fields so writing a method that fills in all the fields results in an error that says code is too large due to the size limits on methods.
To work around this, I closed the destination file then tried to open it in another method where I could continue to fill in the fields, but this results in an error that says "(The requested operation cannot be performed on a file with a user-mapped section open)"
I ended the first method by closing the PDF, so I'm not sure what the issue is. The program will execute the first method, fill the fields but throws the error when it get to the 2nd method. Sample code below.
public void E2fill(String srcE2, String destE2) throws IOException
{
try
{
PdfDocument pdf2 = new PdfDocument(new PdfReader(destE2), new PdfWriter(dest2E2));
PdfAcroForm form2 = PdfAcroForm.getAcroForm(pdf2, true);
Map<String, PdfFormField> fields2 = form2.getFormFields();
PdfFormField field2;
fields2.get("fieldname1").setValue(stringname1);
//lots more field fills
pdf2.close()
}
catch(Exception x)
{
System.out.println(x.getMessage());
}
}
public void E2fill2(String destE2, String dest2E2) throws IOException
{
try
{
PdfDocument pdf2 = new PdfDocument(new PdfReader(destE2), new PdfWriter(dest2E2));
PdfAcroForm form2 = PdfAcroForm.getAcroForm(pdf2, true);
Map<String, PdfFormField> fields2 = form2.getFormFields();
PdfFormField field2;
fields2.get("fieldname546").setValue(stringname546);
//more field fills
form2.flattenFields();
pdf2.close();
}
catch(Exception x)
{
System.out.println(x.getMessage());
}
}
I suggest you try this:
public void fill(String srcE2, String destE2) {
PdfDocument pdf2 = new PdfDocument(new PdfReader(destE2), new PdfWriter(dest2E2));
PdfAcroForm form2 = PdfAcroForm.getAcroForm(pdf2, true);
Map<String, PdfFormField> fields2 = form2.getFormFields();
E2fill(fields2);
E2fill2(fields2);
form2.flattenFields();
pdf2.close();
}
public void E2fill(Map<String, PdfFormField> fields2) throws IOException
{
fields2.get("fieldname1").setValue(stringname1);
//lots more field fills
}
public void E2fill2(PdfAcroForm form2) throws IOException {
fields2.get("fieldname546").setValue(stringname546);
//more field fills
}

How to add fields with the same names to pdf

I'm using itext 7.1.8 and I need to add fields with the same names to pdf. I use the code like the following:
public class Main {
public static void main(String[] args) {
final PdfDocument emptyPdfDocument = createEmptyPdfDocument(pdf);
addTextField("Text_1", "Hello", emptyPdfDocument.getFirstPage(), PdfAcroForm.getAcroForm(emptyPdfDocument, true), emptyPdfDocument);
addTextField("Text_1", "Hello", emptyPdfDocument.addNewPage(), PdfAcroForm.getAcroForm(emptyPdfDocument, true), emptyPdfDocument);
savePdf(emptyPdfDocument);
}
private static void addTextField(String name, String value, PdfPage page, PdfAcroForm form, PdfDocument pdf) {
PdfFormField field = form.getField(name);
final Rectangle rect = new Rectangle(100, page.getCropBox().getHeight() - 100, 300, 20);
if (field != null) {
PdfWidgetAnnotation annotation = new PdfWidgetAnnotation(rect);
annotation.makeIndirect(pdf);
annotation.setVisibility(VISIBLE);
field.addKid(annotation);
page.addAnnotation(annotation);
return;
}
field = PdfFormField.createText(pdf, rect, name);
field.setValue(value);
field.setVisibility(VISIBLE);
page.addAnnotation(field.getWidgets().get(0));
form.addField(field, page);
}
private static PdfDocument createEmptyPdfDocument(final String pdfPath) throws IOException {
PdfWriter pdfWriter = new PdfWriter(new FileOutputStream(pdfPath));
final PdfDocument pdfDocument = new PdfDocument(pdfWriter);
pdfDocument.addNewPage();
return pdfDocument;
}
public static void savePdf(PdfDocument pdf) {
pdf.close();
}
}
but when the method addTextField has been called the second time the kids of the field are empty.
I don't understand what I'm doing wrong.

Itext sign pdf by string signature in base 64 from client

I am trying to sign a pdf document with a signature that comes from the entire client in format base 64.
the service makes a request to calculate the hash from the document
I take the content from the pdf of the document, calculate the hash from it according to the algorithm.
service takes the received hash and signs it, sends the received signature along with the bytes of the document to be signed
I get a string in base 64 and pdf bytes to be signed
Is it possiple case? I give a code example
public byte[] insertSignature(byte[] document, String signature) {
try (InputStream inputStream = new ByteArrayInputStream(document);
ByteArrayOutputStream os = new ByteArrayOutputStream();
ByteArrayOutputStream result = new ByteArrayOutputStream()) {
byte[] decodeSignature = Base64.decodeBase64(signature);
CAdESSignature cades = new CAdESSignature(decodeSignature, null, null);
var certificate = cades.getCAdESSignerInfo(0).getSignerCertificate();
var subject = new Subject(certificate.getSubjectX500Principal().getEncoded());
List<String> names = getSignaturesFields(document);
String sigFieldName = String.format("Signature %s", names.size() + 1);
PdfName filter = PdfName.Adobe_PPKLite;
PdfName subFilter = PdfName.ETSI_CAdES_DETACHED;
int estimatedSize = 8192;
PdfReader reader = new PdfReader(inputStream);
StampingProperties stampingProperties = new StampingProperties();
if (names.size() > 1) {
stampingProperties.useAppendMode();
}
PdfSigner signer = new PdfSigner(reader, os, stampingProperties);
signer.setCertificationLevel(PdfSigner.CERTIFIED_NO_CHANGES_ALLOWED);
PdfSignatureAppearance appearance = signer.getSignatureAppearance();
appearance
.setContact(subject.email().orElse(""))
.setSignatureCreator(subject.organizationName().orElse(""))
.setLocation(subject.country())
.setReuseAppearance(false)
.setPageNumber(1);
signer.setFieldName(sigFieldName);
ContainerForPrepareSignedDocument external = new ContainerForPrepareSignedDocument(filter, subFilter);
signer.signExternalContainer(external, estimatedSize);
byte[] preSignedBytes = os.toByteArray();
ContainerReadyToSignedDocument extSigContainer = new ContainerReadyToSignedDocument(decodeSignature);
PdfDocument docToSign = new PdfDocument(new PdfReader(new ByteArrayInputStream(preSignedBytes)));
PdfSigner.signDeferred(docToSign, sigFieldName, result, extSigContainer);
docToSign.close();
return result.toByteArray();
}
catch (IOException e) {
throw new InternalException("IO exception by insert signature to document:", e);
}
catch (GeneralSecurityException e) {
throw new InternalException("General security by insert signature to document:", e);
}
catch (CAdESException e) {
throw new InternalException("CAdESException by insert signature to document:", e);
}
}
private List<String> getSignaturesFields(byte[] document)
throws IOException {
try (InputStream inputStream = new ByteArrayInputStream(document);
PdfReader reader = new PdfReader(inputStream);
PdfDocument pdfDocument = new PdfDocument(reader)) {
SignatureUtil signUtil = new SignatureUtil(pdfDocument);
return signUtil.getSignatureNames();
}
}
static class ContainerForPrepareSignedDocument implements IExternalSignatureContainer {
private final PdfName filter;
private final PdfName subFilter;
public ContainerForPrepareSignedDocument(PdfName filter,
PdfName subFilter) {
this.filter = filter;
this.subFilter = subFilter;
}
public byte[] sign(InputStream docBytes) {
return new byte[0];
}
public void modifySigningDictionary(PdfDictionary signDic) {
signDic.put(PdfName.Filter, filter);
signDic.put(PdfName.SubFilter, subFilter);
}
}
static class ContainerReadyToSignedDocument implements IExternalSignatureContainer {
private byte[] cmsSignatureContents;
public ContainerReadyToSignedDocument(byte[] cmsSignatureContents) {
this.cmsSignatureContents = cmsSignatureContents;
}
public byte[] sign(InputStream docBytes) {
return cmsSignatureContents;
}
public void modifySigningDictionary(PdfDictionary signDic) {
}
}

Extract text from pdf file by pdfbox

i am facing an issue in pdf reading.
public class GetLinesFromPDF extends PDFTextStripper {
static List<String> lines = new ArrayList<String>();
Map<String, String> auMap = new HashMap();
boolean objFlag = false;
public GetLinesFromPDF() throws IOException {
}
/**
* #throws IOException If there is an error parsing the document.
*/
public static void main(String[] args) throws IOException {
PDDocument document = null;
String fileName = "E:\\sample.pdf";
try {
int i;
document = PDDocument.load(new File(fileName));
PDFTextStripper stripper = new GetLinesFromPDF();
stripper.setSortByPosition(true);
stripper.setStartPage(0);
stripper.setEndPage(document.getNumberOfPages());
Writer dummy = new OutputStreamWriter(new ByteArrayOutputStream());
stripper.writeText(document, dummy);
// print lines
for (String line : lines) {
//System.out.println("line = " + line);
if (line.matches("(.*)Objection(.*)")) {
System.out.println(line);
withObjection(lines);
//System.out.println("iiiiiiiiiiii");
break;
}
//System.out.println("uuuuuuuuuuuuuu");
}
} finally {
if (document != null) {
document.close();
}
}
}
/**
* Override the default functionality of PDFTextStripper.writeString()
*/
#Override
protected void writeString(String string, List<TextPosition> textPositions) throws IOException {
System.out.println("textPositions = " + string);
// System.out.println("tex "+textPositions.get(0).getFont()+ getArticleEnd());
// you may process the line here itself, as and when it is obtained
}
}
in need a output like
My pdf have some title, we need to skip the same.
pdf file content is
how to extract text as in separate formats as specified.
thanks in advance.

Page count showing zero for APACHE POI .docx file

I have implemented Apache POI library for Page count of Doc pages, but it shows page count zero when I download Google Doc as .docx file.
Edit: My code is as follows
public Integer getPagesCount(byte[] docBytes, String type)
throws IOException {
ByteArrayInputStream in = new ByteArrayInputStream(docBytes);
String lowerFilePath = type.toLowerCase();
if (lowerFilePath.equals("docx")) {
#SuppressWarnings("resource")
XWPFDocument docx = new XWPFDocument(in);
return docx.getProperties().getExtendedProperties()
.getUnderlyingProperties().getPages();
} else if (lowerFilePath.equals("doc")) {
#SuppressWarnings("resource")
HWPFDocument wordDoc = new HWPFDocument(in);
return wordDoc.getSummaryInformation().getPageCount();
} else if (lowerFilePath.equals("ppt")) {
HSLFSlideShow document = new HSLFSlideShow(in);
return document.getSlides().size();
} else if (lowerFilePath.equals("pptx")) {
#SuppressWarnings("resource")
XMLSlideShow xslideShow = new XMLSlideShow(in);
return xslideShow.getSlides().size();
} else if (lowerFilePath.equals("pdf")) {
PDDocument doc = PDDocument.load(in);
return doc.getNumberOfPages();
}
return 0;
}

Categories