When trying to fill the form of this pdf (http://vaielab.com/Test/2.pdf) with this code
PDDocument pdfDocument = PDDocument.load(new File("2.pdf"));
pdfDocument.setAllSecurityToBeRemoved(true);
PDDocumentCatalog docCatalog = pdfDocument.getDocumentCatalog();
PDAcroForm acroForm = docCatalog.getAcroForm();
if (acroForm != null) {
PDField field = (PDField) acroForm.getField("rad2");
try {
field.setValue("0");
} catch (Exception e) {
System.out.println(e);
}
}
pdfDocument.save("output.pdf");
pdfDocument.close();
I get this error: value '0' is not a valid option for the field rad2, valid values are: [Yes] and Off
But value "0" should be a valid option, and if I do a dump_data_fields with pdftk, I get this:
FieldType: Button
FieldName: rad2
FieldFlags: 49152
FieldJustification: Left
FieldStateOption: 0
FieldStateOption: 1
FieldStateOption: Off
FieldStateOption: Yes
I also tried the value "1" but get the exact same error.
I'm using pdfbox 2.0.20
This is because of the Opt values in Root/AcroForm/Fields/[7]/Opt, that one has two "Yes" entries only. The PDButton.setValue() code in PDFBox updates this field differently when /Opt is set. The best here would be not to set it, or remove these entries by calling field.setExportValues(null) . Then valid settings would be 0, 1 and "Off".
Related
I am using PDFBox to get a document that was already generated from a Nestjs using PDF-lib js via the command form.createTextField(field.id); after that i send it to java so i can but a signature box ontop of it and fill the forms now the forms are filled and everything works with pdf viewer js
i can see the fields and the values but when i try to open the pdf file in google chrome i dont see the values at all or when i try to open that in Adobe reader i dont see the values untill i click on the field
here is my java code
public void prepareForSigning(DigestAlgorithm digestAlgorithm,
SignatureType signatureType,
UserData userData, List<FieldInput> formFields) throws IOException, NoSuchAlgorithmException {
this.digestAlgorithm = digestAlgorithm;
id = Utils.generateDocumentId();
pdDocument = PDDocument.load(contentIn);
int accessPermissions = getDocumentPermissions();
if (accessPermissions == 1) {
throw new AisClientException("Cannot sign document [" + name + "]. Document contains a certification " +
"that does not allow any changes.");
}
// add fields
// get the document catalog
try {
PDAcroForm acroForm = pdDocument.getDocumentCatalog().getAcroForm();
acroForm.setSignaturesExist(true);
acroForm.setAppendOnly(true);
acroForm.getCOSObject().setDirect(true);
acroForm.getCOSObject().setNeedToBeUpdated(true);
// acroForm.setNeedAppearances(true);
COSObject pdfFields = acroForm.getCOSObject().getCOSObject(COSName.FIELDS);
if (pdfFields != null) {
pdfFields.setNeedToBeUpdated(true);
}
for (int i = 0; i < formFields.size(); i++) {
PDField field = acroForm.getField(formFields.get(i).id);
if (field != null) {
// will also set a checkbox if the value is Yes
// checking for formFields.get(i).value == "true" returns
if (field.getFieldType() == "Btn" && formFields.get(i).value.equals("true")) {
field.setValue("Yes");
} else {
field.setValue(formFields.get(i).value);
}
field.setReadOnly(true);
field.getCOSObject().setNeedToBeUpdated(true);
field.getWidgets().get(0).getAppearance().getCOSObject().setNeedToBeUpdated(true);
Log.info("set field: " + field.getFullyQualifiedName() + " to " + formFields.get(i).value);
}
}
pdDocument.getDocumentCatalog().getCOSObject().setNeedToBeUpdated(true);
} catch (Exception e) {
Log.warn(e);
}
PDSignature pdSignature = new PDSignature();
Calendar signDate = Calendar.getInstance();
if (signatureType == SignatureType.TIMESTAMP) {
// Now, according to ETSI TS 102 778-4, annex A.2, the type of a Dictionary that
// holds document timestamp should be DocTimeStamp
// However, adding this (as of Feb/17/2021), it trips the ETSI Conformance
// Checked online tool, making it say
// "There is no signature dictionary in the document". So, for now (Feb/17/2021)
// this has been removed. This makes the
// ETSI Conformance Checker happy.
// pdSignature.setType(COSName.DOC_TIME_STAMP);
pdSignature.setFilter(PDSignature.FILTER_ADOBE_PPKLITE);
pdSignature.setSubFilter(COSName.getPDFName("ETSI.RFC3161"));
} else {
pdSignature.setFilter(PDSignature.FILTER_ADOBE_PPKLITE);
pdSignature.setSubFilter(PDSignature.SUBFILTER_ETSI_CADES_DETACHED);
// Add 3 Minutes to move signing time within the OnDemand Certificate Validity
// This is only relevant in case the signature does not include a timestamp
// See section 5.8.5.1 of the Reference Guide
signDate.add(Calendar.MINUTE, 3);
}
pdSignature.setSignDate(signDate);
pdSignature.setName(userData.getSignatureName());
pdSignature.setReason(userData.getSignatureReason());
pdSignature.setLocation(userData.getSignatureLocation());
pdSignature.setContactInfo(userData.getSignatureContactInfo());
SignatureOptions options = new SignatureOptions();
options.setPreferredSignatureSize(signatureType.getEstimatedSignatureSizeInBytes());
// create a visible signature at the specified coordinates
if (signatureDefinition != null) {
Rectangle2D humanRect = new Rectangle2D.Float(signatureDefinition.getX(),
signatureDefinition.getY(),
signatureDefinition.getWidth(),
signatureDefinition.getHeight());
PDRectangle rect = createSignatureRectangle(pdDocument, humanRect);
options.setVisualSignature(
createVisualSignatureTemplate(pdDocument, signatureDefinition.getPage(),
signatureDefinition.getImage(), rect, pdSignature));
options.setPage(signatureDefinition.getPage());
}
pdDocument.addSignature(pdSignature, options);
// Set this signature's access permissions level to 0, to ensure we just sign
// the PDF, not certify it
// for more details:
// https://wwwimages2.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf
// see section 12.7.4.5
setPermissionsForSignatureOnly();
pbSigningSupport = pdDocument.saveIncrementalForExternalSigning(inMemoryStream);
MessageDigest digest = MessageDigest.getInstance(digestAlgorithm.getDigestAlgorithm());
byte[] contentToSign = IOUtils.toByteArray(pbSigningSupport.getContent());
byte[] hashToSign = digest.digest(contentToSign);
options.close();
base64HashToSign = Base64.getEncoder().encodeToString(hashToSign);
}
now the field with value 5 is appearing because i already clicked on it which is on focus() mode
adobe reader
when i use acoForm.setNeedAppearances to true i can then see the values but then the signature field is not there am i missing something in code ?
i am expecting to see the values in google chrome or Adobe Reader appearing without me pressing on them
Picture of the pdf fields without values with one field being focused on
PDF SAMLE FILE
It has being parsed Ms Word documents with Aspose Words for Android below code. All of paragraphs in the document have inline character styled texts seperatelly. I've text and style of them but are there any way to get start position of them in its paragraph string like String.indexOf() ? It may be convert to string, but style control is not possible in this case.
Document doc = new Document(file); // Get word document.
NodeCollection paras = doc.getChildNodes(NodeType.PARAGRAPH, true); // get all paragraphs.
for (Paragraph prg : (Iterable<Paragraph>) paras) {
for (Run run : (Iterable<Run>) prg.getChildNodes(NodeType.RUN, true)){
boolean defaultPrgFont = run.getFont().getStyle().getName().equals("Default Paragraph Font");
// Get different styled texts only.
if (!defaultPrgFont){
// Text in different styled according to paragraph.
String runText = run.getText();
// Style of the different styled text.
String runStyle = run.getFont().getStyle().getName()
// Start position of the different styled text in its paragraph.
int runStartPosition; // ?
}
}
}
You can calculate length of text in runs before the styled run. Something like this.
Document doc = new Document("C:\\Temp\\in.docx"); // Get word document.
NodeCollection paras = doc.getChildNodes(NodeType.PARAGRAPH, true); // get all paragraphs.
for (Paragraph prg : (Iterable<Paragraph>) paras) {
int runStartPosition = 0;
for (Run run : (Iterable<Run>) prg.getChildNodes(NodeType.RUN, true)){
boolean defaultPrgFont = run.getFont().getStyle().getName().equals("Default Paragraph Font");
// Get different styled texts only.
if (!defaultPrgFont){
// Text in different styled according to paragraph.
String runText = run.getText();
// Style of the different styled text.
String runStyle = run.getFont().getStyle().getName();
System.out.println(runStartPosition);
}
// Position is increased for all runs in the paragraph.
// Note that some runs might represent field codes and are not normally displayed.
runStartPosition += run.getText().length();
}
}
Here you can download pdf with one acroform field and his size is exactly 427Kb
If I remove this unique field, file is 3Kb only, why this happens please ?
I tried analyse using PDF Debugger and nothing seems weird to me.
There's an embedded "Arial" font in the acroform default resources, see Root/AcroForm/DR/Font/Arial/FontDescriptor/FontFile2.
Either you or whoever created the pdf added it for no reason. The font is not used / referenced. For the acroform default resources you could check the /DA entry (default appearance) of each field whether it contains the font name.
When you removed the field somehow you also removed the font from the acroForm default resources. (You didn't write how you removed it)
Here's some code to do it (null checks mostly missing):
PDAcroForm acroForm = doc.getDocumentCatalog().getAcroForm();
PDResources defaultResources = acroForm.getDefaultResources();
COSDictionary fontDict = (COSDictionary) defaultResources.getCOSObject().getDictionaryObject(COSName.FONT);
List<String> defaultAppearances = new ArrayList<>();
List<COSName> fontDeletionList = new ArrayList<>();
for (PDField field : acroForm.getFieldTree())
{
if (field instanceof PDVariableText)
{
PDVariableText vtField = (PDVariableText) field;
defaultAppearances.add(vtField.getDefaultAppearance());
}
}
for (COSName fontName : defaultResources.getFontNames())
{
if (COSName.HELV.equals(fontName) || COSName.ZA_DB.equals(fontName))
{
// Adobe default, always keep
continue;
}
boolean found = false;
for (String da : defaultAppearances)
{
if (da != null && da.contains("/" + fontName.getName()))
{
found = true;
break;
}
}
System.out.println(fontName + ": " + found);
if (!found)
{
fontDeletionList.add(fontName);
}
}
System.out.println("deletion list: " + fontDeletionList);
for (COSName fontName : fontDeletionList)
{
fontDict.removeItem(fontName);
}
The resulting file has 5KB size now.
I haven't checked the annotations. Some of them have also a /DA string but it is unclear if the acroform default resources fonts are to be used when reconstructing a missing appearance stream.
Update:
Here's some additional code to replace Arial with Helv:
for (PDField field : acroForm.getFieldTree())
{
if (field instanceof PDVariableText)
{
PDVariableText vtField = (PDVariableText) field;
String defaultAppearance = vtField.getDefaultAppearance();
if (defaultAppearance.startsWith("/Arial"))
{
vtField.setDefaultAppearance("/Helv " + defaultAppearance.substring(7));
vtField.getWidgets().get(0).setAppearance(null); // this removes the font usage
vtField.setValue(vtField.getValueAsString());
}
defaultAppearances.add(vtField.getDefaultAppearance());
}
}
Note that this may not be a good idea, because the standard 14 fonts have only limited characters. Try
vtField.setValue("Ayşe");
and you'll get an exception.
More general code to replace font can be found in this answer.
I'm trying to fill out a bunch of PDF Forms using PDFBox 2.0.8. For some documents I get the following error when setting the PDTextField's value:
java.io.IOException: Could not find font: /ArialMT
Apparently the font is not correctly embedded as is often the case with proprietary Microsoft fonts.
How can I tell PDFBox to substitute the font e.g. with "normal" Arial or some other font? Setting the fields DA string to "/Helv 0 tf 0 g" resulted in a NullPointerException.
Based on the comments from Tilman Hausherr I built a first fix which works independent from the operating system (which is a Linux in my case).
acroForm.defaultResources.put(COSName.getPDFName("ArialMT"),
PDType0Font.load (pdDocument, this.javaClass.classLoader.getResourceAsStream("fonts/ARIALMT.ttf"), false))
This will only work for this particular font, though. What's still missing - and was actually the main intention of my question - is an option to tell PDFBox to fall back to a certain font resp. DA if the font that is required cannot be provided.
After Tilman again came for the rescue I can now present the complete solution. Again, this is Kotlin, not Java:
PDDocument.load(file).use { pdDocument ->
val acroForm = pdDocument.documentCatalog.acroForm
acroForm.defaultResources.put(COSName.getPDFName("ArialMT"),
PDType0Font.load (pdDocument, this.javaClass.classLoader.getResourceAsStream("fonts/ARIALMT.ttf"), false))
val pdField: PDField? = acroForm.getField(fieldname)
val value = ...
when (pdField) {
is PDCheckBox -> {
if (value is Boolean) {
when (value) {
true -> pdField.check()
false -> pdField.unCheck()
}
} else {
log.error("RENDER_FORM: Need Boolean for ${pdField.fullyQualifiedName} but got $value")
}
}
is PDTextField -> {
try {
pdField.value = value?.toString() ?: ""
} catch (ioException: IOException) {
pdField.cosObject.setString(COSName.DA, "/Helv 0 Tf 0 g")
pdField.value = value?.toString() ?: ""
log.error("RENDER_FORM: Writing text field failed: ${ioException.message}")
}
}
null -> {
log.error("RENDER_FORMULAR: Formfield $fieldname does not exist in $name")
}
else -> log.error("RENDER_FORMULAR: Formfield $pdField ($fieldname) is of unhandled type ${pdField.fieldType}")
}
val stream = ByteArrayOutputStream()
pdDocument.save(stream)
pdDocument.close()
return stream.toByteArray()
}
Add "ArialMT" to the default resources:
try (PDDocument doc = PDDocument.load(new File("F2_Datenblatt_022015.pdf")))
{
PDAcroForm acroForm = doc.getDocumentCatalog().getAcroForm();
PDField field = acroForm.getField("Vorname_Name");
// fails with IOException as described in question
//field.setValue("Tilman Hausherr");
// Method 1, just add type1 Helvetica (allows only WinAnsiEncoding glyphs)
//acroForm.getDefaultResources().put(COSName.getPDFName("ArialMT"), PDType1Font.HELVETICA);
// Method 2, add the full Arial font (allows for more different glyphs)
// important: use the method that switches off subsetting
acroForm.getDefaultResources().put(
COSName.getPDFName("ArialMT"),
PDType0Font.load(doc, new FileInputStream("c:/windows/fonts/arial.ttf"), false));
field.setValue("Tilman Hausherr");
doc.save("F2_Datenblatt_022015-mod.pdf");
}
Update:
Turns out the code in the question would have worked too with the file - almost. It's "Tf" and not "tf", so the string would have been "/Helv 0 Tf 0 g". We'll research how to avoid an NPE and get a meaningful exception.
I am using Apache-POI 3.14. I have a need to lock-down a cell to a "Text" format. The data in my cell might be all digits, but it is still considered a string. When I write the cell, I do it like this:
cell.setCellValue("001");
cell.setCellType(Cell.CELL_TYPE_STRING);
When I open the output workbook in Excel, the cell contains the correct value ("001") and it displays with a small green triangle in the corner. Hovering over the exclamation point displays the hover text The number in this cell is formatted as text or preceded by an apostrophe. When I look at the cell formatting (Right-click -> Format cells), the "Category" is displayed as "General". I expected this to be "Text".
The problem arises when a user modifies the value in the cell by entering only digits. Because the "Category" is "General", the value is entered and displayed as a number, removing leading zeroes and right-justified.
How can I achieve the same result as Excel's "Format cells" dialog?
You can try to set the cell-format to text via
DataFormat fmt = wb.createDataFormat();
CellStyle cellStyle = wb.createCellStyle();
cellStyle.setDataFormat(
fmt.getFormat("#"));
cell.setCellStyle(cellStyle);
Note: CellStyles shoudl be re-used for all applicable cells, do not create new ones for every cell.
You could also try to use the "Ignore errors" feature in the .xlsx format, however support for it is not fully done yet, see Issue 46136 and Issue 58641 for some ongoing discussion.
See also this MSDN page for some additional information
For HSSF,
DataFormat fmt = workbook.createDataFormat();
CellStyle textStyle = workbook.createCellStyle();
textStyle.setDataFormat(fmt.getFormat("#"));
sheet.setDefaultColumnStyle(0, textStyle);
It just sets the whole column style as Text and set category as Text .
However, if you are using XSSF format, it doesn't work(I am using Apache Poi 3.15 and didn't work for me).
In this case you have set style to each cell you want to treat as text in addition to above code using:
cell.setCellStyle(textStyle);
Regarding error, you could use
sheet.addIgnoredErrors(new CellRangeAddress(0,9999,0,9999),IgnoredErrorType.NUMBER_STORED_AS_TEXT );
It ignores the NUMBER_STORED_AS_TEXT error for row 0 till 9999 and column 0 till 9999 and you wont see it.
Look like OP was asking for Apache solution. After some searching I found this answer:
HSSFCellStyle style = book.createCellStyle();
style.setDataFormat(BuiltInFormats.getBuiltInFormat("text"));
In this case, I'm using Apache-POI 3.15, and I had the same problem, so I validated the data in my style, I need numbers >0 and strings:
try {
if (Integer.parseInt(field + "") >= 0) {
int valor = Integer.parseInt(field + "");
cell.setCellValue(valor); //Int
}
} catch (NumberFormatException nfe) {
// no int
try {
if (Double.parseDouble(field + "") >= 0) {
double valor = Double.parseDouble(field + ""); //double
cell.setCellValue(valor);
}
} catch (NumberFormatException nfe2) {
cell.setCellValue(field + ""); //String
}
}
For Apache POI 4.0.1 :
XSSFSheet sheet = workbook.createSheet("MySheetName");
sheet.addIgnoredErrors(new CellRangeAddress(0, 9999, 0, 9999), IgnoredErrorType.NUMBER_STORED_AS_TEXT);
Be careful to cast your sheet to org.apache.poi.xssf.usermodel.XSSFSheet and not to org.apache.poi.ss.usermodel.Sheet, otherwise the method addIgnoredErrors wil be unknown.