iText PDF Concatination fails - InvalidPDFException

iText PDF Concatination fails - InvalidPDFException - java

I am trying to concatenate 2 PDFs using itext 4.2.0 utility. For few cases, it throws InvalidPDFException in below code
reader = new PdfReader("c:\tmp\test.pdf");
com.itextpdf.text.exceptions.InvalidPdfException: No message found for
trailer.not.found at
com.itextpdf.text.pdf.PdfReader.rebuildXref(Unknown Source) at
com.itextpdf.text.pdf.PdfReader.readPdf(Unknown Source) at
com.itextpdf.text.pdf.PdfReader.(Unknown Source) at
com.itextpdf.text.pdf.PdfReader.(Unknown Source)
This PDF is valid one- I opened it in Text editor and ensured it has %PDF as well as %EOF as recommended here
UPDATE
The iText version is 2.1.7. The jar was wrongly named as 4.2.0.
The path mentioned ("c:\tmp\test.pdf") is sample one. We are sending as "c:/tmp/test.pdf"

There is no iText 4.2.0. Please throw it away. It is a rogue version that is not released by the official developers of iText. It's a "gork", meaning God Only Really Knows what's inside. Solution: Throw away iText 4.2.0 and replace it with a more recent, official version: https://github.com/itext/itextpdf/releases
You get the error saying that the actual error message for the key trailer.not.found is not found. This means that you are using an iText jar that isn't build correctly. The .lng files are missing from the jar, hence the actual error message can't be found. Solution: Throw away iText 4.2.0 and replace it with a more recent, official version: https://github.com/itext/itextpdf/releases
The key trailer.not.found corresponds with the message "Trailer not found". It means that you are trying to create a PdfReader with a file that may look like a PDF, but that isn't. For instance: it starts with %PDF-, but there is no trailer. That means that iText searches the file (that should end in %%EOF; please check if this is the case) and the keyword startxref can be found. In other words: the trailer is missing. Solution: check if the PDF is valid. Note that old versions of iText weren't able to read PDFs that use a feature that was introduced after version PDF 1.5. Maybe your "unofficial" iText version is that old...
Finally: \ is an escape character. This is wrong: "c:\tmp\test.pdf" because if reads as "c:[tab] mp [tab] est.pdf" where [tab] is the tab character \t. You should use either "c:/tmp/test.pdf" or "c:\\tmp\\test.pdf".

Related

Replacement for COSName.DOCMDP in PDFBox 2.0.4

I'm testing the example codes from this page:
https://svn.apache.org/viewvc/pdfbox/trunk/examples/src/main/java/org/apache/pdfbox/examples/signature/
But inside the file CreateSignatureBase.java, exactly in the functions getMDPPermission and setMDPPermission, it calls a property that doesn't exist anymore: COSName.DOCMDP. I perused the Pdfbox page and its migration guide and it doesn't mention this property and how to replace it. I also looked into the PDfbox source code (exactly the file COSName.java) and It doesn't have that property, despite this file:
https://svn.apache.org/viewvc/pdfbox/branches/2.0/pdfbox/src/main/java/org/apache/pdfbox/cos/COSName.java?view=markup does have it.
I checked both pdfbox-2.0.4.jar and pdfbox-app-2.0.4.jar adding them to the Netbeans project where I'm testing the java files from the pdfbox examples. None of them have the property COSName.DOCMDP in the COSName class.
Both jars and the pdfbox sourcecode are downloaded from here:
https://pdfbox.apache.org/download.cgi#20x
How can I replace the property COSName.DOCMDP in the CreateSignatureBase class? Am I getting the right jars?

It will appear in 2.1.0 version:
https://issues.apache.org/jira/browse/PDFBOX-3017
https://issues.apache.org/jira/browse/PDFBOX-3699
https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/cos/COSName.java?annotate=1786065
If you need it for testing purposes, you may download it's SNAPSHOT version from https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox/
Or, you may see this example in current stable version - just download 2.0.4 jar and browse examples.

jaxp_feature_not_supported exception on docx4j

I am trying to use docx4j on my project. (I'm quite a newbies on it.)
I just try to run the sample code from this link.
http://www.smartjava.org/content/create-complex-word-docx-documents-programatically-docx4j
The input is .docx file and the output is also a .docx file.
And here is what the console gave me when It tried to read my template file:
2015-09-10 09:58:43,847 [main] ERROR org.docx4j.XmlUtils - jaxp_feature_not_supported: Feature "http://apache.org/xml/features/disallow-doctype-decl" is not supported.
javax.xml.parsers.ParserConfigurationException: jaxp_feature_not_supported: Feature "http://apache.org/xml/features/disallow-doctype-decl" is not supported.
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl.setFeature(DocumentBuilderFactoryImpl.java:207)
at org.docx4j.XmlUtils.<clinit>(XmlUtils.java:240)
at org.docx4j.openpackaging.contenttype.ContentTypeManager.parseContentTypesFile(ContentTypeManager.java:686)
at org.docx4j.openpackaging.io3.Load3.get(Load3.java:132)
at org.docx4j.openpackaging.packages.OpcPackage.load(OpcPackage.java:454)
at org.docx4j.openpackaging.packages.OpcPackage.load(OpcPackage.java:371)
at org.docx4j.openpackaging.packages.OpcPackage.load(OpcPackage.java:337)
at org.docx4j.openpackaging.packages.OpcPackage.load(OpcPackage.java:302)
at org.docx4j.openpackaging.packages.WordprocessingMLPackage.load(WordprocessingMLPackage.java:170)
at Experiment.getTemplate(Experiment.java:26)
at Experiment.main(Experiment.java:112)
The environment list here:
Java: 1.5
Library Management: Maven
Lib: Version
docx4j: 3.2.1
jaxb-api: 2.1 (Need to add it because Java 1.5 not include this)
jaxb-impl: 2.1 (Need to add it because Java 1.5 not include this)
I want to know how to deal with this error.
I try to solve this on my own but got no result.
Thanks for helping me.
EDIT:
I just found my answer for this when I read the changelog carefully.
V.3.2.0
...
Minimum Java version is Java 6 (since guava and ambassador are compiled for that)
I guess I need to go back to version 3.1.0. :| ( and It run smoothly.)

The exception is caught and logged: https://github.com/plutext/docx4j/blob/docx4j-3.2.1/src/main/java/org/docx4j/XmlUtils.java#L241
It shouldn't cause a problem. What happened after that?

java.lang.NoSuchMethodError: org.apache.commons.codec.binary.Base64.encodeBase64String

I am trying a encrypt a plain text using org.apache.commons.codec.binary.Base64. When I call the method org.apache.commons.codec.binary.Base64.encodeBase64String(aByteArray), it gives the following exception
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.commons.codec.binary.Base64.encodeBase64String([B)Ljava/lang/String;
I am using the jar, org-apache-commons-codec.jar. Please help me as I can't understand What is wrong with this.

First of all encoding is not encryption. You are only changing representation of your string while encoding, it is easily changed back.
Since you are getting this exception this means that you at least have this jar in your classpath. Open this jar with a suitable zip tool like 7-zip and look at your Manifest.mf file. Your jar version should be greater than 1.4 according to Base64javadoc. Download latest version and replace your older version.

Why would a Saxon Report run correctly on a Mac but not on Windows?

I am using Saxon 4.4.2 to convert DocBook to various formats (e.g. HTML, PDF, ePub). I am doing development on a MacBook Pro using Eclipse. Everything is written in Java. On my Mac, everything works fine. When I use Eclipse to generate a deployable plug-in, copy the plug-in and drop it into my Eclipse installation on Windows 7, and run the conversion from DocBook to HTML, Saxon reports "Failed to compile stylesheet. 1 error detected."
The error comes from
com.icl.saxon.TransformerFactoryImpl, method newTemplates line 120.
called by
com.icl.saxon.TransformerFactoryImpl, method newTransformer, line 72.
My calling line of code is:
Transformer transformer = tfactory.newTransformer(xsl);
The setting of xsl is done via this line:
StreamSource xsl = new StreamSource(DocBookTransformer.class.getResourceAsStream("/lib/docbook-xsl-1.76.1/xhtml/docbook.xsl");
Why would Saxon process the stylesheet without error on a Mac, but fail to parse it on Windows, when it is the same Saxon Jars and the same stylesheet file being processed on both machines?

Saxon 4.4.2? Where on earth did you get hold of that? Perhaps a CD in the back of a book published around 1998? It predates the first release on SourceForge in 2001, and was probably designed to run on Java 1.1.8.
So your first step should be to see if the problem still occurs on a more modern release. The current release is 9.5.
The other thing is to find out what the error is that Saxon says it reported. It will have been sent to the JAXP ErrorListener, and unless you changed anything, the default ErrorListener will have written the message to System.err.
The things that are most likely to work on one platform and fail on another are the URIs in xsl:include and xsl:import, so you try checking those.

How to downgrade arbitrary PDF file to version PDF-1.2?

I have some user generated PDF files. Typically the files are be generated with Word, but they could be just a about any kind of valid PDF file. I'd like convert the file to version PDF-1.2 if they have higher version number. The features available only in higher version (like multimedia) should be removed and the result should be still reasonably reasonable and readable.
How to do this programmatically, without interactive tools such as Adobe Acrobat? Preferably with Java and iText-library, but I would be interested in other solutions also.
One way would be to generate a bunch of images from original PDF and then package them as a PDF-1.2 file, but is the a more elegant way?

Try the commandline below. It uses Ghostscript to re-distill the PDF. Use Ghostscript version 8.71 or newer: 9.00. (The wrongly up-voted answer above advicing to "set PDF version in iText using setPdfVersion()" will NOT work -- it only re-labels the PDF, which will only be mis-leading...)
gswin32c.exe ^
-o output-v1.2.pdf ^
-sDEVICE=pdfwrite ^
-dPDFSETTINGS=/ebook ^
-dCompatibilityLevel=1.2 ^
input-v1.6.pdf

The easiest is to reprint it through Ghostscript.

You can set the PDF version in iText using setPdfVersion() however downgrading won't work out of the box I think. You could use PdfCopy and write your pdfs to a new one with the version 1.2 and strip out all none 1.2 objects. Or convert them to version 1.2 objects (which you will have to do yourself I think, not sure however)

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.