I created a PDF using PDFBOX. The entire PDF generates perfectly and even the images loaded while i was using
PDImageXObject ptabelle = PDImageXObject.createFromFile("src/main/resources/pdf/ptabelle.png", pdDocument);
But the project will need to go live sometime so I have to replace the static path with a class loader. After doing all that the PDF generates, the text is displayed, but not the image.
The interesting thing is that inside the PDF the "box" where the image should be is there, but not the image.
Here is the code for the stream generation.
ClassLoader classLoader = getClass().getClassLoader();
PDStream pdStream = new PDStream(pdDocument, classLoader.getResourceAsStream("pdf/ptabelle.png"));
PDResources pdResources = new PDResources();
PDImageXObject ptabelle = new PDImageXObject(pdStream, pdResources);
PDPageContentStream pdPageContentStream = new PDPageContentStream(pdDocument, page4);
And here is the call in the code, the length + width variables are defined in the code.
pdPageContentStream.drawImage(ptabelle, TEXT_BEGIN, currentYCoord, 172, 107);
Instead of new PDImageXObject(pdStream, pdResources) which is for PDFBox internal use, please use the appropriate LosslessFactory method. So your code would look like this:
BufferedImage bim = ImageIO.read(classLoader.getResourceAsStream("pdf/ptabelle.png"));
PDImageXObject img = LosslessFactory.createFromImage(pdDocument, bim);
See also the javadoc of PDImageXObject.createFromFileByExtension, which explains what factory methods can be called instead.
Related
I'm trying to embed the fonts using the following code,
which is based on Stackoverflow and PDFBOX-2661:
The font to embed as alternative to Helvetica is DejaVuSans.
// given: PDDocument document, PDAcroForm acroForm
InputStream font_file = ClassLoader.getSystemResourceAsStream("DejaVuSans.ttf");
font = PDType0Font.load(document, font_file);
if (font_file != null) {
font_file.close();
}
System.err.println("Embedded font 'DejaVuSans.ttf' loaded.");
PDResources resources = acroForm.getDefaultResources();
if (resources == null) {
resources = new PDResources();
}
resources.put(COSName.getPDFName("Helv"), font);
resources.put(COSName.getPDFName("Helvetica"), font);
// Also use "DejaVuSans.ttf" for "HeBo", "HelveticaBold" and "Helvetica-Bold" in a similar way, but this is left out to keep this short.
acroForm.setDefaultResources(resources);
// let pdfbox handle refreshing the values, now that all the fonts should be there.
acroForm.refreshAppearances();
However in acroForm.refreshAppearances(), it results in a lot of Using fallback font LiberationSans for CID-keyed TrueType font DejaVuSans. Debugging it a bit, down there in createDescendantFont it tries to load (in org.apache.pdfbox.pdmodel.font.PDCIDFontType2's findFontOrSubstitute) the font file "DejaVuSans" from the filesystem again, instead of using the provided resource. As it is provided in the JAR file instead of from the normal filesystem (system's fonts) is not found, resulting in the fallback font to be used.
How can I make it recognise and load the font correctly?
What I already tried:
I tried extending the font loading mechanism, but as everything is private and/or final, I had to stop after I already copied about 10 files unchanged from the original code just to be able to access them; that must be possible in a different way.
Direct writes to the ContentStream seem to use a different way (contentStream.setFont(pdfFont, fontSize)), so that is not affected.
The current AcroForm form field refreshing mechanism in PDFBox is not really usable in combination with fonts yet to be subsetted.
The cause is that whenever a font is used for refreshing an appearance, it is retrieved from some resources dictionary. In those resource dictionaries, though, there is not your original PDType0Font but only a preliminary version of the PDF objects backing your PDType0Font. But these PDF objects don't know that they back a font that eventually shall be subsetted, so retrieval of that font generates a new, different PDType0Font object which claims to be non-embedded. So it also is not informed about glyphs to eventually embed.
This also is the reason why the PDType0Font.load method you use is documented (JavaDoc comments) with the hint If you are loading a font for AcroForm, then use the 3-parameter constructor instead:
/**
* Loads a TTF to be embedded and subset into a document as a Type 0 font. If you are loading a
* font for AcroForm, then use the 3-parameter constructor instead.
*
* #param doc The PDF document that will hold the embedded font.
* #param input An input stream of a TrueType font. It will be closed before returning.
* #return A Type0 font with a CIDFontType2 descendant.
* #throws IOException If there is an error reading the font stream.
*/
public static PDType0Font load(PDDocument doc, InputStream input) throws IOException
And the 3-parameter constructor in its documentation tells you not to use subsetting for fonts for AcroForm usage:
/**
* Loads a TTF to be embedded into a document as a Type 0 font.
*
* #param doc The PDF document that will hold the embedded font.
* #param input An input stream of a TrueType font. It will be closed before returning.
* #param embedSubset True if the font will be subset before embedding. Set this to false when
* creating a font for AcroForm.
* #return A Type0 font with a CIDFontType2 descendant.
* #throws IOException If there is an error reading the font stream.
*/
public static PDType0Font load(PDDocument doc, InputStream input, boolean embedSubset)
throws IOException
But even using that 3 parameter constructor with embedSubset set to false does not render a good result. At first glance the rendered fields look ok:
But as soon as you click into them, something weird happens:
#Tilman, there probably still is something to fix here.
The underlying problem with the subset embedded font can also occur in other contexts, e.g.:
try ( PDDocument pdDocument = new PDDocument();
InputStream font_file = [...] ) {
PDType0Font font = PDType0Font.load(pdDocument, font_file);
PDResources pdResources = new PDResources();
COSName name = pdResources.add(font);
PDPage pdPage = new PDPage();
pdPage.setResources(pdResources);
pdDocument.addPage(pdPage);
try ( PDPageContentStream canvas = new PDPageContentStream(pdDocument, pdPage) ) {
canvas.setFont(pdResources.getFont(name), 12);
canvas.beginText();
canvas.newLineAtOffset(30, 700);
canvas.showText("Some test text.");
canvas.endText();
}
pdDocument.save("sampleOfType0Issue.pdf");
}
(RefreshAppearances test testIllustrateType0Issue)
I have a pdf file which shows font properties in Okular (or whatever PDF viewer) like that:
Name: Helvetica
Type: Type1
File: /usr/share/fonts/truetype/liberation2/LiberationSans-regular.ttf
Embedded: No
I want to embed Helvetica with PDFBox 2xx without modifying file content (text) itself so it would always available with a file.
Is it possible at all?
I tried something like:
PDDocument document = PDDocument.load(myFile);
InputStream stream = new FileInputStream(new File("/home/user/fonts_temp/Helvetica.ttf"));
PDFont fontToEmbed = PDType0Font.load(document, stream, true);
PDResources resources = document.getPage(pageNumber).getResources();
resources.add(fontToEmbed);
//or use the font from pdfbox:
resources.add(PDType1Font.HELVETICA);
document.save(somewhere);
document.close();
I also tried to call
COSName fontCosName = resources.add(PDType1Font.HELVETICA);
resources.put(fontCosName, font);
What am I doing wrong?
Edit:
#TilmanHausherr thank you for the clue! But I'm still missing something. Currently my code looks like:
PDFont helvetica = PDType0Font.load(document, new FileInputStream(new File("/path/Helvetica.ttf")), false);
...
PDResources resources = page.getResources();
for (COSName fontCosName : resources.getFontNames()){
if(resources.getFont(fontCosName).getName().equals("Helvetica")) {
resources.put(fontCosName, helvetica);
}
}
End result shows
Helvetica CID TrueType Fully Embedded
But the font is not displayed in PDF file at all now. I mean those places where the font is used are literally empty, blank page... Still something is not there.
Font itself was downloaded from here
You'd need to know the name that is currently used in the resources, so check these with resources.getFontNames()
2.
To replace a standard 14 font, use this font object:
PDTrueTypeFont.load(document, file, oldFont.getEncoding() /* or WinAnsiEncoding.INSTANCE which is usually right */ );
this ensures that the same encoding is used as the standard 14 font. (It's different for the Zapf Dingbats and the Symbol font)
I am creating a report printer in Java using the LibreOffice SDK and Apache Batik. Using Batik, I draw svgs which I then insert into a LibreOffice Writer document. To properly insert the image, all I found is using a path to load the image from disk and insert it into the document. So far so good, but I have to explicitly save the document to disk in order to read it into libreoffice again.
I tried to use a data url as the image path but it did not work. Are there any possibilities to read an image from a stream or anything else I can use without storing the file to disk?
I found a solution. I realized how to do it when I realized that all my images I added were just image links. So I had to embed the images instead.
To use this, you need:
Access to the XComponentContext
A TextGraphicObject in your document (see the links above)
The image as byte[] or use another stream
The code:
Object graphicProviderObject = xComponentContext.getServiceManager().createInstanceWithContext(
"com.sun.star.graphic.GraphicProvider",
xComponentContext);
XGraphicProvider xGraphicProvider = UnoRuntime.queryInterface(
XGraphicProvider.class, graphicProviderObject);
PropertyValue[] v = new PropertyValue[1];
v[0] = new PropertyValue();
v[0].Name = "InputStream";
v[0].Value = new ByteArrayToXInputStreamAdapter(imageAsByteArray);
XGraphic graphic = xGraphicProvider.queryGraphic(v);
if (graphic == null) {
LOGGER.error("Error loading the image");
return;
}
XPropertySet xProps = (XPropertySet) UnoRuntime.queryInterface(
XPropertySet.class, textGraphicObject);
// Set the image
xProps.setPropertyValue("Graphic", graphic);
This worked effortlessly even for my svg images.
Source: https://blog.oio.de/2010/05/14/embed-an-image-into-an-openoffice-org-writer-document/
I had print an image into 'PDF' using the following code:
InputStream in = new FileInputStream(new File("C:/"+imageName));
PDJpeg img = new PDJpeg(doc, in);
contentStream.drawXObject(img, 20, pageYaxis-120, 80, 80);
Here when imagName="a.jpg" its working fine, In case of imagName="b.png" its not working. In jpg images its working but in png its not. Why it is so? Please help me. How can I make print both the formats, I mean format in depended?
In Apache PDFBox 1.8, use PDPixelMap for PNG images:
BufferedImage awtImage = ImageIO.read(new File(image));
ximage = new PDPixelMap(doc, awtImage);
In the source code of PDFBox, see the ImageToPDF.java example. This will work with all files that can be read with ImageIO. However it is still useful to keep using PDJpeg for JPG images, because there the JPEG files are directly put into the PDF files without being converted into a lossless format.
Bitmap alphaImage = BitmapFactory.decodeStream(in);
PDImageXObject alphaXimage = LosslessFactory.createFromImage(document, alphaImage);
There is a pdf file, and I want to import the 2nd page as an image and save it to a jpeg file.
Is it possible and how to do it?
This is the code how I import a page:
Document document = new Document();
File file = File.createTempFile("", "");
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(file));
document.open();
final int backPage = 2;
PdfReader reader = new PdfReader(pdf.getAbsolutePath());
PdfImportedPage importedPage = writer.getImportedPage(reader, backPage);
com.lowagie.text.Image image = com.lowagie.text.Image.getInstance(importedPage);
Now I get an image instance, but I don't know how to write it to a jpeg file.
Image.getInstance(importedPage) does not (as one might assume) render the denoted page as some bitmap but merely creates a wrapper object to make the imported page easier to add to another PDF.
iText is not a PDF rendering tool, especially not the old com.lowagie variant. You may want to look at different products, e.g. JPedal.
Appearently (according to 1T3XT BVBA), you can only save an iText Image from a PDF page, not a raster image.
You can store it everywhere, if you will use later to put it in another PDF page... otherwise, you'll have to use a tool like JPedal:
http://www.idrsolutions.com/convert-pdf-to-images/
===================================
EDIT: maybe PDFBox can do it for you too!:
http://pdfbox.apache.org/commandlineutilities/PDFToImage.html
http://gal-levinsky.blogspot.it/2011/11/convert-pdf-to-image-via-pdfbox.html