Using sun PDF Renderer to display PDFs with embedded fonts - java

I'm having trouble using Sun's PDF Renderer package to view PDFs with embedded fonts. I have the following code which creates a BufferedImage out of every page of a PDF for viewing in my application, and it works fine when there are no embedded fonts. However, when the PDF has embedded fonts, it shows no text. Any ideas? Also, it opens fine in Adobe's PDF viewer.
File f = new File("C:\\test.pdf");
FileChannel fc = new RandomAccessFile(f, "r").getChannel();
PDFFile pdfFile = new PDFFile(fc.map(FileChannel.MapMode.READ_ONLY, 0, fc.size()));
for(int x=0; x<pdfFile.getNumPages(); x++) {
try {
BufferedImage bi = (BufferedImage)pdfFile.getPage(x+1).getImage(
(int)pdfFile.getPage(x+1).getWidth(),
(int)pdfFile.getPage(x+1).getHeight(),
new Rectangle((int)pdfFile.getPage(x+1).getWidth(),
(int)pdfFile.getPage(x+1).getHeight()),
null, true, true);
}
catch (Exception e) {
e.printStackTrace();
}
}

I figured this out by changing PDF renders from PDFRenderer to PDFBox, which works much better. More info is available here.

You could also look at Icesoft, IText, JPedal, and Multivalent who offer Open Source PDF tools.

Related

Using Apache PDFBox 1.3.1 - Trying to scale down all images in PDF file

I am forced to use ver. 1.3.1 of Apache PDFBox. I wish I could use ver 2.+ but can't. Anyway, I am trying to scale down the images in some PDF files but am having trouble. If I was using ver 2.+ I could as I understand it use the {PDPageContentStream_obj}.transform(...) method but that's not avail. to me. Instead as I understand it, I need to apply the .concatenate2CTM(...) method. I am trying that but not getting the anticipated results; my resulting .pdf files display their graphics unchanged in scale. Need to do this to reduce file size to under 5 MB so it can be successfully sent through a gateway that would time out if the file was bigger. Changing the gateway timeout is not an option, unfortunately. My code currently looks like this:
private File compressFile2(File file) throws Exception {
try {
PDDocument doc = PDDocument.load(file);
List<PDPage> pdPages = doc.getDocumentCatalog().getAllPages();
Iterator<PDPage> iterPdPages = pdPages.iterator();
while (iterPdPages.hasNext()) {
PDPage page = iterPdPages.next();
PDPageContentStream cs = new PDPageContentStream(doc, page, true, true);
cs.saveGraphicsState();
cs.concatenate2CTM(.5, 0, 0, .5, 0, 0);
cs.saveGraphicsState();
}
doc.save(file.getAbsolutePath() + "_compressed");
} catch (Exception e) {
e.printStackTrace();
throw e;
}
return new File(file.getAbsolutePath() + "_compressed");
}
Any ideas are appreciated. Thank you.

While parsing pdf with iText7 chars move on fixed interval (with Freeset font)

I'm trying to parse pdf that I have created with iText. In document I have two paragraphs:
"Имя" - ("name" from Russian) - font: Helvetica, size: 20.
"Фамилия" - ("surname" from Russian) - font: Freeset (I downloaded it here), size: 10.
When I finish parsing I get "Имя" properly encoded and "Ôàìèëèÿ" instead of "Фамилия". It is Unicode characters for "Фамилия" but moved 848 chars (10-based) left. (I mean that, for instance, instead of "Ф" (0x0424 in UTF-8) I get "Ô" (0x00d4) and difference between them is 848 (or 350 in hex))
I use this example to get text from pdf (but instead of filtering by font, I filter by equality to one of the Strings in the set ("Имя", "Фамилия")
I know that we are advised to store non-English charactes as sequence of Unicode symbols, but I'm creating pdf on the fly from incoming data so I can't manually retype it as separate Unicode symbols (if you know how to do it on the fly, please provide your approach).
Any ideas why this movement of character happen and how to avoid it are welcomed. Thank you in advance.
Here is the file I worked with.
Edit
I tried opening file in Acrobat Pro and everything is fine there. Acrobat also shows that all three fonts I've put in pdf are still in the document.
Here is the code I used to create pdf I'm processing:
private static void create() throws IOException {
PdfDocument pdf = new PdfDocument(new PdfReader(srcPdf), new PdfWriter(targetPdf));
PdfCanvas pdfCanvas = new PdfCanvas(pdf.getFirstPage());
PdfFont freeset = getPdfFont(freesetPath);
PdfFont helvetica = getPdfFont(helveticaPath);
PdfFont circe = getPdfFont(circePath);
pdfCanvas.beginText()
.setFontAndSize(helvetica, 15)
.setColor(Color.RED, true)
.moveText(50, 300)
.showText("Имя")
.setFontAndSize(freeset, 10)
.setColor(Color.GREEN, true)
.moveText(0, -30)
.showText("Фамилия")
.setFontAndSize(circe, 20)
.setColor(Color.BLUE, true)
.moveText(0, -30)
.showText("Должность")
.endText();
pdf.close();
}
private static PdfFont getPdfFont(String path) throws IOException {
InputStream fontInputStream = new FileInputStream(path);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
byte[] buffer = new byte[2048];
int a;
while((a = fontInputStream.read(buffer, 0, buffer.length)) != -1) {
baos.write(buffer, 0, a);
}
baos.flush();
return PdfFontFactory.createFont(baos.toByteArray(),
PdfEncodings.IDENTITY_H, true);
}
iText 7 appears to have an issue with embedding the font in question. I don't know whether it's a bug in the font or in iText, though.
The "FreeSet" font is indeed embedded in the OP's sample document with a wrong ToUnicode map
...
6 beginbfrange
<009e> <009e> <00d4> <00aa> <00aa> <00e0> <00b2> <00b2> <00e8> <00b5> <00b5> <00eb> <00b6> <00b6> <00ec> <00c9> <00c9> <00ff> endbfrange
...
which maps the glyphs used for "Фамилия" to 00d4, 00e0, 00e8, 00eb, 00ec, and 00ff.
This in turn explains why both iText and Adobe Reader extract unexpected text.
The issue can be reproduced like this:
PdfFont arial = PdfFontFactory.createFont(BYTES_OF_ARIAL_FONT, PdfEncodings.IDENTITY_H, true);
PdfFont freeSet = PdfFontFactory.createFont(BYTES_OF_FREESET_FONT, PdfEncodings.IDENTITY_H, true);
try ( OutputStream result = new FileOutputStream("cyrillicTextFreeSet.pdf");
PdfWriter writer = new PdfWriter(result);
PdfDocument pdfDocument = new PdfDocument(writer);
Document doc = new Document(pdfDocument) ) {
doc.add(new Paragraph("Фамилия").setFont(arial));
doc.add(new Paragraph("Фамилия").setFont(freeSet));
}
(CreateCyrillicText test testCreateTextWithFreeSet)
The result looks ok:
When extracting / copying&pasting, though:
The embedded Arial subset has a proper ToUnicode map, the text in Arial is extracted as "Фамилия".
The embedded FreeSet subset has an incorrect ToUnicode map, the text in FreeSet is extracted as "Ôàìèëèÿ".
(Tested with the current iText 7.1.1-SNAPSHOT)
Apparently iText 7 does understand the FreeSet font program well enough to select the needed subset and reference the correct glyphs from the content but it has problems building an appropriate ToUnicode map. This is not a general problem, though, as the parallel test with Arial shows.

How to insert graphs into a pdf?

I would like to draw two graphs showing stacked bar graphs with labels, into a PDF file in Java. I would get the data from a Mongodb for input to graphs. How to do that?
Using JFreechart and pdfbox I have done something similar to what you are requesting for a report I made once. Making a pie chart was as follows:
public class PieChartExample {
public static void main(String[] args) {
// Create a simple pie chart
DefaultPieDataset pieDataset = new DefaultPieDataset();
pieDataset.setValue("Chrome", new Integer(42));
pieDataset.setValue("Explorer", new Integer(24));
pieDataset.setValue("Firefox", new Integer(24));
pieDataset.setValue("Safari", new Integer(12));
pieDataset.setValue("Opera", new Integer(8));
JFreeChart chart = ChartFactory.createPieChart3D(
"Browser Popularity", // Title
pieDataset, // Dataset
true, // Show legend
true, // Use tooltips
false // Configure chart to generate URLs?
);
try {
ChartUtilities.saveChartAsJPEG(new File("C:\\Users\\myname\\Desktop\\chart.jpg"), chart, 500, 300);
} catch (Exception e) {
System.out.println("Problem occurred creating chart.");
}
}
}
The above example came from a pdf I think is available on their website, it has examples for other charts if you need them. Once saved, I could import it to the pdf similarly to this:
try {
PDDocument document = new PDDocument();
PDPage page = new PDPage(PDPage.PAGE_SIZE_A4);
document.addPage(page);
InputStream in = new FileInputStream(new File("c:/users/myname/desktop/chart.jpg"));
PDJpeg img = new PDJpeg(document, in);
PDPageContentStream contentStream = new PDPageContentStream(document, page);
contentStream.drawImage(img, 10, 300);
contentStream.close();
document.save("pathway/to/save.pdf");
} catch (IOException e) {
System.out.println(e);
} catch (COSVisitorException cos) {
System.out.println(cos);
}
itext is also a good library for pdf manipulation, but that is commercial after a point whereas pdfbox should be open source.
Good Luck!
You can use gnujavaplot. It's an api enabling you to call gnuplot via Java.
you can use any charting library to generate the chart (somme libraries examples here), and then add it to your PDF using Itext.
You can take a look at JasperReports. It's a Java framework for generating reports in PDF and other file formats.
It has integrated support for various types of charts using the JFreeChart library.
However, I should warn you that the learning curve for JasperReports is quite steep. Perhaps you could consider using a combination of JFreeChart with iText instead, as suggested in this post.

PDF trailer error on pdfs created by PDFBox

I try to create a PDF file with PDFBox and then to create from it an image using commercial library jPDFImages (Qoppa software). Yes, i know that PDFBox is able too to create images from PDFs, but for some reasons I need to use commercial library.
I created PDF file and pass it to jPDFImajes, but I have an error: "Unable to find PDF trailer". Qoppa software describe this error.
The problem is seems to be in the PDF trailer, which is created by PDFBox, but I don`t understand how to set it up in right mode? (I have a problem only with PDFs created with PDFBox)
Here is my code for pdf creation:
public void createPDFFromImage( String file) throws Exception {
PDDocument doc = null;
try {
doc = new PDDocument();
BufferedImage bufferedImage = ImageIO.read(new File(/home/.../files/test.png));
PDPage page = new PDPage();
doc.addPage( page );
PDJpeg ximage = new PDJpeg(doc,bufferedImage, (float) 0.95);
PDPageContentStream contentStream = new PDPageContentStream(doc, page);
contentStream.drawXObject(ximage, x, y, W, H);
contentStream.close();
doc.save(file);
} finally {
if( doc != null ) {
doc.close();
}
}
}
Here is an error from commercial library:
java.lang.RuntimeException: com.qoppa.pdf.PDFException: Unable to find PDF trailer.
Caused by: com.qoppa.pdf.PDFException: Unable to find PDF trailer.
I think the problem is, how do I create pdf. Maybe I need to add some information to the pdf to make it valid?

Exporting SWT Image into PDF using Java iText API

I have a an SWT image I want to export this image into a pdf file using iText API.
I have tried saving this image on the disk and then using the path of image to export
it to the pdf, this takes lots of time to generate the pdf.
I have also tried converting the SWT image into AWT image and then exporting it into the
pdf, this approach takes even more time to generate pdf.
Another approach I have been trying is to convert the raw data of image into
jpeg byteArrayOutputStream using ImageLoader Object as shown below :
ImageLoader tempLoader = new ImageLoader();
tempLoader.data = new ImageData[] {
image.getImageData()
};
ByteArrayOutputStream bos = new ByteArrayOutputStream();
tempLoader.save(bos, SWT.IMAGE_JPEG);
Now I am using this ByteArrayOutputStream as input to
OutputStream outStream = new FileOutputStream(selectedPathAndName);
Document document = new Document();
document.setMargins(0,0,0,0);
document.setPageSize(new Rectangle(0,0,width,height));
PdfWriter.getInstance(document, outStream);
document.open();
com.itextpdf.text.Image pdfImage = com.itextpdf.text.Image.getInstance(bos.toByteArray());
document.add(pdfImage);
document.close();
This generates pdf files with the width and height I have set, but the page seems to be empty.
Any suggestions or any other approach is most welcome.
Thank you,
It looks your page sizes are zero, try setting them to something like A4 in the constructor.
Document document = new Document(PageSize.A4, 50, 50, 50, 50);

Categories