Display pdf documents in java using iText (and not acrobat reader)? - java

I can get the the file to open and display in an external reader but it is possible to get it to open within java (eg. in a JPanel)? I am using iText would this be possible or would i need something like ICEPdf?

Related

How to print a PDF File using native JAVA (or open and print)

I created a invoice billing software using itextpdf 5. Is there any way to print the generated PDF using java?
The easiest way is using Desktop.print(File).

How can I get a preview image of an excel document?

I've got an excel document and i need to take an image of the first sheet and use it to put in the icon of JLabel. How can I do it?
I don't think Apache POI provides anything here as it is not concerned about displaying the data, only about retrieving and updating the data in the Excel Workbook.
So basically only Excel and other office applications like LibreOffice know how to actually display the data.
A few alternative options that come to my mind:
Use some other software to display the contents as a web-page and use Selenium to take a screenshot, see e.g. Take a screenshot with Selenium WebDriver
Programmatically open the file in Excel on a Windows box, then use some screenshot utility with automation support

Java How to close PDF from using pdf javascript?

I'm using a servlet to send a pdf stream to a browser.
There is a request that the pdf must open the print dialog when I show it to the user.
I was successful doing this using iText api. Something like this:
stamper.setPageAction(PdfWriter.PAGE_OPEN, new PdfAction(PdfAction.PRINTDIALOG), 1);
Now I do need to close the pdf file after print. I tried using pdfactions, but I'm can't get it. What I tried is:
writer.setAdditionalAction(PdfWriter.DID_PRINT, PdfAction.javaScript("app.execMenuItem('Close');", writer));
or
writer.setAdditionalAction(PdfWriter.DID_PRINT, PdfAction.javaScript("app.close();", writer));
I don't necessarily need to use pdfActions, but I don't see how to get it after user sent the pdf to printer.
Do you have any ideas?
There's a reason why app.execMenuItem('Close'); and app.close() don't work. These methods are designed to close the standalone version of Adobe Reader/Acrobat. I guess you're viewing the PDF in a browser, in which case you use Adobe Reader as a plug-in, Chrome's PDF viewer, pdf.js in Firefox, or any other PDF viewer.
Problem #1: you need to close the browser window from a PDF document. PDFs don't have the power to control your browser. Suppose they would: wouldn't that be a serious security issue?
Problem #2: you embed the PDF inside a HTML page (e.g. using an <object> tag) and establish a communication between the JavaScript in the PDF and the JavaScript in the HTML. I've described how to do this in my book, but: it won't work with all browsers on all OSs. For instance: Chrome's PDF viewer and Firefox's pdf.js will completely ignore your commands.
You are asking a solution using our iText library, but you're asking something that can't be done with any software.

Extract text from PDF (google app engine)

Is there any free Java library for extracting text from PDF, that is compatible with Google Application Engine?
I've read about PDFJet, but it can't read PDF, can it?
Is there perhaps other way how to extract text from PDF? I tried http://www.pdfdownload.org/, unfortunately they don't handle non-English characters correctly.
iText now has a text parsing module (I'm one of the parser authors). See the com.itextpdf.text.pdf.parser.PdfContentReaderTool class for an example of how to use it.
PdfBox does not run on GAE. It uses not-allowed java classes.
(GAE only permits these http://code.google.com/appengine/docs/java/jrewhitelist.html)
I have partially modified a very old version of PdfBox (0.7.3) to be GAE complaiant. Now I'm able to extract text from PDF (whole page or rectangular area). I only modified a minumum part of the pdf text extraction and not the whole PdfBox. :)
The idea was to remove refences to java.awt.retangle & C. using my own "rectangle" class.
More info: http://fhtino.blogspot.com/2010/04/pdfbox-text-extration-gae.html
I modified the latest (1.8.0-Snapshot) version to run on Google AppEngine. Had to disable one Unit-Test, but it runs fine for simple text extraction.
Following the simple try-fail-fix approach i had to modify 5 files in total. Pretty doable.
You'll also have to explicitly use a RandomAccessBuffer, like Fabrizio explained.
For the extra lazy, heres the compiled jar, dependencies for text extraction, and the patch. Note that it might not work for every usecase (i.e. rectangle based extraction). Used it to extract text of a whole page.
https://docs.google.com/folder/d/0B53n_gP2oU6iVjhOOVBNZHk0a0E/edit
I know there is http://pdfbox.apache.org/index.html
Apache PDFBox is an open source Java
PDF library for working with PDF
documents. This project allows
creation of new PDF documents,
manipulation of existing documents and
the ability to extract content from
documents.
but I've never tested it.
Last month, I'd just finished extracting text from pdf file in my project. I used XPDF tool for getting text, and text coordinates, but I used it in Xcode (Objective-C). This tool was open source, written by C++, and able to be encoded in many language. However, I didn't know whether XPdf would be work on your java, or not. Anyway, You can try this tool.

Generate PCL from PDF in Java

What is the best way to generate a PCL output file from an existing PDF file in java?
It depends on how much you want to invest, and how robust the solution needs to be. For quick and dirty, you can print from Adobe Acrobat to a file, using a PCL driver (look, mom, no Java ...).
The Java Print Service API can process PDF. Use StreamPrintService and write the stream to a file, using PCL for the output format.
If you need to have more control over the content, maybe modify it or add to it, you can use a PDF parser (this one, for instance) and print the resulting HTML from a browser that your application starts, by adding some Javascript, for example.
The StreamPrintService from JDK 6 does only support PS. I am still searching for a StreamPrintService which supports PCL.
We capture PCL generated from Acrobat printing a PDF to a PCL driver and redirect as input to our Windows console PCLXForm program. With a custom script, we can "stream edit" the PCL. We can extract the address block text for address correction, insert the corrected text, add the Intelligent Mail Barcode, 2-D barcodes, sort the documents, batch them by page count, change tray assignments, merge with other documents, etc. The product required is PCLTool SDK - Option V at www.pagetech.com

Categories