Creating complex pdf using java - java

I have an Java/Java EE based application wherein I have a requirement to create PDF certificates for various services that will be provided to the users. I am looking for a way to create PDF (no need for digital certificates for now).
What is the easiest and convenient way of doing that? I have tried
XSL to pdf conversion
HTML to PDF conversion using itext.
crude java way (using PDFWriter, PdfPCell etc.)
What is the best way out of these or is there any other way which is easier and convenient?

When you talk about Certificates, I think of standard sheets that look identical for every receiver of the certificate, except for:
the name of the receiver
the course that was followed by the receiver
a date
If this is the case, I would use any tool that allows you to create a fancy certificate (Acrobat, Open Office, Adobe InDesign,...) and create a static form (sometimes referred to as an AcroForm) containing three fields: name, course, date.
I would then use iText to fill in the fields like this:
PdfReader reader = new PdfReader(pathToCertificateTemplate);
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(pathToCertificate));
AcroFields form = stamper.getAcroFields();
form.setField("name", name);
form.setField("course", course);
form.setField("date", date);
stamper.setFormFlattening(true);
stamper.close();
reader.close();
Creating such a certificate from code is "the hard way"; creating such a certificate from XML is "a pain" (because XML isn't well-suited for defining a layout), creating a certificate from (HTML + CSS) is possible with iText's XML Worker, but all of these solutions have the disadvantage that it's hard work to position every item correctly, to make sure everything fits on the same page, etc...
It's much easier to maintain a template with fixed fields. This way, you only have to code once. If for some reason you want to move the fields to another place, you only have to change the template, you don't have to worry about messing around in code, XML, HTML or CSS.
See http://www.manning.com/lowagie2/samplechapter6.pdf for some more info (section 6.3.5).

Try using Jasper Reports mate. Check it out at http://community.jaspersoft.com/

I recommend the first method: XSL to pdf conversion, which is the most powerful. I have experience to produce a lot of PDF reports(each having thousands of pages) gracefully by use of Apache FOP, I think it's good enough and fairly easy(but it requires some knowledge of xsl-FO).

Even though, this is old question, I think it should be anwered.
To create very complex pdf such as certificates,reports or payment slips etc.
You can definitely use Dynamic Reports library. This library is dependent on jasper reports (This is also very popular and old library). Dynamic reports will provide you to design your documents using java code so that you can easily manipulate or make changes as required.
There are lots of examples available there at their site and very easy to learn from those examples.
Below is link for it :
http://www.dynamicreports.org/

Bruno Lowagie pointed out a great way to generate a Template which is the same basically for all data and needs to be populated. However, Bruno Lowagie recommends iText as library to populate the fields. For me like for Ankit, this license was an issue why I had to choose another library. In the following I have a step-by-step guide how to create a template and populate it with data using Apaches PdfBox
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
<version>2.0.16</version>
</dependency>
Create a Template with LibreOffice Writer. For placeholders use
TextBoxes (View >> Toolbars >> Form Controls ). This will create a PDF with AcroForms as suggested by Bruno Lowagie
Set a name for each Textbox. Set read-only to true.
Save the document as PDF.
Read the PDF-Template with PdfBox and set the values for the
textboxes.
InputStream is = getClass().getClassLoader().getResourceAsStream("Template.pdf");
try {
PDDocument pDDocument = PDDocument.load(is);
PDAcroForm pDAcroForm = pDDocument.getDocumentCatalog().getAcroForm();
PDField fieldName = pDAcroForm.getField("name");
fieldName.setValue("FirstName Surname"); // <-- Replacement
pDDocument.save(outStream);
pDDocument.close();
} catch (IOException e) {
e.printStackTrace();
}

Use iText pdf library for creating the pdf's It will be easy for you to generate pdfs from that api. Here is the link
http://itextpdf.com/
Text ® is a library that allows you to create and manipulate PDF documents. It enables developers looking to enhance web- and other applications with dynamic PDF document generation and/or manipulation.
Developers can use iText to:
Serve PDF to a browser
Generate dynamic documents from XML files or databases
Use PDF's many interactive features
Add bookmarks, page numbers, watermarks, etc.
Split, concatenate, and manipulate PDF pages
Automate filling out of PDF forms
Add digital signatures to a PDF file

You mentioned the PDFs can be complex. If this is to do with variability or layout, one option that provides reasonably sophisticated template-based layouts and controls is Docmosis. You provide Docmosis with doc or odt files as templates so they are very easy to change and the call Docmosis to mail-merge to create the pdf or other formats. Please not I work for the company that created Docmosis.
Hope that helps.

Related

PDFBox: convert PDF to text including chapter headlineinformation

I am currently working at a project to extract the content of pdf files and search for certain keywords in them.
For extracting the content I am using PDFBox and this works fine.
The problem I now have encountered is that I want to be able to search for certain keywords only within chapter headlines.
At the moment my code for extracting looks like this:
PDDocument doc = PDDocument.load(pdfFile);
String text = new PDFTextStripper().getText(doc);
doc.close();
This only extracts the raw text of the file, with no information about headlines. I was not able to figure out how to use PDFBox to include such information. So I am not sure if this is even possible.
Has anybody experience with this tool and can tell me, if its even possible to do this by using PDFBox and if yes, how I will be able to achieve this?
Kind regards

Auto-generate the Content in PDF format - Java

I'm developing a Java web App which could calculate one's IQ. I want the App to have an option Get Your Certificate at the end. I want a PDF file (A Certificate of appreciation) to be auto generated with the pre-entered name of the User and his IQ Score.
How can one achieve this? I've already seen this type of feature in some websites which provide certifications..
Java PDF APIs
Here is an answer to a similar question referencing a few well-known APIs.
Here is a more recent article detailing the licenses for those APIs.
Yet another listing of resources.
Flow of control
User clicks a link that generates a request that will be handled by the servlet.
Extract whatever you need from the URL within the servlet.
Use your chosen API to build the content for the PDF using a writer.
Push the PDF to the client.
Take a look a some iText samples. You can fill out a form, then click "flatten" and you have a PDF containing the data you used. As you're talking about a certificate, the easiest solution would be to create a PDF template using AcroForm technology. For instance: state.pdf is the interactive PDF that was used in the example I just mentioned.
The code used to fill out and flatten this form can be found here. For more examples, please read Chapter 6 of my book "iText in Action" (that chapter is available for free; you need section 6.3.5). I've also written a complete chapter about integrating code like this in a web application. You can find the examples that come with this chapter here.
Basically, you need to do something like this:
PdfReader reader = new PdfReader(src);
PdfStamper stamper = new PdfStamper(reader,
new FileOutputStream(dest));
AcroFields fields = stamper.getAcroFields();
fields.setField("name", "CALIFORNIA");
fields.setField("abbr", "CA");
fields.setField("capital", "Sacramento");
fields.setField("city", "Los Angeles");
fields.setField("population", "36,961,664");
fields.setField("surface", "163,707");
fields.setField("timezone1", "PT (UTC-8)");
fields.setField("timezone2", "-");
fields.setField("dst", "YES");
stamper.setFormFlattening(true);
stamper.close();
reader.close();
Caveat regarding the data that is entered: The simple example uses a very basic font that doesn't know how to display special characters. If you need characters such as ñ, é, à, etc... You'll need to introduce a font with more glyphs.
Caveat regarding the jsp-tag you used: I have written this helloworld.jsp that results in this PDF, which proves that is is possible to generate PDF from JSP. Nevertheless, it is a bad idea to do so. When you learned how to write JSP, your teacher probably told you that JSP shouldn't be used to create binary files. (If he didn't tell you this, he either forgot or he wasn't a good teacher.) As there are so many pitfalls when using JSP to create binary files and as a JSP file is eventually compiled to a Servlet anyway, you should forget about creating a JSP to create a PDF and prefer writing a Servlet. It will save you plenty of time and your code will be easier to maintain (the slightest change to your JSP file can break the code).

Creating a invisible PDF object with iText

I have a program that outputs to PDF, however, I want it to be able to read from it.
I have come up with my own data type which my program is able to read, but I need it somehow included in PDF file (no multiple files, I want one file per single output).
I also need this data to be invisible and undetectable for the user.
I heard something about PDF dictionaries, but I'm not sure how to do it (or if there's another way). I do not want to use XMP/XML file, my data is more complex than key-value.
What would be nice is somebody writing me couple example lines of code that would enable me to:
add new dicitonary to PDF using iText
populate it with data using iText
locate it in a file using iText
read from it using iText
You want to do something similar to what Adobe Illustrator is doing. If you create a PDF from Adobe Illustrator, you can encapsulate the original AI file. This gives you the impression the PDF can be edited. In reality, Adobe Illustrator takes the AI file and uses that to edit, and re-creates the PDF from the updated AI.
Where is this information stored? See ISO-32000-1 section 14.5:
Conforming products may use this dictionary as a place to store
private data in connection with that document, page, or form. Such
private data can convey information meaningful to the conforming
product that produces it (such as information on object grouping for a
graphics editor or the layer information used by Adobe Photoshop®) but
may be ignored by general-purpose conforming readers.
I'm not sure what is asked here. If you're asking for advice like what I answered above: for instance add a PieceInfo entry to the Root dictionary (aka Catalog). This is all documented, isn't it? Read the ISO specification, and read part 4 of "iText in Action".
If your question is: write some code for me that does what I need to do. then I believe that's more or less in violation with the goal of this site.
Well you could hex encode your data as a String and then draw it off screen like this:
cb.showTextAligned(PdfContentByte.ALIGN_LEFT,"HIDDENDATA_"+ hexencodeddata, 2000f,2000f, 0f);
and to read process all string searching for HIDDENDATA_
Another way is to use Annotations
public void addAnnotation(PdfWriter writer,
Document document, Rectangle rect, String text) {
PdfAnnotation annotation = new PdfAnnotation(writer,
new Rectangle(
rect.getRight() + 10, rect.getBottom(),
rect.getRight() + 30, rect.getTop()));
annotation.setTitle("Text annotation");
annotation.put(PdfName.SUBTYPE, PdfName.TEXT);
annotation.put(PdfName.OPEN, PdfBoolean.PDFFALSE);
annotation.put(PdfName.NAME, new PdfName(text));
writer.addAnnotation(annotation);
}
And then use some like this to read it.
http://downloads.snowtide.com/javadoc/PDFTextStream/2.3.2/com/snowtide/pdf/PDFTextStream.html

PDF Handling in Java

I have created a program that should one day become a PDF editor
It's purpose will be saving GUI's textual content to the PDF, and loading it from it. GUI resembles text editor, but it only has certain fields(JTextAreas, actually).
It can look like this (this is only one page, it can have many more, also upper and lower margins are cut out of the picture) It should actually resemble A4 in pixel size.
I have looked around for a bit for PDF libraries and found out that iText could suit my PDF creating needs, however, if I understood it correct, it retirevs text from a whole page as a string which won't work for me, because I will need to detect diferent fields/paragaphs/orsomething to be able to load them back into the program.
Now, I'm a bit lazy, but I don't want to spend hours going trough numerus PDF libraries just to find out that they won't work for me.
Instead, I'm asking someone with a bit more Java PDF handling experience to recommend me one according to my needs.
Or maybe recommend me how to add invisible parts to PDF which will help my program to determine where is it exactly situated insied a PDF file...
Just to be clear (I formed my question wrong before), only thing I need to put in my PDF is text, and that's all I need to later be able to get out. My program should be able to read PDF's which he created himself...
Also, because of the designated use of files created with this program, they need to be in the PDF format.
Short Answer: Use an intermediate format like JSON or XML.
Long Answer: You're using PDF's in a manner that they wasn't designed for. PDF's were not designed to store data; they were designed to present and format data in an portable form. Furthermore, a PDF is a very "heavy" way to store data. I suggest storing your data in another manner, perhaps in a format like JSON or XML.
The advantage now is that you are not tied to a specific output-format like PDF. This can come in handy later on if you decide that you want to export your data into another format (like a Word document, or an image) because you now have a common representation.
I found this link and another link that provides examples that show you how to store and read back metadata in your PDF. This might be what you're looking for, but again, I don't recommend it.
If you really insist on using PDF to store data, I suggest that you store the actual data in either XML or RDF and then attach that to the PDF file when you generate it. Then you can read the XML back for the data.
Assuming that your application will only consume PDF files generated by the same application, there is one part of the PDF specification called Marked Content, that was introduced precisely for this purpose. Using Marked Content you can specify the structure of the text in your document (chapter, paragraph, etc).
Read Chapter 14 - Document Interchange of the PDF Reference Document for more details.

How to generate a printable output for a phonebook

I'm developing a desktop software to manage people and telephones, and also to generate (export) a list of telephones (also with a summary of the cities) that can be printed (like pdf). The part of telephones management is ready and was made with java and swt/jface. Exporting the list in a print friendly format is what has become an issue.
I tried exporting the list in HTML with CSS, but the result is not the same in different browsers.
I was thinking about generating it in LaTeX, but creating an style is getting too complicated (need an A7 page size, smaller fonts...).
What file format can be used to export this list? Is there an easy way to generate printable stuff?
Edit: forgot to mention that the file will be sent to a company to be printed.
Thanks!
Generate a pdf, it will look the same no matter what browser they use. You can use iText to create the pdf, it is fairly straight forward for a simple pdf.
You could just draw an image, it will stay the same on different systems and its easy to print. by drawing it, you can style it like you imagine, without learning any document format. It should be easy to draw a simple table.
Plain text is a very friendly format for me. Altough, this could be done with HTML and CSS, if you keep the style complexity level to a minimum. Try reading:
http://www.smashingmagazine.com/2010/06/07/the-principles-of-cross-browser-css-coding/
And be careful when choosing your properties!

Categories