HI all
is there any kind of abstraction API over Apache POI/FOP allowing one to use the same API to write both Word and PDF documents ?
I'm not aware of a unified API for the two libraries you have mentioned.
However you may still have a couple of options using a single API:
Use Apache POI to generate the documents in Word format and then use a Word to PDF conversion library to create a PDF from the word document. Another commenter has suggested IText
Use OpenOffice via its Java API to create documents and export them in Microsoft Word or PDF format.
Docmosis will do what you require, assuming you mean a Java (or command line) API. It reads doc and odt files as templates, populates/manipulates via the Java API, and produces the output formats OpenOffice supports. Have a look at the online demo on the web site which lets you see various output formats to render a document in.
When I was working on previous project, I was sure the Apache/POI can be used for Microsoft Documents.
we have IText.jar which we can use it for PDF generation and alteration. please check this will help you.
Related
I have a requirement to convert excel template to word. Then using Aspose.Word for JAVA I can merge all word templates (including the converted excel template) to PDF file.
Aspose, iText, POI, Jasper, Birt etc doesn't support this. Is there any API in Java which allows this kind of conversion?
Although, you cannot convert Excel spreadsheets to Word documents directly via Aspose.Cells APIs. FYI, Aspose.Cells is a spreadsheet management library that manages MS Excel file formats only. We have another component i.e., Aspose.Words that manages or merges MS word documents. But, I think for your specific requirements, you have to use two Aspose APIs with two steps, that are; Aspose.Cells & Aspose.Pdf to achieve your goal. You will use Aspose.Cells APIs that allows you to convert the spreadsheet formats (XLS/XLSX, etc.) to PDF format. Then you will use Aspose.Pdf APIs which allows you to convert PDF to Word document for your needs.
I am working as Support developer/ Evangelist at Aspose.
you can try the Apache POI - the Java API for Microsoft Documents..
have a look here
http://viralpatel.net/blogs/java-read-write-excel-file-apache-poi/
Is it possible to convert from MS office file formats using Apache PDFBox (the documentation isn't clear about this, and the javadoc seems to indicate no such capability exists), or would I need to do some tedious conversions with Apache POI?
The reason I'm asking is the answer to this StackOverflow question:
https://stackoverflow.com/questions/10861227/convert-ms-office-to-pdf-in-java
I imagine I'll need to use Apache POI, but I wanted to clarify.
In order to do this conversion, you will need MS Office, or perhaps Google Drive. PDFBox does not convert from anything to PDF or vice versa -- it simply reads and writes PDF files. Apache POI will not do that type of conversion either -- it simply reads and writes MS Office files. Specifically, it does not render them. You could implement a rendering engine for each type of Office file yourself, but that would be a gargantuan task to say the least.
Take a look at https://angelozerr.wordpress.com/2012/12/06/how-to-convert-docxodt-to-pdfhtml-with-java/.
One of possible options it mentions is XWPFConverterPDFViaIText:
org.apache.poi.xwpf.converter.pdf provides the DOCX 2 Pdf converter
based on Apache POI XWPF and iText.
You can test this converter with the REST Converter service
http://xdocreport-converter.opensagres.cloudbees.net/
need to convert a pdf file to a doc file. I found different type of example to generate pdf file but not got pdf to doc.
What your asking is actually very difficult
I recommend you start here and look for a good parsing library. then you would have to write it out in .doc format. Inevitably a lot of the formatting and extra information would be lost. it would be a lot easier to output to docx format, but i assume thats not what your looking for.
I see few possible solutions:
Davisor Publishor 6.2 probably can be used, but it is commercial, and seems that generates only txt from pdf... just have a look
parse pdf with iText, and then
generate doc with Apache POI -
another way to try (free one ;)
look for command line tools, like
Convert PDF To DOC and execute
them from java
Otherwise take a look at Con's answer, there is a link to the list with java pdf processing libraries, maybe some library can do it directly, or can be used to parse pdf (better than iText), and then just use Apache POI to generate doc. Hope it helps ;)
How to create pdf with complex design views in Java?I have tried it using jasper reports.Is there Any Ideas for creating PDF for Income tax forms?.
A commonly used Java API to create PDF files is iText. Give it a look. API documentation can be found here, code examples can be found here, a tutorial can be found here.
A good but less widely known Java API is OOo API wherein you can create any OOo document to your taste and finally export to PDF.
Have you taken a look at the Apache PDFBox project. I believe you can create PDFs using this library, although it is more commonly used in Lucene to convert PDFs to text to allow indexing.
You could also try Docmosis or JODConverter to do the conversion as long as you can install OpenOffice somewhere. They work on many platforms and can be Java controlled and will save you the hassle of learning the OOo UNO API.
Design your complex PDF Form with the appropriate tools, something like Acrobat Professional. Then from your Java code, you generate an FDF file (Form Data Format) and let the PDF Reader do the merging or you do it from the server-side and stream back the result.
Possible solutions to process FDF are Adobe Java FDF Toolkit or Apache PDFBox.
one approach that requires very little programming is converting your Java object to XML using the Java Binding API for XML (JABX) and then use apache FOP (XSL-FO) to create the PDF from XML. The adavantage of this approach is that is almost 100% declarative, .i.e no programming involved other than executing jabx and apache fop. If you want a tool to create the XSL-FO template, look at J4L FO Designer
You can try ITextPDF.jar Add this jar to your application and please go through the examples to know more about the tags and design procedure used for creating a PDF Document. Check this link for a simple exmaple http://itextpdf.com/examples/iia.php?id=12
I want to create a Word or RTF file with a table of contents (with links to each section) from Java. From my understanding, iText & Apache POI do not support generating a table of contents. Some clients of the app still use older versions of Word, so I need a library that supports the older Word doc format. Does anyone know how I can do this?
Thanks,
Glen
Look into the Java API for OpenOffice. It will do what you want.
Late, but possibly helpful to others. Docmosis is a good option here. It works over the OpenOffice API but you don't have to spend the time learning that API. It provides template population and manipulation, a range of data source options and all the input and output options of OpenOffice. Specifically, it also updates the TOC and indexes at the final stages of document rendering.