Convert HTML to Microsoft Word Document in Java [duplicate] - java

This question already has answers here:
Convert html to doc in java
(5 answers)
Writing HTML content into MS WORD using JAVA?
(2 answers)
Closed 9 years ago.
I know it is a repetitive question, but most of the answers are not straightforward for this question. Some say, convert HTML to XHTML and then convert it to Word doc. Some say, right-click on the page and select 'Save as doc'. But my question is, is there any particular API for this, which simply converts the HTML to Doc?
To elaborate, is there any API like iText (which we use for PDF) for Word doc generation?
Thanks.

The Apache POI project's goal is to provide a comprehensive Java API for Microsoft Word documents.
However, you should keep in mind that Word formatting is awfully complex and messy. It might be easier to go with a cleaner or more abstracted specification, like PDF.
Best of luck.

Related

write to docx with Apache POI and enriched text (HTML) [duplicate]

This question already has an answer here:
How to set define different styles for the same paragraph
(1 answer)
Closed 2 years ago.
i'm writing a docx with apache poi and i have enriched text like "<p>This is a Paragraph with <strong>enriched text</strong>.</p>". But i need that the Word document keep the style like the "strong".
If you only have to create bold/italic/underlined texts, then checking the tags like strong yourself and using the build in Apache POI functions setBold/setItalic on the run Class can be an option. If more complex HTML structures like with breaks and lists are also included, then a better option would be to use the JSOUP library in combination with Apache POI. I have good experience with this library in combination with Apache POI.
Simple example:
How do I use Jsoup to parse html for text?

Concatenate 2 different pdf with or without PdfCopy class of Itext [duplicate]

This question already has answers here:
Is it possible to merge several pdfs using iText7
(6 answers)
Closed 3 years ago.
I'm trying to concatenate two different PDF files in a java project.
I've read in others posts that the best way is to use Itext class "PdfCopy", but the strange thing is that there isn't in my project even though I have the 7.0.6 of Itext in the pom and a commercial license.
Is there an explanation or another way to solve this task without using Itext or others external libraries?
Thank you
I've read in others posts that the best way is to use Itext class "PdfCopy"
That recommendation refers to iText up to version 5.x. iText 7.x is a major re-design of the whole iText api. In particular the functionality of the Pdf*Copy* classes had been moved. Try PdfDocument.copyPagesTo instead.
I.e. if you have loaded your source documents into PdfDocument instances doc1 and doc2 and have another, writable PdfDocument instance dest you want to copy those source documents into, simply do:
doc1.copyPagesTo(1, doc1.getNumberOfPages(), dest);
doc2.copyPagesTo(1, doc2.getNumberOfPages(), dest);
Please refer this as well:
Is it possible to merger several pdfs using iText7
Another library that can be used is PDFBox, sample below"
https://www.tutorialkart.com/pdfbox/pdfbox-merge-multiple-pdfs/

Exporting contents in html using java [duplicate]

This question already has answers here:
Write HTML file using Java
(11 answers)
Closed 7 years ago.
I need to create an html page to export some information.
Currently, by using java, i've exported information to excel. But now, I need to export the information in HTML page using Java.Actually I am developing an application which will test rest api and generate the output in html.
Is there any APIs I can use? Thanks
There are a lot of ways to do that. You may use any template engine you like or write HTML directly.
Freemarker (http://freemarker.org): easy to use template engine
Velocity: (https://velocity.apache.org/): another one
XML + XSLT: Generate XML with DocumentBuilder or serialize your file with XStream and then apply XSLT. Safier (you can't miss tag) but only cool if you really like XSL (most people hate it).
If your app is web application, you may use JSP or JSPX which is part of servlet specification and good enoght (but its pain to use it offline, so only good for web apps)

JAVA parse a CSV file [duplicate]

This question already has answers here:
CSV API for Java [closed]
(10 answers)
Closed 9 years ago.
I just recently went through the exercise of going through various examples and tutorials dealing with parsing XML's - I have been introduced to JSoup, DOM, StAX, and maybe one other.
I now want to branch off into opening and reading a csv (comma deliminted file) and search for particular data.
A brief internet search shows similar to the XML exercise - plenty of options.
What technique do you recommend (within the JAVA world) for opening, reading, searching a csv file?
I guess I would also like to include writing a csv file also.
Thanks for the advice.
I can recommend you opencsv library which is simple and easy to use csv file parser. You can find how to read and how to write examples on FAQ of opencsv site.

Create PDF with Java [duplicate]

This question already has answers here:
PDF Generation Library for Java [closed]
(6 answers)
Closed 2 years ago.
Possible Duplicate:
PDF Generation Library for Java
I'm working on an invoice program for a local accounting company.
What is a good way to create a PDF file with Java? Any good library?
I'm totally new to PDF export (On any language).
I prefer outputting my data into XML (using Castor, XStream or JAXB), then transforming it using a XSLT stylesheet into XSL-FO and render that with Apache FOP into PDF. Worked so far for 10-page reports and 400-page manuals. I found this more flexible and stylable than generating PDFs in code using iText.
Following are few libraries to create PDF with Java:
iText
Apache PDFBox
BFO
I have used iText for genarating PDF's with a little bit of pain in the past.
Or you can try using FOP: FOP is an XSL formatter written in Java. It is used in conjunction with an XSLT transformation engine to format XML documents into PDF.
Another alternative would be JasperReports: JasperReports Library. It uses iText itself and is more than a PDF library you asked for, but if it fits your needs I'd go for it.
Simply put, it allows you to design reports that can be filled during runtime. If you use a custom datasource, you might be able to integrate JasperReports easily into the existing system. It would save you the whole layouting troubles, e.g. when invoices span over more sites where each side should have a footer and so on.

Categories