I want to convert dynamic html to pdf. Following code show the conversion of static html to pdf:
Document document = new Document();
// step 2
PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream("d:/sample/pdfaskkea.pdf"));
// step 3
document.open();
// step 4
XMLWorkerHelper.getInstance().parseXHtml(writer, document,new FileInputStream("webcontent/jsp/index.jsp"), null);
// XMLWorkerHelper.getInstance().parseXHtml(writer, document,new FileInputStream("C:\\pdf_table1.html"), null);
//step 5
document.close();
System.out.println( "PDF Created!" );
From your question it is not clear, what you mean under "dynamic HTML".
If it is an HTML dynamically created with JSP, for example, PD4ML offers a JSP custom tag library - you only need to surround your code with and to output PDF instead of HTML.
If under dynamic HTML you mean JavaScript-rich HTML pages, I would recommend to take a look at PhantomJS, which can convert HTMLs also built on-a-fly with JavaScript. PhantomJS is a native standalone application, based on WebKit.
You can use itext pdf library to convert html into rich PDF files. To generate dynamic HTML content you can use a template library like thymeleaf.
I have a detailed article about generating PDF files with thymeleaf in a spring boot application if you are interested.
Related
I'm developing a Java application that has to process a folder with PDF/A files, adding a page with some information to each of them using Apache's PDFBox library. The problem is that the output PDF file after adding the information is not PDF/A anymore. This is a validation test from the website: https://www.pdf-online.com/osa/validate.aspx:
And this is the relevant part of the code that I use to generate the PDF file:
String pdfFileName = this.baseFolder+this.extendedPDFFileName;
File file = new File(pdfFileName);
PDDocument pdfFile = PDDocument.load(file);
PDPage pag = new PDPage();
// As a test, simply adding a page makes the PDF unvalid as PDF/A
pdfFile.addPage(pag);
pdfFile.save(file);
pdfFile.close();
What could I do to keep the PDF/A format validity? Thanks in advance,
As Tilman Hausherr suggested, the problem has been solved by adding a PDResources object to the new page, like this:
pag.setResources(new PDResources());
Now I'm having troubles with the embedded fonts, but this is another question :)
Many thanks!
You create a normal PDF in your code, you should create a valid PDF/A from the start.
Here's a link: https://pdfbox.apache.org/1.8/cookbook/pdfacreation.html
I have String which contains some html tags and it is coming from database, i want to write that in PDF file with same styling present in the String in the form of HTML tag. I tried to use XMLWorkerHelper like this
String html = What is the equation of the line passing through the
point (2,-3) and making an angle of -45<sup>2</sup> with the positive
X-axis?
XMLWorkerHelper.getInstance().parseXHtml(writer, document, new
StringReader(html));
but it only reads the data which is inside the html tag(in this case only 2) other string it simply ignores. But i want the entire String with HTML formating.
With HTMLWorker it works perfectly but that is deprecated so please let me know how to achieve this.
I am using iText 5 lib
I am trying to convert html to pdf using aspose,also i have to use PageSize A1,A2,A3,A4 .this is worked perfectly..but i dont want set pagesize for pdf generation.So far i have tried below code
HtmlLoadOptions htmloptions = new HtmlLoadOptions(basePath);
htmloptions.getPageInfo().setWidth(PageSize.getA2().getWidth());
htmloptions.getPageInfo().setHeight(PageSize.getA2().getHeight());
// Load HTML file
Document doc = new Document(basePath + "400010_DOC002_L_10_2508016.html", htmloptions);
// Save HTML file
doc.save("D:/Web+URL_output.pdf");
Can anyone suggest with out set page size i have convert html to pdf conversion ? or else please let me know what tools are available for this. Please let me know any other tools for this conversion.
#Shankar, you may use the below code sample in order to convert an HTML file to a PDF file without setting page size. By default, the page size of the rendered PDF file will be as of the A4 page size.
Simply omit the code which is setting a page size, else remains the same.
HtmlLoadOptions htmloptions = new HtmlLoadOptions(basePath);
// Load HTML file
Document doc = new Document(basePath + "400010_DOC002_L_10_2508016.html", htmloptions);
// Save HTML file
doc.save("D:/Web+URL_output.pdf");
Please let us know if you need any further assistance. I work with Aspose as Developer Evangelist.
I am using the following code to generate a PDF file of the HTML Report
String url = new File("Test.html").toURI().toURL().toString();
OutputStream os = new FileOutputStream("Test.pdf");
ITextRenderer renderer = new ITextRenderer();
renderer.setDocument(url);
renderer.layout();
renderer.createPDF(os);
os.close();
I was able to use it on sample HTML files to convert to pdf. But when it comes to my real usage, the HTML content consists of various special symbols, like &,<,> that can't be parsed by XML.
I tried using CDATA, while generating HTML itself, but later found that the text around CDATA is not visible in HMTL.
Does anyone have a solution for this?
Have you tried to print to pdf from the browser? Google primo pdf for a program that we'll let you do it.
I don't know if this will help you, but you can use StringEscapeUtils from apache-commons. It has methods for escape and unescape HTML (you may use them to pre-process your HTML before PDF generation).
I want to setup response for PDF output. How do I achieve this?
I can setup for Excel output and successfully get the desired excel sheet from the browser, but for PDF I could not get the desired output.
For Excel the following code works fine
String excelContent = "an html table..";
getServletResponse().setContentType("“application/vnd.ms-excel");
getServletResponse().setHeader("Content-Disposition","inline; filename=" + pageTitle + ".xls");
PrintWriter ps = getServletResponse().getWriter();
ps.println(excelContent);
But for PDF I tried setting the content type to PDF, but could not get it properly (no content gets displayed even though a PDF file is opened in the browser)
String excelContent = "an html table..";
getServletResponse().setContentType("“application/pdf");
getServletResponse().setHeader("Content-Disposition","inline; filename=" + pageTitle + ".pdf");
PrintWriter ps = getServletResponse().getWriter();
ps.println(excelContent);
Do html tables cannot be displayed as such in PDF?
It's very simple with the Flying Saucer renderer. It takes HTML input and returns PDF. What you have to do is to declare content type as PDF, generate HTML to a string, and call a method from Flying Saucer.
Here is an example
xhtml to pdf servlet with flyingsaucer
You are writing html context to the response stream. You need to write pdf format to the response.
You can use libraries like iText or Apache FOP to create PDF format.