How to get PDF in byte[] without forming a file? - java

Below is my code snippet:
try (OutputStream out = new FileOutputStream(PDF_NAME)) {
Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, foUserAgent, out);
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer(new StreamSource(xsltFile));
Result res = new SAXResult(fop.getDefaultHandler());
transformer.transform(new StreamSource(IOUtils.toInputStream(xml, "UTF-8")), res);
}
byte[] inputFile = Files.readAllBytes(Paths.get(PDF_NAME));
String encodedFile = Base64.getEncoder().encodeToString(inputFile);
InventoryListSnapshot pojo = new InventoryListSnapshot(invList.getInventoryLayoutId(), invList.getProjectId(), invList.getAuthorUsername(), encodedFile);
repository.save(pojo);
It used xsl-fo to form PDF in the file. I need to place this PDF encoded by Base64 as BLOB into DB - so I don't use the file itself.
How can I save PDF into DB without forming a file?

You would change this:
OutputStream out = new FileOutputStream(PDF_NAME)
to
OutputStream out = new ByteArrayOutputStream()

Thanks, it's works.
The new version is:
byte[] pdf;
try (OutputStream out = new ByteArrayOutputStream()){
Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, foUserAgent, out);
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer(new StreamSource(xsltFile));
Result res = new SAXResult(fop.getDefaultHandler());
transformer.transform(new StreamSource(IOUtils.toInputStream(xml, "UTF-8")), res);
pdf = ((ByteArrayOutputStream) out).toByteArray();
}
String encodedFile = Base64.getEncoder().encodeToString(pdf);
InventoryListSnapshot pojo = new InventoryListSnapshot(invList.getInventoryLayoutId(), invList.getProjectId(), invList.getAuthorUsername(), encodedFile);
repository.save(pojo);

Related

Java - Apache FOP - PDF generated for the second time is corrupt

I am using java - Apache FOP to generate a PDF and here is the code inside an execute method of struts ActionFroward.
{
Random randomGenerator = new Random();
String generatedfile = "generatedfile"+"_"+sessionId+randomGenerator.nextInt(100)+".xml";
File file = new File(getServlet().getServletContext().getResource("/ttf/").getPath(),generatedfile);
FileOutputStream fStream = new FileOutputStream(file);
ByteArrayOutputStream out = new ByteArrayOutputStream();
byte[] b = reportXML.getBytes("UTF-8");
OutputStream outstream = fStream;
outstream.write(b);
outstream = new java.io.BufferedOutputStream(outstream);
String filePath = getServlet().getServletContext().getResource("/ttf/").getPath()+generatedfile;
xmlfile = new File(filePath);
Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, foUserAgent,out);
Source src = new StreamSource(xmlfile);
TransformerFactory factory = TransformerFactory.newInstance();
File xsltfile = new File(printStylesheetURI.getPath());
Transformer transformer = factory.newTransformer( new StreamSource(xsltfile));
Result res = new SAXResult(fop.getDefaultHandler());
transformer.transform(src, res);
byte[] pdfBytes = out.toByteArray();
response.setHeader("Pragma", "public");
response.setHeader("Expires", "0");
response.setHeader("Content-Disposition", "attachment; filename=\"" + generatedfile + ".pdf\"");
response.setContentType(MimeConstants.MIME_PDF);
response.setContentLength(pdfBytes.length);
outstream.close();
out.close();
response.getOutputStream().write(pdfBytes);
}
The PDF file is generated fine for the first time at the first request with PDF size 22KB. But for the second request, the PDF is getting corrupted and the size is only 15bytes. Again for the third request, the file is generated fine with 22KB as file size. Again for the 4th time it fails...
Am I doing anything wrong here? I tried debugging in the local, it is working file for all the requests. But after deploying on the server/environments, I am facing the issue.
Thanks in advance.

Creating PDF File from dynamic invoice and XSLT

I have been trying to generate PDF files for customer invoices for a long time. Invoices are saved as xml files. And customers can have their own xslt file in order to have their own invoice view( if not default one is used as xslt ).
My Problem is transforming XML/ (X)HTML files to pdf files. I have read almost about all libraries for doing this and tried to transform almost with all of them.
1) Apache FOP
http://www.javaworld.com/article/2071749/java-app-dev/convert-html-content-to-pdf-format.html
I transformed invoice xml to xhtml using default xslt and jtidy. And then I tried to convert generated xhtml to pdf with XSL-FO given by Antenna House.I managed to generate a pdf file with just header. also no success. Code for doing this below.
TransformerFactory factory = TransformerFactory.newInstance();
Source xslt = new StreamSource(getClass().getResourceAsStream("/xslts/general.xslt"));
// xslt.setSystemId("/xslts/general.xslt");
StringWriter writer = new StringWriter();
StreamResult result = new StreamResult(writer);
Transformer transformer = factory.newTransformer(xslt);
DOMSource domSource = new DOMSource(document);
transformer.transform(domSource, result);
String strResult = writer.toString();
Tidy tidy = new Tidy();
// tidy.setDropEmptyParas(true);
// tidy.setJoinStyles(true);
tidy.setInputEncoding("UTF-8");
tidy.setOutputEncoding("UTF-8");
tidy.setXHTML(true);
tidy.setMakeClean(true);
tidy.setForceOutput(true);
ByteArrayInputStream boas = new ByteArrayInputStream(strResult.getBytes("UTF-8"));
ByteArrayOutputStream bos = new ByteArrayOutputStream();
FileOutputStream baoOut = new FileOutputStream(new File("C:\\Users\\xxx\\out.pdf"));
Document tiedDoc = tidy.parseDOM(boas, bos);
DOMSource tiedDocDomSource = new DOMSource(tiedDoc);
StringWriter writer2 = new StringWriter();
StreamResult result2 = new StreamResult(writer2);
Transformer xsl2foTrans = factory.newTransformer(new StreamSource(getClass().getResourceAsStream("/xslt/xhtml2fo.xsl")));
xsl2foTrans.transform(tiedDocDomSource, result2);
// // ab hier
final FopFactory fopFactory = FopFactory.newInstance(new File(".").toURI());
File userConfig = new File("C:\\Users\\xxx\\Desktop\\pdfWork\\fop.xconf");
FOUserAgent foUserAgent = fopFactory.newFOUserAgent();
// configure foUserAgent as desired
// Setup output stream. Note: Using BufferedOutputStream
// for performance reasons (helpful with FileOutputStreams).
OutputStream out = baoOut;
out = new BufferedOutputStream(out);
// Construct fop with desired output format
Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, foUserAgent, out);
// Setup JAXP using identity transformer
transformer = factory.newTransformer(); // identity transformer
// Setup input stream
Source src = new StreamSource(new StringReader(writer2.toString()));
// Resulting SAX events (the generated FO) must be piped through to FOP
Result res = new SAXResult(fop.getDefaultHandler());
// Start XSLT transformation and FOP processing
transformer.transform(src, res);
out.close();
2) IText
as far as i know, we should provide a valid xhtml to IText for generating PDF Files. so i transform invoice xml to html using default.xslt and then to xhtml using jtidy with setXhtml option true. i managed to generate pdf file from given xhtml. But pdf is not rendered well. somehow, css in style tag are not recognized. no success. code for doing this below
StreamSource xslt = new StreamSource(getClass().getResourceAsStream("/xslt/general.xslt"));
// StreamSource xslt = new StreamSource(new FileInputStream(new File("C:\\Users\\XXX\\Desktop\\pdfWork\\firm.xslt")));
TransformerFactory factory = TransformerFactory.newInstance();
// Source xslt = new StreamSource(getClass().getResourceAsStream("/xslts/general.xslt"));
// xslt.setSystemId("/xslts/general.xslt");
StringWriter writer = new StringWriter();
StreamResult result = new StreamResult(writer);
Transformer transformer = factory.newTransformer(xslt);
DOMSource domSource = new DOMSource(document);
transformer.transform(domSource, result);
String strResult = writer.toString();
Tidy tidy = new Tidy();
// tidy.setDropEmptyParas(true);
tidy.setJoinStyles(true);
tidy.setInputEncoding("UTF-8");
tidy.setOutputEncoding("UTF-8");
tidy.setXHTML(true);
tidy.setMakeClean(true);
tidy.setForceOutput(true);
ByteArrayInputStream boas = new ByteArrayInputStream(strResult.getBytes("UTF-8"));
ByteArrayOutputStream bos = new ByteArrayOutputStream();
FileOutputStream baoOut = new FileOutputStream(new File("C:\\Users\\XXX\\Desktop\\pdfWork\\out.pdf"));
tidy.parseDOM(boas, bos);
com.itextpdf.text.Document documentText = new com.itextpdf.text.Document(PageSize.LETTER); // PageSize.A4, 10.0F, 10.0F, 10.0F, 0.0F
PdfWriter pdfWriter = PdfWriter.getInstance(documentText, new FileOutputStream(new File("C:\\Users\\Onur\\Desktop\\pdfWork\\out.pdf")));
documentText.open();
// documentText.open();
// HTMLWorker htmlWorker = new HTMLWorker(documentText);
// htmlWorker.parse(new StringReader(IOUtils.toString(new ByteArrayInputStream(bos.toByteArray()), "UTF-8")));
// documentText.close();
XMLWorkerHelper worker = XMLWorkerHelper.getInstance();
worker.parseXHtml(pdfWriter, documentText, new StringReader(IOUtils.toString(new ByteArrayInputStream(bos.toByteArray()))));
documentText.close();
3) Flying Saucer
I did almost same steps as IText. I managed to generate PDF with recognized css style tag. single problem i have is table and some elements overflow. they are not fitting to the page. i solved this problem with page rule as suggested on
How can i make my html page to be fit in the pdf using Flying Saucer
Document document = // invoice as document
StreamSource xslt = new StreamSource(getClass().getResourceAsStream("/xslt/general.xslt"));
TransformerFactory factory = TransformerFactory.newInstance();
// Source xslt = new StreamSource(getClass().getResourceAsStream("/xslts/general.xslt"));
// xslt.setSystemId("/xslts/general.xslt");
StringWriter writer = new StringWriter();
StreamResult result = new StreamResult(writer);
Transformer transformer = factory.newTransformer(xslt);
DOMSource domSource = new DOMSource(document);
transformer.transform(domSource, result);
String strResult = writer.toString();
Tidy tidy = new Tidy();
tidy.setInputEncoding("UTF-8");
tidy.setOutputEncoding("UTF-8");
tidy.setXHTML(true);
tidy.setMakeClean(true);
tidy.setForceOutput(true);
ByteArrayInputStream boas = new ByteArrayInputStream(strResult.getBytes("UTF-8"));
ByteArrayOutputStream bos = new ByteArrayOutputStream();
FileOutputStream baoOut = new FileOutputStream(new File("C:\\Users\\XXX\\Desktop\\pdfWork\\out2.pdf"));
tidy.parseDOM(boas, bos);
System.out.println(bos.toString("UTF-8"));
ITextRenderer renderer = new ITextRenderer();
renderer.getFontResolver().addFont("/unicode/ARIALUNI.TTF", BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
// renderer.setDocument();
renderer.setDocumentFromString(IOUtils.toString(new ByteArrayInputStream(bos.toByteArray())));
// renderer.setPDFVersion('');
renderer.layout();
renderer.createPDF(baoOut);
renderer.finishPDF();
baoOut.flush();
baoOut.close();
as i said, i managed to generate PDF file from xhtml using flying saucer. But i had to add page rule and some inline styling in general.xslt for doing this. But problem is that each customer can have his own xslt for invoice view. so i dont want to touch and change xslt. general.xslt can be downloaded from link below. How can i achieve this ? is it possible ? Or what i am doing wrong ? Thanks in Advance!
http://www.efatura.gov.tr/dosyalar/kilavuzlar/UBL-TR1.2_Paketi.zip

How to convert Word DOCX to HTML using java

I am using the following code:
My code is converting Doc document to HTML only. I need to convert Docx document to HTML.
try
{
HWPFDocumentCore wordDocument = WordToHtmlUtils.loadDoc(new FileInputStream("C:\\DOC.doc"));
WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(
DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument());
wordToHtmlConverter.processDocument(wordDocument);
org.w3c.dom.Document htmlDocument = wordToHtmlConverter.getDocument();
ByteArrayOutputStream out = new ByteArrayOutputStream();
DOMSource domSource = new DOMSource(htmlDocument);
StreamResult streamResult = new StreamResult(out);
TransformerFactory tf = TransformerFactory.newInstance();
Transformer serializer = tf.newTransformer();
serializer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
serializer.setOutputProperty(OutputKeys.INDENT, "yes");
serializer.setOutputProperty(OutputKeys.METHOD, "html");
serializer.transform(domSource, streamResult);
out.close();
String result = new String(out.toByteArray());
System.out.println(result);
ConvertDocxBigToXHTML html = new ConvertDocxBigToXHTML();
html.creatHTML(result);
}
catch(Exception e)
{
e.printStackTrace();
}
Can someone help me to what changes i have to do above this code

Processing html tags inside xml with fop

I've got an object that's being converted into SAXSource and then converted into a PDF with FOP. Some of the data however is in HTML format inside the XML tags. I'd like to have this these HTML tags parsed as actual elements by the stylesheet, which would mean reading it in its unescaped format, but I can't figure out how to do this. I have disable-output-escaping set in the stylesheet, but I think the data in the XML has already been parsed as "escaped" before it gets to the stylesheet and processed.
Here's the code for converting.
FOUserAgent foUserAgent = getUserAgent();
PDFRenderer pdfrenderer = new PDFRenderer();
pdfrenderer.setUserAgent(foUserAgent);
foUserAgent.setRendererOverride(pdfrenderer);
URIResolver resolver = myWebContext.getResolver();
foUserAgent.setURIResolver(resolver);
ByteArrayOutputStream out = new ByteArrayOutputStream();
byte[] b = null;
try {
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer();
//transformer.setOutputProperty("disable-output-escaping", "yes");
// kicks back error --- invalid property
Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, foUserAgent, out);
Source xsl = resolver.resolve(xslFilename, null);
transformer = factory.newTransformer(xsl);
res = new SAXResult(fop.getDefaultHandler());
transformer.transform(xmlSrc, res);
b = out.toByteArray();

how to read the formated text as a html text from ms word(.doc) using poi?

I want to read the formated text as a html text like(<html><b>boldvalue<b><img src"link" ></html>) also i want to get the image using the image tag link. I'm using poi does poi have any option to get data like this in html format?
try this
HWPFDocumentCore wordDocument = WordToHtmlUtils.loadDoc(new FileInputStream("D:\\temp\\seo\\1.doc"));
WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(
DocumentBuilderFactory.newInstance().newDocumentBuilder()
.newDocument());
wordToHtmlConverter.processDocument(wordDocument);
Document htmlDocument = wordToHtmlConverter.getDocument();
ByteArrayOutputStream out = new ByteArrayOutputStream();
DOMSource domSource = new DOMSource(htmlDocument);
StreamResult streamResult = new StreamResult(out);
TransformerFactory tf = TransformerFactory.newInstance();
Transformer serializer = tf.newTransformer();
serializer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
serializer.setOutputProperty(OutputKeys.INDENT, "yes");
serializer.setOutputProperty(OutputKeys.METHOD, "html");
serializer.transform(domSource, streamResult);
out.close();
String result = new String(out.toByteArray());
System.out.println(result);

Categories