Apache FOP losing elements - java

I'm trying to print a PDF using Apache FOP + XSL as follow:
(in is JDOM)
private void transformPDF(Document in, StreamSource template, OutputStream out) throws ConvertirXMLException {
try {
Driver driver = new Driver();
driver.setOutputStream(out);
driver.setRenderer(1);
Transformer transformer = TransformerFactory.newInstance().newTransformer(template);
transformer.transform(new JDOMSource(in), new SAXResult(driver.getContentHandler()));
} catch (Exception e) {
throw new ConvertirXMLException(e.toString());
}
}
It's like FOP's documentation but it's not working fine. If I go to debug mode, i can see that in has the correct content, but when FOP transform the data to PDF, I lose some elements.
I have tested the in data and my template in a XSL editor where you can debug it and make some transformations and works fine (no lose data) so I'm a little bit lost... Any idea?

Related

javax.xml.transform.TransformerFactory Unicode issue- Java

We are not able transform Unicode Characters properly. We are giving input in XML format, when we try to transform we are not able to get back the original string.
This is the code i'm using,
StringCarrier OStringCarrier = new StringCarrier();
String SXmlFileData= "<export_candidate_response><criteria><output><lastname>Bhagavath</lastname><firstname>ガネーシュ</firstname></output></export_candidate_response>";
String SResult = "";
try
{
TransformerFactory tFactory = TransformerFactory.newInstance();
Transformer transformer = tFactory.newTransformer(new StreamSource(SXslFileName));
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF8");
OutputStream xmlResult = (OutputStream)new ByteArrayOutputStream();
StreamResult outResult = new StreamResult(xmlResult);
transformer.transform(new StreamSource(
new ByteArrayInputStream(SXmlFileData.getBytes("UTF8"))),outResult);
SResult = outResult.getOutputStream().toString();
}
catch (TransformerConfigurationException OException)
{
//Exception has been thrown
OException.printStackTrace();
return OStringCarrier;
}
catch (TransformerException OException)
{
//Exception has been thrown
OException.printStackTrace();
return OStringCarrier;
}
catch (Exception OException)
{
//Exception has been thrown
OException.printStackTrace();
return OStringCarrier;
}
This is the output i'm getting ガãƒ?ーシュ in place of ガネーシュ
This is the output i'm getting ガãƒ?ーシュ in place of ガネーシュ
That tells you that somewhere in this process, data in UTF-8 is being read by a piece of software that thinks it is reading Latin-1. What it doesn't tell you is where in the process this is happening. So you need to divide-and-conquer - you need to find the last point at which the data is correct.
Start by establishing whether the problem is before the transformation or after it. That's very easy if you're using an XSLT 2.0 processor: you can use ` to see what string of characters the XSLT processor has been given. It's a bit trickier with a 1.0 processor, but you can use substring($in, $n, 1) to extract the nth character, and that should give you a clue.
My suspicion is that it's the input. Firstly, putting non-ASCII characters in a Java string literal is always a bit dangerous, because the round trip to a source repository can easily corrupt the code if you're not very careful about everything being configured correctly. Secondly, if the string is correct, it would be much safer to read it using a StringReader, rather than converting it to a byte stream. Try:
transformer.transform(new StreamSource(
new StringReader(SXmlFileData)),outResult);

Neither an FOEventHandler, nor a Renderer could be found for this output format

I am doing a conversion from docx to pdf format. I successfully did the variable replacement and have a WordprocessingMLPackage template.
I have tried both approches. The old deprcated version of converting to pdf and the newer method. Both fails giving this exception error
Don't know how to handle "application/pdf" as an output format.
Neither an FOEventHandler, nor a Renderer could be found for this
output format. Error: UnsupportedOpertaionException
I have tried everything I can. This thing works on my local machine but now at my workplace. I think I have all the necessary jars. Can u please instruct what course of action should I take.
Code :
Method 1:
Docx4J.toPDF(template, new FileOutputStream("newPdf.pdf"));
Method 2:
public static void createPDF(WordprocessingMLPackage template, String outputPath) {
try {
// 2) Prepare Pdf settings
PdfSettings pdfSettings = new PdfSettings();
// 3) Convert WordprocessingMLPackage to Pdf
OutputStream out = new FileOutputStream(new File(
outputPath));
PdfConversion converter = new org.docx4j.convert.out.pdf.viaXSLFO.Conversion(
template);
converter.output(out, pdfSettings);
} catch (Throwable e) {
e.printStackTrace();
}
}
Both are giving the same error. Any help is appreciated!
My issue is resolved. The problem was that the required fop-1.1.jar was there on my eclipse classpath but it was not there on the local server classpath. I added them there and it worked like a charm.

IKVM C# Tika Implementation - NoClassDefFoundError - sun.java2d.Disposer

I have a small library that utilizes IKVM to run Tika (1.2) for the purposes of extracting text and metadata for use within Lucene. I grab document and image paths from a CMS we are using, and pass them through here:
public TextExtractionResult Extract(string filePath)
{
var parser = new AutoDetectParser();
var metadata = new Metadata();
var parseContext = new ParseContext();
Class parserClass = parser.GetType();
parseContext.set(parserClass, parser);
try
{
// Attempt to fix ImageParser "NoClassDefFoundError"
java.lang.System.setProperty("java.awt.headless", "true");
var file = new File(filePath);
var url = file.toURI().toURL();
using (InputStream inputStream = TikaInputStream.get(url, metadata))
{
parser.parse(inputStream, getTransformerHandler(), metadata, parseContext);
inputStream.close();
}
return AssembleExtractionResult(_outputWriter.toString(), metadata);
}
catch (Exception ex)
{
throw new ApplicationException("Extraction of text from the file '{0}' failed.".ToFormat(filePath), ex);
}
}
Only when the files are .png, it bombs with this error:
It seems as though it most likely coming from Tika's ImageParser.
For those who are interested - You can see getTransformerHandler() here:
private TransformerHandler getTransformerHandler()
{
var factory = TransformerFactory.newInstance() as SAXTransformerFactory;
TransformerHandler handler = factory.newTransformerHandler();
handler.getTransformer().setOutputProperty(OutputKeys.METHOD, "text");
handler.getTransformer().setOutputProperty(OutputKeys.INDENT, "yes");
handler.getTransformer().setOutputProperty(OutputKeys.INDENT, "UTF-8");
_outputWriter = new StringWriter();
handler.setResult(new StreamResult(_outputWriter));
return handler;
}
I have looked around and keep being pointed in the direct of running headless, so I already tried that with no luck. Because this is a C# implementation in IKVM, is something missing? It works on all other documents as far as I can tell (.jpeg, .docx, .pdf, etc.).
Thanks to those who know more about Tika + IKVM implementations than I do.
Apache Tika 1.2 was released back on 17 July 2012, and there have been a lot of fixes and improvements since then
You should upgrade to the most recent version of Apache Tika (1.12 as of writing), and that should solve your issue

How to merge two jrxml jasper reports into a one single pdf output file

I have two JRXML File with two different data source.
in first jasper report data source is JRXmlDataSource and in second jasper report data source is JRResultSetDataSource
try
{
conn= objConnector.getConnection();
conn.setAutoCommit(false);
PreparedStatement ps = conn.prepareCall("{ call Sp_DEMO(?) }");
ps.setString(1,condition);
ResultSet rs = ps.executeQuery();
JasperReport jreport1 = JasperCompileManager.compileReport("d:\\JRXML\\ECGImage.jrxml");
JasperPrint jprint1 = JasperFillManager.fillReport(jreport1, new HashMap(), new JRResultSetDataSource(rs));
jprintlist.add(jprint1);
JasperReport jasperReport = JasperCompileManager.compileReport("d:\\JRXML\\RadiologyReport.jrxml");
JRXmlDataSource xmlDataSource = new JRXmlDataSource("d:\\abc.xml"+, "/X-RayReport/Type");
JasperPrint jasperPrint = JasperFillManager.fillReport(jasperReport, new HashMap(),xmlDataSource);
jprintlist.add(jasperPrint);
File file = new File("d:\\demo.pdf");
if(file.exists())
{
file.delete();
}
JRExporter exporter = new JRPdfExporter();
exporter.setParameter(JRPdfExporterParameter.JASPER_PRINT_LIST, jprintlist);
OutputStream output = new FileOutputStream(new File("d:\\demo.pdf"));
exporter.setParameter(JRPdfExporterParameter.OUTPUT_STREAM, output);
exporter.exportReport();
}
catch(Exception e)
{
e.printStackTrace();
}
i want to create single pdf file as output from both jrxml file.
You can merge the above two pdf like this
List pages = jasperPrint.getPages();
for (int j = 0; j < pages.size(); j++) {
JRPrintPage object = (JRPrintPage)pages.get(j);
jprint1.addPage(object);
}
And jprint1 will be your single output .
Well, that is what JasperReport Subreports are meant to do. You have to create another .jrxml which would be the master report, and include the other existing two in this one as "Subreports". So you'll have a single output.
To create subreports (if you don't know), please refer to these tutoriale: JasperReports - Create SubReports, SubReports.
From the fact that you have two different data sources, I think that you might as well need to read this too: Pass parameter to subreports.
You might need this to pass the different data sources as parameters to the subreports, thus not in your JasperPrint object instance.
I have no factual defence or argumentation about how I do it normally and all that is just matter of personal preference and the ease that I have during the merging and post-processing stages for adding some general additional stuff to the merged report i.e. page numbers, header, footer, etc.
The correct jasper-report way is to stay away from joining reports in Java and instead, to make a main report and then adding subreports to it. However, if you really want to do that in Java, you can use markers and afterwards post-process everything again, check here.
Still, I just don't do that.
Most of the time I prefer to first generate and export different sections of my final report in pdf separately and afterward merge them all together utilizing PDFBox. By not following the jasper report's way, usually I avoid a hell of subreports hierarchy as normally each section of the final report on their own contain couple of other subreports. So, I just find it better to focus on each section separately and later doing the final merging.
I normally do something like this:
public final class Report
{
List<File> MergingFiles;
ReportData data;
public Report(ReportData data)
{
// I prefer to pack all the data like, jrxml files path or
// the data that fills the report in a separate object -> ReportData
this.data = data;
MergingFiles = new ArrayList<>();
}
private void generateCoverPage() throws JRException
{
/* Setting up the data which needs to be passed to cover page
reading the cover page jrxml file
compiling the report */
// and Exporting - I normally use JRPdfExporter for that
exporter.setExporterOutput(new SimpleOutputStreamExporterOutput(data.getCoverPageExportPath()));
exporter.exportReport();
MergingFiles.add(new File(data.getCoverPageExportPath()));
}
private void generateSecondPart() throws JRException
{
/* similar to generateCoverPage() to create another part of the report */
}
public void generateReport()
{
generateCoverPage();
generateSecondPart();
mergePDFFiles(MergingFiles, data.getPrintFileName());
/* Do additional general post process i.e. page numbers, header, footer, etc. here
and then clean up temp files */
}
private void mergePDFFiles(List<File> files, String mergedFileName)
{
// all classes are imported from "org.apache.pdfbox"
try
{
PDFMergerUtility pdfmerger = new PDFMergerUtility();
for (File file : files)
{
PDDocument document;
document = PDDocument.load(file);
pdfmerger.setDestinationFileName(mergedFileName);
pdfmerger.addSource(file);
pdfmerger.mergeDocuments(MemoryUsageSetting.setupTempFileOnly());
document.close();
}
}
catch (IOException e)
{
System.out.println("Error to merge files. Error: " + e.getMessage());
}
}
}
// are just my side notes, however comments surrender with /* */ are the parts which you have to take care of them in case that you wanted to use my approach.

No return value after moving XSLT transformation to other project

I have the following piece of code to transform an xml document to a html page. It's nothing special, just that I use the saxon factory for transforming as I need a function from XSLT 2.0.
try {
ByteArrayOutputStream bos = new ByteArrayOutputStream();
StreamResult output = new StreamResult(bos);
StreamSource input = new StreamSource(new FileReader(TransformationTest.class.getResource(sourceFileName).getFile()));
StreamSource stylesheet = new StreamSource(TransformationTest.class.getResourceAsStream(STYLESHEET));
Transformer tr = TransformerFactory.newInstance("net.sf.saxon.TransformerFactoryImpl", null).newTransformer(stylesheet);
tr.transform(input, output);
return bos.toString();
} catch (TransformerException e) {
throw new RuntimeException("XSL-Transform: Error while transforming the document.");
} catch (FileNotFoundException e) {
throw new RuntimeException("XSL-Transform: Error while reading the source file.");
}
The code is working fine in a seperate project with saxon added as a dependency. I tried it with different inputs and everytime I got the right output.
Now I tried adding it to a bigger project which has many many other libraries in the classpath and suddenly the code stops working. The output is just blank, no exceptions, no error messages, just a blank string that will be returned. I'm not an expert of what is happening in the transformer, so I need a bit of help here. What exactly could be reason for this behaviour and how can I fix it?

Categories