How apply CDATA to transformer parameter with jdom - java

For some reason I have tried to surround the parameters sExtraParameter, sExtraParameter2, sExtraParameter3 with <![CDATA[ ]]> string in order to get "pretty-printed" latin characters. But every time I check the xml output, it stills show bad parsed characters.
So, if is there another way to apply the CDATA to this parameters?
public static Element xslTransformJDOM(File xmlFile, String xslStyleSheet, String sExtraParameter, String sExtraParameterValue, String sExtraParameter2, String sExtraParameterValue2, String sExtraParameter3,String sExtraParameterValue3 ) throws JDOMException, TransformerConfigurationException, FileNotFoundException, IOException{
try{
Transformer transformer = TransformerFactory.newInstance().newTransformer(new StreamSource(xslStyleSheet));
transformer.setParameter(sExtraParameter, sExtraParameterValue);
transformer.setParameter(sExtraParameter2, sExtraParameterValue2);
transformer.setParameter(sExtraParameter3, sExtraParameterValue3);
JDOMResult out = new JDOMResult();
transformer.transform(new StreamSource(xmlFile), out);
Element result = out.getDocument().detachRootElement();
setSize(new XMLOutputter().outputString(result).length());
return result;
}
catch (TransformerException e){
throw new JDOMException("XSLT Transformation failed", e);
}
}
edit:
I am following up a project from my boss, for these reason I have not the entire code to show you here.

Maybe I have missed the question, but the API (http://docs.oracle.com/javaee/1.4/api/javax/xml/transform/Transformer.html#setParameter(java.lang.String, java.lang.Object)) for setParameter does not expect
value - The value object. This can be any valid Java object. It is up to the processor to provide the proper object coersion or to simply pass the object on for use in an extension.
This could then vary by implementation, assuming you are using JDOM.
There may be a CDATA xml element that would then be processed correctly. Maybe: http://www.jdom.org/docs/apidocs/org/jdom2/CDATA.html
You could still think about setting the serializer settings to some sort of whitespace preservation. http://www.jdom.org/docs/apidocs.1.1/org/jdom/output/Format.TextMode.html

Related

How do you write a JUnit Test for TransformerException from javax.xml.transform.Transformer.transform

I have been trying to write a unit test in an attempt to reach full coverage on my class under test.
I am trying to test that it properly catches the TransformerException thrown from this method in class DOMUtil:
public final class DOMUtil
{
// This class is entirely static, so there's no need to ever create an instance.
private DOMUtil()
{
// Do nothing.
}
...
...
/**
* Returns a String containing XML corresponding to a Document. The String
* consists of lines, indented to match the Document structure.
* #param doc - Document to be converted.
* #return String containing XML or null if an error occurs.
*/
public static String documentToString(final Document doc)
{
try
{
// Note that there is no control over many aspects of the conversion,
// e.g., insignificant whitespace, types of quotes.
final Transformer tf = TransformerFactory.newInstance().newTransformer();
tf.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
tf.setOutputProperty(OutputKeys.INDENT, "yes");
final Writer out = new StringWriter();
tf.transform(new DOMSource(doc), new StreamResult(out));
return out.toString();
}
catch (final TransformerException e)
{
LOG.error("Error converting Document to String: " + e.getMessage());
return null;
}
}
}
My DOMUtilTest class:
#RunWith(PowerMockRunner.class)
...
...
#PrepareForTest({Document.class, TransformerFactory.class, Transformer.class, DOMUtil.class, DocumentBuilder.class, DocumentBuilderFactory.class})
public class DOMUtilTest
{
/**
* Test documentToString with TransformerException
*/
#Test(expected=TransformerException.class)
public void testDocumentToStringTransformerException()
{
try
{
// Mocking Stuff
TransformerFactory fac = PowerMockito.mock(TransformerFactory.class);
Transformer transformer = PowerMockito.mock(Transformer.class);
// probably only need two of these??
PowerMockito.whenNew(TransformerFactory.class).withNoArguments().thenReturn(fac);
PowerMockito.whenNew(Transformer.class).withNoArguments().thenReturn(transformer);
PowerMockito.when(fac.newTransformer(ArgumentMatchers.any(Source.class))).thenReturn(transformer);
PowerMockito.when(fac.newTransformer()).thenReturn(transformer);
// spy in results
PowerMockito.spy(transformer);
PowerMockito.when(transformer, "transform", ArgumentMatchers.any(Source.class), ArgumentMatchers.any(Result.class)).thenThrow(new TransformerException("Mocked TransformerException"));
final String result = DOMUtil.documentToString(doc);
LOG.info("result length: " + result.length());
}
catch(Exception e)
{
LOG.error("Exception in testDocumentToStringTransformerException: " + e.getMessage());
fail("Exception in testDocumentToStringTransformerException: " + e.getMessage());
}
}
}
I feel like I have tried every possible solution. I have a lot of working tests with similar conditions on other classes/methods. I have tried
annotations style mocking/injecting
spying
ArgumentMatchers.any(DOMSource.class), ArgumentMatchers.any(StreamResult.class) (which gives error: The method any(Class) from the type ArgumentMatchers refers to the missing type DOMSource)
and every other possible way I could think of. Right now the result is still showing: result length: 25706 (the real length of the doc object) without getting an exception before copying the document.
Here is a further explanation of TransformationException from Java 7 API - Exceptions and Error Reporting.
TransformerException is a general exception that occurs during the
course of a transformation. A transformer exception may wrap another
exception, and if any of the TransformerException.printStackTrace()
methods are called on it, it will produce a list of stack dumps,
starting from the most recent. The transformer exception also provides
a SourceLocator object which indicates where in the source tree or
transformation instructions the error occurred.
TransformerException.getMessageAndLocation() may be called to get an
error message with location info, and
TransformerException.getLocationAsString() may be called to get just
the location string.
My assumption is the problem is with mocking of the line:
final Transformer tf = TransformerFactory.newInstance().newTransformer();
Has anybody encountered a situation like this, with JUnit and PowerMockito?
If anybody could point me in the right direction or show me how to solve this any help would be greatly appreciated.
Thanks!
In theory this does not seem like the prettiest way of mocking the TransformerException, but upon a lot of research and attempts, I came up with the solution of mocking the Result outputTarget argument of tf.transform:
#Test
public void testDocumentToStringTransformerException() throws Exception
{
// Mocking Stuff
PowerMockito.whenNew(StringWriter.class).withNoArguments().thenReturn(null);
// Execute Method Under Test
final String result = DOMUtil.documentToString(doc);
// Verify Result
assertNull(result);
}
Which will ultimately give:
XML-22122: (Fatal Error) Invalid StreamResult - OutputStream, Writer, and SystemId are null.
encapsulated into a TransformerException. This allows me to test the TransformerException branches of code coverage.
So the good news is it is possible with mocking, instead of trying to figure out how to manipulate a Document object to throw the exception. Also don't have to mock as many objects as I initially thought.
(References to TransformerException here: Java 7 API - Exceptions and Error Reporting)

javax.xml.transform.TransformerFactory Unicode issue- Java

We are not able transform Unicode Characters properly. We are giving input in XML format, when we try to transform we are not able to get back the original string.
This is the code i'm using,
StringCarrier OStringCarrier = new StringCarrier();
String SXmlFileData= "<export_candidate_response><criteria><output><lastname>Bhagavath</lastname><firstname>ガネーシュ</firstname></output></export_candidate_response>";
String SResult = "";
try
{
TransformerFactory tFactory = TransformerFactory.newInstance();
Transformer transformer = tFactory.newTransformer(new StreamSource(SXslFileName));
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF8");
OutputStream xmlResult = (OutputStream)new ByteArrayOutputStream();
StreamResult outResult = new StreamResult(xmlResult);
transformer.transform(new StreamSource(
new ByteArrayInputStream(SXmlFileData.getBytes("UTF8"))),outResult);
SResult = outResult.getOutputStream().toString();
}
catch (TransformerConfigurationException OException)
{
//Exception has been thrown
OException.printStackTrace();
return OStringCarrier;
}
catch (TransformerException OException)
{
//Exception has been thrown
OException.printStackTrace();
return OStringCarrier;
}
catch (Exception OException)
{
//Exception has been thrown
OException.printStackTrace();
return OStringCarrier;
}
This is the output i'm getting ガãƒ?ーシュ in place of ガネーシュ
This is the output i'm getting ガãƒ?ーシュ in place of ガネーシュ
That tells you that somewhere in this process, data in UTF-8 is being read by a piece of software that thinks it is reading Latin-1. What it doesn't tell you is where in the process this is happening. So you need to divide-and-conquer - you need to find the last point at which the data is correct.
Start by establishing whether the problem is before the transformation or after it. That's very easy if you're using an XSLT 2.0 processor: you can use ` to see what string of characters the XSLT processor has been given. It's a bit trickier with a 1.0 processor, but you can use substring($in, $n, 1) to extract the nth character, and that should give you a clue.
My suspicion is that it's the input. Firstly, putting non-ASCII characters in a Java string literal is always a bit dangerous, because the round trip to a source repository can easily corrupt the code if you're not very careful about everything being configured correctly. Secondly, if the string is correct, it would be much safer to read it using a StringReader, rather than converting it to a byte stream. Try:
transformer.transform(new StreamSource(
new StringReader(SXmlFileData)),outResult);

JDOM Transformer - don't contract empty elements

I'm using JDOM 2.0.6 to transform an XSLT into an HTML, but I'm coming across the following problem - sometimes the data should be empty, that is, I'll have in my XSLT the following:
<div class="someclass"><xsl:value-of select="somevalue"/></div>
and when somevalue is empty, the output I get is:
<div class="someclass"/>
which may be perfectly valid XML, but is not valid HTML, and causes problems when displaying the resulting page.
Similar problems occur for <span> or <script> tags.
So my question is - how can I tell JDOM not to contract empty elements, and leave them as <div></div>?
Edit
I suspect the problem is not in the actual XSLTTransformer, but later when using JDOM to write to html. Here is the code I use:
XMLOutputter htmlout = new XMLOutputter(Format.getPrettyFormat());
htmlout.getFormat().setEncoding("UTF-8");
Document htmlDoc = transformer.transform(model);
htmlDoc.setDocType(new DocType("html"));
try (OutputStreamWriter osw = new OutputStreamWriter(new FileOutputStream(outHtml), "UTF-8")) {
htmlout.output(htmlDoc, osw);
}
Currently the proposed solution of adding a zero-width space works for me, but I'm interested to know if there is a way to tell JDOM to treat the document as an HTML (be it in the transform stage or the output stage, but I'm guessing the problem lies in the output stage).
You can use a zero-width-space between the elements. This doesn't affect the HTML output, but keeps the open-close-tags separated because they have a non-empty content.
<div class="someclass">​<xsl:value-of select="somevalue"/></div>
Downside is: the tag is not really empty anymore. That would matter if your output would be XML. But for HTML - which is probably the last stage of processing - it should not matter.
In your case, the XML transform is happening directly to a file/stream, and it is no longer in the control of JDOM.
In JDOM, you can select whether the output from the JDOM document has expanded, or not-expanded output for empty elements. Typically, people have output from JDOM like:
XMLOutputter xout = new XMLOutputter(Format.getPrettyFormat());
xout.output(document, System.out);
You can modify the output format, though, and expand the empty elements
Format expanded = Format.getPrettyFormat().setExpandEmptyElements(true);
XMLOutputter xout = new XMLOutputter(expanded);
xout.output(document, System.out);
If you 'recover' (assuming it is valid XHTML?) the XSLT transformed xml as a new JDOM document you can output the result with expanded empty elements.
If you want to transform to a HTML file then consider to use Jaxp Transformer with a JDOMSource and a StreamResult, then the Transformer will serialize the transformation result as HTML if the output method is html (either as set in your code or as done with a no-namespace root element named html.
In addition to the "expandEmptyElements" option, you could create your own writer and pass it to the XMLOutputter:
XMLOutputter outputter = new XMLOutputter(Format.getPrettyFormat().setExpandEmptyElements(true));
StringWriter writer = new HTML5Writer();
outputter.output(document, writer);
System.out.println(writer.toString());
This writer can then modify all HTML5 void elements. Elements like "script" for example won't be touched:
private static class HTML5Writer extends StringWriter {
private static String[] VOIDELEMENTS = new String[] { "area", "base", "br", "col", "command", "embed", "hr",
"img", "input", "keygen", "link", "meta", "param", "source", "track", "wbr" };
private boolean inVoidTag;
private StringBuffer voidTagBuffer;
public void write(String str) {
if (voidTagBuffer != null) {
if (str.equals("></")) {
voidTagBuffer.append(" />");
super.write(voidTagBuffer.toString());
voidTagBuffer = null;
} else {
voidTagBuffer.append(str);
}
} else if (inVoidTag) {
if (str.equals(">")) {
inVoidTag = false;
}
} else {
for (int i = 0; i < VOIDELEMENTS.length; i++) {
if (str.equals(VOIDELEMENTS[i])) {
inVoidTag = true;
voidTagBuffer = new StringBuffer(str);
return;
}
}
super.write(str);
}
}
}
I know, this is dirty, but I had the same problem and didn't find any other way.

Transformation of multiple input files

Right now i am using this java (which receives one xml file parameter) method to perform XSLT transformation:
static public byte[] simpleTransform(byte[] sourcebytes, int ref_id) {
try {
StreamSource xmlSource = new StreamSource(new ByteArrayInputStream(sourcebytes));
StringWriter writer = new StringWriter();
transformations_list.get(ref_id).transformer.transform(xmlSource, new StreamResult(writer));
return writer.toString().getBytes("UTF-8");
} catch (Exception e) {
e.printStackTrace();
return new byte[0];
}
}
And in my xslt file i am using the document('f2.xml') to refer to other transform related files.
I want to use my Java like this (get multiple xml files):
static public byte[] simpleTransform(byte[] f1, byte[] f2, byte[] f3, int ref_id)
An in my XSLT i don't want to call document('f2.xml') but refer to the object by using f2 received in my Java method.
Is there a way to do it? how do i refer to
f2.xml
in my XSLT using this way?
I'm not entirely sure what is in f1, f2 etc. Is it the URL of a document? or the XML document content itself?
There are two possible approaches you could consider.
One is to write a URIResolver. When you call document('f2.xml') Saxon will call your URIResolver to get the relevant document as a Source object. Your URIResolver could return a StreamSource initialized with a ByteArrayInputStream referring to the relevant btye[] value.
A second approach is to supply the documents as parameters to the stylesheet. You could declare a global parameter <xsl:param name="f2" as="document-node()"/> and then use Transfomer.setParameter() to supply the actual document; within the stylesheet, replace document('f2.xml') by $f2. Saxon will accept a Source object as the value supplied to setParameter, so you could again create a StreamSource initialized with a ByteArrayInputStream referring to the relevant btye[] value; alternatively (and perhaps better) you could pre-build the tree by calling a Saxon DocumentBuilder.

jaxb marshaller characterEscapeHandler

I have the following problem. I've set the following properties to the marshaller:
marshaller.setProperty( Marshaller.JAXB_FORMATTED_OUTPUT, Boolean.TRUE );
marshaller.setProperty( "com.sun.xml.bind.characterEscapeHandler", new CharacterEscapeHandler() {
public void escape(char[] ch, int start, int length, boolean isAttVal, Writer out) throws IOException {
String s = new String(ch, start, length);
System.out.println("Inside CharacterEscapeHandler...");
out.write(StringEscapeUtils.escapeXml(StringEscapeUtils.unescapeXml(s)));
}
});
When i try to marshall an object to SOAPBody with the following code:
SOAPMessage message = MessageFactory.newInstance().createMessage();
marshaller.marshal(request, message.getSOAPBody());
the CharacterEscapeHandler.escape is not invoked, and the characters are not escaped, but this code:
StringWriter writer = new StringWriter();
marshaller.marshal(request, writer);
invokes CharacterEscapeHandler.escape(), and all the characters are escaped... Is this normal behaviour for JAXB. And how can I escape characters before placing them inside SOAP's body?
Update:
Our system have to communicate with another system, which expects the text to be escaped.
Example for message sent by the other system:
<env:Envelope xmlns:env="http://www.w3.org/2003/05/soap-envelope">
<env:Body xmlns:ac="http://www.ACORD.org/Standards/AcordMsgSvc/1">
<ac:CallRs xmlns:ac="http://www.ACORD.org/Standards/AcordMsgSvc/1">
<ac:Sender>
<ac:PartyId>urn:partyId</ac:PartyId>
<ac:PartyRoleCd/>
<ac:PartyName>PARTYNAME</ac:PartyName>
</ac:Sender>
<ac:Receiver>
<ac:PartyRoleCd>broker</ac:PartyRoleCd>
<ac:PartyName>�марант Българи� ООД</ac:PartyName>
</ac:Receiver>
<ac:Application>
<ac:ApplicationCd>applicationCd</ac:ApplicationCd>
<ac:SchemaVersion>schemaversion/</ac:SchemaVersion>
</ac:Application>
<ac:TimeStamp>2011-05-11T18:41:19</ac:TimeStamp>
<ac:MsgItem>
<ac:MsgId>30d63016-fa7d-4410-a19a-510e43674e70</ac:MsgId>
<ac:MsgTypeCd>Error</ac:MsgTypeCd>
<ac:MsgStatusCd>completed</ac:MsgStatusCd>
</ac:MsgItem>
<ac:RqItem>
<ac:MsgId>d8c2d9c4-3f1c-459f-abe1-0e9accbd176b</ac:MsgId>
<ac:MsgTypeCd>RegisterPolicyRq</ac:MsgTypeCd>
<ac:MsgStatusCd>completed</ac:MsgStatusCd>
</ac:RqItem>
<ac:WorkFolder>
<ac:MsgFile>
<ac:FileId>cid:28b8c9d1-9655-4727-bbb2-3107482e7f2e</ac:FileId>
<ac:FileFormatCd>text/xml</ac:FileFormatCd>
</ac:MsgFile>
</ac:WorkFolder>
</ac:CallRs>
</env:Body>
</env:Envelope>
So I need to escape all the text between the opening/closing tags.. like this inside ac:PartyName
When you marshal to a DOM Document, JAXB is not in charge of the actual serialization and escaping, it just builds the DOM tree in memory. The serialization is then handled by the DOM implementation.
Needing additional escaping when writing xml is usually a sign of a design problem or not using xml correctly. If you can give some more context why you need this escaping, maybe I could suggest an alternative solution.

Categories