Restrict element creation in XSLT if value is empty - java

I wanted to create new element in target XML if and only if the element value of source XML is not empty. I can do this using below code. But, my problem is I have around 5k field to wrap with similar condition. Do we have any better way to handle this?
<xsl:if test="edi:po-num"> //wanted to avoid this for each element
<xsl:element name="element">
<xsl:attribute name="name">order_reference_number</xsl:attribute>
<xsl:value-of select="edi:po-num"/>
</xsl:element>
</xsl:if>
java code to transform:
Transformer trans = StylesheetCache.newTransformer(xslFilePath);
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
trans.transform(source, new StreamResult(outputStream));

Your options in XSLT 1.0 are limited - XSLT 1.0 code tends to be verbose. But if it's really repetitive, then you could consider writing a meta-stylesheet - an XSLT stylesheet that generates your stylesheet from some higher-level description of what it needs to do.
Note also, your code will be a lot less verbose if you use literal result elements and attribute value templates rather than xsl:element and xsl:attribute.

Related

Xalan and the document() function

Since I have spent almost a full day now on debugging this, I hope to get some valuable insight on SO on following problem:
I am running an XSL Transformation on an input document, my stylesheet loads an external XML-Document which contains lookup values I need to do some comparisons.
I am loading the external document like this:
<xsl:variable name="dictionary"
select="document('myDict.xml', document(''))/path/to/LookupElement" />
LookupElement is an element which contains the complete XML-Fragment I need to access.
Throughout the stylesheet various comparison expressions are accessing $dictionary.
Now, what happens is, that the transformation with this document() function call in place takes about 12 (!) minutes using Xalan (2.7.?, latest version, downloaded from the Apache website, not the one contained in the JRE).
The same stylesheet without the document() call (and without my comparisons accessing data in $dictionary) completes in seconds.
The same stylesheet using Saxon-B 9.1.0.8 completes in seconds as well.
Information: The external document has 25MB(!) and there is no possibility for me to reduce its size.
I am running the transformations using the xslt-Task of ant under JRE 6.
I am not sure if this has anything to do with above mentioned problem: Throughout my stylesheet I have expressions that test for existence of certain attributes in the external XML-Document. These expressions always evaluate to true, regardless of whether the attributes exist or not:
<xsl:variable name="myAttExists" select="boolean($dictionary/path/to/#myAttribute)"/>
I am at the end of my wits. I know that Xalan correctly reads the document, all references go to $dictionary, so I am not calling document() multiple times.
Anybody any idea?
Edit:
I have removed the reference to the XML-Schema from the external XML-Document to prevent Schema-Lookups of Xalan or the underlying (Xerces) Parser.
Edit:
I have verified that myAttExists will always be true, even if specifiying an attribute name that for sure does not exist in the entire external XML-Document.
I have even changed the above expression to:
<xsl:variable name="myAttExists" select="count($dictionary/path/to/#unknownAttribute) != 0"/>
which still yields true.
Edit:
I have removed the call to the document() function and all references to $dictionary for testing purposes. This reduces transformation runtime with Xalan to 16 seconds.
Edit:
Interesting detail: The Xalan version shipped with Oxygen 12.1 completes within seconds loading the external XML-Document. However, it also evaluates the existence of attributes incorrectly...
Edit:
I have the following variable declaration which always yields true:
<xsl:variable name="expectedDefaultValueExists">
<xsl:choose>
<xsl:when test="#index">
<xsl:value-of select="boolean($dictionary/epl:Object[#index = $index]/#defaultValue)"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="boolean($dictionary/epl:Object[#index = $index]/epl:SubObject[#subIndex = $subIndex]/#defaultValue)"/>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
Is this possible in XSLT/XPath 1.0? $index and $subIndex are calculated from the #index and #subIndex attributes of the context node. I want to load the defaultValue attribute from the external XML-Document which has an equal index and/or subIndex.
Is it possible to use variables in predicates in XPath 1.0? This works in XPath 2.0.
Regarding the incorrect variable assignment, I don't believe in a parser (Xalan) issue anymore, since PHPs XSLTProcessor does the same. It must be an issue in the variable declaration...
This only answers the last part of the question, but it's getting too unwieldy for comments...
I have the following variable declaration which always yields true when used as the test of an xsl:if or xsl:when:
<xsl:variable name="expectedDefaultValueExists">
<xsl:choose>
<xsl:when test="#index">
<xsl:value-of select="boolean($dictionary/epl:Object[#index = $index]/#defaultValue)"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="boolean($dictionary/epl:Object[#index = $index]/epl:SubObject[#subIndex = $subIndex]/#defaultValue)"/>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
In XSLT 1.0 a variable with a body rather than a select always becomes a "result tree fragment", in this case with a single text node child that will contain the string "true" or "false" as appropriate. Any non-empty RTF is considered true when converted to boolean.
In XSLT 2.0 it's a similar story - 2.0 doesn't distinguish between node sets and result tree fragments, but still the variable will be a "temporary tree" with a single text node child whose value is the string "true" or "false", and both these trees are true when converted to boolean. If you want to get an actual boolean value out of the variable then you need to change two things - add as="xs:boolean" to the variable declaration and use xsl:sequence instead of xsl:value-of:
<xsl:variable name="expectedDefaultValueExists" as="xs:boolean">
<xsl:choose>
<xsl:when test="#index">
<xsl:sequence select="boolean($dictionary/epl:Object[#index = $index]/#defaultValue)"/>
</xsl:when>
<xsl:otherwise>
<xsl:sequence select="boolean($dictionary/epl:Object[#index = $index]/epl:SubObject[#subIndex = $subIndex]/#defaultValue)"/>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
The xsl:value-of instruction converts the result of its select into a string and constructs a text node containing that string. The xsl:sequence instruction simply returns the value from the select directly as whatever type it happens to be.
But there are simpler ways to achieve the same thing. In XPath 2.0 you can do if/then/else constructs directly in the XPath
<xsl:variable name="expectedDefaultValueExists"
select="if (#index)
then $dictionary/epl:Object[#index = $index]/#defaultValue
else $dictionary/epl:Object[#index = $index]/epl:SubObject[#subIndex = $subIndex]/#defaultValue" />
In 1.0 you need to be slightly more creative
<xsl:variable name="expectedDefaultValueExists"
select="(#index and $dictionary/epl:Object[#index = $index]/#defaultValue)
or (not(#index) and $dictionary/epl:Object[#index = $index]/epl:SubObject[#subIndex = $subIndex]/#defaultValue)" />

XSLT in Java: CDATA section split

I want to replace some items in a huge XML file, and I thought I'll do it with XSLT. I have absolutely no experience with it, so if you think there would be better ways to do this, please tell me.
Anyway, as a first step I just wanted to copy the whole XML over. This is my xsl file:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="no" cdata-section-elements="script"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
The relevant Java code is:
Source xmlInput = new StreamSource(oldProjectStream);
Source xsl = new StreamSource("test.xsl");
Transformer transformer = TransformerFactory.newInstance().newTransformer(xsl);
StreamResult xmlOutput = new StreamResult("output/project.xml");
transformer.transform(xmlInput, xmlOutput);
Most of the output is fine, also the order of the elements is not changed (this could turn out quite important).
The XML contains some Lua code in CDATA sections. At some (seemingly random) points, however, the CDATA section is closed and reopened again. It seems to have to do with brackets in the code, but just rately - there are about 5 points in a 1.4 MB XML looking like this:
<script><![CDATA[
...
html_encoding["Otilde" ] = string.char(213)
html_encoding["Ouml" ]]]><![CDATA[ = string.char(214)
html_encoding["Oslash" ] = string.char(216)
...
]]></script>
In the original file, the middle line looks just like the other ones. There are thousands of lines where I've put the dots. What's going on here?
The (proprietary) application that should handle the XML isn't able to load it.
It's useful to tell us which XSLT processor you are using.
The serializer has to close and reopen a CDATA section if it encounters "]]>" in the data, because that sequence cannot legally appear in a CDATA section. It shouldn't need to do so under any other circumstances, though the spec probably doesn't disallow it.

Unusual output for XSL transformations

I have an xml document and a style sheet to convert the document into another useful xml.
For the reference the xml document is somewhat like this:
<root>
<element1>value1</element1>
<element2>value2</element2>
<element3>value3</element3>
<element4>..some more levels of data</element4>
</root>
The style sheet looks somewhat like this:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:include href="errorResponse.xsl"/>
<xsl:template match="root/element4">
<xsl:element name="myRoot">
<xsl:element name="myElement">
<xsl:apply-templates select="./someElement/someOtherElement"/>
</xsl:element>
</xsl:element>
</xsl:template>
The output xml string which I am getting is like this:
<?xml version="1.0" encoding="ISO-8859-1"?>
value1
value2
value3
<myRoot><myelement> some data </myElemrnt></myroot>
The code snippet which I am using for transformation is this:
InputStream styleSheet = new FileUtil().getFileStream("xsltFileName");
StreamSource xslStream = new StreamSource(styleSheet);
DOMSource in = new DOMSource(inputXMLDoc);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
TransformerFactory transFact = TransformerFactory.newInstance();
transFact.setURIResolver(new XsltURIResolver());
Transformer trans = transFact.newTransformer(xslStream);
trans.transform(in, new StreamResult(baos));
System.out.println(baos.toString()); // displays the above output
However the output is in undesired format. I dont want value1, value2, value3. This is also creating problems further for the new XML generated, to be processed.
I have seen a lot of questions around the transformations. This is bugging me for a long time. Appreciate a lot if someone could point out where I am going wrong.
Also point out if I am following any incorrect conventions during the entire process.
Thanks and regards.
You are getting that output because of the Default Template Rule, which outputs the text nodes. If you don't want those nodes you need to exclude them explicitly by matching them and replacing them with nothing (i.e. an empty template).
Try adding this template to your stylesheet:
<xsl:template match="/">
<xsl:apply-templates select="root/element4"/>
</xsl:template>
It matches the root and discards everything except for root/element4.
What happens here is that the XSLT built-in templates are applied to any node not matched explicitly by a template. The net effect of the built-in templates is to copy any text node (on which tey are applied) to the output.
One of the simplest and shortest way to supress this unwanted output is to add the following template:
<xsl:template match="text()"/>
which causes any text-node for which this template is selected for execution, not to be copied to the output.

How to match and process unknown XML elements in XSLT 1.0?

I have a simply XSLT 1.0 stylesheet, that turns XML documents in XHTML. I really want to be able to "include" the content of an XML file in another when needed. AFAIK it is simply not possible in XSLT 1.0, so I decided to move my processing to a simple Java app that would pre-process the XML, executing the "includes" recursively, and passing it to the default JDK XSLT processor. I have a XML schema that my documents must conform to.
The most used element is called "text", and can have an "id" and/or a "class" attribute, which gets used for XHTML styling with CSS. This element gets turned into "p", "div", or "span" depending on the context.
What I would like to add, is the ability to define "unknown" elements in my input files, and have them transformed in a "text" element for further processing. If the "unknown" element's name start with a capital letter, then it becomes a "text", with "id" set to original name. Otherwise a "text" with "class" set to original name. Everything else in the unknown element should be kept as-is, and then it should be processed by XSLT as if it was originally in the input file. In other words, I would like to transform all unknown elements to for a valid XML document, and then process it with my stylesheet.
Can this be done in XSLT, possibly in a pre-processing "stylesheet", or should I do that as pre-processing in Java? Performance here is not important. I would prefer a XSLT solution, but not if it's much more complicated then doing it in Java.
Well, since no one answered, I just tried it. While is is easier to do it in Java, it has one major drawback: since the code need to know the valid elements so that it recognize the unknown ones, you end up having to hardcode that in your code and have to recompile it if the XSLT template changes.
So, I tried in XSLT and it also works. Let's say you have:
<xsl:template match="text">
*processing*
<xsl:call-template name="id_and_class"/>
*processing*
</xsl:template>
where the template named id_and_class copies your id and classes attribute in the generated element, and you want unknown elements to be mapped to "text" elements, then you can do this:
<xsl:template match="text">
<xsl:call-template name="text_processing"/>
</xsl:template>
<xsl:template name="text_processing">
*processing*
<xsl:call-template name="text_id_and_class"/>
*processing*
</xsl:template>
...
<xsl:template name="text_id_and_class">
<xsl:choose>
<!-- If name() is not "text", then we have an unknown element. -->
<xsl:when test="name()!='text'">
<!-- Processing of ID and class omitted ... -->
</xsl:when>
<xsl:otherwise>
<xsl:call-template name="id_and_class"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
...
<!-- MUST BE LAST : Process unknown elements like a "text" element. -->
<xsl:template match="*">
<xsl:call-template name="text_processing"/>
</xsl:template>
If yon process the content of one specific element with a named template, then you can check in that template if the name matches, and use that for your special processing. Then you just have to put a <xsl:template match="*"> at the end of your stylesheet and call the named template from there.

How to put String text without converting content to xml file in Java?

I need to put String content to xml in Java. I use this kind of code to insert information in xml:
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(new File ("file.xml"));
DOMSource source = new DOMSource (doc);
Node cards = doc.getElementsByTagName ("cards").item (0);
Element card = doc.createElement ("card");
cards.appendChild(card);
Element question = doc.createElement("question");
question.appendChild(doc.createTextNode("This <b>is</b> a test.");
card.appendChild (question);
StreamResult result = new StreamResult (new File (file));
Transformer tf = TransformerFactory.newInstance().newTransformer();
tf.setOutputProperty(OutputKeys.INDENT, "yes");
tf.transform(source, result);
But string is converted in xml like this:
<cards>
<card>
<question>This <b>is</b> a test.</question>
</card>
</cards>
It should be like this:
<cards>
<card>
<question>This <b>is</b> a test.</question>
</card>
</cards>
I tried to use CDDATA method but it puts code like this:
// I changed this code
question.appendChild(doc.createTextNode("This <b>is</b> a test.");
// to this
question.appendChild(doc.createCDATASection("This <b>is</b> a test.");
This code gets a xml file look like:
<cards>
<card>
<question><![CDATA[This <b>is</b> a test.]]></question>
</card>
</cards>
I hope that somebody can help me to put String content in the xml file exactly with same content.
Thanks in advance!
This would be expected behaviour.
Consider if the brackets were kept as you put them, the end result would essentially be:
<cards>
<card>
<question>
This
<b>
is
</b>
a test.
</question>
</card>
</cards>
Basically, it would result in the <b> being an additional node in the xml tree. Encoding the brackets to < and > ensures that when displayed by any xml parser, the brackets will be displayed, and not confused as being an additional node.
If you really wanted them to display as you say you do, you will need to create elements named b. This will not only be awkward, it will also not display quite as you've written above - it would display as additional nested nodes as I've shown above. So you would need to amend the xml writer to output inline for those tags.
Nasty.
Check this solution: how to unescape XML in java
Maybe you could solve it in this way (code only for <question> tag part):
Element question = doc.createElement("question");
question.appendChild(doc.createTextNode("This ");
Element b = doc.createElement("b");
b.appendChild(doc.createTextNode("is");
question.appendChild(b);
question.appendChild(doc.createTextNode(" a test.");
card.appendChild(question);
What you are effectively trying to do is to insert XML into the middle of a DOM without parsing it. You can't do this since the DOM APIs don't support it.
You have three choices:
You could serialize the DOM and then insert the String at the appropriate point. The end result may or may not be well-formed XML ... depending on what is in the String that you inserted.
You could create and insert DOM nodes representing the text and the <b>...</b> element. This requires you to know the XML structure of the stuff that you are inserting. #bluish's answer gives an example.
You could wrap the String in some container element, parse it using an XML parser to give a second DOM, find the nodes of interest, and add them to the original DOM. This requires that the String is well-formed XML when wrapped in the container element.
Or, since you're already using a Transformation, why not go all the way?
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()" />
</xsl:copy>
</xsl:template>
<xsl:template match="cards">
<card>
<question>This <b>is</b> a test</question>
</card>
</xsl:template>
</xsl:stylesheet>

Categories