This question is a follow up to my earlier question:
Creating a valid XSD that is open using <all> and <any> elements
Given that I have a Java String containing an XML document of the following form:
<TRADE>
<TIME>12:12</TIME>
<MJELLO>12345</MJELLO>
<OPTIONAL>12:12</OPTIONAL>
<DATE>25-10-2011</DATE>
<HELLO>hello should be ignored</HELLO>
</TRADE>
How can I use XSLT or similar (in Java by using JAXB) to remove all elements not contained in a set of elements.
In the above example I am only interested in (TIME, OPTIONAL, DATE), so I would like to transform it into:
<TRADE>
<TIME>12:12</TIME>
<OPTIONAL>12:12</OPTIONAL>
<DATE>25-10-2011</DATE>
</TRADE>
The order of the elements is not fixed.
This transformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:param name="pNames" select="'|TIME|OPTIONAL|DATE|'"/>
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|#*" name="identity">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="*/*">
<xsl:if test="contains($pNames, concat('|', name(), '|'))">
<xsl:call-template name="identity"/>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
when applied on the provided XML document:
<TRADE>
<TIME>12:12</TIME>
<MJELLO>12345</MJELLO>
<OPTIONAL>12:12</OPTIONAL>
<DATE>25-10-2011</DATE>
<HELLO>hello should be ignored</HELLO>
</TRADE>
produces the wanted, correct result:
<TRADE>
<TIME>12:12</TIME>
<OPTIONAL>12:12</OPTIONAL>
<DATE>25-10-2011</DATE>
</TRADE>
Explanation:
The identity rule (template) copies every node "as-is".
The identity rule is overridden by a template matching any element that is not the top element of the document. Inside the template a check is made if the name of the matched element is one of the names specified in the external parameter $pNames in a pipe-delimited string of wanted names.
See the documentation of your XSLT processor on how to pass a parameter to a transformation -- this is implementation-dependent and differs from processor to processor.
I haven't tried yet, but maybe the javax.xml.tranform package can help:
http://download.oracle.com/javase/6/docs/api/javax/xml/transform/package-summary.html
JAXB & XSLT
JAXB integrates very cleanly with XSLT for an example see:
How to get jaxb to Ignore certain data during unmarshalling
Your Other Question
Based on your previous question (see link below), the transform is really unnecessary as JAXB will just ignore attributes and elements that are not mapped to fields/properties in your domain object.
Creating a valid XSD that is open using <all> and <any> elements
Related
I am trying to extract the last 4 numbers of the "red" sibling with xpath.
The source xml looks like:
...
<node2>
<key><![CDATA[RED]]></key>
<value><![CDATA[98472978241908]]></value>
... more key value pairs here...
</node2>
...
And when I use the follwing xpath:
/nodelevelX/nodelevelY/node2/key[text()='RED']/following-sibling::value
I have the full number in output, then I tried to extract the digit with this xpath experssion:
/nodelevelX/nodelevelY/node2/key[text()='RED']/following-sibling::value/text()[substring(., string-length(.)-4)]
I still have the full number. The substring function does not seems to work.
my xsl header is:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
I think there is a small error, but I cannot see where. I followed many discussions on SO and others (w3schools) and tried to follow the advices whithout success.
UPDATE: The context:
I use the following identity which copy all the nodes from my source XML to the destination (xml)
and I apply specific rules for some node after inside a xsl:template:
<!-- This copy the whole source XML in destination -->
<xsl:template match="node() | #*">
<xsl:copy>
<xsl:apply-templates select="node() | #*" />
</xsl:copy>
</xsl:template>
<!-- specific rules for some nodes -->
<xsl:template match="/nodeDetails">
<mynewnode>
<!-- here I take the whole value and it s working -->
<someVal><xsl:value-of select="/nodeDetails/nodeX/key[text()='ANOTHER_KEY']/following-sibling::value" /></someVal>
<!-- FIXME substring does not work now -->
<redVal><xsl:value-of select="/nodeDetails/nodeX/key[text()='RED']/following-sibling::value/text()[substring(.,string-length(.)-4)]" /></redVal>
</mynewnode>
</xsl:template>
And for the transformation I use the following code from a junit class in Java (JDK 6):
#Test
public void transformXml() throws TransformerException {
TransformerFactory factory = TransformerFactory.newInstance();
Source xslt = new StreamSource(getClass().getResourceAsStream("contract.xsl"));
Transformer transformer = factory.newTransformer(xslt);
Source input = new StreamSource(getClass().getResourceAsStream("source.xml"));
Writer output = new StringWriter();
transformer.transform(input, new StreamResult(output));
System.out.println("output=" + output.toString());
}
Your current XPath will evaluate to a nodeset, but what you need is a string. Please try something like this:
<xsl:variable name="value"
select="/nodelevelX/nodelevelY/node2/key[. = 'RED']
/following-sibling::value[1]" />
<xsl:value-of select="substring($value, string-length($value) - 3)" />
Though to be sure about an answer, I'd need to see the portion of your XSLT where you are trying to output this value.
Use this XPath 2.0 expression:
/nodelevelX/nodelevelY/node2/key[text()='RED']
/following-sibling::*[1][self::value]
/substring(., string-length() -3)
XSLT 2.0 - based verification:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:copy-of select=
"/nodelevelX/nodelevelY/node2/key[text()='RED']
/following-sibling::*[1][self::value]
/substring(., string-length() -3)"/>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the following XML document:
<nodelevelX>
<nodelevelY>
<node2>
<key>GREEN</key>
<value>0123456789</value>
<key>RED</key>
<value>98472978241908</value>
<key>BLACK</key>
<value>987654321</value>
</node2>
</nodelevelY>
</nodelevelX>
the XPath expression is evaluated and the result of this evaluation is copied to the output:
1908
I want to do the following :
At this moment we receive some xml-files where some xml-tags are filled wrongly.
To help our partner, we want to catch these false values by using a "Pass-through" folder where all the xml-files are placed before importing in our application.
This folder would be read every X minutes and for every file there will need to be done some checks, like : The length of the value within a tag, the value of the tag, etc.
Because this is only a temporary solution, we don't want to implement it in our application.
I was thinking of 2 possible set-ups :
Using java and calling an XSLT-file to transform every file and put it in another folder
Using only java to check the xml-file and do the transformation.
Both of the cases would be called by a .bat that runs every X minutes.
Now my questions :
What do you think that would be the best solution? a.k.a. the quickest, the most secure, etc. (maybe something other than suggested?)
Could you also provide me some examples of the way to do something like this?
I'm not like other persons who ask strictly for the codes. If you can give me something similar, I can make it on my own.
At the time of this writing, I'm already looking for solutions on other websites, but because it is urgent, it's also helpfull to ask the community.
Thank you for your answer,
Kind regards,
Maarten
EDIT : Both answers helped me a lot. Thank you guys.
http://www.ibm.com/developerworks/xml/library/x-javaxmlvalidapi/index.html
or
http://www.java-tips.org/java-se-tips/javax.xml.validation/how-to-create-xml-validator-from-xml-s.html
or
http://docs.oracle.com/javase/1.5.0/docs/api/javax/xml/validation/package-summary.html
If you want to run your XSLT, using a .bat script, on every XML file in a given folder (your first option in the OP) I can think of 3 ways:
A. Basically do a "for" loop to process each individual file via the command line. (Eww.)
B. Use collection() to point to an input folder and use xsl:result-document to create the output files in a new folder.
Here's an example XSLT 2.0 (tested with Saxon 9):
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:param name="pInputDir" select="'input'"/>
<xsl:param name="pOutputDir" select="'output'"/>
<xsl:variable name="vCollection" select="collection(concat($pInputDir,'/?*.xml'))"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/">
<xsl:for-each select="$vCollection">
<xsl:variable name="vOutFile" select="tokenize(document-uri(document(.)),'/')[last()]"/>
<xsl:result-document href="{concat($pOutputDir,'/',$vOutFile)}">
<xsl:apply-templates/>
</xsl:result-document>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Notes:
This stylesheet is just doing an identity transform. It's passing the XML through unchanged. You would need to override the identity template by adding new templates to do your checks/changes.
Also notice that there are 2 parameters for the input and output folder names.
You may run into memory issues using collection() because it loads all of the XML files in the folder into memory. If this is an issue, see below...
C. Have your XSLT process a list of all the files in the directory. Use a combination of document() and the Saxon extension function saxon:discard-document() to load and discard the documents.
Here's an example I used a while back for testing.
XML file listing (input to the XSLT):
<files>
<file>file:///C:/input_xml/file1.xml</file>
<file>file:///C:/input_xml/file2.xml</file>
<file>file:///C:/input_xml/file3.xml</file>
<file>file:///C:/input_xml/file4.xml</file>
<file>file:///C:/input_xml/file5.xml</file>
<file>file:///C:/input_xml/file6.xml</file>
<file>file:///C:/input_xml/file7.xml</file>
<file>file:///C:/input_xml/file8.xml</file>
<file>file:///C:/input_xml/file9.xml</file>
<file>file:///C:/input_xml/file10.xml</file>
</files>
XSLT 2.0 (tested with Saxon 9):
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:param name="pOutputDir" select="'output'"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="files">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="file">
<xsl:variable name="vOutFile" select="tokenize(document-uri(document(.)),'/')[last()]"/>
<xsl:result-document href="{concat($pOutputDir,$vOutFile)}">
<xsl:apply-templates select="document(.)/saxon:discard-document(.)" xmlns:saxon="http://saxon.sf.net/"/>
</xsl:result-document>
</xsl:template>
</xsl:stylesheet>
Notes:
Again, this stylesheet is just doing an identity transform. It's passing the XML through unchanged. You would need to override the identity template by adding new templates to do your checks/changes.
Also notice that there is only a parameter for the output folder name.
I'm trying to validate an XML file, but I get the following error:
Can not find declaration of element
'xsl:stylesheet'.
This is the XML:
<?xml version='1.0' encoding='utf-8'?>
<xsl:stylesheet version='1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform' xmlns:msxsl='urn:schemas-microsoft-com:xslt' exclude-result-prefixes='msxsl' xmlns:ns='http://www.ibm.com/wsla'>
<xsl:strip-space elements='*'/>
<xsl:output method='xml' indent='yes'/>
<xsl:template match='#* | node()'>
<xsl:copy>
<xsl:apply-templates select='#* | node()'/>
</xsl:copy>
</xsl:template>
<xsl:template match="/ns:SLA/ns:ServiceDefinition/ns:WSDLSOAPOperation/ns:SLAParameter/#name[.='TotalMemoryConsumption']">
<xsl:attribute name='{name()}'>
<xsl:text>MemConsumption</xsl:text>
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
Where is the mistake?
EDIT: I want to parse this XML in Java with SAX, but I get the following error:
Element type "xsl:template" must be followed by either attribute specifications, ">" or "/>".
How to get rid of it?
Assuming you are actually trying to validate your XSL as an XML document, it looks like that website requires you to point to a schema or DTD in order to validate the XML against it. You can get a non-normative schema here: http://www.w3.org/TR/xslt20/#schema-for-xslt. Here's instructions on how to reference a schema from an XML file: http://www.ibm.com/developerworks/xml/library/x-tipsch.html
You could also check "Well-Formedness only," and check the document for well-formedness, if not actually validity.
Generally, any XSL engine will report any errors in your XSL document, so you don't need to validate it separately.
Your XSL is OK, don't worry. Just that there is no DTD/XSD for XSLs 1.0. no one bothers checking XSLT stylesheets (1.0) for validity. "Wellformedness" is enough.
I am having a certian issue with special characters in my XML.
Bascially I am splitting up an xml into multiple xmls using Xalan Processor.
When splitting the documents up I am using their value of the name tag as the name of the file generated. The problem is that the name contains characters that arent recognized by the XML processor like ™ (TM) and ® (R). I want to remove those characters ONLY when naming the files.
<xsl:template match="products">
<redirect:write select="concat('..\\xml\\product\\en\\',translate(string(name),'</> ',''),'.xml')">
The above is the XSL code I have writter to split the XML into multlpe XMLs. As you can see I am using hte translate method to subtitute '/','<','>' with '' from the name. I was hoping I could do the same with ™ (TM) and ® (R) but it doesnt seem to work.
Please advice me how I would be able to do that.
Thanks for you help in advance.
I don't have Xalan, but with 8 other XSLT processors this thransformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="text()">
<xsl:value-of select="translate(., '</>™®', '')"/>
===================
<xsl:value-of select="translate(., '</>™®', '')"/>
</xsl:template>
</xsl:stylesheet>
when applied on this XML document:
<t>XXX™ My Trademark®</t>
produces the wanted result:
XXX My Trademark
===================
XXX My Trademark
I suggest that you try to use one of the two expressions above -- at least the second may work successfully.
Following Dimitre answer, I think that if you are not sure about wich special character could be in name, maybe you should keep what you consider legal document's name characters.
As example:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="text()">
<xsl:value-of select="translate(.,
translate(.,
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ ',
''),
'')"/>
</xsl:template>
</xsl:stylesheet>
With input:
<t>XXX™ My > Trademark®</t>
Result:
XXX My Trademark
I'm working with a DSL based on an XML schema that supports functional language features such as loops, variable state with context, and calls to external Java classes. I'd like to write a tool which takes the XML document and converts it to, at the very least, something that looks like Java, where the <set> tags get converted to variable assignments, loops get converted to for loops, and so on.
I've been looking into ANTLR as well as standard XML parsers, and I'm wondering whether there's a recommended way to go about this. Can such an XML document be converted to something that's convertable to Java, if not directly?
I'm willing to write the parsing through SAX that writes an intermediate language based on each tag, if that's the recommended way, but the part that's giving me pause is the fact that it's context-based in the same way a language like Scheme is, with child elements of any tag being fully evaluated before the parent.
You can do it with XSLT. Then just use to generate the code snippets you need.
(remember to set the output format to plain text)
EDIT: Sample XSLT script
Input - a.xml:
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml-stylesheet type="text/xsl" href="b.xsl"?>
<set name='myVar'>
<concat>
<s>newText_</s>
<ref>otherVar</ref>
</concat>
</set>
Script - b.xsl:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:strip-space elements="*"/>
<xsl:output method="text" />
<xsl:template match="set">
<xsl:value-of select="#name"/>=<xsl:apply-templates/>
</xsl:template>
<xsl:template match="concat">
<xsl:for-each select="*">
<xsl:if test="position() > 1">+</xsl:if>
<xsl:apply-templates select="."/>
</xsl:for-each>
</xsl:template>
<xsl:template match="ref">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="s">
<xsl:text>"</xsl:text>
<xsl:apply-templates/>
<xsl:text>"</xsl:text>
</xsl:template>
</xsl:stylesheet>
Note that a.xml contain an instruction that will let XSLT-capable browsers render it with the stylesheet b.xsl. Firefox is such a browser. Open a.xml in firefox and you will see
myVar="newText_"+otherVar
Note that XSLT is a quite capable programming language, so there is a lot you can do.