XPATH: select subset of xml file - java

In my case, I have:
<booklist>
<book id="1">
</book>
<book id="2">
</book>
<book id="3">
</book>
......
</booklist>
How can i just return:
<booklist>
<book id="1">
</book>
</booklist>
if I use /booklist/book[#id=1], I can only get
<book id="1">
</book>
But I also need the document element.
Thanks

Rather than selecting the element that you do want, try excluding the elements that you don't want.
If you are just using XPATH, this will select all of the elements except for the book elements who's #id is not equal to 1 (i.e. <booklist><book id="1" /></booklist>).
//*[not(self::book[#id!='1'])]
If you want an XSLT solution, this stylesheet has an empty template that matches all of the <book> elements that do not have #id="1", which prevents them from being copied into the output.
Everything else (document node <booklist> and <book id="1">) will match the identity template, which copies forward.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<!--Empty template to prevent book elements
that do not have #id="1" from being
copied into the output -->
<xsl:template match="book[#id!='1']" />
<!--identity template to copy all nodes and attributes to output -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>

How can i just return:
< booklist >
< book id=1 >
< /book >
< /booklist >
XPath is a query language. Evaluating an XPath expression cannot change the structure of the XML document.
This is why the answer is: No, with XPath this is not possible!
Whenever you want to transform an XML document (which is exactly the case here), the probably best solution is to use XSLT -- a language which was designed especially for processing and transforming tree-structured data.
Here is a very simple XSLT solution:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="book[not(#id=1)]"/>
</xsl:stylesheet>
When this transformation is applied to the provided XML file, the wanted, correct result is produced:
<booklist>
<book id="1"/>
</booklist>

When you try to select a sub-element, only this will be returned.

Related

Most performative way to go through an XML transformation - Java

I'm new with java, and I want an opinion for the community.
I Have a huge XML, that contains a lot of information. Actually, this XML has approximately 140Mb of information.
In this XML I have a lot of information that is no more valid, so I need to do filter and use only the valid one, to check this I need to cross information between node, to check if deletion is needed or not. In some cases, the entire father(main) node needs to be deleted.
I'm already doing it with dom parse, using loops, inside the loops I save in variables and cross the information to check, and delete the actual node or the entire father node.
Basically, the structure is like this:
<source>
<main>
<id>98567</id>
<block_information>
<name>Block A</name>
<start_date>20120210</start_date>
<end_date>20150210</end_date>
</block_information>
<block_information>
<name>Block A.01</name>
<start_date>20150210</start_date>
<end_date>20251005</end_date>
</block_information>
<city_information>
<name>Manchester</name>
<start_date>20150210</start_date>
<end_date>20150212</end_date>
</city_information>
<city_information>
<name>New Manchester</name>
<start_date>20150212</start_date>
<end_date>20251005</end_date>
</city_information>
<phone>
<type>C</type>
<number>987466321</number>
<name></name>
</phone>
<phone>
<type>P</type>
<number>36547821</number>
<name></name>
</phone>
</main>
<main>
<id>19587</id>
<block_information>
<name>Che</name>
<start_date>20090210</start_date>
<end_date>20100210</end_date>
</block_information>
<block_information>
<name></name>
<start_date>20100210</start_date>
<end_date>20351005</end_date>
</block_information>
<city_information>
<name></name>
<start_date>20150210</start_date>
<end_date>20150212</end_date>
</city_information>
<city_information>
<name>No Name</name>
<start_date>20150212</start_date>
<end_date>20191005</end_date>
</city_information>
<phone>
<type>C</type>
<number>987466321</number>
<name>Mom</name>
</phone>
<phone>
<type>P</type>
<number>36547821</number>
<name></name>
</phone>
</main>
</source>
The output is like this:
<result>
<main>
<id>98567</id>
<block_name>Block A.01</block_name>
<city_name>New Manchester</city_name>
<cellphone></cellphone>
<phone>36547821</phone>
<contact_phone></contact_phone>
<contact_phone_name></contact_phone_name>
</main>
</result>
For the information go out in result, is mandatory that there is one <block_information> and <city_information> valid (<start_date> less than actual date and <end_date> bigger than actual date), and the <name...> is needed for both.
If there is none, or more than one valid, the <main> will be deleted.
For the phone number, <type> ['C' is for contact, 'P' for personal phone, 'M' for mobile]. So if the <type> is 'C' but there is no value in <name> the phone do not go to result. 'P' go to <phone> and 'M' go to <cellphone>.
I want your considerations on what is the best way to do that in the most performative way, and to anyone can do adjustment before in an easy way if it's needed.
thanks in advance for the inputs!
as asked by #kjhughes, I put some values on the sample XML, and some filters that I need to do. Thanks!
ps.: the XML structure used as an example is TOO simple compared to the actual one, there are a lot more complex types.
I would go with the following approach:
find a library that lets you stream the xml (file or inputsream) and produce a Stream<Main>
process the Stream<Main> and filter each Main node according to your validation logic
depending if you are I/O or CPU bottlenecked use a .parallel() stream to process the stream (read: test if .parallel() helps you in any way)
This will suffice for any sane performance requirements in the context of XML parsing (I guess?). Google for Java XML Stream and go from there (or maybe this stackoverflow question can give some pointers)
XSLT is a transformation language existing since 1999 which has now three versions, 1.0, 2.0, and 3.0, the latest version published as W3C recommendation in 2017 and supported on the Java platform by Saxon 9.8 and later, available in the open-source HE edition on Sourceforge and Maven. The use of XSLT 1 is supported in the Oracle/Sun Java JRE by incorporating Apache Xalan.
So instead of using DOM you have the option to use XSLT, here is an example using XSLT 3 (online at https://xsltfiddle.liberty-development.net/bFN1yab/0):
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:mf="http://example.com/mf"
exclude-result-prefixes="#all"
version="3.0">
<xsl:output indent="yes"/>
<xsl:function name="mf:date" as="xs:date">
<xsl:param name="input-date" as="xs:string"/>
<xsl:sequence
select="xs:date(replace($input-date, '([0-9]{4})([0-9]{2})([0-9]{2})', '$1-$2-$3'))"/>
</xsl:function>
<xsl:function name="mf:select-valid-info" as="element()*">
<xsl:param name="infos" as="element()*"/>
<xsl:sequence
select="$infos[name/normalize-space()
and mf:date(start_date) lt current-date()
and mf:date(end_date) gt current-date()]"/>
</xsl:function>
<xsl:function name="mf:valid-main" as="xs:boolean">
<xsl:param name="main" as="element(main)"/>
<xsl:sequence
select="let $valid-blocks := mf:select-valid-info($main/block_information),
$valid-cities := mf:select-valid-info($main/city_information)
return count($valid-blocks) eq 1 and count($valid-cities) eq 1"/>
</xsl:function>
<xsl:mode on-no-match="shallow-copy"/>
<xsl:template match="main[not(mf:valid-main(.))]"/>
<xsl:template match="main[mf:valid-main(.)]">
<xsl:copy>
<xsl:apply-templates
select="id,
mf:select-valid-info(block_information)/name,
mf:select-valid-info(city_information)/name,
phone"/>
</xsl:copy>
</xsl:template>
<xsl:template match="block_information/name | city_information/name">
<xsl:element name="{substring-before(local-name(..), '_')}_name">
<xsl:value-of select="."/>
</xsl:element>
</xsl:template>
<xsl:template match="main/phone[type = 'C']">
<contact_phone>
<xsl:value-of select="number[current()/normalize-space(name)]"/>
</contact_phone>
<contact_name>
<xsl:value-of select="name"/>
</contact_name>
</xsl:template>
<xsl:template match="main/phone[type = 'P']">
<phone>
<xsl:value-of select="number"/>
</phone>
</xsl:template>
<xsl:template match="main/phone[type = 'M']">
<cellphone>
<xsl:value-of select="number"/>
</cellphone>
</xsl:template>
</xsl:stylesheet>
I hope I have grasped the conditions for the main elements, I have not been able to quite understand the rules for the various phone data, but the code is meant as an example anyway.
Of course performance depends very much on the implementation but I think that XSLT is a more structured and maintainable way than doing DOM coding.
If you can afford it you can also look into Saxon 9.8 or 9.9 EE which supports streaming XSLT 3 where, with some rewrites of above code, you could have an XSLT based approach to stream forwards only through the huge document, materializing main elements as element nodes you transform while keeping the memory footprint low as that approach, in comparison to DOM or normal XSLT processing, doesn't parse the whole XML document first into a complete in-memory tree structure:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:mf="http://example.com/mf"
exclude-result-prefixes="#all"
version="3.0">
<xsl:mode streamable="yes" on-no-match="shallow-copy"/>
<xsl:template match="source">
<xsl:copy>
<xsl:apply-templates select="main!copy-of()" mode="main"/>
</xsl:copy>
</xsl:template>
<xsl:output indent="yes"/>
<xsl:function name="mf:date" as="xs:date">
<xsl:param name="input-date" as="xs:string"/>
<xsl:sequence
select="xs:date(replace($input-date, '([0-9]{4})([0-9]{2})([0-9]{2})', '$1-$2-$3'))"/>
</xsl:function>
<xsl:function name="mf:select-valid-info" as="element()*">
<xsl:param name="infos" as="element()*"/>
<xsl:sequence
select="$infos[name/normalize-space()
and mf:date(start_date) lt current-date()
and mf:date(end_date) gt current-date()]"/>
</xsl:function>
<xsl:function name="mf:valid-main" as="xs:boolean">
<xsl:param name="main" as="element(main)"/>
<xsl:sequence
select="let $valid-blocks := mf:select-valid-info($main/block_information),
$valid-cities := mf:select-valid-info($main/city_information)
return count($valid-blocks) eq 1 and count($valid-cities) eq 1"/>
</xsl:function>
<xsl:mode name="main" on-no-match="shallow-copy"/>
<xsl:template match="main[not(mf:valid-main(.))]" mode="main"/>
<xsl:template match="main[mf:valid-main(.)]" mode="main">
<xsl:copy>
<xsl:apply-templates
select="id,
mf:select-valid-info(block_information)/name,
mf:select-valid-info(city_information)/name,
phone" mode="#current"/>
</xsl:copy>
</xsl:template>
<xsl:template match="block_information/name | city_information/name" mode="main">
<xsl:element name="{substring-before(local-name(..), '_')}_name">
<xsl:value-of select="."/>
</xsl:element>
</xsl:template>
<xsl:template match="main/phone[type = 'C']" mode="main">
<contact_phone>
<xsl:value-of select="number[current()/normalize-space(name)]"/>
</contact_phone>
<contact_name>
<xsl:value-of select="name"/>
</contact_name>
</xsl:template>
<xsl:template match="main/phone[type = 'P']" mode="main">
<phone>
<xsl:value-of select="number"/>
</phone>
</xsl:template>
<xsl:template match="main/phone[type = 'M']" mode="main">
<cellphone>
<xsl:value-of select="number"/>
</cellphone>
</xsl:template>
</xsl:stylesheet>

Showing results from two different XML files in Java

So I am trying to pull results from two different XML files using XSLT in order to show a Restaurant Review. I have Restaurant details in allRestaurants.xml and have all of the reviews for these restaurants in allReviews.xml. I have currently stored a tag against each restaurant, and the reviews are each associated with a specific restaurant as well, so carry the same tag. I need to build a page that takes the restaurant with ID 1 and beneath is show the reviews for that restaurant. The reviews are stored with the exactly the same 1 as per below. Please help.
allRestaurants.xml
<restaurants>
<restaurant>
<restaurant_id>1</restaurant_id>
<name>The Jackaroo</name>
<street_address>107-109 Darlinghurst Road</street_address>
<postcode>2011</postcode>
<city>Sydney</city>
<state>NSW</state>
<country>Australia</country>
<email>info#jackaroo.com.au</email>
<telephone>93322244</telephone>
<stars>3</stars>
</restaurant>
<restaurant>
<restaurant_id>2</restaurant_id>
<name>Four Seasons restaurant Sydney</name>
<street_address>199 George Street</street_address>
<postcode>2000</postcode>
<city>Sydney</city>
<state>NSW</state>
<country>Australia</country>
<email>info#sydneyfourseasons.com.au</email>
<telephone>92503100</telephone>
<stars>5</stars>
</restaurant>
</restaurants>
allReviews.xml
<reviews>
<review id="1">
<restaurant_id>1</restaurant_id>
<author_id>1</author_id>
<headline>Clean Bare-Bones Hostel</headline>
<details>
Example text here
</details>
<rating>3</rating>
<date>1388782853</date>
</review>
<review id="2">
<restaurant_id>1</restaurant_id>
<author_id>3</author_id>
<headline>Wouldn't Recommend</headline>
<details>
Example text here
</details>
<rating>2</rating>
<date>1368748800</date>
</review>
<review id="3">
<restaurant_id>2</restaurant_id>
<author_id>2</author_id>
<headline>Overall I Enjoyed</headline>
<details>
Example text here
</details>
<rating>4</rating>
<date>1378788850</date>
</review>
</reviews>
I thought maybe merging them into one XML file like so would do the trick, but even then, I'm not sure where to start:
oneHotel.xml
<?xml-stylesheet type="text/xsl" href="oneHotel.xsl"?>
<list>
<entry name="allHotels.xml" />
<entry name="reviews.xml" />
</list>
This is as far as I got in the XSLT doc, and am drawing a massive blank. I don't even know where to start:
oneHotel.xsl
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html"/>
<xsl:copy-of name="restaurant" select="document('allRestaurants.xml')
/restaurants/restaurant[restaurant_id=1]"/>
<xsl:copy-of name="reviews" select="document('allReviews.xml')
/reviews/review[restaurant_id=1]"/>
<xsl:template match="/">
<xsl:choose>
<xsl:when test="document('allRestaurants.xml')
/restaurants/restaurant[restaurant_id=1]"/>
<h2><xsl:value-of select="name"/></h2>
</xsl:choose>
<h2><xsl:value-of select="$restaurant/name"/></h2>
</xsl:template>
</xsl:stylesheet>
Try this as your starting point:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" encoding="UTF-8"/>
<xsl:param name="path-to-reviews" select="'allReviews.xml'"/>
<xsl:key name="review-by-restaurant-id" match="review" use="restaurant_id" />
<xsl:template match="/restaurants">
<html>
<body>
<h1>Restaurant Reviews</h1>
<xsl:apply-templates select="restaurant"/>
</body>
</html>
</xsl:template>
<xsl:template match="restaurant">
<h2>
<xsl:value-of select="name"/>
</h2>
<xsl:variable name="id" select="restaurant_id" />
<!-- switch context to lookup document in order to use key -->
<xsl:for-each select="document($path-to-reviews)">
<xsl:for-each select="key('review-by-restaurant-id', $id)">
<h3>
<xsl:value-of select="headline"/>
</h3>
<p>
<xsl:value-of select="details"/>
</p>
</xsl:for-each>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
This assumes that you are instructing your XSLT processor to process the allRestaurants.xml document and passing the path to the allReviews.xml document as a parameter.
You didn't tell us what you want your final result to look like, so I just made up a very basic page.

XPATH expression to output parent node and matched child node excluding unmatched child node

Below is the XML
<?xml version="1.0" encoding="UTF-8"?>
<library>
<object>book</object>
<bookname>
<value>testbook</value>
<author>
<value>ABCD</value>
<category>
<value>story</value>
<price>
<dollars>200</dollars>
</price>
</category>
</author>
<author>
<value>EFGH</value>
<category>
<value>fiction</value>
<price>
<dollars>300</dollars>
</price>
</category>
</author>
</bookname>
</library>
I need the xpath expression to get the below output
<?xml version="1.0" encoding="UTF-8"?>
<library>
<object>book</object>
<bookname>
<value>testbook</value>
<author>
<value>ABCD</value>
<category>
<value>story</value>
<price>
<dollars>200</dollars>
</price>
</category>
</author>
</bookname>
</library>
But when i apply the below xpath expression, im getting the entire input xml as transformed output. Instead i need only the parent nodes + child node matching author/value='ABCD' (as shown above)
<xsl:copy-of select="/library/object[text()='book']/../bookname/value[text()='testbook']/../author/value[text()='ABCD']/../../.."/>
Please help me with the correct xpath expression to get the desired output.
I'm using a java program to evaluate the xpath expression to get my desired XML output. And so I need an xpath expression. Below is my java code
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(true);
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document doc = builder.parse("books.xml");
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
XPathExpression expr = xpath.compile("/library/object[text()='book']/../bookname/value[text()='testbook']/../author/value[text()='ABCD']/../../..");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
Please help me with correct solution either in Java or xslt
You cannot do this in pure xpath.
This stylesheet will do what you want in XSL 2.0
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<!-- Idendtity template -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="author[not(value eq 'ABCD')]"/>
</xsl:stylesheet>
This stylesheet will do what you want in XSL 1.0
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<!-- Idendtity template -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="author[not(value = 'ABCD')]"/>
</xsl:stylesheet>

Create xmlns attribute in the XML using XSLT Transformation

I am trying to add the xmlns attribute to the resulting XML with a value passed by parameter during XSLT transformation using JDK Transformer (Oracle XML v2 Parser or JAXP) but it always defaults to http://www.w3.org/2000/xmlns/
My source XML
<test/>
My XSLT
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="http://example.com">
<xsl:param name="myNameSpace" select="'http://neilghosh.com'"/>
<xsl:template match="/">
<process>
<xsl:attribute name="xmlns:neil">
<xsl:value-of select="$myNameSpace"/>
</xsl:attribute>
</process>
</xsl:template>
</xsl:stylesheet>
My Result
<?xml version="1.0"?>
<process xmlns="http://www.w3.org/2000/xmlns/" xmlns:neil="neilghosh.com">
</process>
My Desired Result
<?xml version="1.0"?>
<process xmlns="http://example.com" xmlns:neil="neilghosh.com">
</process>
Firstly, in the XSLT data model, you don't want to create an attribute node, you want to create a namespace node.
Namespace nodes are usually created automatically: if you create an element or attribute in a particular namespace, the requisite namespace node (and hence, when serialized, the namespace declaration) are added automatically by the processor.
If you want to create a namespace node that isn't necessary (because it's not used in the name of any element or attribute) then in XSLT 2.0 you can use xsl:namespace. If you're stuck with XSLT 1.0 then there's a workaround, that involves creating an element in the relevant namespace and then copying its namespace node:
<xsl:variable name="ns">
<xsl:element name="neil:dummy" namespace="{$param}"/>
</xsl:variable>
<process>
<xsl:copy-of select="$ns/*/namespace::neil"/>
</process>
Michael Kay provided you with the correct answer, but based on your comments, you aren't sure how to use it in your transformation.
Here is a complete transformation:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ext="http://exslt.org/common" exclude-result-prefixes="ext">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:param name="pNamespace" select="'neilghosh.com'"/>
<xsl:variable name="vDummy">
<xsl:element name="neil:x" namespace="{$pNamespace}"/>
</xsl:variable>
<xsl:template match="/*">
<xsl:element name="process" namespace="http://example.com">
<xsl:copy-of select="namespace::*"/>
<xsl:copy-of select="ext:node-set($vDummy)/*/namespace::*[.=$pNamespace]"/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the provided XML document:
<test/>
the wanted, correct result is produced:
<process xmlns="http://example.com" xmlns:neil="neilghosh.com" />
Namespace declarations in XML are not attributes even though they look like attributes. In XSLT 2.0 you can use <xsl:namespace name="neil" select="$myNameSpace" /> to add a namespace declaration to the result tree dynamically but that feature is not available in XSLT 1.0.
Don't try to create "xmlns" attributes yourself. Create the namespaces in the XSLT and they will be done automatically.
This XSLT works (tested with Saxon 9.4):
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:neil="neilghosh.com"
xpath-default-namespace="http://example.com"
xmlns="http://example.com" version="2.0">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:param name="myDynamicNamespace" select="'http://neilghosh.com'"/>
<xsl:template match="/">
<xsl:element name="process">
<xsl:namespace name="neil" select="$myDynamicNamespace"/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
And gives the following output:
<?xml version="1.0" encoding="UTF-8"?>
<process xmlns="http://example.com" xmlns:neil="http://neilghosh.com"/>
Finally got an workaround which worked with my XSLT Processor (Oracle XML V2 Parser)
I had to transform it to a DOM Document and then persist that DOM to filesystem instead of outputting directly to StreamResult
I used DOMResult in the transform method
Following XSLT fragment worked but there was an extra xmlns:xmlns="http://www.w3.org/2000/xmlns/" which was probably absorbed by Document and did not appear in the final output when I persisted to file system.
<process>
<xsl:attribute name="xmlns">
<xsl:value-of select="'http://example.com'"/>
</xsl:attribute>
<process>
I know this is not the best way to do but given the parse constraint this is the only choice I have now.

Populate XML template-file from XPath Expressions?

What would be the best way to populate (or generate) an XML template-file from a mapping of XPath expressions?
The requirements are that we will need to start with a template (since this might contain information not otherwise captured in the XPath expressions).
For example, a starting template might be:
<s11:Envelope xmlns:s11='http://schemas.xmlsoap.org/soap/envelope/'>
<ns1:create xmlns:ns1='http://predic8.com/wsdl/material/ArticleService/1/'>
<article xmlns:ns1='http://predic8.com/material/1/'>
<name>?XXX?</name>
<description>?XXX?</description>
<price xmlns:ns1='http://predic8.com/common/1/'>
<amount>?999.99?</amount>
<currency xmlns:ns1='http://predic8.com/common/1/'>???</currency>
</price>
<id xmlns:ns1='http://predic8.com/material/1/'>???</id>
</article>
</ns1:create>
</s11:Body>
</s11:Envelope>
Then we are supplied, something like:
expression: /create/article[1]/id => 1
expression: /create/article[1]/description => bar
expression: /create/article[1]/name[1] => foo
expression: /create/article[1]/price[1]/amount => 00.00
expression: /create/article[1]/price[1]/currency => USD
expression: /create/article[2]/id => 2
expression: /create/article[2]/description => some name
expression: /create/article[2]/name[1] => some description
expression: /create/article[2]/price[1]/amount => 00.01
expression: /create/article[2]/price[1]/currency => USD
We should then generate:
<ns1:create xmlns:ns1='http://predic8.com/wsdl/material/ArticleService/1/'>
<article xmlns:ns1='http://predic8.com/material/1/'>
<name xmlns:ns1='http://predic8.com/material/1/'>foo</name>
<description>bar</description>
<price xmlns:ns1='http://predic8.com/common/1/'>
<amount>00.00</amount>
<currency xmlns:ns1='http://predic8.com/common/1/'>USD</currency>
</price>
<id xmlns:ns1='http://predic8.com/material/1/'>1</id>
</article>
<article xmlns:ns1='http://predic8.com/material/2/'>
<name>some name</name>
<description>some description</description>
<price xmlns:ns1='http://predic8.com/common/2/'>
<amount>00.01</amount>
<currency xmlns:ns1='http://predic8.com/common/2/'>USD</currency>
</price>
<id xmlns:ns1='http://predic8.com/material/2/'>2</id>
</article>
</ns1:create>
I am implemented in Java, although I would prefer an XSLT-based solution if one is possible.
PS: This question is the reverse of another question I recently asked.
This transformation creates from the "expressions" an XML document that has the structure of the wanted result -- it remains to transform this result into the final result:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:my="my:my">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:variable name="vPop" as="element()*">
<item path="/create/article[1]/id">1</item>
<item path="/create/article[1]/description">bar</item>
<item path="/create/article[1]/name[1]">foo</item>
<item path="/create/article[1]/price[1]/amount">00.00</item>
<item path="/create/article[1]/price[1]/currency">USD</item>
<item path="/create/article[1]/price[2]/amount">11.11</item>
<item path="/create/article[1]/price[2]/currency">AUD</item>
<item path="/create/article[2]/id">2</item>
<item path="/create/article[2]/description">some name</item>
<item path="/create/article[2]/name[1]">some description</item>
<item path="/create/article[2]/price[1]/amount">00.01</item>
<item path="/create/article[2]/price[1]/currency">USD</item>
</xsl:variable>
<xsl:template match="/">
<xsl:sequence select="my:subTree($vPop/#path/concat(.,'/',string(..)))"/>
</xsl:template>
<xsl:function name="my:subTree" as="node()*">
<xsl:param name="pPaths" as="xs:string*"/>
<xsl:for-each-group select="$pPaths"
group-adjacent=
"substring-before(substring-after(concat(., '/'), '/'), '/')">
<xsl:if test="current-grouping-key()">
<xsl:choose>
<xsl:when test=
"substring-after(current-group()[1], current-grouping-key())">
<xsl:element name=
"{substring-before(concat(current-grouping-key(), '['), '[')}">
<xsl:sequence select=
"my:subTree(for $s in current-group()
return
concat('/',substring-after(substring($s, 2),'/'))
)
"/>
</xsl:element>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="current-grouping-key()"/>
</xsl:otherwise>
</xsl:choose>
</xsl:if>
</xsl:for-each-group>
</xsl:function>
</xsl:stylesheet>
When this transformation is applied on any XML document (not used), the result is:
<create>
<article>
<id>1</id>
<description>bar</description>
<name>foo</name>
<price>
<amount>00.00</amount>
<currency>USD</currency>
</price>
<price>
<amount>11.11</amount>
<currency>AUD</currency>
</price>
</article>
<article>
<id>2</id>
<description>some name</description>
<name>some description</name>
<price>
<amount>00.01</amount>
<currency>USD</currency>
</price>
</article>
</create>
Note:
You need to transform the "expressions" you are given into the format used in this transformation -- this is easy and straightforward.
In the final transformation you need to copy every node "as-is" (using the identity rule), with the exception that the top node should be generated in the "http://predic8.com/wsdl/material/ArticleService/1/" namespace. Note that the other namespaces present in the "template" are not used and can be safely ommitted.
This solution requires you to re-organise your XPATH input information slightly, and to allow a 2-step transformation. The first transformation will write the stylesheet, which will be executed in the second transformation - Thus the client is required to do two invocations of the XSLT engine. Let us know if this is a problem.
Step One
Please re-organise your XPATH information into an XML document like so. It should not be difficult to do, and even an XSLT script could be written to do the job.
<paths>
<rule>
<match>article[1]/id[1]</match>
<namespaces>
<namespace prefix="ns1">http://predic8.com/wsdl/material/ArticleService/1/</namespace>
<!-- The namespace node declares a namespace that is used in the match expression.
There can be many of these. It is not required to define the s11: namespace,
nor the ns1 namespace. -->
</namespaces>
<replacement>1</replacement>
</rule>
<rule>
<match>article[1]/description[1]</match>
<namespaces/>
<replacement>bar</replacement>
</rule>
... etc ...
</paths>
Solution constraints
In the above rules document we are constrained so that:
The match is implicitly prefixed 'expression: /create/'. Don't put that explicitly.
All matches must begin like article[n] where n is some ordinal number.
We can't have zero rules.
Any prefixes that you use in the match, other than s11="http://schemas.xmlsoap.org/soap/envelope/" and ns1="http://predic8.com/wsdl/material/ArticleService/1/". (Note: I don't think it is valid for namespaces to end in '/' - but not sure about that), are defined in the namespaces node.
The above is the input document to the step one transformation. Apply this document to this style-sheet ...
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:step2="http://www.w3.org/1999/XSL/Transform-step2"
xmlns:s11="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:ns1="http://predic8.com/wsdl/material/ArticleService/1/"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes='xsl'>
<xsl:output method="xml" indent="yes" encoding="UTF-8" />
<xsl:namespace-alias stylesheet-prefix="step2" result-prefix="xsl"/>
<xsl:template match="/">
<step2:stylesheet version="2.0">
<step2:output method="xml" indent="yes" encoding="UTF-8" />
<step2:variable name="replicated-template" as="element()*">
<step2:apply-templates select="/" mode="replication" />
</step2:variable>
<step2:template match="#*|node()" mode="replication">
<step2:copy>
<step2:apply-templates select="#*|node()" mode="replication" />
</step2:copy>
</step2:template>
<step2:template match="/s11:Envelope/s11:Body/ns1:create/article" mode="replication">
<step2:variable name="replicant" select="." />
<step2:for-each select="for $i in 1 to
{max(for $m in /paths/rule/match return
xs:integer(substring-before(substring-after($m,'article['),']')))}
return $i">
<step2:for-each select="$replicant">
<step2:copy>
<step2:apply-templates select="#*|node()" mode="replication" />
</step2:copy>
</step2:for-each>
</step2:for-each>
</step2:template>
<step2:template match="#*|node()">
<step2:copy>
<step2:apply-templates select="#*|node()"/>
</step2:copy>
</step2:template>
<step2:template match="/">
<step2:apply-templates select="$replicated-template" />
</step2:template>
<xsl:apply-templates select="paths/rule" />
</step2:stylesheet>
</xsl:template>
<xsl:template match="rule">
<step2:template match="s11:Envelope/s11:Body/ns1:create/{match}">
<xsl:for-each select="namespaces/namespace">
<xsl:namespace name="{#prefix}" select="." />
</xsl:for-each>
<step2:copy>
<step2:apply-templates select="#*"/>
<step2:value-of select="'{replacement}'"/>
<step2:apply-templates select="*"/>
</step2:copy>
</step2:template>
</xsl:template>
</xsl:stylesheet>
Step Two
Apply your soap envelope file, as an input document, to the style-sheet which was output from step one. The result is the original soap document, altered as required. This is a sample of a step two style-sheet, with just the first rule (/create/article[1]/id => 1) being considered for the sake of simplicity of illustration.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:s11="http://schemas.xmlsoap.org/soap/envelope/"
version="2.0">
<xsl:output method="xml" indent="yes" encoding="UTF-8"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template xmlns:ns1="http://predic8.com/wsdl/material/ArticleService/1/"
match="/s11:Envelope/s11:Body/ns1:create[1]/article[1]/id[1]">
<xsl:copy>
<xsl:apply-templates select="#*"/>
<xsl:value-of select="'1'"/>
<xsl:apply-templates select="*"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
More solution constraints
The template document must contain at least one /s11:Envelope/s11:Body/ns1:create/article . Only the article node is replicated (deeply) as required by rules. Other than than it can be any structure.
The template document cannot contain nested levels of the s11:Envelope/s11:Body/ns1:create node.
Explanation
You will notice that your XPATH expressions are not far removed from a match condition of template. Therefore it is not too difficult to write a stylesheet which re-expresses your XPATH and replacement values as template rules. When writing a style-sheet writing style-sheet the xsl:namespace-alias enables us to disambiguate "xsl:" as an instruction and "xsl:" as intended output. When XSLT 3.0 comes along, we are quiet likely to be able to reduce this algorithm into one step, as it will allow dynamic XPATH evaluation, which is really the nub of your problem. But for the moment we must be content with a 2-step process.
The second style-sheet is a two-phase transformation. The first stage replicates the template from the article level, as many times as needed by the rules. The second phase parses this replicated template, and applies the dynamic rules substituting text values as indicated by the XPATHs.
UPDATE
My original post was wrong. Thanks to Dimitre for pointing out the error. Please find updated solution above.
After-thought
If a two-step solultion is too complicated, and you are running on a wintel platform, you may consider purchasing the commercial version of Saxon. I believe that the commercial version has a dynamic XPATH evaluation function. I can't give you such a solution because I don't have the commercial version. I imagine a solution using an evaluate() function would be a lot simpler. XSLT is just a hobby for me. But if you are using XSLT for business purposes, the price is quiet reasonable.

Categories