Showing results from two different XML files in Java - java

So I am trying to pull results from two different XML files using XSLT in order to show a Restaurant Review. I have Restaurant details in allRestaurants.xml and have all of the reviews for these restaurants in allReviews.xml. I have currently stored a tag against each restaurant, and the reviews are each associated with a specific restaurant as well, so carry the same tag. I need to build a page that takes the restaurant with ID 1 and beneath is show the reviews for that restaurant. The reviews are stored with the exactly the same 1 as per below. Please help.
allRestaurants.xml
<restaurants>
<restaurant>
<restaurant_id>1</restaurant_id>
<name>The Jackaroo</name>
<street_address>107-109 Darlinghurst Road</street_address>
<postcode>2011</postcode>
<city>Sydney</city>
<state>NSW</state>
<country>Australia</country>
<email>info#jackaroo.com.au</email>
<telephone>93322244</telephone>
<stars>3</stars>
</restaurant>
<restaurant>
<restaurant_id>2</restaurant_id>
<name>Four Seasons restaurant Sydney</name>
<street_address>199 George Street</street_address>
<postcode>2000</postcode>
<city>Sydney</city>
<state>NSW</state>
<country>Australia</country>
<email>info#sydneyfourseasons.com.au</email>
<telephone>92503100</telephone>
<stars>5</stars>
</restaurant>
</restaurants>
allReviews.xml
<reviews>
<review id="1">
<restaurant_id>1</restaurant_id>
<author_id>1</author_id>
<headline>Clean Bare-Bones Hostel</headline>
<details>
Example text here
</details>
<rating>3</rating>
<date>1388782853</date>
</review>
<review id="2">
<restaurant_id>1</restaurant_id>
<author_id>3</author_id>
<headline>Wouldn't Recommend</headline>
<details>
Example text here
</details>
<rating>2</rating>
<date>1368748800</date>
</review>
<review id="3">
<restaurant_id>2</restaurant_id>
<author_id>2</author_id>
<headline>Overall I Enjoyed</headline>
<details>
Example text here
</details>
<rating>4</rating>
<date>1378788850</date>
</review>
</reviews>
I thought maybe merging them into one XML file like so would do the trick, but even then, I'm not sure where to start:
oneHotel.xml
<?xml-stylesheet type="text/xsl" href="oneHotel.xsl"?>
<list>
<entry name="allHotels.xml" />
<entry name="reviews.xml" />
</list>
This is as far as I got in the XSLT doc, and am drawing a massive blank. I don't even know where to start:
oneHotel.xsl
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html"/>
<xsl:copy-of name="restaurant" select="document('allRestaurants.xml')
/restaurants/restaurant[restaurant_id=1]"/>
<xsl:copy-of name="reviews" select="document('allReviews.xml')
/reviews/review[restaurant_id=1]"/>
<xsl:template match="/">
<xsl:choose>
<xsl:when test="document('allRestaurants.xml')
/restaurants/restaurant[restaurant_id=1]"/>
<h2><xsl:value-of select="name"/></h2>
</xsl:choose>
<h2><xsl:value-of select="$restaurant/name"/></h2>
</xsl:template>
</xsl:stylesheet>

Try this as your starting point:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" encoding="UTF-8"/>
<xsl:param name="path-to-reviews" select="'allReviews.xml'"/>
<xsl:key name="review-by-restaurant-id" match="review" use="restaurant_id" />
<xsl:template match="/restaurants">
<html>
<body>
<h1>Restaurant Reviews</h1>
<xsl:apply-templates select="restaurant"/>
</body>
</html>
</xsl:template>
<xsl:template match="restaurant">
<h2>
<xsl:value-of select="name"/>
</h2>
<xsl:variable name="id" select="restaurant_id" />
<!-- switch context to lookup document in order to use key -->
<xsl:for-each select="document($path-to-reviews)">
<xsl:for-each select="key('review-by-restaurant-id', $id)">
<h3>
<xsl:value-of select="headline"/>
</h3>
<p>
<xsl:value-of select="details"/>
</p>
</xsl:for-each>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
This assumes that you are instructing your XSLT processor to process the allRestaurants.xml document and passing the path to the allReviews.xml document as a parameter.
You didn't tell us what you want your final result to look like, so I just made up a very basic page.

Related

Most performative way to go through an XML transformation - Java

I'm new with java, and I want an opinion for the community.
I Have a huge XML, that contains a lot of information. Actually, this XML has approximately 140Mb of information.
In this XML I have a lot of information that is no more valid, so I need to do filter and use only the valid one, to check this I need to cross information between node, to check if deletion is needed or not. In some cases, the entire father(main) node needs to be deleted.
I'm already doing it with dom parse, using loops, inside the loops I save in variables and cross the information to check, and delete the actual node or the entire father node.
Basically, the structure is like this:
<source>
<main>
<id>98567</id>
<block_information>
<name>Block A</name>
<start_date>20120210</start_date>
<end_date>20150210</end_date>
</block_information>
<block_information>
<name>Block A.01</name>
<start_date>20150210</start_date>
<end_date>20251005</end_date>
</block_information>
<city_information>
<name>Manchester</name>
<start_date>20150210</start_date>
<end_date>20150212</end_date>
</city_information>
<city_information>
<name>New Manchester</name>
<start_date>20150212</start_date>
<end_date>20251005</end_date>
</city_information>
<phone>
<type>C</type>
<number>987466321</number>
<name></name>
</phone>
<phone>
<type>P</type>
<number>36547821</number>
<name></name>
</phone>
</main>
<main>
<id>19587</id>
<block_information>
<name>Che</name>
<start_date>20090210</start_date>
<end_date>20100210</end_date>
</block_information>
<block_information>
<name></name>
<start_date>20100210</start_date>
<end_date>20351005</end_date>
</block_information>
<city_information>
<name></name>
<start_date>20150210</start_date>
<end_date>20150212</end_date>
</city_information>
<city_information>
<name>No Name</name>
<start_date>20150212</start_date>
<end_date>20191005</end_date>
</city_information>
<phone>
<type>C</type>
<number>987466321</number>
<name>Mom</name>
</phone>
<phone>
<type>P</type>
<number>36547821</number>
<name></name>
</phone>
</main>
</source>
The output is like this:
<result>
<main>
<id>98567</id>
<block_name>Block A.01</block_name>
<city_name>New Manchester</city_name>
<cellphone></cellphone>
<phone>36547821</phone>
<contact_phone></contact_phone>
<contact_phone_name></contact_phone_name>
</main>
</result>
For the information go out in result, is mandatory that there is one <block_information> and <city_information> valid (<start_date> less than actual date and <end_date> bigger than actual date), and the <name...> is needed for both.
If there is none, or more than one valid, the <main> will be deleted.
For the phone number, <type> ['C' is for contact, 'P' for personal phone, 'M' for mobile]. So if the <type> is 'C' but there is no value in <name> the phone do not go to result. 'P' go to <phone> and 'M' go to <cellphone>.
I want your considerations on what is the best way to do that in the most performative way, and to anyone can do adjustment before in an easy way if it's needed.
thanks in advance for the inputs!
as asked by #kjhughes, I put some values on the sample XML, and some filters that I need to do. Thanks!
ps.: the XML structure used as an example is TOO simple compared to the actual one, there are a lot more complex types.
I would go with the following approach:
find a library that lets you stream the xml (file or inputsream) and produce a Stream<Main>
process the Stream<Main> and filter each Main node according to your validation logic
depending if you are I/O or CPU bottlenecked use a .parallel() stream to process the stream (read: test if .parallel() helps you in any way)
This will suffice for any sane performance requirements in the context of XML parsing (I guess?). Google for Java XML Stream and go from there (or maybe this stackoverflow question can give some pointers)
XSLT is a transformation language existing since 1999 which has now three versions, 1.0, 2.0, and 3.0, the latest version published as W3C recommendation in 2017 and supported on the Java platform by Saxon 9.8 and later, available in the open-source HE edition on Sourceforge and Maven. The use of XSLT 1 is supported in the Oracle/Sun Java JRE by incorporating Apache Xalan.
So instead of using DOM you have the option to use XSLT, here is an example using XSLT 3 (online at https://xsltfiddle.liberty-development.net/bFN1yab/0):
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:mf="http://example.com/mf"
exclude-result-prefixes="#all"
version="3.0">
<xsl:output indent="yes"/>
<xsl:function name="mf:date" as="xs:date">
<xsl:param name="input-date" as="xs:string"/>
<xsl:sequence
select="xs:date(replace($input-date, '([0-9]{4})([0-9]{2})([0-9]{2})', '$1-$2-$3'))"/>
</xsl:function>
<xsl:function name="mf:select-valid-info" as="element()*">
<xsl:param name="infos" as="element()*"/>
<xsl:sequence
select="$infos[name/normalize-space()
and mf:date(start_date) lt current-date()
and mf:date(end_date) gt current-date()]"/>
</xsl:function>
<xsl:function name="mf:valid-main" as="xs:boolean">
<xsl:param name="main" as="element(main)"/>
<xsl:sequence
select="let $valid-blocks := mf:select-valid-info($main/block_information),
$valid-cities := mf:select-valid-info($main/city_information)
return count($valid-blocks) eq 1 and count($valid-cities) eq 1"/>
</xsl:function>
<xsl:mode on-no-match="shallow-copy"/>
<xsl:template match="main[not(mf:valid-main(.))]"/>
<xsl:template match="main[mf:valid-main(.)]">
<xsl:copy>
<xsl:apply-templates
select="id,
mf:select-valid-info(block_information)/name,
mf:select-valid-info(city_information)/name,
phone"/>
</xsl:copy>
</xsl:template>
<xsl:template match="block_information/name | city_information/name">
<xsl:element name="{substring-before(local-name(..), '_')}_name">
<xsl:value-of select="."/>
</xsl:element>
</xsl:template>
<xsl:template match="main/phone[type = 'C']">
<contact_phone>
<xsl:value-of select="number[current()/normalize-space(name)]"/>
</contact_phone>
<contact_name>
<xsl:value-of select="name"/>
</contact_name>
</xsl:template>
<xsl:template match="main/phone[type = 'P']">
<phone>
<xsl:value-of select="number"/>
</phone>
</xsl:template>
<xsl:template match="main/phone[type = 'M']">
<cellphone>
<xsl:value-of select="number"/>
</cellphone>
</xsl:template>
</xsl:stylesheet>
I hope I have grasped the conditions for the main elements, I have not been able to quite understand the rules for the various phone data, but the code is meant as an example anyway.
Of course performance depends very much on the implementation but I think that XSLT is a more structured and maintainable way than doing DOM coding.
If you can afford it you can also look into Saxon 9.8 or 9.9 EE which supports streaming XSLT 3 where, with some rewrites of above code, you could have an XSLT based approach to stream forwards only through the huge document, materializing main elements as element nodes you transform while keeping the memory footprint low as that approach, in comparison to DOM or normal XSLT processing, doesn't parse the whole XML document first into a complete in-memory tree structure:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:mf="http://example.com/mf"
exclude-result-prefixes="#all"
version="3.0">
<xsl:mode streamable="yes" on-no-match="shallow-copy"/>
<xsl:template match="source">
<xsl:copy>
<xsl:apply-templates select="main!copy-of()" mode="main"/>
</xsl:copy>
</xsl:template>
<xsl:output indent="yes"/>
<xsl:function name="mf:date" as="xs:date">
<xsl:param name="input-date" as="xs:string"/>
<xsl:sequence
select="xs:date(replace($input-date, '([0-9]{4})([0-9]{2})([0-9]{2})', '$1-$2-$3'))"/>
</xsl:function>
<xsl:function name="mf:select-valid-info" as="element()*">
<xsl:param name="infos" as="element()*"/>
<xsl:sequence
select="$infos[name/normalize-space()
and mf:date(start_date) lt current-date()
and mf:date(end_date) gt current-date()]"/>
</xsl:function>
<xsl:function name="mf:valid-main" as="xs:boolean">
<xsl:param name="main" as="element(main)"/>
<xsl:sequence
select="let $valid-blocks := mf:select-valid-info($main/block_information),
$valid-cities := mf:select-valid-info($main/city_information)
return count($valid-blocks) eq 1 and count($valid-cities) eq 1"/>
</xsl:function>
<xsl:mode name="main" on-no-match="shallow-copy"/>
<xsl:template match="main[not(mf:valid-main(.))]" mode="main"/>
<xsl:template match="main[mf:valid-main(.)]" mode="main">
<xsl:copy>
<xsl:apply-templates
select="id,
mf:select-valid-info(block_information)/name,
mf:select-valid-info(city_information)/name,
phone" mode="#current"/>
</xsl:copy>
</xsl:template>
<xsl:template match="block_information/name | city_information/name" mode="main">
<xsl:element name="{substring-before(local-name(..), '_')}_name">
<xsl:value-of select="."/>
</xsl:element>
</xsl:template>
<xsl:template match="main/phone[type = 'C']" mode="main">
<contact_phone>
<xsl:value-of select="number[current()/normalize-space(name)]"/>
</contact_phone>
<contact_name>
<xsl:value-of select="name"/>
</contact_name>
</xsl:template>
<xsl:template match="main/phone[type = 'P']" mode="main">
<phone>
<xsl:value-of select="number"/>
</phone>
</xsl:template>
<xsl:template match="main/phone[type = 'M']" mode="main">
<cellphone>
<xsl:value-of select="number"/>
</cellphone>
</xsl:template>
</xsl:stylesheet>

Parsing nested XML File as node set

I have a problem. I have an xml file as below :
Main.xml
<?xml version="1.0" encoding="ISO-8859-1"?>
<msrsw>
<software>
<chapter>
<name> Hello world</sample>
<xref type="xml">C:\ABC\NestedXML.xml</xref>
</chapter>
</software>
</msrsw>
NestedXml.xml :
<?xml version="1.0" encoding="ISO-8859-1"?>
<msrsw>
<software>
<chapter>
<name> Nested XML </sample>
<age>14</age>
<country>Canada</country>
</chapter>
</software>
</msrsw>
I am using Apache FOP to produce PDF document (Using Java). FOP needs src and dest params. Src being Main.xml.
My XSLT is as below:
<xsl:param name="xmlFileName" />
<xsl:param name="XMLFile" select="document($xmlFileName)"/>
<xsl:template name="Chapters_1_2_Template">
<xsl:apply-templates select="$XMLFile/*" mode="chapter" />
</xsl:template>
<xsl:template match="node()" mode="chapter">
<xsl:for-each select="node()">
<xsl:if test="current()[name() = 'xref']">
<xsl:apply-templates select="current()[name() = 'xref']" mode="x" />
</xsl:if>
<xsl:if test="current()[name() = 'name']">
<xsl:apply-templates select="current()[name() = 'name']" mode="name" />
</xsl:if>
<xsl:if test="current()[name() = 'age']">
<xsl:apply-templates select="current()[name() = 'age']" mode="age" />
</xsl:if>
</xsl:for-each>
</xsl:template>
<xsl:template match="name" mode="name">
<xsl:value-of select="current()" />
</xsl:template>
<xsl:template match="age" mode="age">
<xsl:value-of select="current()" />
</xsl:template>
<xsl:template match="country" mode="country">
<xsl:value-of select="current()" />
</xsl:template>
<xsl:template match="xref" mode="x">
<xsl:param name="XMLFile" select="document(current())"/>
<xsl:call-template name="Chapters_1_2_Template" />
</xsl:template>
When I execute the above code, i get the following error in the line <xsl:apply-templates select="$XMLFile/*" mode="chapter" />:
Invalid token XMLFile.
When I remove $XMLFile - <xsl:apply-templates select="*" mode="chapter" /> , it works fine for me with Main.xml file's content.
When I just display the value of current() in xref template, it is parses the NestedXml.xml file and displayes string values in PDF.
Expected Output is:
Hello world
Nested XML
14
Canada
My requirement is to embed the node-set of NestedXML.xml within the<xref></xref> tags and recursively applyTemplates on that node-set.
But the document() function gives me the parsed String content of NestedXml.xml.
I do not want to use the document() function and just get the String value of NestedXml.xml. I need to parse all the tags of NestedXml.xml using the same XSLT.
Please suggest me where am I going wrong. Is this approach right? Is there any other way to do this?
Or it is not possible to do in this way using XSLT and that XSLT allows including only the parsed String values?
If it is that you want just some text nodes, the stylesheet needn't be that complex. Try this simpler stylesheet:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
<xsl:output method="text" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:apply-templates select="msrsw/software/chapter/name"/>
<xsl:apply-templates select="document(msrsw/software/chapter/xref)/*"/>
</xsl:template>
<xsl:template match="text()">
<xsl:value-of select="normalize-space()"/>
<xsl:text>
</xsl:text>
</xsl:template>
</xsl:stylesheet>

CDATA XML masking with XSLT should return same XML with few masked fields

I have a requirement of masking few fields in XML of CDATA inside XML with XSLT.
So the resultant XML should be same like the input XML but few fileds are masked with XSLT.
I followed this link which is masking as expected but producing XML is in different format.
I tried many other solutions from SO, they are almost outputing the new XML/HTML in other format which is different from the input XML.
Please check the following example for better understading.
Input XML with CDATA content.
<XML>
<LogLevel>info</LogLevel>
<Content><![CDATA[ <Msg>
<AccountNo>2701000098983</AccountNo>
<ApplName>Testing</ApplName>
</Msg>]]></Content>
<Date>20140909</Date>
</XML>
Output XML should be:
<XML>
<LogLevel>info</LogLevel>
<Content><![CDATA[ <Msg>
<AccountNo>XXXXXXXXXX983</AccountNo>
<ApplName>Testing</ApplName>
</Msg>]]></Content>
<Date>20140909</Date>
</XML>
Edit:
I used the following XSLT
<xsl:template match="node()">
<xsl:copy>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
<xsl:template match="text()">
<xsl:choose>
<xsl:when test="contains(.,'<AccountNo>')">
<!-- This is the CDATA that I want to mask and write back out as CDATA -->
<xsl:variable name="tcontent">
<xsl:value-of
select="substring-after(substring-before(.,'</AccountNo>'),'<AccountNo>') " />
</xsl:variable>
<xsl:text disable-output-escaping="yes"><![CDATA[<AccountNo></xsl:text>
<xsl:call-template name="maskVariable">
<xsl:with-param name="tvar" select="$tcontent" />
</xsl:call-template>
<xsl:text disable-output-escaping="yes"></AccountNo>]]></xsl:text>
</xsl:when>
<xsl:otherwise>
<xsl:copy />
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template name="maskVariable">
<xsl:param name="tvar" />
<xsl:variable name="length" select="string-length($tvar)" />
<xsl:choose>
<xsl:when test="$length > 3">
<xsl:value-of
select="concat ('************', substring($tvar,$length - 1, 2))" />
</xsl:when>
<xsl:when test="$length > 1">
***
</xsl:when>
<xsl:otherwise />
</xsl:choose>
</xsl:template>
Output of using this XSLT is :
<LogLevel>info</LogLevel>
<Content><![CDATA[<AccountNo>************02</AccountNo>]]></Content>
<Date>20140909</Date>
Here in output, only masked output of is displaying.
How to make other part of the code to get displayed ?
Please give me some idea how to do it ?
Any help is highly appreciated.
Why don't you try:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- identity transform -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="Content">
<xsl:copy>
<xsl:value-of select="substring-before(.,'<AccountNo>')" />
<xsl:text><AccountNo></xsl:text>
<xsl:variable name="acct-num" select="substring-before(substring-after(.,'<AccountNo>'), '</AccountNo>')" />
<xsl:value-of select="concat('************', substring($acct-num, string-length($acct-num) - 2))" />
<xsl:text></AccountNo></xsl:text>
<xsl:value-of select="substring-after(.,'</AccountNo>')" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Applied to your input, the result will be:
<?xml version="1.0" encoding="UTF-8"?>
<XML>
<LogLevel>info</LogLevel>
<Content> <Msg>
<AccountNo>************983</AccountNo>
<ApplName>Testing</ApplName>
</Msg></Content>
<Date>20140909</Date>
</XML>
Alternatively, you could use:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" cdata-section-elements="Content"/>
<xsl:strip-space elements="*"/>
<!-- identity transform -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="Content">
<xsl:copy>
<xsl:variable name="content">
<xsl:value-of select="substring-before(.,'<AccountNo>')" />
<xsl:text><AccountNo></xsl:text>
<xsl:variable name="acct-num" select="substring-before(substring-after(.,'<AccountNo>'), '</AccountNo>')" />
<xsl:value-of select="concat('************', substring($acct-num, string-length($acct-num) - 2))" />
<xsl:text></AccountNo></xsl:text>
<xsl:value-of select="substring-after(.,'</AccountNo>')" />
</xsl:variable>
<xsl:value-of select="$content" disable-output-escaping="yes"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
to produce:
<?xml version="1.0" encoding="UTF-8"?>
<XML>
<LogLevel>info</LogLevel>
<Content><![CDATA[ <Msg>
<AccountNo>************983</AccountNo>
<ApplName>Testing</ApplName>
</Msg>]]></Content>
<Date>20140909</Date>
</XML>
although this might not work with every processor (tested to work with Xalan 2.7.1: http://xsltransform.net/jyH9rMk).
The stuff inside the CDATA section is XML disguised as text. XSLT is good at transforming XML, it's not so good at transforming text, especially text with a complex grammar. So my approach would be: extract the text from the outer XML document, parse it as XML, transform it using XSLT (real XSLT that works on nodes rather than on markup), serialize it back to text, then stuff the text back into the original (outer) XML document.
Raw XSLT 1.0 can't do this within a single stylesheet. You need the functions parse() and serialize() to turn lexical XML into a node tree, and back again. These are available as extensions in some processors (such as Saxon), they become available as standard functions in XPath 3.0, and they can be written as simple extension functions (e.g. in Javascript) code in most other processors.

Mapping two sources to one output

I want to performs XSLT on two sources.
Sourch_1.xml:
<Root>
<record>
<PolicyNumber>1</PolicyNumber>
<FirdsName>aa</FirdsName>
<LastName>aa</LastName>
</record>
<record>
<PolicyNumber>2</PolicyNumber>
<FirdsName>bb</FirdsName>
<LastName>bb</LastName>
</record>
Sourch_2.xml:
<Root>
<record>
<policy>
<PolicyNumber>1</PolicyNumber>
<city>aaCity</city>
<street>aaStreet</street>
</policy>
<policy>
<PolicyNumber>2</PolicyNumber>
<city>bbCity</city>
<street>bbStreet</street>
</policy>
</record>
I want the output to create tags which combine data from two sources based on PolicyNumber.
I want my output to be:
output.xml:
<Root>
<record_data>
<PolicyNumber>1</PolicyNumber>
<FirdsName>aa</FirdsName>
<LastName>aa</LastName>
<city>aaCity</city>
<street>aaStreet</street>
</record_data>
<record_data>
<PolicyNumber>2</PolicyNumber>
<FirdsName>bb</FirdsName>
<LastName>bb</LastName>
<city>bbCity</city>
<street>bbStreet</street>
</record_data>
How can i accomplish that using XSLT?
To process more than one XML input file, use the document() function. Make sure that additional XML files are in the folder where you put the XSLT stylesheet, too.
The line that deserves your attention is the following:
<xsl:for-each select="document('double2.xml')/root/record/policy[./PolicyNumber=current()/PolicyNumber]">
To begin with, document('double2.xml') opens the second XML file (I have dubbed it "double2.xml"). For the policy elements in this second XML file it checks whether their PolicyNumber is equal to the PolicyNumber of the record element from the first XML file that is being processed
I have slightly modified your input to make it well-formed (lowercased the root element and added its closing tag). Note that there is also a typo in it and you probably meant to write "FirstName" instead of "FirdsName".
Stylesheet
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/root">
<xsl:copy>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
<xsl:template match="record">
<record_data>
<xsl:copy-of select="*"/>
<xsl:for-each select="document('double2.xml')/root/record/policy[./PolicyNumber=current()/PolicyNumber]">
<xsl:copy-of select="city|street"/>
</xsl:for-each>
</record_data>
</xsl:template>
</xsl:stylesheet>
Output
<?xml version="1.0" encoding="UTF-8"?>
<root>
<record_data>
<PolicyNumber>1</PolicyNumber>
<FirdsName>aa</FirdsName>
<LastName>aa</LastName>
<city>aaCity</city>
<street>aaStreet</street>
</record_data>
<record_data>
<PolicyNumber>2</PolicyNumber>
<FirdsName>bb</FirdsName>
<LastName>bb</LastName>
<city>bbCity</city>
<street>bbStreet</street>
</record_data>
</root>
Something like below will work for you.
NOTE : I have not ran this code. where I check not(policy) you might do some change in xpath like not(./policy)
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:for-each select="Root/record">
<xsl:choose>
<xsl:when test="not(policy)">
<record_data>
<PolicyNumber><xsl:value-of select="PolicyNumber"/></PolicyNumber>
<FirdsName><xsl:value-of select="FirdsName"/></FirdsName>
<LastName><xsl:value-of select="LastName"/></LastName>
<city><xsl:value-of select="city"/></city>
<street><xsl:value-of select="street"/></street>
</record_data>
</xsl:when>
<xsl:otherwise>
</xsl:otherwise>
<record_data>
<PolicyNumber><xsl:value-of select="policy/PolicyNumber"/></PolicyNumber>
<FirdsName><xsl:value-of select="policy/FirdsName"/></FirdsName>
<LastName><xsl:value-of select="policy/LastName"/></LastName>
<city><xsl:value-of select="policy/city"/></city>
<street><xsl:value-of select="policy/street"/></street>
</record_data>
</xsl:choose>
</xsl:for-each>

Populate XML template-file from XPath Expressions?

What would be the best way to populate (or generate) an XML template-file from a mapping of XPath expressions?
The requirements are that we will need to start with a template (since this might contain information not otherwise captured in the XPath expressions).
For example, a starting template might be:
<s11:Envelope xmlns:s11='http://schemas.xmlsoap.org/soap/envelope/'>
<ns1:create xmlns:ns1='http://predic8.com/wsdl/material/ArticleService/1/'>
<article xmlns:ns1='http://predic8.com/material/1/'>
<name>?XXX?</name>
<description>?XXX?</description>
<price xmlns:ns1='http://predic8.com/common/1/'>
<amount>?999.99?</amount>
<currency xmlns:ns1='http://predic8.com/common/1/'>???</currency>
</price>
<id xmlns:ns1='http://predic8.com/material/1/'>???</id>
</article>
</ns1:create>
</s11:Body>
</s11:Envelope>
Then we are supplied, something like:
expression: /create/article[1]/id => 1
expression: /create/article[1]/description => bar
expression: /create/article[1]/name[1] => foo
expression: /create/article[1]/price[1]/amount => 00.00
expression: /create/article[1]/price[1]/currency => USD
expression: /create/article[2]/id => 2
expression: /create/article[2]/description => some name
expression: /create/article[2]/name[1] => some description
expression: /create/article[2]/price[1]/amount => 00.01
expression: /create/article[2]/price[1]/currency => USD
We should then generate:
<ns1:create xmlns:ns1='http://predic8.com/wsdl/material/ArticleService/1/'>
<article xmlns:ns1='http://predic8.com/material/1/'>
<name xmlns:ns1='http://predic8.com/material/1/'>foo</name>
<description>bar</description>
<price xmlns:ns1='http://predic8.com/common/1/'>
<amount>00.00</amount>
<currency xmlns:ns1='http://predic8.com/common/1/'>USD</currency>
</price>
<id xmlns:ns1='http://predic8.com/material/1/'>1</id>
</article>
<article xmlns:ns1='http://predic8.com/material/2/'>
<name>some name</name>
<description>some description</description>
<price xmlns:ns1='http://predic8.com/common/2/'>
<amount>00.01</amount>
<currency xmlns:ns1='http://predic8.com/common/2/'>USD</currency>
</price>
<id xmlns:ns1='http://predic8.com/material/2/'>2</id>
</article>
</ns1:create>
I am implemented in Java, although I would prefer an XSLT-based solution if one is possible.
PS: This question is the reverse of another question I recently asked.
This transformation creates from the "expressions" an XML document that has the structure of the wanted result -- it remains to transform this result into the final result:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:my="my:my">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:variable name="vPop" as="element()*">
<item path="/create/article[1]/id">1</item>
<item path="/create/article[1]/description">bar</item>
<item path="/create/article[1]/name[1]">foo</item>
<item path="/create/article[1]/price[1]/amount">00.00</item>
<item path="/create/article[1]/price[1]/currency">USD</item>
<item path="/create/article[1]/price[2]/amount">11.11</item>
<item path="/create/article[1]/price[2]/currency">AUD</item>
<item path="/create/article[2]/id">2</item>
<item path="/create/article[2]/description">some name</item>
<item path="/create/article[2]/name[1]">some description</item>
<item path="/create/article[2]/price[1]/amount">00.01</item>
<item path="/create/article[2]/price[1]/currency">USD</item>
</xsl:variable>
<xsl:template match="/">
<xsl:sequence select="my:subTree($vPop/#path/concat(.,'/',string(..)))"/>
</xsl:template>
<xsl:function name="my:subTree" as="node()*">
<xsl:param name="pPaths" as="xs:string*"/>
<xsl:for-each-group select="$pPaths"
group-adjacent=
"substring-before(substring-after(concat(., '/'), '/'), '/')">
<xsl:if test="current-grouping-key()">
<xsl:choose>
<xsl:when test=
"substring-after(current-group()[1], current-grouping-key())">
<xsl:element name=
"{substring-before(concat(current-grouping-key(), '['), '[')}">
<xsl:sequence select=
"my:subTree(for $s in current-group()
return
concat('/',substring-after(substring($s, 2),'/'))
)
"/>
</xsl:element>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="current-grouping-key()"/>
</xsl:otherwise>
</xsl:choose>
</xsl:if>
</xsl:for-each-group>
</xsl:function>
</xsl:stylesheet>
When this transformation is applied on any XML document (not used), the result is:
<create>
<article>
<id>1</id>
<description>bar</description>
<name>foo</name>
<price>
<amount>00.00</amount>
<currency>USD</currency>
</price>
<price>
<amount>11.11</amount>
<currency>AUD</currency>
</price>
</article>
<article>
<id>2</id>
<description>some name</description>
<name>some description</name>
<price>
<amount>00.01</amount>
<currency>USD</currency>
</price>
</article>
</create>
Note:
You need to transform the "expressions" you are given into the format used in this transformation -- this is easy and straightforward.
In the final transformation you need to copy every node "as-is" (using the identity rule), with the exception that the top node should be generated in the "http://predic8.com/wsdl/material/ArticleService/1/" namespace. Note that the other namespaces present in the "template" are not used and can be safely ommitted.
This solution requires you to re-organise your XPATH input information slightly, and to allow a 2-step transformation. The first transformation will write the stylesheet, which will be executed in the second transformation - Thus the client is required to do two invocations of the XSLT engine. Let us know if this is a problem.
Step One
Please re-organise your XPATH information into an XML document like so. It should not be difficult to do, and even an XSLT script could be written to do the job.
<paths>
<rule>
<match>article[1]/id[1]</match>
<namespaces>
<namespace prefix="ns1">http://predic8.com/wsdl/material/ArticleService/1/</namespace>
<!-- The namespace node declares a namespace that is used in the match expression.
There can be many of these. It is not required to define the s11: namespace,
nor the ns1 namespace. -->
</namespaces>
<replacement>1</replacement>
</rule>
<rule>
<match>article[1]/description[1]</match>
<namespaces/>
<replacement>bar</replacement>
</rule>
... etc ...
</paths>
Solution constraints
In the above rules document we are constrained so that:
The match is implicitly prefixed 'expression: /create/'. Don't put that explicitly.
All matches must begin like article[n] where n is some ordinal number.
We can't have zero rules.
Any prefixes that you use in the match, other than s11="http://schemas.xmlsoap.org/soap/envelope/" and ns1="http://predic8.com/wsdl/material/ArticleService/1/". (Note: I don't think it is valid for namespaces to end in '/' - but not sure about that), are defined in the namespaces node.
The above is the input document to the step one transformation. Apply this document to this style-sheet ...
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:step2="http://www.w3.org/1999/XSL/Transform-step2"
xmlns:s11="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:ns1="http://predic8.com/wsdl/material/ArticleService/1/"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes='xsl'>
<xsl:output method="xml" indent="yes" encoding="UTF-8" />
<xsl:namespace-alias stylesheet-prefix="step2" result-prefix="xsl"/>
<xsl:template match="/">
<step2:stylesheet version="2.0">
<step2:output method="xml" indent="yes" encoding="UTF-8" />
<step2:variable name="replicated-template" as="element()*">
<step2:apply-templates select="/" mode="replication" />
</step2:variable>
<step2:template match="#*|node()" mode="replication">
<step2:copy>
<step2:apply-templates select="#*|node()" mode="replication" />
</step2:copy>
</step2:template>
<step2:template match="/s11:Envelope/s11:Body/ns1:create/article" mode="replication">
<step2:variable name="replicant" select="." />
<step2:for-each select="for $i in 1 to
{max(for $m in /paths/rule/match return
xs:integer(substring-before(substring-after($m,'article['),']')))}
return $i">
<step2:for-each select="$replicant">
<step2:copy>
<step2:apply-templates select="#*|node()" mode="replication" />
</step2:copy>
</step2:for-each>
</step2:for-each>
</step2:template>
<step2:template match="#*|node()">
<step2:copy>
<step2:apply-templates select="#*|node()"/>
</step2:copy>
</step2:template>
<step2:template match="/">
<step2:apply-templates select="$replicated-template" />
</step2:template>
<xsl:apply-templates select="paths/rule" />
</step2:stylesheet>
</xsl:template>
<xsl:template match="rule">
<step2:template match="s11:Envelope/s11:Body/ns1:create/{match}">
<xsl:for-each select="namespaces/namespace">
<xsl:namespace name="{#prefix}" select="." />
</xsl:for-each>
<step2:copy>
<step2:apply-templates select="#*"/>
<step2:value-of select="'{replacement}'"/>
<step2:apply-templates select="*"/>
</step2:copy>
</step2:template>
</xsl:template>
</xsl:stylesheet>
Step Two
Apply your soap envelope file, as an input document, to the style-sheet which was output from step one. The result is the original soap document, altered as required. This is a sample of a step two style-sheet, with just the first rule (/create/article[1]/id => 1) being considered for the sake of simplicity of illustration.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:s11="http://schemas.xmlsoap.org/soap/envelope/"
version="2.0">
<xsl:output method="xml" indent="yes" encoding="UTF-8"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template xmlns:ns1="http://predic8.com/wsdl/material/ArticleService/1/"
match="/s11:Envelope/s11:Body/ns1:create[1]/article[1]/id[1]">
<xsl:copy>
<xsl:apply-templates select="#*"/>
<xsl:value-of select="'1'"/>
<xsl:apply-templates select="*"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
More solution constraints
The template document must contain at least one /s11:Envelope/s11:Body/ns1:create/article . Only the article node is replicated (deeply) as required by rules. Other than than it can be any structure.
The template document cannot contain nested levels of the s11:Envelope/s11:Body/ns1:create node.
Explanation
You will notice that your XPATH expressions are not far removed from a match condition of template. Therefore it is not too difficult to write a stylesheet which re-expresses your XPATH and replacement values as template rules. When writing a style-sheet writing style-sheet the xsl:namespace-alias enables us to disambiguate "xsl:" as an instruction and "xsl:" as intended output. When XSLT 3.0 comes along, we are quiet likely to be able to reduce this algorithm into one step, as it will allow dynamic XPATH evaluation, which is really the nub of your problem. But for the moment we must be content with a 2-step process.
The second style-sheet is a two-phase transformation. The first stage replicates the template from the article level, as many times as needed by the rules. The second phase parses this replicated template, and applies the dynamic rules substituting text values as indicated by the XPATHs.
UPDATE
My original post was wrong. Thanks to Dimitre for pointing out the error. Please find updated solution above.
After-thought
If a two-step solultion is too complicated, and you are running on a wintel platform, you may consider purchasing the commercial version of Saxon. I believe that the commercial version has a dynamic XPATH evaluation function. I can't give you such a solution because I don't have the commercial version. I imagine a solution using an evaluate() function would be a lot simpler. XSLT is just a hobby for me. But if you are using XSLT for business purposes, the price is quiet reasonable.

Categories