XSLT check if String is inside a set - java

I am trying to check if a String is contained within a set. I have an Excel sheet that I convert to an xml file; example:
Excel sheet on left and converted sheet on right (RowData.xml):
So I have an xml file where those set of numbers may or may not be there. For example, the source xml may look like this:
Source.xml:
<Data>
<Number>5556781234</Number>
<Number>5556781235</Number>
<Number>5556781236</Number>
</Data>
As you see it can stop anywhere. The source xml file may have all the numbers listed in RowData.xml or it may have only 1 or more. So my question is, how would I check for that in my xslt file?
I want to do this:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<!-- This is the Excel sheet converted to an XML file -->
<xsl:param name="sheet-uri" select="'RowData.xml'"/>
<xsl:param name="sheet-doc" select="document($sheet-uri)"/>
<xsl:template match="Data">
<xsl:for-each select="Data/Number">
<xsl:variable name="continue" select="$sheet-doc//Sheet/Row[Number = current()]/Continue"/>
<xsl:if test="">
<!-- Check the Source.xml against the RowData.xml and
see if the set contains any "No"'s in it. -->
<!-- If it does then don't do the following -->
<Data2>
<Number><xsl:value-of select="Number"/></Number>
<Timestamp>125222</Timestamp>
</Data2>
</xsl:if>
</xsl:for-each>
</xsl:template>
So basically, before making the <Data2> element, check the numbers in Source.xml and see if any of those numbers have a value of No for the column Continue in RowData.xml. I don't know how to make the if statement above. I know there's a contains() function in xslt; however, I don't know how I can use it here.
Is this possible? Please let me know if anything was confusing. Thanks in advance!

check the numbers in Source.xml and see if any of those numbers have a value of No for the column Continue in RowData.xml.
You can take advantage of XSLT's "existential equal" operator here:
test="doc('source.xml')/Data/Number =
$sheet-doc//Sheet/Row[Continue='No']/Number"
Essentially, if A and B are sets of values, then A = B returns true if some value in A is equal to some value in B.

I would suggest you use the key mechanism - esp. if you're using XSLT 2.0.
Define a key as:
<xsl:key name="row" match="Row" use="Number" />
then do:
<xsl:template match="/Data">
<xsl:for-each select="Number[not(key('row', ., $sheet-doc))]">
<Data2>
<xsl:copy-of select="."/>
<Timestamp>125222</Timestamp>
</Data2>
</xsl:for-each>
</xsl:template>
This selects only Numberelements that do not have a corresponding Row in the RowData.xml document.

Related

Restricting the letters used in a numbering generated by a xalan transformation

I am looking for a way to not have some letters in the numbering of chapters generated by xalan in an .xml to .fo transformation.
I am using org.apache.xalan.xsltc.trax.TransformerFactoryImpl to transform an .xml file into an .fo file to later make a PDF out of it. In the xml file I have some numbered chapters like so :
<prcitem2 numbering="9">
They are transformed in the .fo like so :
(This block is inside an fo:list-item-label, inside an fo:list-item, but I am on mobile and can't edit it properly. Sorry)
<fo:block>Й.</fo:block>
The xsl in charge of the transformation is :
<xsl:when test="ancestor-or-self::prcitem2">
<xsl:choose>
<xsl:when test="($language = 'ru')">
<xsl:number count="prcitem2" format="А."/>
</xsl:when>
</xsl:choose>
But my Russian comrades have informed me that some of their letters can't be used in numbering as it is not allowed by ATA and Russian standards (e.g. Й, З (that's not a 3) and some others).
Is there a way to forbid the use of these letters ?
As I mentioned in comments, I don't see a way to "fix" the built-in xsl:number algorithm and I suggest you replace it with your own.
In the following template, replace the alpha parameter's value with the Cyrillic characters you want to use. Everything else is self-adjusting.
Note that the input numbering is expected to start at zero, so call the template with the decimal parameter's value being = "$your-number - 1".
<xsl:template name="dec-to-alpha">
<xsl:param name="decimal"/>
<xsl:param name="alpha" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ'"/>
<xsl:variable name="base" select="string-length($alpha)"/>
<xsl:variable name="bit" select="$decimal mod $base"/>
<xsl:variable name="char" select="substring($alpha, $bit + 1, 1)"/>
<xsl:variable name="next" select="floor($decimal div $base)"/>
<xsl:if test="$next">
<xsl:call-template name="dec-to-alpha">
<xsl:with-param name="decimal" select="$next - 1"/>
</xsl:call-template>
</xsl:if>
<xsl:value-of select="$char"/>
</xsl:template>
Demo: https://xsltfiddle.liberty-development.net/94rmq74

XSLT gathering data

I have a simple issue that I can't really find a workaround to and I need your help.
The main problem is, that while process an input XML there are various places where I need to "gather" information. This means all I really have to do is call a special template with parameters like so:
<xsl:template name="append-section">
<xsl:param name="id" />
<xsl:param name="title" />
<!-- more code here -->
</xsl:template>
Lets say this template is called 12 times during the XSLT procedure. At the end of the conversion I want to write this data to a file.
I have tried to appen this data to a global variable and then write the result to the file. Only to realise the variables are not really variables in XSLT. This solution did not work.
Second solution was to use the xsl:result-document with one temp file. This solution would have done something like always copying the previous content of the file to itself, but also appending the new data something like this:
<xsl:template name="append-section">
<xsl:param name="id" />
<xsl:param name="title" />
<xsl:result-document method="html" href="tmp/tmp.html">
<xsl:value-of select="document(tmp.html)" />
<xsl:element name="li">
<xsl:element name="a">
<xsl:attribute name="class">
<xsl:value-of select="'so-dropdown-page-menu-list-button'" />
</xsl:attribute>
<xsl:attribute name="href">
<xsl:value-of select="'#'" />
<xsl:value-of select="$id" />
</xsl:attribute>
<xsl:value-of select="$title" />
</xsl:element>
</xsl:element>
</xsl:result-document>
</xsl:template>
This code might not be perfect, but I had to realise unfortunatly that the following exception was thrown:
Cannot write more than one result document to the same URI
This solution also seems to be invalid.
So my question is this: How can I implement this simple issue? Gather the data from various places and write them to a file at the end of the transformation.
I use Saxon.
You need to structure your code according to the structure of the output, not the structure of the input. Don't try to do things as you encounter information in the input; do them when you need to generate the relevant piece of the output.
There are cases when this can seem inefficient because it means visiting the same input more than once. Usually these inefficiences will prove apparent rather than real. But the first thing is to get the transformation working; if it's not fast enough you can come back to us with another question.

Check xml tags for certain values and transform - Best practice

I want to do the following :
At this moment we receive some xml-files where some xml-tags are filled wrongly.
To help our partner, we want to catch these false values by using a "Pass-through" folder where all the xml-files are placed before importing in our application.
This folder would be read every X minutes and for every file there will need to be done some checks, like : The length of the value within a tag, the value of the tag, etc.
Because this is only a temporary solution, we don't want to implement it in our application.
I was thinking of 2 possible set-ups :
Using java and calling an XSLT-file to transform every file and put it in another folder
Using only java to check the xml-file and do the transformation.
Both of the cases would be called by a .bat that runs every X minutes.
Now my questions :
What do you think that would be the best solution? a.k.a. the quickest, the most secure, etc. (maybe something other than suggested?)
Could you also provide me some examples of the way to do something like this?
I'm not like other persons who ask strictly for the codes. If you can give me something similar, I can make it on my own.
At the time of this writing, I'm already looking for solutions on other websites, but because it is urgent, it's also helpfull to ask the community.
Thank you for your answer,
Kind regards,
Maarten
EDIT : Both answers helped me a lot. Thank you guys.
http://www.ibm.com/developerworks/xml/library/x-javaxmlvalidapi/index.html
or
http://www.java-tips.org/java-se-tips/javax.xml.validation/how-to-create-xml-validator-from-xml-s.html
or
http://docs.oracle.com/javase/1.5.0/docs/api/javax/xml/validation/package-summary.html
If you want to run your XSLT, using a .bat script, on every XML file in a given folder (your first option in the OP) I can think of 3 ways:
A. Basically do a "for" loop to process each individual file via the command line. (Eww.)
B. Use collection() to point to an input folder and use xsl:result-document to create the output files in a new folder.
Here's an example XSLT 2.0 (tested with Saxon 9):
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:param name="pInputDir" select="'input'"/>
<xsl:param name="pOutputDir" select="'output'"/>
<xsl:variable name="vCollection" select="collection(concat($pInputDir,'/?*.xml'))"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/">
<xsl:for-each select="$vCollection">
<xsl:variable name="vOutFile" select="tokenize(document-uri(document(.)),'/')[last()]"/>
<xsl:result-document href="{concat($pOutputDir,'/',$vOutFile)}">
<xsl:apply-templates/>
</xsl:result-document>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Notes:
This stylesheet is just doing an identity transform. It's passing the XML through unchanged. You would need to override the identity template by adding new templates to do your checks/changes.
Also notice that there are 2 parameters for the input and output folder names.
You may run into memory issues using collection() because it loads all of the XML files in the folder into memory. If this is an issue, see below...
C. Have your XSLT process a list of all the files in the directory. Use a combination of document() and the Saxon extension function saxon:discard-document() to load and discard the documents.
Here's an example I used a while back for testing.
XML file listing (input to the XSLT):
<files>
<file>file:///C:/input_xml/file1.xml</file>
<file>file:///C:/input_xml/file2.xml</file>
<file>file:///C:/input_xml/file3.xml</file>
<file>file:///C:/input_xml/file4.xml</file>
<file>file:///C:/input_xml/file5.xml</file>
<file>file:///C:/input_xml/file6.xml</file>
<file>file:///C:/input_xml/file7.xml</file>
<file>file:///C:/input_xml/file8.xml</file>
<file>file:///C:/input_xml/file9.xml</file>
<file>file:///C:/input_xml/file10.xml</file>
</files>
XSLT 2.0 (tested with Saxon 9):
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:param name="pOutputDir" select="'output'"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="files">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="file">
<xsl:variable name="vOutFile" select="tokenize(document-uri(document(.)),'/')[last()]"/>
<xsl:result-document href="{concat($pOutputDir,$vOutFile)}">
<xsl:apply-templates select="document(.)/saxon:discard-document(.)" xmlns:saxon="http://saxon.sf.net/"/>
</xsl:result-document>
</xsl:template>
</xsl:stylesheet>
Notes:
Again, this stylesheet is just doing an identity transform. It's passing the XML through unchanged. You would need to override the identity template by adding new templates to do your checks/changes.
Also notice that there is only a parameter for the output folder name.

Trying to print out node values in XSLT using variable elements names

So here's a problem that's been bugging me for the last few days. It should be fairly easy, but XSLT is just such a pain to debug. We're using Xalan 1.0 on java 1.6
Input XML
<?xml version="1.0" encoding="UTF-8"?>
<rfb2>
<rfb2_item>
<VALDATE>2011-10-23</VALDATE>
<FUND_ID>300</FUND_ID>
<SEC_ID>34567</SEC_ID>
</rfb2_item>
<rfb2_item>
<VALDATE>2011-1-09</VALDATE>
<FUND_ID>700</FUND_ID>
<SEC_ID>13587</SEC_ID>
</rfb2_item>
<rfb2_item>
<VALDATE>2011-3-09</VALDATE>
<FUND_ID>200</FUND_ID>
<SEC_ID>999334</SEC_ID>
</rfb2_item>
<rfb2>
We need to transform the XML into a comma-separated list of values for each rfb2_item, so the style sheet always iterates the rfb2_item nodes. We are using a parameter in the style sheet to control which elements of rfb2_item (valdate,fund_id,sec_id) that will be output, and in what order, for example
<xsl:param name="$outputElements" select="'VALDATE,FUND_ID'"/>
..outputs...
2011-10-23,300
2011-1-09,700
2011-3-09,200
<xsl:param name="$outputElements" select="'SEC_ID'"/>
..outputs...
34567
13587
999334
Special case where if $outputElements is '*', just output the elements in the order they appear in the input xml
<xsl:param name="$outputElements" select="'*'"/>
..outputs...
2011-10-23,300,34567
2011-1-09,700,13587
2011-3-09,200,999334
So, my question is how do we write a template to create the desired output based on the $outputElements parameter? A working example would be great...
Yup, FailedDev is right. Someone would write it for you:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" />
<xsl:param name="outputElements" select=" 'FUND_ID,SEC_ID,VALDATE' " />
<xsl:template match="rfb2_item">
<xsl:for-each select="*[contains($outputElements, local-name()) or $outputElements = '*']">
<xsl:sort select="string-length(substring-before($outputElements, local-name(.)))" />
<xsl:value-of select="text()" />
<xsl:if test="position() != last()">
<xsl:text>,</xsl:text>
</xsl:if>
</xsl:for-each>
<xsl:text>
</xsl:text>
</xsl:template>
</xsl:stylesheet>
Bit of explanation. The xsl:for-each is gonna select each element in the current rfb2_item for which the local name is contained in the outputElements parameter, or for which the outputElements parameter is * (which would always yield true if that's the case). It's then gonna sort those based on the length of the substring that goes before that local name in outputElements. Since this value becomes higher when the name occurs later in that parameter, this results in ordering based on your parameter.
Example: element VALDATE would yield FUND_ID,SEC_ID for the substring-before function, which in turn would yield 14 as string length. This is higher than the 8 that you'd get for SEC_ID, meaning the VALDATE value is ordered after SEC_ID.
After the xsl:sort, we're simply using xsl:value-of to output the element value. You might want to trim extraneous whitespace there. Finally, we're testing if the position is not equal to that of the last node in the current context (which is that of xsl:for-each after sorting) and if so, output a comma. This avoids outputting a comma after the last value.
The line break I've inserted using xsl:text assumes the Windows/DOS convention. Remove the 
 if the file should only use new line characters for line breaks, instead of carriage return + new line.
Note that this does not escape commas in your CSV output! I'll leave that up to you. It could be interesting to look into using extension functions for delegating this task to Java if it proves too difficult in XSLT/XPath.
Sometimes in this kind of situation it's worth looking at the possibility of generating or modifying XSLT code using XSLT. You can take the parameterization a lot further that way - for example controlling which fields are output, how they are sorted, whether they are grouped, selection criteria for which rows are selected, etc etc.

How to match and process unknown XML elements in XSLT 1.0?

I have a simply XSLT 1.0 stylesheet, that turns XML documents in XHTML. I really want to be able to "include" the content of an XML file in another when needed. AFAIK it is simply not possible in XSLT 1.0, so I decided to move my processing to a simple Java app that would pre-process the XML, executing the "includes" recursively, and passing it to the default JDK XSLT processor. I have a XML schema that my documents must conform to.
The most used element is called "text", and can have an "id" and/or a "class" attribute, which gets used for XHTML styling with CSS. This element gets turned into "p", "div", or "span" depending on the context.
What I would like to add, is the ability to define "unknown" elements in my input files, and have them transformed in a "text" element for further processing. If the "unknown" element's name start with a capital letter, then it becomes a "text", with "id" set to original name. Otherwise a "text" with "class" set to original name. Everything else in the unknown element should be kept as-is, and then it should be processed by XSLT as if it was originally in the input file. In other words, I would like to transform all unknown elements to for a valid XML document, and then process it with my stylesheet.
Can this be done in XSLT, possibly in a pre-processing "stylesheet", or should I do that as pre-processing in Java? Performance here is not important. I would prefer a XSLT solution, but not if it's much more complicated then doing it in Java.
Well, since no one answered, I just tried it. While is is easier to do it in Java, it has one major drawback: since the code need to know the valid elements so that it recognize the unknown ones, you end up having to hardcode that in your code and have to recompile it if the XSLT template changes.
So, I tried in XSLT and it also works. Let's say you have:
<xsl:template match="text">
*processing*
<xsl:call-template name="id_and_class"/>
*processing*
</xsl:template>
where the template named id_and_class copies your id and classes attribute in the generated element, and you want unknown elements to be mapped to "text" elements, then you can do this:
<xsl:template match="text">
<xsl:call-template name="text_processing"/>
</xsl:template>
<xsl:template name="text_processing">
*processing*
<xsl:call-template name="text_id_and_class"/>
*processing*
</xsl:template>
...
<xsl:template name="text_id_and_class">
<xsl:choose>
<!-- If name() is not "text", then we have an unknown element. -->
<xsl:when test="name()!='text'">
<!-- Processing of ID and class omitted ... -->
</xsl:when>
<xsl:otherwise>
<xsl:call-template name="id_and_class"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
...
<!-- MUST BE LAST : Process unknown elements like a "text" element. -->
<xsl:template match="*">
<xsl:call-template name="text_processing"/>
</xsl:template>
If yon process the content of one specific element with a named template, then you can check in that template if the name matches, and use that for your special processing. Then you just have to put a <xsl:template match="*"> at the end of your stylesheet and call the named template from there.

Categories