XSLT gathering data - java

I have a simple issue that I can't really find a workaround to and I need your help.
The main problem is, that while process an input XML there are various places where I need to "gather" information. This means all I really have to do is call a special template with parameters like so:
<xsl:template name="append-section">
<xsl:param name="id" />
<xsl:param name="title" />
<!-- more code here -->
</xsl:template>
Lets say this template is called 12 times during the XSLT procedure. At the end of the conversion I want to write this data to a file.
I have tried to appen this data to a global variable and then write the result to the file. Only to realise the variables are not really variables in XSLT. This solution did not work.
Second solution was to use the xsl:result-document with one temp file. This solution would have done something like always copying the previous content of the file to itself, but also appending the new data something like this:
<xsl:template name="append-section">
<xsl:param name="id" />
<xsl:param name="title" />
<xsl:result-document method="html" href="tmp/tmp.html">
<xsl:value-of select="document(tmp.html)" />
<xsl:element name="li">
<xsl:element name="a">
<xsl:attribute name="class">
<xsl:value-of select="'so-dropdown-page-menu-list-button'" />
</xsl:attribute>
<xsl:attribute name="href">
<xsl:value-of select="'#'" />
<xsl:value-of select="$id" />
</xsl:attribute>
<xsl:value-of select="$title" />
</xsl:element>
</xsl:element>
</xsl:result-document>
</xsl:template>
This code might not be perfect, but I had to realise unfortunatly that the following exception was thrown:
Cannot write more than one result document to the same URI
This solution also seems to be invalid.
So my question is this: How can I implement this simple issue? Gather the data from various places and write them to a file at the end of the transformation.
I use Saxon.

You need to structure your code according to the structure of the output, not the structure of the input. Don't try to do things as you encounter information in the input; do them when you need to generate the relevant piece of the output.
There are cases when this can seem inefficient because it means visiting the same input more than once. Usually these inefficiences will prove apparent rather than real. But the first thing is to get the transformation working; if it's not fast enough you can come back to us with another question.

Related

markup specific strings in Xml

I like to markup some strings in an xml document.
For example, I have:
<p> I like to go to Florida </p>
I need to tag the string "go" and have the output as:
<p> I like to <something>go</something> to Florida</p>
What is the best way to do this? I am using Java. I need to treat the XML file as XML not as text. I found some solutions that treat an xml file as a text file and use string.replace but I do not think those are good solutions.
Any suggestion is much appreciated.
Thank you,
Try an XSLT 2.0 transformation like this:
<xsl:template match="#*|*">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="text()">
<xsl:analyze-string regex="go">
<xsl:matching-substring>
<something><xsl:value-of select="."/></something>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>
You can of course extend the regular expression, e.g. regex="go|come|walk|run"; if you only want to match whole words, you might want to use tokenize() to split it into words and process each word separately.

Validate XML against XSD and enrich with validation results

I'm building a java tool to validate an xml document and build an html report containing the input data and validation results.
I think that a possible way is:
validate XML with XSD
enrich the XML with validation results
transform the enriched XML in the final HTML report (this point is not object of the question)
First and foremost, is this a valid approach? Or there are more suitable ways to get those things done in java?
If this is is a viable solution, how can i implement step 2?
For example, if I start from this input document:
<parent>
<child id="a correct id" type="a correct type"/>
<child id="an incorrect id" type="an incorrect type"/>
</parent>
How can I produce an enriched output document like that:
<parent>
<child id="a correct id" type="a correct type">
<results>
<result>id is correct</result>
<result>type is correct</result>
</results>
</child>
<child id="an incorrect id" type="an incorrect type">
<results>
<result>id is NOT correct</result>
<result>type is NOT correct</result>
</results>
</child>
</parent>
First, there are many ways of going about this. There are other tools like schematron that provide languages for describing validation results, and the ability to transform the results of validation into pretty HTML. There are numerous java packages that actually do schema validation, so most of what you're trying to accomplish should be "glue code". Make sure you don't attempt to do schema validation in your java code.
So next, I'm not sure what your requirements are for wanting to transform the original XML file after validation. Usually you'd dump a validation result set as a separate file. Does the schema for the original XML permit your additions that you're putting in?
In general, if you wanted to transform the original input, you could go about this by writing an XSLT program that takes the validation results file, and the original source file, and then transforms the original file using those validation results. But I don't recommend that because I think your situation might call for a different design that doesn't transform the original file, unless you have more requirements you want to go into more depth about.
Another option would be straightforward DOM manipulation. After validation, you could load the DOM for the input document, manipulate it, then write it back to the same original file.
But seriously -- before you adopt any approach for step 2, make sure that your requirements really call for it.
One approach worth exploring: Xerces-J provides access to the post-schema-validation infoset (PSVI), and can in fact serialize it as XML. For small documents, at least, you may find that XML representation of the PSVI suffices for your purposes.
The PSVI representation made available by Xerces-J (and by xsv) is not, it should be said, anything like an annotated copy of the input. But it can be transformed into a form like the one you show using normal XML processing.
I'm returning to this question after I got some deeper understanding and experience of XSD and XSLT and eventually built my project on that knowledge.
My original question had some misleading points.
My objective was to process an XML with the ONLY objective to produce a corresponding HTML report, containing the XML data in a readable form along with the "validation" results against a set of business rules.
My wrong assumption was that I should necessarily have to validate the XML against an XSD, but that brought me some significant challenges in development:
XSD 1.0 did not fit completely my business rules. So I had to switch to XSD 1.1
then I had to set up an XSD 1.1 compliant validator in java (Xerces2J)
and finally I started thinking on how to build a fine-grained validator based on those premises.
It was at that time that I realized that this process was a little overkill: what I really needed was just a TRANSFORMATION from XML to HTML: the fine-grained validation could have been done inside the transformation process, all together with an XSLT.
To answer my own question in the most generic form:
I would still use XSD for a very basic preliminary validation, and then use XSLT to check for more complex validation rules and enrich the XML.
This is the XSLT to transform the source xml of my own question into the result (still XML).
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="parent">
<xsl:for-each select="child">
<xsl:call-template name="processChildren"/>
</xsl:for-each>
</xsl:template>
<!-- this template processes the children nodes, applying a sample test clause -->
<xsl:template name="processChildren">
<xsl:copy>
<xsl:apply-templates select="#*|node()" />
<results>
<xsl:choose>
<xsl:when test="contains(#id,'incorrect')">
<result>id is NOT correct</result>
</xsl:when>
<xsl:otherwise>
<result>id is correct</result>
</xsl:otherwise>
</xsl:choose>
<xsl:choose>
<xsl:when test="contains(#type,'incorrect')">
<result>type is NOT correct</result>
</xsl:when>
<xsl:otherwise>
<result>type is correct</result>
</xsl:otherwise>
</xsl:choose>
</results>
</xsl:copy>
</xsl:template>
<!-- this template copy the contents of the node unaltered -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>

Converting an XML serialization back to java code

I know this questions might seem a little odd, but I want to make sure.
One of my superiors is kind of convinced that there is a more or less easy way to convert the XML serialization of an object back to the java code that created it. I am, however, pretty sure that the best I can get is the object.
So basically my questions is: Is there any way to convert something like this
<java version="1.6.0_10" class="java.beans.XMLDecoder">
<object class="javax.swing.JPanel">
<void property="size">
<object class="java.awt.Dimension">
<int>42</int>
<int>23</int>
</object>
</void>
</object>
</java>
back to something like
JPanel jPanel = new JPanel();
jPanel.setSize(42,23);
Thanks in advance.
Provided that all serialized objects comply to the java beans contract, you can re-create the process that the XML de-serializer follows to unmarshal the java objects, in order to recreate the code that goes with it.
Back in the golden XML days, I worked some projects that used similar processes to generate Java code from XML definitions.
Departing from your serialized model, you can use a XSL-T transformation to recreate the code that lead to the serialized objects. This process will create very linear code (as in non-modular), but you'll have what you're looking for.
An example to get you started: To process the XML you provided, you can use the following recursive transformation: copy/paste it & try it here: online XSL-T (the template is based on Xpath 1.0 to be able to use the online tool. Xpath 2.0 will improve the code in some areas, like string functions)
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xf="http://www.w3.org/2005/xpath-functions">
<xsl:template match="object">
<xsl:call-template name="objectClass" /> <xsl:value-of select="string(' ')" />
<xsl:call-template name="objectNodeName" />
= new <xsl:call-template name="objectClass" />(<xsl:call-template name="objectParams" />);
<xsl:for-each select="*[#property]">
<xsl:apply-templates />
<xsl:call-template name="setProperty" />
</xsl:for-each>
</xsl:template>
<xsl:template match="/" >
<xsl:apply-templates match="/object" />
</xsl:template>
<xsl:template match="text()" />
<xsl:template name="objectNodeName">
<xsl:param name="node" select="." />
<xsl:value-of select="translate($node/#class,'.','_')" />_<xsl:value-of select="count($node/ancestor-or-self::*)" />
</xsl:template>
<xsl:template name="setProperty">
<xsl:call-template name="objectNodeName" > <xsl:with-param name="node" select="parent::node()"/></xsl:call-template>
.set<xsl:call-template name="capitalize"><xsl:with-param name="str" select="#property"/></xsl:call-template>(<xsl:call-template name="objectNodeName" > <xsl:with-param name="node" select="node()"/></xsl:call-template>);
</xsl:template>
<xsl:template name="objectClass">
<xsl:param name="fqn" select="#class" />
<xsl:value-of select="$fqn" />
</xsl:template>
<xsl:template name="objectParams">
<xsl:for-each select="*[not(child::object)]">
<xsl:if test="position() > 1">,</xsl:if><xsl:value-of select="." />
</xsl:for-each>
</xsl:template>
<xsl:variable name="smallcase" select="'abcdefghijklmnopqrstuvwxyz'" />
<xsl:variable name="uppercase" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ'" />
<xsl:template name="capitalize">
<xsl:param name="str" select="." />
<xsl:value-of select="concat(translate(substring($str,1,1),$smallcase,$uppercase),substring($str,2))">
</xsl:template>
</xsl:stylesheet>
Disclaimer: I tested the template on the sample provided and some variations of it, including some containign several more objects. I did not test deeper object nesting. It's an example and not a fully-functional XML Serialization to Java transformation, which is left as an exercise to the reader :-)
Yes, I think there are several ways to implement this. First of all you can use JAXB technology, read about it http://www.oracle.com/technetwork/articles/javase/index-140168.html#xmp1.
Second way: you always can read xml in runtime (DOM, SAX) and create objects dynamically using reflection.
I don't think this is possible because if object class can be everything, how would you know what method to call to set size x to 42? Maybe there is a setter for this, maybe just a constructor or the number was calculated somehow.
The only possibility I can imagine is through the use of reflection, that's more or less the same frameworks like XStream do. So you can create the same object, but not the same code that was originally used to create it.
It is not that difficult. I am not sure how this would work with Swing class like JPanel, but since it is a Java bean, it should not be a problem to use some kind of XML library like XStream, which is one of the easiest way of how to do such things. Or you could use more verbose JAXB or XML Beans.
Edit: I'm sorry I didn't notice that there already is an XMLDecoder mentioned, and there seems to be an article on how to 'read JavaBean from XML file using XMLDecoder'.
I suspect that the superiors belief was poorly articulated and/or poorly understood. In particular, "the XML serialization of an object" may have been meant to refer to the schema, not the XML for a particular object, so that "convert the XML serialization of an object back to the java code that created it" would mean going from schema to marshalling code - JAXB's XJC, or similar.
I have never seen any sort of generalized technology that can reliably reconstruct the actual code instructions used to create an object based on a general XML serialization. What you have is stuff like JAXB, XStream or xmlbeans that can recreate an object based on XML serialized information. This is pretty evident when you think about it since there can be any number of ways to code an object to a specific state. Just knowing the state (which is really what the XML serialization is - the object's state at a certain point in time) is not enough to deduce HOW the object got to that state.
Also, there are many types of information that is transient in nature and not serializable (thread handles, sockets, window handles, device context etc etc etc) so serialization is not applicable to all objects/classes to begin with.

Formatted HTML as output from method invocation from MX4J HTTP page

I have a huge set of data and want to display the data with some formatting.
This is what the method basically looks like:
#ManagedOperation(description = "return html")
#ManagedOperationParameters({#ManagedOperationParameter(name = "someVal", description = "text")})
public String returnAsHtml(String someVal)
{
return "some formatted xml";
}
Looks like XSLTProcessor can be configured to use a XSLT template. However I could not find any examples on the internet using XSLT for html transformation in the context of MX4J. Could any one provide a sample XSLT template?
In case anyone comes back to this question, two things come to mind:
1) MX4J has several default implementations of HttpCommandProcessorAdaptor. These operations are mapped from the path. For JMX operations (aka ManagedOperation in Spring parlance), MX4J uses URLs like /invoke?operation=returnAsHtml
This will be passed to the InvokeOperationCommandProcessor to create an XML document with the result being just the toString() of whatever you returned, in an attribute called 'return'. It also passes back the return type in an attribute called 'returnclass'. You can see all this if you just add &template=identity to the invoke URL.
I mention all this because one option is to implement your own 'invoke.xsl'. The one in MX4J just calls the renderobject template:
Lo and behold, you find this in mbean_attributes.xsl, with a comment showing you exactly what you need to do:
<xsl:template name="renderobject">
<xsl:param name="objectclass"/>
<xsl:param name="objectvalue"/>
<xsl:choose>
<xsl:when test="$objectclass='javax.management.ObjectName'">
<xsl:variable name="name_encoded">
<xsl:call-template name="uri-encode">
<xsl:with-param name="uri">
<xsl:value-of select="$objectvalue"/>
</xsl:with-param>
</xsl:call-template>
</xsl:variable>
<a href="/mbean?objectname={$name_encoded}">
<xsl:value-of select="$objectvalue"/>
</a>
</xsl:when>
<xsl:otherwise>
<!-- Use the following line when the result of an invocation
returns e.g. HTML or XML data
<xsl:value-of select="$objectvalue" disable-output-escaping="true" />
-->
<xsl:value-of select="$objectvalue"/>
</xsl:otherwise>
</xsl:choose>
Setting 'disable-output-escaping' to true will do the trick
2) Another option is to write your own HttpCommandProcessorAdaptor, and set it on the HttpAdapter. This could either replace the 'invoke' processor, or you could have an entirely new one.
Hope that helps
One way I figured out is to use java script in the XSL template to extract and parse the string. Make sure you test for the browser (IE vs Non IE) and use proper parser.

How to preserve Empty XML Tags after XSLT - prevent collapsing them from <B></B> to <B/>

Say I have a very simple XML with an empty tag 'B':
<Root>
<A>foo</A>
<B></B>
<C>bar</C>
</Root>
I'm currently using XSLT to remove a few tags, like 'C' for example:
<?xml version="1.0" ?>
<xsl:stylesheet version="2.0" xmlns="http://www.w3.org/1999/XSL/Transform" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="no" encoding="utf-8" omit-xml-declaration="yes" />
<xsl:template match="*">
<xsl:copy>
<xsl:copy-of select="#*" />
<xsl:apply-templates />
</xsl:copy>
</xsl:template>
<xsl:template match="C" />
</xsl:stylesheet>
So far OK, but the problem is I end up having an output like this:
<Root>
<A>foo</A>
<B/>
</Root>
when I actually really want:
<Root>
<A>foo</A>
<B></B>
</Root>
Is there a way to prevent 'B' from collapsing?
Thanks.
Ok, so here what worked for me:
<xsl:output method="html">
Try this:
<script type="..." src="..."> </script>
Your HTML output will be:
<script type="..." src="..."> </script>
The   prevents the collapsing but translates to a blank space. It's worked for me in the past.
There is no standard way, as they are equivalent; You might be able to find an XSLT engine that has an option for this behaviour, but I'm not aware of any.
If you're passing this to a third party that cannot accept empty tags using this syntax, then you may have to post-process the output yourself (or convince the third party to fix their XML parsing)
It is up to the XSLT engine to decide how the XML tag is rendered, because a parser should see no difference between the two variations. However, when outputting HTML this is a common problem (for <textarea> and <script> tags for example.) The simplest (but ugly) solution is to add a single whitespace inside the tag (this does change the meaning of the tag slightly though.)
This has been a long time issue and I finally made it work with a simple solution.
Add <xsl:text/> if you have a space character. I added a space in my helper class.
<xsl:choose>
<xsl:when test="$textAreaValue=' '">
<xsl:text/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$textAreaValue"/>
</xsl:otherwise>
</xsl:choose>
They are NOT always equivalent. Many browsers can't deal with <script type="..." src="..." /> and want a separate closing tag. I ran into this problem while using xml/xsl with PHP. Output "html" didn't work, I'm still looking for a solution.
No. The 2 are syntactically identical, so you shouldn't have to worry
It should not be a problem if it is or . However if you are using another tool which expects empty XML tags as way only, then you have a problem. A not very elegant way to do this will be adding a space between staring and ending 'B' tags through XSLT code.
<xsl:text disable-output-escaping="yes">
<![CDATA[<div></div>]]>
</xsl:text>
This works fine with C#'s XslCompiledTransform class with .Net 2.0, but may very well fail almost anywhere else. Do not use unless you are programmatically doing the transofrm yourself; it is not portable at all.
It's 7 years late, but for future readers I will buck the trend here and propose an actual solution to the original question. A solution that does not modify the original with spaces or the output directive.
The idea was to use an empty variable to trick the parser.
If you only want to do it just for one tag B, my first thought was to use something like this to attach a dummy variable.
<xsl:variable name="dummyempty" select="''"/>
<xsl:template match="B">
<xsl:copy>
<xsl:apply-templates select="#*" />
<xsl:value-of select="concat(., $dummyempty)"/>
</xsl:copy>
</xsl:template>
But I found that in fact, even the dummy variable is not necessary. This preserved empty tags, at least when tested with xsltproc in linux :
<xsl:template match="B">
<xsl:copy>
<xsl:apply-templates select="#*" />
<xsl:value-of select="."/>
</xsl:copy>
</xsl:template>
For a more generic solution to handle ALL empty tags, try this:
<xsl:variable name="dummyempty" select="''"/>
<xsl:template match="*[. = '']">
<xsl:copy>
<xsl:apply-templates select="node()|#*" />
<xsl:value-of select="$dummyempty"/>
</xsl:copy>
</xsl:template>
Again, depending on how smart your parser is, you may not even need the dummy variable.

Categories