Post-Process-Step for XSL - java

I'm currently working on a project which uses XSL-Transformations to generate HTML from XML.
On the input fields there are some attributes I have to set.
Sample:
<input name="/my/xpath/to/node"
class="{/my/xpath/to/node/#isValid}"
value="{/my/xpath/to/node}" />
This is pretty stupid because I have to write the same XPath 3 times... My idea was to have some kind of post-processor for the xsl file so i can write:
<input xpath="/my/xpath/to/node" />
I'm using using something like that to transform my xml
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import org.dom4j.Document;
import org.dom4j.io.DocumentResult;
import org.dom4j.io.DocumentSource;
public class Foo {
public Document styleDocument(
Document document,
String stylesheet
) throws Exception {
// load the transformer using JAXP
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer(
new StreamSource( stylesheet )
);
// now lets style the given document
DocumentSource source = new DocumentSource( document );
DocumentResult result = new DocumentResult();
transformer.transform( source, result );
// return the transformed document
Document transformedDoc = result.getDocument();
return transformedDoc;
}
}
My hope was that I can create a Transformer object out of a Document object. But it seems like it has to be a file path - at least I can't find a way to use a Document directly.
Anyone knows a way to achieve what I want?
Thanks

Why not skip the postprocessing, and use this in XSLT:
<xsl:variable name="myNode" select="/my/xpath/to/node" />
<input name="/my/xpath/to/node"
class="{$myNode/#isValid}"
value="{$myNode}" />
That gets you closer.
If you really want to DRY (as apparently you do), you could even use a variable myNodePath for which you generate the value from $myNode via a template or user-defined function. Does the name really have to be an XPath expression (as opposed to a generate-id()?)
Update:
Example code:
<xsl:variable name="myNode" select="/my/xpath/to/node" />
<xsl:variable name="myNodeName">
<xsl:apply-template mode="generate-xpath" select="$myNode" />
</xsl:variable>
<input name="{$myNodeName}"
class="{$myNode/#isValid}"
value="{$myNode}" />
The template for generate-xpath mode is available on the web... For example, you can use one of the templates for that purpose that comes with Schematron. Go to this page, download iso-schematron-xslt1.zip, and look at iso_schematron_skeleton_for_xslt1.xsl. (If you're able to use XSLT 2.0, then download that zip archive.)
In there you'll find a couple of implementations of schematron-select-full-path, which you can use for generate-xpath. One version is precise and is best for consumption by a program; another is more human-readable. Remember, for any given node in an XML document, there are multitudes of XPath expressions that could be used to select only that node. So you probably won't be getting the same XPath expression that you came in with at the beginning. If this is a deal-breaker, you may want to try another approach, such as ...
generating your XSLT stylesheet (the one you've already been developing, call it A) with another XSLT stylesheet (call it B). When B generates A, B has the chance to output the XPath expression both as a quoted string, and as an expression that will be evaluated. This is basically preprocessing in XSLT instead of postprocessing in Java. I'm not really sure if it would work in your case. If I knew what the input XML looks like, it would be easier to figure that out I think.

My hope was that I can create a Transformer object out of a Document object. But it seems like it has to be a file path - at least I can't find a way to use a Document directly.
You can create a Transformer object from a document object:
Document stylesheetDoc = loadStylesheetDoc(stylesheet);
// load the transformer using JAXP
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer(
new DOMSource( stylesheetDoc )
);
Implementing loadStylesheetDoc is left as an excercise. You can build the stylesheet Document internally or load it using jaxp, and you could even write the changes to it you need as another XSLT transform transforming the stylesheet.

Related

How to do a XSL transform in Java using a not namespace aware parser?

I use tagsoup as (SAX) XMLREader and set the namespace feature to false. This parser is used to feed the Transformer as SAX Source. Complete code:
final TransformerFactory factory = TransformerFactory.newInstance();
final Transformer t = factory.newTransformer(new StreamSource(
getClass().getResourceAsStream("/identity.xsl")));
final XMLReader p = new Parser(); // the tagsoup parser
p.setFeature("http://xml.org/sax/features/namespaces", false);
// getHtml() returns HTML as InputStream
final Source source = new SAXSource(p, new InputSource(getHtml()));
t.transform(source, new StreamResult(System.out));
This results in something like:
< xmlns:html="http://www.w3.org/1999/xhtml">
<>
<>
<>
<>
< height="17" valign="top">
Problem is that the tag names are blank. The XMLReader (tagsoup parser) does report an empty namespaceURI and empty local name in the SAX methods ContentHandler#startElement and ContentHandler#endElement. For a not namespace aware parser this is allowed (see Javadoc).
If i add a XMLFilter which copies the value of the qName to the localName, everything goes fine. However, this is not what i want, i expect this works "out of the box". What am i doing wrong? Any input would be appreciated!
I expect this works "out of the box". What am i doing wrong?
What you are doing wrong is taking a technology (XSLT) that is defined to operate over namespace-well-formed XML and attempting to apply it to data that it is not intended to work with. If you want to use XSLT then you must enable namespaces, declare a prefix for the http://www.w3.org/1999/xhtml namespace in your stylesheet, and use that prefix consistently in your XPath expressions.
If your transformer understands XSLT 2.0 (e.g. Saxon 9) then instead of declaring a prefix and prefixing your element names in XPath expressions, you can put xpath-default-namespace="http://www.w3.org/1999/xhtml" on the xsl:stylesheet element to make it treat unprefixed element names as references to that namespace. But in XSLT 1.0 (the default built-in Java Transformer implementation) your only option is to use a prefix.

How to get all XML branches

How can I get all XML branches using Java.
For example if i have the following XML:
<?xml version="1.0" encoding="UTF-8"?>
<addresses xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation='test.xsd'>
<address>
<name>Joe Tester</name>
<street>Baker street 5</street>
</address>
<person>
<name>Joe Tester</name>
<age>44</age>
</person>
</addresses>
I want to obtain the following branches:
addresses
addresses_address
addresses_address_name
addresses_address_street
addresses_person
addresses_person_name
addresses_person_age
Thanks.
You can get XML root, its' node and sub node names easily using any template engine. i.e Velocity, FreeMarker and other, FreeMarker have powerful new facilities for XML processing. You can drop XML documents into the data model, and templates can pull data from them in a variety of ways, such as with XPath expressions. FreeMarker, as an XML transformation tool with the much better-known XSLT stylesheet approach promulgated by the Worldwide Web Consortium (W3C).
FrerMarker support XPath to using jaxen,XPath expression needs Jaxen. downlaod
FreeMarker will use Xalan, unless you choose Jaxen by calling freemarker.ext.dom.NodeModel.useJaxenXPathSupport() from Java.
Just you need One Template, that will generate all XML branches according to input XML. really Put any XML on run-time to data model freemarker will process the template and generate XML branches corresponding to that XML structure. If your XML structure will change then no need of to change your Java code. Even if you want to change the output then changes will comes in template file hence no need recompilation Java code.
Just change in template, get get changes on the fly.
FTL File [One template for multiple XML document for creating xml branch names]
<#list doc ['/*' ] as rootNode>
<#assign rootNodeValue="${rootNode?node_name}">
${rootNodeValue}
<#list doc ['/*/*' ] as childNodes>
<#if childNodes?is_node==true>
${rootNodeValue}-${childNodes?node_name}
<#list doc ['/*/${childNodes?node_name}/*' ] as subNodes>
${rootNodeValue}-${childNodes?node_name}-${subNodes?node_name}
</#list>
</#if>
</#list>
</#list>
XMLTest.Java for process template
import java.io.IOException;
import java.io.InputStream;
import java.io.StringWriter;
import java.util.HashMap;
import java.util.Map;
import javax.xml.parsers.ParserConfigurationException;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import freemarker.ext.dom.NodeModel;
import freemarker.template.Configuration;
import freemarker.template.DefaultObjectWrapper;
import freemarker.template.ObjectWrapper;
import freemarker.template.Template;
import freemarker.template.TemplateException;
public class XMLTest {
public static void main(String[] args) throws SAXException, IOException,
ParserConfigurationException, TemplateException {
Configuration config = new Configuration();
config.setClassForTemplateLoading(XMLTest.class, "");
config.setObjectWrapper(new DefaultObjectWrapper());
config.setObjectWrapper(ObjectWrapper.BEANS_WRAPPER);
Map<String, Object> dataModel = new HashMap<String, Object>();
//load xml
InputStream stream = XMLTest.class.getClassLoader().getResourceAsStream(xml_path);
// if you xml sting then then pass it from InputSource constructor, no need of load xml from dir
InputSource source = new InputSource(stream);
NodeModel xmlNodeModel = NodeModel.parse(source);
dataModel.put("doc", xmlNodeModel);
Template template = config.getTemplate("test.ftl");
StringWriter out = new StringWriter();
template.process(dataModel, out);
System.out.println(out.getBuffer().toString());
}
}
Final OutPut
addresses
addresses-address
addresses-address-name
addresses-address-street
addresses-person
addresses-person-name
addresses-person-age
See doc for 1.XML Node Model 2.XML Node MOdel
Download FreeMarker from here
Downlaod Jaxen from here
There are many ways that you can extract data from XML and use it in Java. The one you choose will depend on how you want to use the data.
Some scenarios are:
You might want to manipulate nodes, order, remove and add others and transform the XML.
You might just want to read (and possibly change) the text contained in elements and attributes.
You might have a very large file and you just want to find some particular data and ignore the rest of the file.
For scenario #3, the best option is some memory-efficient stream-based parser, such as SAX or XML reader with the StAX API.
You can also use that for scenario #2, if you do mostly reading (and not writing), but DOM-based APIs might be easier to work with. You can use the standard DOM org.w3c.dom API or a more Java-like API such as JDOM or DOM4J. If you wish to synchronize XML files with Java objects you also might want to use a full Java-XML mapping framework such as JAXB.
DOM APIs are also great for scenario #1, but in many cases it might be simpler to use XSLT (via the javax.xml.transform TrAX API in Java). If you use DOM you can also use XPath to select the nodes.
I will show you an example on how to extract the individual nodes of your file using the standard DOM API (org.w3c.dom) and also using XPath (javax.xml.xpath).
1. Setup
Initialize the parser:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Parse file into a Document Object Model:
Document source = builder.parse(new File("src/main/resources/addresses.xml"));
2. Selecting nodes with J2SE DOM
You get the root element using getDocumentElement():
Element addresses = source.getDocumentElement();
From there you can get the child nodes using getChildNodes() but that will return all child nodes, which includes text nodes (the whitespace between elements). addresses.getChildNodes().item(0) returns the whitespace after the <addresses> tag and before the <address> tag. To get the element you would have to go for the second item. An easier way to do that is use getElementsByTagName, which returns a node-set and get the first item:
Element addresses_address = (Element)addresses.getElementsByTagName("address").item(0);
Many of the DOM methods return org.w3c.dom.Node objects, which you have to cast. Sometimes they might not be Element objects so you have to check. Node sets are not automatically converted into arrays. They are org.w3c.dom.NodeList so you have to use .item(0) and not [0] (if you use other DOM APIs such as JDOM or DOM4J, it will seem more intuitive).
You could use addresses.getElementsByTagName to get all the elements you need, but you would have to deal with the context for the two <name> elements. So a better way is to call it in the appropriate context:
Element addresses_address = (Element)addresses.getElementsByTagName("address").item(0);
Element addresses_address_name = (Element)addresses_address.getElementsByTagName("name").item(0);
Element addresses_address_street = (Element)addresses_address.getElementsByTagName("street").item(0);
Element addresses_person = (Element)addresses.getElementsByTagName("person").item(0);
Element addresses_person_name = (Element)addresses_person.getElementsByTagName("name").item(0);
Element addresses_person_age = (Element)addresses_person.getElementsByTagName("age").item(0);
That will give you all the Element nodes (or branches as you called them) for your file. If you want the text nodes (as actual Node objects) you need to get it as the first child:
Node textNode = addresses2_address_street.getFirstChild();
And if you want the String contents you can use:
String street = addresses2_address_street.getTextContent();
3. Selecting nodes with XPath
Another way to select nodes is using XPath. You will need the DOM source and you also need to initialize the XPath processor:
XPath xPath = XPathFactory.newInstance().newXPath();
You can extract the root node like this:
Element addresses = (Element)xPath.evaluate("/addresses", source, XPathConstants.NODE);
And all the other nodes using a path-like syntax:
Element addresses_address = (Element)xPath.evaluate("/addresses/address", source, XPathConstants.NODE);
Element addresses_address_name = (Element)xPath.evaluate("/addresses/address/name", source, XPathConstants.NODE);
Element addresses_address_street = (Element)xPath.evaluate("/addresses/address/street", source, XPathConstants.NODE);
You can also use relative paths, choosing a different element as the root:
Element addresses_person = (Element)xPath.evaluate("person", addresses, XPathConstants.NODE);
Element addresses_person_name = (Element)xPath.evaluate("person/name", addresses, XPathConstants.NODE);
Element addresses_person_age = (Element)xPath.evaluate("age", addresses_person, XPathConstants.NODE);
You can get the text contents as before, since you have Element objects:
String addressName = addresses_address_name.getTextContent();
But you can also do it directly using the same methods above without the last argument (which defaults to string). Here I'm using different relative and absolute XPath expressions:
String addressName = xPath.evaluate("name", addresses_address);
String addressStreet = xPath.evaluate("address/street", addresses);
String personName = xPath.evaluate("name", addresses_person);
String personAge = xPath.evaluate("/addresses/person/age", source);

Saxon in Java: XSLT for CSV to XML

Mostly continued from this question: XSLT: CSV (or Flat File, or Plain Text) to XML
So, I have an XSLT from here: http://andrewjwelch.com/code/xslt/csv/csv-to-xml_v2.html
And it converts a CSV file to an XML document. It does this when used with the following command on the command line:
java -jar saxon9he.jar -xsl:csv-to-xml.csv -it:main -o:output.xml
So now the question becomes: How do I do I do this in my Java code?
Right now I have code that looks like this:
TransformerFactory transformerFactory = TransformerFactory.newInstance();
StreamSource xsltSource = new StreamSource(new File("location/of/csv-to-xml.xsl"));
Transformer transformer = transformerFactory.newTransformer(xsltSource);
StringWriter stringWriter = new StringWriter();
transformer.transform(documentSource, new StreamResult(stringWriter));
String transformedDocument = stringWriter.toString().trim();
(The Transformer is an instance of net.sf.saxon.Controller.)
The trick on the command line is to specify "-it:main" to point right at the named template in the XSLT. This means you don't have to provide the source file with the "-s" flag.
The problem starts again on the Java side. Where/how would I specify this "-it:main"? Wouldn't doing so break other XSLT's that don't need that specified? Would I have to name every template in every XSLT file "main?" Given the method signature of Transformer.transform(), I have to specify the source file, so doesn't that defeat all the progress I've made in figuring this thing out?
Edit: I found the s9api hidden inside the saxon9he.jar, if anyone is looking for it.
You are using the JAXP API, which was designed for XSLT 1.0. If you want to make use of XSLT 2.0 features, like the ability to start a transformation at a named template, I would recommend using the s9api interface instead, which is much better designed for this purpose.
However, if you've got a lot of existing JAXP code and you don't want to rewrite it, you can usually achieve what you want by downcasting the JAXP objects to the underlying Saxon implementation classes. For example, you can cast the JAXP Transformer as net.sf.saxon.Controller, and that gives you access to controller.setInitialTemplate(); when it comes to calling the transform() method, just supply null as the Source parameter.
Incidentally, if you're writing code that requires a 2.0 processor then I wouldn't use TransformerFactory.newInstance(), which will give you any old XSLT processor that it finds on the classpath. Use new net.sf.saxon.TransformerFactoryImpl() instead, which (a) is more robust, and (b) much much faster.

HTML to XML Conversion using XSLT in java

Hi Can anyone help me in html to xml conversion using xslt in java.I converted xml to html using xslt in java.This is the code i used for that converstion:
import javax.xml.transform.*;
import java.net.*;
import java.io.*;
public class HowToXSLT {
public static void main(String[] args) {
try {
TransformerFactory tFactory = TransformerFactory.newInstance();
Transformer transformer =
tFactory.newTransformer
(new javax.xml.transform.stream.StreamSource
("howto.xsl"));
transformer.transform
(new javax.xml.transform.stream.StreamSource
("howto.xml"),
new javax.xml.transform.stream.StreamResult
( new FileOutputStream("howto.html")));
}
catch (Exception e) {
e.printStackTrace( );
}
}
}
But i dont know the reverse process of this program that is to convert html to xml? Is there is any jar files available to do that? please help me...
Generally, it isn't possible to "reverse" a transformation, because a transformation in the general case isn't a 1:1 mapping.
For example, if the transformation does this:
<xsl:value-of select= "/x * /x"/>
and we get as result: 16
(and we know that the source XML document had only one element),
it isn't possible to determine from the value 16 whether the source XML document was:
<x>4</x>
or whether it was:
<x>-4</x>
And the above was only a simple example! :)
This will depend on what you wish to do exactly.
Apparently, howto.xsl contains the rules to be applied on the xml to get the html.
You will have to write another xsl file to do the reverse.
I believe it is not possible. XLST input must be XML conforming and HTML is not conforming to XML (unless you talk about XHTML).
May be you need to first make your html xhtml complaint, then use a xsl (reverse of the original xsl)which has instruction to convert the xhtml file to xml.
Its not possible, you can use Microsoft.XMLDOM for converting from HTML to XML.

JAXB XSLT Property substitution

I apologize for the elementary question. I have an XML file, as well as an XSL to translate it into another format (KML). Within the KML I wish to inject a dynamic attribute which is not present in the original XML document. I want to emit a node like the following:
<NetworkLinkControl>
<message>This is a pop-up message. You will only see this once</message>
<cookie>sessionID={#sessionID}</cookie>
<minRefreshPeriod>5</minRefreshPeriod>
</NetworkLinkControl>
In particular I want the {#sessionID} text to be replaced with a dynamic value that I insert into the template somehow (i.e. is NOT part of the source XML document that the XSLT is transforming).
Here's the code I'm using to marshal the KML:
DomainObject myObject = ...;
JAXBContext context = JAXBContext.newInstance(new Class[]{DomainObject.class});
Marshaller xmlMarshaller = context.createMarshaller();
xmlMarshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
TransformerFactory transFact = TransformerFactory.newInstance();
// converts from jaxb XML representation into KML
Templates displayTemplate = transFact.newTemplates(new StreamSource(new File("conf/jaxbkml.xsl")));
Result outputResult = new StreamResult(System.out);
TransformerHandler handler =
((SAXTransformerFactory) transFact).newTransformerHandler(displayTemplate);
handler.setResult(outputResult);
Transformer transformer = handler.getTransformer();
// TODO: what do I actually fill in here to ensure that the session ID comes through
// in the XSLT document? I can't make heads or tails of the javadocs
transformer.setOutputProperty("{http://xyz.foo.com/yada/baz.html}sessionID", "asdf");
xmlMarshaller.marshal(myObject, handler);
I have gathered that there is a way to substitute in values dynamically in the XSLT via Attribute Value Templates and I assume that there is a way to hookup the transformer's properties to be used with these Attribute Value Templates, but I don't quite see how it's done. Could someone shed some light? Thanks.
Thanks to #jtahlborn for setting me on the right track. It is possible to do this, but I wasn't putting all the pieces together. First, define xsl:param.
<!-- give it a default value if none is set -->
<xsl:param name="sessionID" select="''"/>
Second, insert a reference to this xsl:param. If you need to embed it within the content of a node, as I did, use an xsl:value-of node.
<cookie>sessionID=<xsl:value-of
select="$sessionID"/></cookie>
Otherwise, if you need to embed it within an attributes string:
<img src="{$sessionID}/sample.gif"/>
Next, pass in a value for that xsl:param from within Java.
Result outputResult = new StreamResult(outputStream);
TransformerHandler handler =
((SAXTransformerFactory) transFact).newTransformerHandler(displayTemplate);
Transformer transformer = handler.getTransformer();
// Here is where the parameter is bound.
transformer.setParameter("sessionID", sessionID);
handler.setResult(outputResult);
xmlMarshaller.marshal(listWrapper, handler);
The attribute value templates are part of your XSL, not part of your XML, so what you are attempting won't work. You could use xpath to select the element which matches the pattern "sessionID={#sessionID}" and replace that with the text of your choice.
i believe you can set parameters for the stylesheet using the Transformer.setParameter() method which can then be referenced in the stylesheet using the syntax "{$param}", see examples here.

Categories