JAXB and XSLT processor - java

I am using JAXB and maven-jaxb2-plugin and I am able right now to bind my schemas to Java code successfully.
I also have a .xsl file "annotate_schemas.xsl" that modifies a specific schema adding some additional information.
Finally, on the schema that I want transformed, I added the header:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="annotate_schemas.xsl"?>
...
The problem is that, while the .xsl is correct (if I open my schema file in a browser, the transformation is done flawlessly), JAXB ignores it and binds an untouched version of my schema.
My question is: Does JAXB (and/or its plugin) have an XSLT processor?? Is there a way to tell JAXB to bind the result of the XSLT transformation instead of the original?
Thank you very much

JAXB, like the vast majority of XML-consuming applications, takes no notice of an <?xml-stylesheet?> processing instruction. If you want to transform a document before passing it to JAXB, you need to transform it explicitly, for example by using the JAXP transformation API. (There is an option in JAXP to request transformation according to the value of the xml-stylesheet PI if that's how you want to control it: TransformerFactory.useAssociatedStylesheet()).

You can try something like this:
TransformerFactory transFact = TransformerFactory.newInstance();
Templates displayTemplate = transFact.newTemplates(new StreamSource(new File("your_xsl_file")));
TransformerHandler handler =
((SAXTransformerFactory) transFact).newTransformerHandler(displayTemplate);

Related

Keep CDATA for xstl transformation?

I'm doing a xstl transformation with saxon from an XML document.
The doc is not standard-valid XML, and I want to preserve all <![CDATA[< elements that are found in there.
However using the .xsl file for transformation with
Transformer trans = TransformerFactory.newInstance().newTransformer(new StreamSource(new File("foo.xsl"));
trans.transform(new StreamSource(new File("foo.xml"), new StreamResult(new File("output.xml")));
results in stripping out these CDATA entries. How can I prevent this?
You can't, as the distinction whether a text originated from a cdata section is not available in the datamodel used by xslt. You can however define in your stylesheet that certain result elements are to be wrapped inside cdata. This is done using the cdata-section-elements attribute of the xsl:output element in your stylesheet.
Consider using Andrew Welch's LexEv tool (bundled I believe with KernowForSaxon), which preprocesses CDATA start and end tags into something different (processing instructions perhaps?) that's visible in the XSLT data model and thus available to the application.

Saxon in Java: XSLT for CSV to XML

Mostly continued from this question: XSLT: CSV (or Flat File, or Plain Text) to XML
So, I have an XSLT from here: http://andrewjwelch.com/code/xslt/csv/csv-to-xml_v2.html
And it converts a CSV file to an XML document. It does this when used with the following command on the command line:
java -jar saxon9he.jar -xsl:csv-to-xml.csv -it:main -o:output.xml
So now the question becomes: How do I do I do this in my Java code?
Right now I have code that looks like this:
TransformerFactory transformerFactory = TransformerFactory.newInstance();
StreamSource xsltSource = new StreamSource(new File("location/of/csv-to-xml.xsl"));
Transformer transformer = transformerFactory.newTransformer(xsltSource);
StringWriter stringWriter = new StringWriter();
transformer.transform(documentSource, new StreamResult(stringWriter));
String transformedDocument = stringWriter.toString().trim();
(The Transformer is an instance of net.sf.saxon.Controller.)
The trick on the command line is to specify "-it:main" to point right at the named template in the XSLT. This means you don't have to provide the source file with the "-s" flag.
The problem starts again on the Java side. Where/how would I specify this "-it:main"? Wouldn't doing so break other XSLT's that don't need that specified? Would I have to name every template in every XSLT file "main?" Given the method signature of Transformer.transform(), I have to specify the source file, so doesn't that defeat all the progress I've made in figuring this thing out?
Edit: I found the s9api hidden inside the saxon9he.jar, if anyone is looking for it.
You are using the JAXP API, which was designed for XSLT 1.0. If you want to make use of XSLT 2.0 features, like the ability to start a transformation at a named template, I would recommend using the s9api interface instead, which is much better designed for this purpose.
However, if you've got a lot of existing JAXP code and you don't want to rewrite it, you can usually achieve what you want by downcasting the JAXP objects to the underlying Saxon implementation classes. For example, you can cast the JAXP Transformer as net.sf.saxon.Controller, and that gives you access to controller.setInitialTemplate(); when it comes to calling the transform() method, just supply null as the Source parameter.
Incidentally, if you're writing code that requires a 2.0 processor then I wouldn't use TransformerFactory.newInstance(), which will give you any old XSLT processor that it finds on the classpath. Use new net.sf.saxon.TransformerFactoryImpl() instead, which (a) is more robust, and (b) much much faster.

Dynamic XML creation in Java

I am trying to dynamically y create an XML file in Java to display a timetable. I have created a DTD for my XML file and I have an XSL file I would like to use to transform the XML. I don't know exactly how to continue.
What I've tried so far is onClick of some button a Servlet is called which generates the string of the content of the XML file (inserting the dynamic parts of the XML into the String. I now have a String containing the content of the XML file. I would now like to transform the XML file using an XSL file i have on my server and display the result in the page which has called the Servlet (doing this via AJAX).
I'm not sure if I'm in the direction, perhaps I shouldn't even create the XML code in String form from the beginning. So my question is, how do I continue from here? how do I transform the XML string, using the XSL file, and send it as a response to the AJAX call so I can plant the generated code into the page? Or if this is not the way to do it, how do I create a dynamic XML file in a different way producing the same result?
You can use JAXP for this. It's part of standard Java SE API.
StringReader xmlInput = new StringReader(xmlStringWhichYouHaveCreated);
InputStream xslInput = getServletContext().getResourceAsStream("file.xsl"); // Or wherever it is. As long as you've it as an InputStream, it's fine.
Source xmlSource = new StreamSource(xmlInput);
Source xslSource = new StreamSource(xslInput);
Result xmlResult = new StreamResult(response.getOutputStream()); // XML result will be written to HTTP response.
Transformer transformer = TransformerFactory.newInstance().newTransformer(xslSource);
transformer.transform(xmlSource, xmlResult);
Depending on how complicated and large your XML is going to be I would suggest two options. For small, simple structures Java's DOM implementation (Document) will suffice.
If your XML is more elaborate I would look into JAXB. The benefit there is that there are tools that automatically create Java classes from an XML schema (XSD). So you'd have to transform your DTD into an XSD first, but that shouldn't be a problem. You end up with plain data transfer objects (plain objects with getters/setters for the values of the corresponding XML elements) and parsing/encoding plus setting namespaces correctly is done for you. It's quite convenient but can also be a bit of an overkill for simple XML structures.
In both cases, you will end up with a Document instance that you can finally transform using JAXP.
Apache XMLBeans are a nice solution to serializing to and from XML. Here's what you need to do:
Download XMLBeans from http://www.apache.org/dyn/closer.cgi/xmlbeans/binaries
Use the XMLBeans inst2xsd executable (in the bin dir0 to convert your DTD to an XSD
Use the XMLBeans ANT task to convert the XSD into classes which you can use in your app
Here's an example ANT script to use XMLBeans to create the classes:
<project name="my_project" basedir="..">
<property name="my_project.project.path" value="${basedir}"/>
<property name="xbean.dir" value="C:/lib/xmlbeans-2.2.0/lib" />
<path id="classpath">
<fileset dir="${xbean.dir}" includes="**/*.jar" />
</path>
<taskdef name="xmlbean" classname="org.apache.xmlbeans.impl.tool.XMLBean" classpathref="classpath" />
<xmlbean schema="${testing_project.project.path}/my.xsd" srcgendir="${my_project.project.path}/src-tms-template-filter-fields" classgendir="${my_project.project.path}/bin">
<classpath><path refid="classpath" /></classpath>
</xmlbean>
You'll now have nice Java classes which you can use for clean code to create the XML from the data stored in your DB. Use BalusC's answer for the XSLT.

Adding source validation to a StructuredTextViewer

I added to my application a nice XML source viewer. Now, I have an XSD scheme that defines the xml document. Any idea where to start on adding some source validation that relies on this scheme?
Thanks!
To check that your XML is well-formed, just run it through a DocumentBuilderFactory parser. To additionally validate it against an .xsd schema referenced in the XML, call:
factory.setValidating( true );
If the xsd schema is not referenced within the XML that you are validating, you can supply it yourself like this:
factory.setAttribute(JAXP_SCHEMA_SOURCE, new File(schemaSource) );
For more information, read the article from Oracle here:
http://download.oracle.com/javaee/1.4/tutorial/doc/JAXPDOM8.html

XML to be validated against multiple xsd schemas

I'm writing the xsd and the code to validate, so I have great control here.
I would like to have an upload facility that adds stuff to my application based on an xml file. One part of the xml file should be validated against different schemas based on one of the values in the other part of it. Here's an example to illustrate:
<foo>
<name>Harold</name>
<bar>Alpha</bar>
<baz>Mercury</baz>
<!-- ... more general info that applies to all foos ... -->
<bar-config>
<!-- the content here is specific to the bar named "Alpha" -->
</bar-config>
<baz-config>
<!-- the content here is specific to the baz named "Mercury" -->
</baz>
</foo>
In this case, there is some controlled vocabulary for the content of <bar>, and I can handle that part just fine. Then, based on the bar value, the appropriate xml schema should be used to validate the content of bar-config. Similarly for baz and baz-config.
The code doing the parsing/validation is written in Java. Not sure how language-dependent the solution will be.
Ideally, the solution would permit the xml author to declare the appropriate schema locations and what-not so that s/he could get the xml validated on the fly in a sufficiently smart editor.
Also, the possible values for <bar> and <baz> are orthogonal, so I don't want to do this by extension for every possible bar/baz combo. What I mean is, if there are 24 possible bar values/schemas and 8 possible baz values/schemas, I want to be able to write 1 + 24 + 8 = 33 total schemas, instead of 1 * 24 * 8 = 192 total schemas.
Also, I'd prefer to NOT break out the bar-config and baz-config into separate xml files if possible. I realize that might make all the problems much easier, as each xml file would have a single schema, but I'm trying to see if there is a good single-xml-file solution.
I finally figured this out.
First of all, in the foo schema, the bar-config and baz-config elements have a type which includes an any element, like this:
<sequence>
<any minOccurs="0" maxOccurs="1"
processContents="lax" namespace="##any" />
</sequence>
In the xml, then, you must specify the proper namespace using the xmlns attribute on the child element of bar-config or baz-config, like this:
<bar-config>
<config xmlns="http://www.example.org/bar/Alpha">
... config xml here ...
</config>
</bar-config>
Then, your XML schema file for bar Alpha will have a target namespace of http://www.example.org/bar/Alpha and will define the root element config.
If your XML file has namespace declarations and schema locations for both of the schema files, this is sufficient for the editor to do all of the validating (at least good enough for Eclipse).
So far, we have satisfied the requirement that the xml author may write the xml in such a way that it is validated in the editor.
Now, we need the consumer to be able to validate. In my case, I'm using Java.
If by some chance, you know the schema files that you will need to use to validate ahead of time, then you simply create a single Schema object and validate as usual, like this:
Schema schema = factory().newSchema(new Source[] {
new StreamSource(stream("foo.xsd")),
new StreamSource(stream("Alpha.xsd")),
new StreamSource(stream("Mercury.xsd")),
});
In this case, however, we don't know which xsd files to use until we have parsed the main document. So, the general procedure is to:
Validate the xml using only the main (foo) schema
Determine the schema to use to validate the portion of the document
Find the node that is the root of the portion to validate using a separate schema
Import that node into a brand new document
Validate the brand new document using the other schema file
Caveat: it appears that the document must be built namespace-aware in order for this to work.
Here's some code (this was ripped from various places of my code, so there might be some errors introduced by the copy-and-paste):
// Contains the filename of the xml file
String filename;
// Load the xml data using a namespace-aware builder (the method
// 'stream' simply opens an input stream on a file)
Document document;
DocumentBuilderFactory docBuilderFactory =
DocumentBuilderFactory.newInstance();
docBuilderFactory.setNamespaceAware(true);
document = docBuilderFactory.newDocumentBuilder().parse(stream(filename));
// Create the schema factory
SchemaFactory sFactory = SchemaFactory.newInstance(
XMLConstants.W3C_XML_SCHEMA_NS_URI);
// Load the main schema
Schema schema = sFactory.newSchema(
new StreamSource(stream("foo.xsd")));
// Validate using main schema
schema.newValidator().validate(new DOMSource(document));
// Get the node that is the root for the portion you want to validate
// using another schema
Node node= getSpecialNode(document);
// Build a Document from that node
Document subDocument = docBuilderFactory.newDocumentBuilder().newDocument();
subDocument.appendChild(subDocument.importNode(node, true));
// Determine the schema to use using your own logic
Schema subSchema = parseAndDetermineSchema(document);
// Validate using other schema
subSchema.newValidator().validate(new DOMSource(subDocument));
Take a look at NVDL (Namespace-based Validation Dispatching Language) - http://www.nvdl.org/
It is designed to do what you want to do (validate parts of an XML document that have their own namespaces and schemas).
There is a tutorial here - http://www.dpawson.co.uk/nvdl/ - and a Java implementation here - http://jnvdl.sourceforge.net/
Hope that helps!
Kevin
You need to define a target namespace for each separately-validated portions of the instance document. Then you define a master schema that uses <xsd:include> to reference the schema documents for these components.
The limitation with this approach is that you can't let the individual components define the schemas that should be used to validate them. But it's a bad idea in general to let a document tell you how to validate it (ie, validation should something that your application controls).
You can also use a "resource resolver" to allow "xml authors" to specify their own schema file, at least to some extent, ex: https://stackoverflow.com/a/41225329/32453 at the end of the day, you want a fully compliant xml file that can be validatable with normal tools, anyway :)

Categories