JAVA: Build XML document using XPath expressions - java

I know this isn't really what XPath is for but if I have a HashMap of XPath expressions to values how would I go about building an XML document. I've found dom-4j's
DocumentHelper.makeElement(branch, xpath) except it is incapable of creating attributes or indexing. Surely a library exists that can do this?
Map xMap = new HashMap();
xMap.put("root/entity/#att", "fooattrib");
xMap.put("root/array[0]/ele/#att", "barattrib");
xMap.put("root/array[0]/ele", "barelement");
xMap.put("root/array[1]/ele", "zoobelement");
would result in:
<root>
<entity att="fooattrib"/>
<array><ele att="barattrib">barelement</ele></array>
<array><ele>zoobelement</ele></array>
</root>

I looked for something similar a few years ago - a sort of writeable XPath. In the end, having not found anything, I hacked up something which essentially built up the XML document by adding new nodes to parent expressions:
parent="/" element="root"
parent="/root" element="entity"
parent="/root/entity" attribute="att" value="fooattrib"
parent="/root" element="array"
parent="/root" element="ele" text="barelement"
(This was itself to be governed by an XML configuration file, hence the appearance of above.)
It would be tempting to try an automate some of this to just take the last path element, and make something of it, but I always felt that there were XPath expressions I could write which such a dumbheaded approach would get wrong.
Another approach I considered, though did not implement (the above was "good enough"), was to use the excellent Jaxen to generate elements that did not exist, on the fly if it didn't already exist.
From the Jaxen FAQ:
The only thing required is an implementation of the interface org.jaxen.Navigator. Not all of the interface is required, and a default implementation, in the form of org.jaxen.DefaultNavigator is also provided.
The DOMWriterNavigator would wrap and existing DOMNavigator, and then use the makeElement method if the element didn't exist. However, even with this approach,
you'd probably have to do some pre/post processing of the XPath query for things like attributes and text() functions.

The best I was able to come up with is to use a JAXB implementation, which will marshall/unmarshal objects to xml and then I used Dozer (http://dozer.sourceforge.net/documentation/mapbackedproperty.html) to map the xpaths which were keys in a map to the JAXB object method setters.
<mapping type="one-way" map-id="TC1">
<class-a>java.util.Map</class-a>
<class-b>org.example.Foo</class-b>
<field>
<a key="root/entity/#att">this</a>
<b>Foo.entity.att</b>
<a-hint>java.lang.String</a-hint>
</field>
It's more of a two step solution, but really worked for me.

I also wanted same kind of requirement where nature is so dynamic and dont want to use XSLT or any object mapping frameworks, so i've implemented this code in java and written blog on it please visit,
http://ganesh-kandisa.blogspot.com/2013/08/dynamic-xml-transformation-in-java.html
or fork code at git repository,
https://github.com/TheGanesh/DynamicXMLTransformer

Related

How can I get next available node in DOM with schema?

I need to query the names of "available" sub elements of an element node in DOM.
For example if schema says "There can be age, name, occupation elements under person element." then I wanna function like this,
import org.w3c.dom.Element;
Element person_element;
String[] names_of_available_sub_element =
get_available_sub_element_names(person_element);
which makes
names_of_available_sub_element == {"age", "name", "occupation"}.
How can I implement this function?
This isn't easy, but it can be done if you're prepared to put a lot of work in.
There are a number of approaches to getting information from an XSD schema. You could try and process the XSD source code, but I wouldn't recommend that, because there are so many things you have to take into account (wildcards, substitution groups, types derived by restriction and extension, and so on). A better approach is to use some kind of API that gives you access to the information in digested form. For that, some possible suggestions are:
(a) Xerces gives you a Java API providing programmatic access to the compiled schema.
(b) Saxon gives you two possibilities: (i) the SCM file, which is an XML representation of the compiled schema, and (ii) an XPath API giving programmatic access to the compiled schema using extension functions.
Do remember that knowing you're at a "person" element isn't (in the general case) enough to determine what the permitted children are. That's because there can be global and local elements using the name "person", but with different types. Whether this is a problem in your case depends on what you are trying to achieve, which you haven't really explained in much detail.

How to tell XStream to ignore child elements that have the 'class' attribute

I have some persisted XML that was generated by XStream, and looks like:
<CalculationDefinition>
<id>47</id>
<version>3</version>
<name>RHO error (pts)</name>
<expression class="com.us.provider.expression.AbsoluteValue">
....
</expression>
</CalculationDefinition>
I want to persist this content differently now, and want to tell XStream to simply ignore the expression element entirely. There's many links around that talk about how to do this with a MapperWrapper (eg XStream JIRA) but as far I can tell it doesn't work for an element that has a 'class' attribute.
This can be worked around by leaving an 'expression' field in the CalculationDefinition, but I'd rather not have to keep it there now that it's not used in code.
You could filter the incoming XML using XSL and removing the expression nodes, before passing it to XStream.

XML formatted Java string manipulation

I'm developing a plugin that has node(computer) objects with attributes like:
String name
String description
String labels
Launcher computerLauncher
...
I can convert the node(computer) object to an XML-formated String like:
String xml = jenkins.instance.toXML(node);
Which gives me a string:
<name>Computer1</name>
<description>This is a description</description>
<labels>label1 label2</labels>
<launcher>windows.object.launcher.12da1</launcher>
Then I can go the other way back:
Node node = jenkins.instance.fromXML(xml);
I have no methods for changing attributes in a Node so I want to convert it to XML, change som attributes and then make it a Node again.
I see two options
Manipulate the XML with some String methods to replace everything in between the <> tags.
Try to cast the XML string to something like a real Object and manipulate it that way.
Not sure what would be the best approach.
Why invent something new when there already is support for all that using Java's DOM (Document Object Model) API?
Use a DocumentBuilderFactory to get a DocumentBuilder and create a Document instance. With this you can create the 'Node' objects (please note that the example you posted is actually not valid XML, it's missing a root node) in your toXML method, serializing the Document to a String could be done by using a Transformer.
With the DOM API you can also modify the attributes of your existing elements.
Parsing the Document instance from an XML string is realized again with the help of the DocumentBuilder, using DocumentBuilder#parse.
If your DOM operations are not too complex this should be a nice, quick way to accomplish your goal.
It makes sense to me to use a DOM-like approach. But don't use DOM itself: there are much better alternatives like JDOM and XOM that have much friendlier APIs.
recenty I had to manipulate large XML files (my software had to create some XML files dynamically and get input data from some other XML files). To do this I've used JAXB, which is a very neat API that marshalls XML files into Java objects and Java objects into XML files automatically.
However to do this I had to create a XSD file to specify the XMLs that I would need to read and write from.
Therefore JAXB requires more work to set up than DOM, so if your needs are simple I suggest that you use DOM, however if your needs are more complex, then I would suggest JAXB.

using xsd:any for extensible schema

Until now, I've been handling extensions by defining a placeholder element that has "name" and "value" attributes as shown in the below example
<root>
<typed-content>
...
</typed-content>
<extension name="var1" value="val1"/>
<extension name="var2" value="val2"/>
....
</root>
I am now planning to switch to using xsd:any. I'd appreciate if you can help me choose th best approach
What is the value add of xsd:any over my previous approach if I specify processContents="strict"
Can a EAI/ESB tool/library execute XPATH expressions against the arbitrary elements I return
I see various binding tools treating this separately while generating the binding code. Is this this the same case if I include a namespace="http://mynamespace" and provide the schema for the "http://mynamespace" during code gen time?
Is this WS-I compliant?
Are there any gotchas that I am missing?
Thank you
Using <xsd:any processContents="strict"> gives people the ability to add extensions to their XML instance documents without changing the original schema. This is the critical benefit it gives you.
Yes. tools than manipulate the instances don't care what the schema looks like, it's the instance documents they look at. To them, it doesn't really matter if you use <xsd:any> or not.
Binding tools generally don't handle <xsd:any> very elegantly. This is understandable, since they have no information about what it could contain, so they'll usually give you an untyped placeholder. It's up the the application code to handle that at runtime. JAXB is particular (the RI, at least) makes a bit of a fist of it, but it's workable.
Yes. It's perfectly good XML Schema practice, and all valid XML Schema are supported by WS-I
<xsd:any> makes life a bit harder on the programmer, due to the untyped nature of the bindings, but if you need to support arbitrary extension points, this is the way to do it. However, if your extensions are well-defined, and do not change, then it may not be worth the irritation factor.
Regarding point 3
Binding tools generally don't handle
very elegantly. This is
understandable, since they have no
information about what it could
contain, so they'll usually give you
an untyped placeholder. It's up the
the application code to handle that at
runtime. JAXB is particular (the RI,
at least) makes a bit of a fist of it,
but it's workable.
This corresponds to the #XmlAnyElement annotation in JAXB. The behaviour is as follows:
#XmlAnyElement - Keep All as DOM Nodes
If you annotate a property with this annotation the corresponding portion of the XML document will be kept as DOM nodes.
#XMLAnyElement(lax=true) - Convert Known Elements to Domain Objects
By setting lax=true, if JAXB has a root type corresponding to that QName then it will convert that chunk to a domain object.
http://bdoughan.blogspot.com/2010/08/using-xmlanyelement-to-build-generic.html

Java object to XML schema

If you have a Java object and an XML schema (XSD), what is the best way to take that object and convert it into an xml file in line with the schema. The object and the schema do not know about each other (in that the java classes weren't created from the schema).
For example, in the class, there may be an integer field 'totalCountValue' which would correspond to an element called 'countTotal' in the xsd file. Is there a way of creating a mapping that will say "if the object contains an int totalCountValue, create an element called 'countTotal' and put it into the XML".
Similarly, there may be a field in the object that should be ignored, or a list in the object that should correspond to multiple XML elements.
I looked at XStream, but didn't see any (obvious) way of doing it. Are there other XML libraries that can simplify this task?
I believe this can be achieved via JAXB using it's annotations. I've usually found it much easier to generate Objects from JAXB ( as defined in your schema) using XJC than to map an existing Java object to match my schema. YMMV.
I'm doing Object do XML serialization with XStream. What don't you find "obvious" with this serializer? Once you get the hang of it its very simple.
In the example you provided you could have something like this:
...
XStream xstream = new XStream(new DomDriver());
xstream.alias("myclass", MyClass.class);
xstream.aliasField("countTotal", MyClass.class, "totalCountValue");
String xml = xstream.toXML(this);
...
for this sample class:
class MyClass {
private int totalCountValue;
public MyClass() {
}
}
If you find some serializer more simple or "cool" than this please share it with us. I'm also looking for a change...
Check the XStream mini tutorial here
I use a java library called JiBx to do this work. You need to write a mapping file (in XML) to describe how you want the XML Schema elements to map to the java objects. There are a couple of generator tools to help with automating the process. Plus it's really fast.
I tried out most of the libraries suggested in order to see which one is most suited to my needs. I also tried out a library that was not mentioned here, but suggested by a colleague, which was a StAX implementation called Woodstox.
Admittedly my testing was not complete for all of these libraries, but for the purpose mentioned in the question, I found Woodstox to be the best. It is fastest for marshalling (in my testing, beating XStream by around 30~40%). It is also fairly easy to use and control.
The drawback to this approach is that the XML created (since it is defined by me) needs to be run through a validator to ensure that it is correct with the schema.
You can use a library from Apache Commons called Betwixt. It can map a bean to XML and then back again if you need to round trip.
Take a look at JDOM.
I would say JAXB or Castor. I have found Castor to be easier to use and more reliable, but JAXB is the standard

Categories