Use xpath instead of XSD object generation for accessing XML details? - java

There is an XML file hosted on a server that I want to parse. Normally I generate an XSD from the XML and then generate the java pojo's from this XSD. Using jackson I then parse the XML to a java object representation. Is it not more straightforward to just use xpath ? This means I do not need to generate a object hierarchy based on the XML and also I do not need to regenerate the object hierarchy if the XML changes. xpath seems much more concise and intuitive ?
Why should I use XSD , object generation instead of xpath ?

According to the XML Schema specification XSD is used for defining the structure, content and semantics of XML documents. This means that you can use XSD to validate your XML file.
Depending on your circumstances you might be able to do without generating the whole object tree if all you need is to get some values from the XML file. In this case XPath is the way to go. However, you still might want to have an XSD file in order to validate the XML file before parsing it. This way you make your software fail fast, when the structure of your XML file changes, which will suggest that you change your XPath expressions. But for this to work, you shouldn't use the XSD you generate from your XML file, instead you should have a separate pre-generated XSD file which complies with the XPath expressions.

I think both approaches are valid, depending on the circumstances.
At the end of the day, you want to extract the values from that remote xml file and do something with them.
First criteria to consider is the size of that file, and the number of data elements.
If it's just a few, then xpath extraction should be straightforward. However, if that xml file represent a sizable and/or complex data structure, then you probably want the de-serialization to a Java data structure that you can then utilize, and JAXB would be a good candidate.
JAXB is going to be easier/better if the remote server adheres or publishes an XML Schema. If it doesn't, and changes often and significantly, you're going to suffer either way, but particularly so with JAXB. There are ways to smooth things over by pre-processing that xml with XSLT to force it into a more reliable form, but that is going to be a partial solution most likely.

Related

How can i dynamically unmarshal xml in java?

I have a very large xml to unmarshal. I don't want to create POJO classes for this because that would mean creating around 20 classes. Is there a way I can unmarshal this dynamically i.e. without creating POJO classes?
Edit: Here is the link to the article to unmarshal (https://www.ncbi.nlm.nih.gov/pubmed/31297574/?report=xml&format=text)
I want to read this data and store it somewhere in my database.
I am trying to do this with jaxb.
The term "unmarshal" is usually used to mean a process of parsing XML and generating custom POJO objects. If you want to use generic Java objects instead, then you want one of the XML generic tree models. Most people use DOM, which is the oldest and worst of the models but is the default because it comes bundled with the Java platform; my own recommendation would be either JDOM2 or XOM.
If you don't want to create custom classes then you don't want to be using JAXB.
You haven't said in detail what you want to achieve, but for many XML operations, using XSLT or XQuery is going to be much easier than using Java (because processing XML is what they were designed for).
You can check DSM library. It's designed to process complex XML and JSON documents while reading the document. You define mapping definition in yaml format so you don't need to create classes to unmarshal.
DOM API load all XML to memory so that you can't use DOM with large XML. But DSM uses stream parsing so you won't face with memory problems. Using DSM is easier then DOM

modifying xml document using xml parsers?

I have an xml stored in database table. i need to get the xml and modify few elements and put the xml back in the database.
I am thinking to use JDOM or JAXB to modify the xml elements. Could you please suggest which one is better regarding the performance?
Thanks!
JAXB and JDOM and completely different things. JAXB will serialize java objects into an XML format and vice versa. JDOM simply reads in the XML file and stores it in a DOM tree which can then be used to modify the xml itself. So better if you go for JDOM.
JAXB is to be used when you have objects where the attribute values are stored in XML hence you can parse an xml document and it gives you a java objects and then you can write these back.
Quite a bit of work if you want to simple change some values. And it doesn't work with arbitrary xml files, JAXB has it's own format linked to your object's definitions.
JDOM creates also objects but the objects used are XML objects like Element, NodeList, ...
If you just want to change some values -> why not reading the xml file as a plain text file and use string operations to make your changes.
Or of the modification is more logicaly defined -> use an XSLT and a stylesheet translator.
Googling for XSLT and Java will give you tons of examples.

How to generate XSD from elements of XML

I have a XML input
<field>
<name>id</name>
<dataType>string</dataType>
<maxlength>42</maxlength>
<required>false</required>
</field>
I am looking for a library or a tool which will take an XML instance document and output a corresponding XSD schema.
I am looking for some java library with which I can generate a XSD for the above XML structure
If all you want is an XSD so that the XML you gave conforms to it, you'd be much better off by crafting it yourself rather than using a tool.
No one knows better than you the particularities of the schema, such as which valid values are there (for instance, is the <maxlength> element required? are true and false the only valid values for <required>?).
If you really want to use a tool (I'd only advice using it if you haven't designed the XML and really can't get the real XSD - or if you designed it, double check the generated XSD), you could try Trang. It can infer an XSD Schema from a number of example XML's.
You'll have to take into account that the XSD a tool can infer you might be incomplete or inaccurate if XML samples aren't representative enough.
java -jar trang.jar sampleXML.xml inferredXSD.xsd
You can find a usage example of Trang here.
You can try with online tool called XMLGrid: http://xmlgrid.net/xml2xsd.html
You could write an XSLT to do something like that. But the problem is, a single document alone is not enough information to generate a schema. Are any of those elements optional? Is there anything missing from that document, that might appear in other instances? How many of a particular element can there be? Do they have to be in that order? There are loads of things that can be expressed in a schema, that are not immediately obvious from one instance of a document that conforms to that schema.
For the people who really want to include it in their Java code to generate an XSD and understand the perils, check out Generate XSD from XML programatically in Java
Try xmlbeans it has some tools one of them is ins2xsd you can find specifics here:
http://xmlbeans.apache.org/docs/2.0.0/guide/tools.html
Good luck

Parsing a xml file using Java

I need to parse a xml file using JAVA and have to create a bean out of that xml file after parsing .
I need this while using Spring JMS in which producer is producing a xml file .First I need to read the xml file and take action according .
I read some thing about parsing and come with these option
xpath
DOM
Which ll be the best option to parse the xml file.
did you check JAXB
There's three ways of parsing an XML file, SAX, DOM and StAX.
DOM will parse the whole file and build up a tree in memory - great for small files but obviously if this is huge then you don't want the entire tree just sitting in memory! SAX is event based - it doesn't load anything into memory per-se but just fires off a series of events as it reads through the file. StAX is a median between the two, the application moves the cursor forward as it needs, grabbing the data as it goes (so no event firing or huge memory consumption.)
What one you use will really depend on your application - all have built in libraries since Java 6.
Looks like, you receive a serialized object via Java messaging. Have a look first, how the object is being serialized. Usually this is done with a library (jaxb, axis, ...) and you could use the very same library to create a deserializer.
You will need:
The xml schema (a xsd file)
The Java bean class (very helpful, it should exist)
Then, usually the library will create all helper classes and files and you don't have to care about parsing.
if you need to create an object, just extract the needed properties and go on...
I recommend using StaX, see this tutorial for more information.
Umh..there are several ways you can parse an xml document to into memory and work with it. You mentioned DOM. DOM actually holds uploads the whole document into memory and then allows you to move between different branches of the XML document.
On the other hand, you could use StAX. It works similar to DOM. The only difference is that, it streams the content of the XML document thus allowing better allocation of memory. On the other hand, it does not retain the information that has already been read.
Look at : http://download.oracle.com/javaee/5/tutorial/doc/bnbem.html It gives details about both parsing methods and example code. Hope that helps.

Modify XML node but keep the XML file format intact

How may I modify a XML file without any change like attributes ordering, tag expansion and encoding? (My preference is DOM API)
You could try VTD-XML.
Since this library builds an index while keeping the file content as-is, its manipulation API will allow to "patch" your file while keeping the rest intact.
Using the VTD-XML API, you will be able to navigate your XML like a DOM tree (even using XPath) and do some modifications (insert elements, insert attributes, etc.)
One nice option would be using decentxml. I've successfully used it before for programmatically changing a few attributes of a hand-writen XML config file without losing the formatting. It's not DOM, though.
This is not possible with DOM, since DOM does not know the attributes' order.
You could use a low-level API such as StAX, javax.xml.stream, but StAX is not exactly comfortable to use.

Categories