How can i dynamically unmarshal xml in java? - java

I have a very large xml to unmarshal. I don't want to create POJO classes for this because that would mean creating around 20 classes. Is there a way I can unmarshal this dynamically i.e. without creating POJO classes?
Edit: Here is the link to the article to unmarshal (https://www.ncbi.nlm.nih.gov/pubmed/31297574/?report=xml&format=text)
I want to read this data and store it somewhere in my database.
I am trying to do this with jaxb.

The term "unmarshal" is usually used to mean a process of parsing XML and generating custom POJO objects. If you want to use generic Java objects instead, then you want one of the XML generic tree models. Most people use DOM, which is the oldest and worst of the models but is the default because it comes bundled with the Java platform; my own recommendation would be either JDOM2 or XOM.
If you don't want to create custom classes then you don't want to be using JAXB.
You haven't said in detail what you want to achieve, but for many XML operations, using XSLT or XQuery is going to be much easier than using Java (because processing XML is what they were designed for).

You can check DSM library. It's designed to process complex XML and JSON documents while reading the document. You define mapping definition in yaml format so you don't need to create classes to unmarshal.
DOM API load all XML to memory so that you can't use DOM with large XML. But DSM uses stream parsing so you won't face with memory problems. Using DSM is easier then DOM

Related

JAXB on-demand unmarshalling

Say I have a very large XML file to unmarshal, like 2 MB. I don't want to unmarshal the file and end up with (aprox.) 2 MB of objects in memory.
Instead, I want to unmarshal portions of the file on-demand (i.e. as I navigate through the object structure) and cleanup the references of the objects I have already read, making them available to the garbage collector.
Is it possible to tweak the JAXB implementation in a way it unmarshals the XML on-demand instead of entirely at once?
Take a look in this solution: Partial Unmarshalling of an XML using JAXB to skip some xmlElement
A specialist class was created to provide a mechanism to read the xml in parts.
A comprehensive example can be found here.
It uses StAX to advance to the right position within the xml file and JAXB to unmarshal the domain object.

Use xpath instead of XSD object generation for accessing XML details?

There is an XML file hosted on a server that I want to parse. Normally I generate an XSD from the XML and then generate the java pojo's from this XSD. Using jackson I then parse the XML to a java object representation. Is it not more straightforward to just use xpath ? This means I do not need to generate a object hierarchy based on the XML and also I do not need to regenerate the object hierarchy if the XML changes. xpath seems much more concise and intuitive ?
Why should I use XSD , object generation instead of xpath ?
According to the XML Schema specification XSD is used for defining the structure, content and semantics of XML documents. This means that you can use XSD to validate your XML file.
Depending on your circumstances you might be able to do without generating the whole object tree if all you need is to get some values from the XML file. In this case XPath is the way to go. However, you still might want to have an XSD file in order to validate the XML file before parsing it. This way you make your software fail fast, when the structure of your XML file changes, which will suggest that you change your XPath expressions. But for this to work, you shouldn't use the XSD you generate from your XML file, instead you should have a separate pre-generated XSD file which complies with the XPath expressions.
I think both approaches are valid, depending on the circumstances.
At the end of the day, you want to extract the values from that remote xml file and do something with them.
First criteria to consider is the size of that file, and the number of data elements.
If it's just a few, then xpath extraction should be straightforward. However, if that xml file represent a sizable and/or complex data structure, then you probably want the de-serialization to a Java data structure that you can then utilize, and JAXB would be a good candidate.
JAXB is going to be easier/better if the remote server adheres or publishes an XML Schema. If it doesn't, and changes often and significantly, you're going to suffer either way, but particularly so with JAXB. There are ways to smooth things over by pre-processing that xml with XSLT to force it into a more reliable form, but that is going to be a partial solution most likely.

modifying xml document using xml parsers?

I have an xml stored in database table. i need to get the xml and modify few elements and put the xml back in the database.
I am thinking to use JDOM or JAXB to modify the xml elements. Could you please suggest which one is better regarding the performance?
Thanks!
JAXB and JDOM and completely different things. JAXB will serialize java objects into an XML format and vice versa. JDOM simply reads in the XML file and stores it in a DOM tree which can then be used to modify the xml itself. So better if you go for JDOM.
JAXB is to be used when you have objects where the attribute values are stored in XML hence you can parse an xml document and it gives you a java objects and then you can write these back.
Quite a bit of work if you want to simple change some values. And it doesn't work with arbitrary xml files, JAXB has it's own format linked to your object's definitions.
JDOM creates also objects but the objects used are XML objects like Element, NodeList, ...
If you just want to change some values -> why not reading the xml file as a plain text file and use string operations to make your changes.
Or of the modification is more logicaly defined -> use an XSLT and a stylesheet translator.
Googling for XSLT and Java will give you tons of examples.

Parsing a xml file using Java

I need to parse a xml file using JAVA and have to create a bean out of that xml file after parsing .
I need this while using Spring JMS in which producer is producing a xml file .First I need to read the xml file and take action according .
I read some thing about parsing and come with these option
xpath
DOM
Which ll be the best option to parse the xml file.
did you check JAXB
There's three ways of parsing an XML file, SAX, DOM and StAX.
DOM will parse the whole file and build up a tree in memory - great for small files but obviously if this is huge then you don't want the entire tree just sitting in memory! SAX is event based - it doesn't load anything into memory per-se but just fires off a series of events as it reads through the file. StAX is a median between the two, the application moves the cursor forward as it needs, grabbing the data as it goes (so no event firing or huge memory consumption.)
What one you use will really depend on your application - all have built in libraries since Java 6.
Looks like, you receive a serialized object via Java messaging. Have a look first, how the object is being serialized. Usually this is done with a library (jaxb, axis, ...) and you could use the very same library to create a deserializer.
You will need:
The xml schema (a xsd file)
The Java bean class (very helpful, it should exist)
Then, usually the library will create all helper classes and files and you don't have to care about parsing.
if you need to create an object, just extract the needed properties and go on...
I recommend using StaX, see this tutorial for more information.
Umh..there are several ways you can parse an xml document to into memory and work with it. You mentioned DOM. DOM actually holds uploads the whole document into memory and then allows you to move between different branches of the XML document.
On the other hand, you could use StAX. It works similar to DOM. The only difference is that, it streams the content of the XML document thus allowing better allocation of memory. On the other hand, it does not retain the information that has already been read.
Look at : http://download.oracle.com/javaee/5/tutorial/doc/bnbem.html It gives details about both parsing methods and example code. Hope that helps.

What is the difference between JAXP and JAXB?

What is the difference between JAXP and JAXB?
JAXP (Java API for XML Processing) is a rather outdated umbrella term covering the various low-level XML APIs in JavaSE, such as DOM, SAX and StAX.
JAXB (Java Architecture for XML Binding) is a specific API (the stuff under javax.xml.bind) that uses annotations to bind XML documents to a java object model.
JAXP is Java API for XML Processing, which provides a platform for us to Parse the XML Files with the DOM Or SAX Parsers.
Where as JAXB is Java Architecture for XML Binding, it will make it easier to access XML documents from applications written in the Java programming language.
For Example : Computer.xml File, if we want to access the data with JAXP, we will be performing the following steps
Create a SAX Parser or DOM Parser and then PArse the data, if we use
DOM, it may be memory intensive if the document is too big. Suppose
if we use SAX parser, we need to identify the beginning of the
document. When it encounters something significant (in SAX terms, an
"event") such as the start of an XML tag, or the text inside of a
tag, it makes that data available to the calling application.
Then Create a content handler that defines the methods to be
notified by the parser when it encounters an event. These methods,
known as callback methods, take the appropriate action on the data
they receive.
The Same Operations if it is performed by JAXB, the following steps needs to be performed to access the Computer.xml
Bind the schema for the XML document.
Unmarshal the document into Java content objects. The Java content objects represent the content and organization of the XML document, and are directly available to your program.
After unmarshalling, your program can access and display the data in the XML document simply by accessing the data in the Java content objects and then displaying it. There is no need to create and use a parser and no need to write a content handler with callback methods. What this means is that developers can access and process XML data without having to know XML or XML processing
The key difference is which role the xml Schema plays. JAXP is outdated without awareness of the XML Schema while JAXB handles the schema binding as the very first step.

Categories