Modify XML node but keep the XML file format intact - java

How may I modify a XML file without any change like attributes ordering, tag expansion and encoding? (My preference is DOM API)

You could try VTD-XML.
Since this library builds an index while keeping the file content as-is, its manipulation API will allow to "patch" your file while keeping the rest intact.
Using the VTD-XML API, you will be able to navigate your XML like a DOM tree (even using XPath) and do some modifications (insert elements, insert attributes, etc.)

One nice option would be using decentxml. I've successfully used it before for programmatically changing a few attributes of a hand-writen XML config file without losing the formatting. It's not DOM, though.

This is not possible with DOM, since DOM does not know the attributes' order.
You could use a low-level API such as StAX, javax.xml.stream, but StAX is not exactly comfortable to use.

Related

Writing an XML rule parser in java

I have a complex xml file which has multi-level elements. I have to parse the XML file and based on the elements present, I have to handle the incoming request. I can use JAXB to generate the classes and parse the xml. But to go through the multi-level elements and match against the rules makes the program too complex and heavy (leads to 4-5 levels of loops). Is there an efficient and lighter way of achieving the same?
Depending on your needs to store -or not- temporarily the read data, you can count on these parsers:
DOM (Document Object Model) parsers store the XML data into a memory structure.
jDOM (Java DOM) parser is a DOM-like implementation in Java, with its own API.
SAX (Simple API for XML) parsers traverse an XML stream completely asynchronously, throwing user events for every read data.
StAX (Streaming API for XML) parsers reads an XML stream synchronously.
All of them can be found in any standard runtime (JRE), except jDOM, which is open-source.
So, if you are looking for an efficient way to process XML and take decissions based upon the read data, maybe StAX would suit your needs, because as soon as you get the data you need, you might stop reading and discard the rest of the input XML.
Update
To apply matching rules over the whole document I recommend you to use XPath over DOM.

Use xpath instead of XSD object generation for accessing XML details?

There is an XML file hosted on a server that I want to parse. Normally I generate an XSD from the XML and then generate the java pojo's from this XSD. Using jackson I then parse the XML to a java object representation. Is it not more straightforward to just use xpath ? This means I do not need to generate a object hierarchy based on the XML and also I do not need to regenerate the object hierarchy if the XML changes. xpath seems much more concise and intuitive ?
Why should I use XSD , object generation instead of xpath ?
According to the XML Schema specification XSD is used for defining the structure, content and semantics of XML documents. This means that you can use XSD to validate your XML file.
Depending on your circumstances you might be able to do without generating the whole object tree if all you need is to get some values from the XML file. In this case XPath is the way to go. However, you still might want to have an XSD file in order to validate the XML file before parsing it. This way you make your software fail fast, when the structure of your XML file changes, which will suggest that you change your XPath expressions. But for this to work, you shouldn't use the XSD you generate from your XML file, instead you should have a separate pre-generated XSD file which complies with the XPath expressions.
I think both approaches are valid, depending on the circumstances.
At the end of the day, you want to extract the values from that remote xml file and do something with them.
First criteria to consider is the size of that file, and the number of data elements.
If it's just a few, then xpath extraction should be straightforward. However, if that xml file represent a sizable and/or complex data structure, then you probably want the de-serialization to a Java data structure that you can then utilize, and JAXB would be a good candidate.
JAXB is going to be easier/better if the remote server adheres or publishes an XML Schema. If it doesn't, and changes often and significantly, you're going to suffer either way, but particularly so with JAXB. There are ways to smooth things over by pre-processing that xml with XSLT to force it into a more reliable form, but that is going to be a partial solution most likely.

modifying xml document using xml parsers?

I have an xml stored in database table. i need to get the xml and modify few elements and put the xml back in the database.
I am thinking to use JDOM or JAXB to modify the xml elements. Could you please suggest which one is better regarding the performance?
Thanks!
JAXB and JDOM and completely different things. JAXB will serialize java objects into an XML format and vice versa. JDOM simply reads in the XML file and stores it in a DOM tree which can then be used to modify the xml itself. So better if you go for JDOM.
JAXB is to be used when you have objects where the attribute values are stored in XML hence you can parse an xml document and it gives you a java objects and then you can write these back.
Quite a bit of work if you want to simple change some values. And it doesn't work with arbitrary xml files, JAXB has it's own format linked to your object's definitions.
JDOM creates also objects but the objects used are XML objects like Element, NodeList, ...
If you just want to change some values -> why not reading the xml file as a plain text file and use string operations to make your changes.
Or of the modification is more logicaly defined -> use an XSLT and a stylesheet translator.
Googling for XSLT and Java will give you tons of examples.

Editing a BIG XML via DOM parser

If there is a very big XML and DOM parser is used to parse it.
Now there is a requirement to add/delete elements from the XML i.e edit the XML
How to edit the XML as the entire XML will not be loaded due to memory constraints ?
What could be the strategy to solve this ?
You may consider to use a SAX parser instead, which doesn't keep the whole document in memory. It will be faster and will also use much less memory.
As two other answers mentioned already, a SAX parser will do the trick. Your other alternative to DOM is a StAX parser.
Traditionally, XML APIs are either:
DOM based - the entire document is read into memory as a tree
structure for random access by the calling application
event based - the application registers to receive events as
entities are encountered within the source document.
Both have advantages; the former (for example, DOM) allows for random
access to the document, the latter (e.g. SAX) requires a small memory
footprint and is typically much faster.
These two access metaphors can be thought of as polar opposites. A
tree based API allows unlimited, random access and manipulation, while
an event based API is a 'one shot' pass through the source document.
StAX was designed as a median between these two opposites. In the StAX
metaphor, the programmatic entry point is a cursor that represents a
point within the document. The application moves the cursor forward -
'pulling' the information from the parser as it needs. This is
different from an event based API - such as SAX - which 'pushes' data
to the application - requiring the application to maintain state
between events as necessary to keep track of location within the
document.
StAX is my preferred approach for handling large documents. If DOM is a requirement, check out DOM implementations like Xerces that support lazy construction of DOM nodes:
http://xerces.apache.org/xerces-j/faq-write.html#faq-4
Your assumption of memory constraint loading the XML document may only apply to DOM. VTD-XML loads the entire XML in memory, and does it efficiently (1.3x the size of XML document)... both in memory and performance...
http://sdiwc.us/digitlib/journal_paper.php?paper=00000582.pdf
Another distinct benefit, which none other XML framework in existence has, is its incremental update capability...
http://www.devx.com/xml/Article/36379
As stivlo mentioned you can use a SAX parser for reading the XML.
But for writing the XML you can write into fileoutput stream as plain text. I am sure that you will get requirement that mentions after which tag or under which tag the new data should be inserted.

Parsing a xml file using Java

I need to parse a xml file using JAVA and have to create a bean out of that xml file after parsing .
I need this while using Spring JMS in which producer is producing a xml file .First I need to read the xml file and take action according .
I read some thing about parsing and come with these option
xpath
DOM
Which ll be the best option to parse the xml file.
did you check JAXB
There's three ways of parsing an XML file, SAX, DOM and StAX.
DOM will parse the whole file and build up a tree in memory - great for small files but obviously if this is huge then you don't want the entire tree just sitting in memory! SAX is event based - it doesn't load anything into memory per-se but just fires off a series of events as it reads through the file. StAX is a median between the two, the application moves the cursor forward as it needs, grabbing the data as it goes (so no event firing or huge memory consumption.)
What one you use will really depend on your application - all have built in libraries since Java 6.
Looks like, you receive a serialized object via Java messaging. Have a look first, how the object is being serialized. Usually this is done with a library (jaxb, axis, ...) and you could use the very same library to create a deserializer.
You will need:
The xml schema (a xsd file)
The Java bean class (very helpful, it should exist)
Then, usually the library will create all helper classes and files and you don't have to care about parsing.
if you need to create an object, just extract the needed properties and go on...
I recommend using StaX, see this tutorial for more information.
Umh..there are several ways you can parse an xml document to into memory and work with it. You mentioned DOM. DOM actually holds uploads the whole document into memory and then allows you to move between different branches of the XML document.
On the other hand, you could use StAX. It works similar to DOM. The only difference is that, it streams the content of the XML document thus allowing better allocation of memory. On the other hand, it does not retain the information that has already been read.
Look at : http://download.oracle.com/javaee/5/tutorial/doc/bnbem.html It gives details about both parsing methods and example code. Hope that helps.

Categories