Custom StAX Parser for XML using javax wrappers - java

Custom StAX Parser for XML using javax wrappers
How do you do this; or at least good suggestions on the right documentation / examples / tutorials?
I've been using the javax.xml.stream package to process XML files but the application is begging for some "non-standard XML" (easy to understand what the means if you're not picky). I can write the parser, but I want this to be configurable: so that the app continues to use the same XML processing code except for changing the parser as needed.
The hard part at this point is finding concrete info on how this is done. Documentation speaks of, for example, configuring the parameters of SAXParserFactory and such, but I haven't found specific documentation or examples. I've even looked into some existing StAX source code. Need some good hints / guidance on how this is done in order to move forward.

According to the documentation, you can't. You can use one of three approved parsers. Anything else will result in an error.

Related

Best Practice for large XML file builder

I have to build an XML file for an input to a SOAP service in Java. The input xml can consist of at least 1000 tags. What is the best way to build the XML? I have the XSD files but it is a bit complicated to use JAXB. Is XMLStreamWriter a good option for that?
XMLStreamWriter is one of the better APIs to use for writing XML from a Java application, but it has a few quirks (e.g. its namespace handling is a bit bizarre) and you may find it worthwhile to wrap it in a convenience API that knows about the kind of document you are writing, e.g. what namespaces it uses.
One of the advantages of the XMLStreamWriter interface is that there are plenty of implementations to choose from. For example Saxon has an implementation that gives you full control over all the XSLT/XQuery serialization options plus Saxon extensions (for example, you can even control the output order of attributes!)
One of the problems I hit with all event-based APIs is that sooner or later you find yourself forgetting to write an end tag, and that can be quite tricky to debug. Using a wrapper API that forces you to include the element name in a call on endElement() can be useful for debugging; if debugging is switched on you can keep a stack of element names and check that endElement() is writing the right tag; with debugging switched off you just drop this check.
Serializing using JAXB is higher-level, of course, but the downside is that it gives you less control.

Java Collada Parser - XML Pull based implementation

I am looking at a set of parsers generated for Atom, XAL, Kml etc. seemingly using an automated technique with a XML pull based parser. The clue towards the automation is presence of "package.html" in all XML-to-Java mapped classes folders. I would like to produce a similar one for the rather large Collada 1.4 spec. My first attempt with Altova ran into small problems due the "enum" keyword. I am sure I can fix it in the next run with appropriate renaming. Khronos admit to not designing the 1.4 spec to being automated parser generation friendly.
The actual parsers i.e. XAL parser, Atom parser etc. implement the XMLEventParser interface. I would like to know if anybody has encountered/used this pattern. If so which tool can be used to map the XSD to a class set simply giving access to the data components of the nodes using getters and setters.
I'm not sure I understand your question, but it appears that you want to process XML formats like Atom and represent it in objects with getters/setters. This can easily be done with JAXB.
For an example see:
http://bdoughan.blogspot.com/2010/09/processing-atom-feeds-with-jaxb.html

What is best practice in converting XML to Java object?

I need to convert XML data to Java objects. What would be best practice to convert this XML data to object?
Idea is to fetch data via a web service (it doesn't use WSDL, just HTTP GET queries, so I cannot use any framework) and answers are in XML. What would be best practice to handle this situation?
JAXB is a standard API for doing this: http://java.sun.com/developer/technicalArticles/WebServices/jaxb/
Have a look at XStream. It might not be the quickest, but it is one of the most user friendly and straightforward converters in Java, especially if your model is not complex.
For a JMS project we were marshalling and unmarshalling (going from java to xml and xml to java) XML embedded in TextMessages (string property). We tried JAXB, Jibx, and XMLBeans. We found that XMLBeans worked best for us. Fast, easily configurable, good documentation, and easy Maven integration.
I have used and will continue to use JDOM -> www.jdom.org
Another option is a Sax Parser. It is procedural - i.e. a visitor pattern - but if the xml is fairly lightweight, (and even medium weight) I have found it to be very useful for this.
JAXB API which comes in Java(In built).
I have used JIBX in MQ module. It works very well. Ant config is simple. Used Xsd2Jibx converter to generate the binding files and Java beans from XML schema. Marshalling and un-marshalling allow to specify character-set parameter. It was useful in my project to handle custom character-set. But I found an issue in the binding compiler. If the Java bean has lengthier path name, it generates class file with lengthier file name which will cause issue in Windows XP(it has a maximum file length limit).
I haven't used other APIs. So I am not trying to compare with others. If you decided to use JIBX, I hope this will be helpful.
More details, please refer JIBX website
I've used XStream as well, it is easy to use and customizable. You can add your own custom converters and that was very handy for me...
So surprised more people have not mentioned Jibx. Amazing lib and i think a lot simpler to use than Jaxb. Performance is also fab!
For this you can also consider apache's bitwixt and simple framework for xml

Java SGML to XML conversion?

Does anyone know of a method, or library, to convert SGML into XML?
EDIT: For clarification, I have to do the conversion in Java, and I cannot use the SP parser or the related SX tool.
It seems that the general consensus is that there are no existing libraries for doing SGML work in Java. Certainly after several days of fruitlessly searching Google, and asking this question here, I have found no resources on this subject.
The answer is not always that simple, as it depends on the sgml DTD. I haven't actually found a general SGML parser in Java at all, but this article uses SP which includes a converter.
See http://jclark.com/sp/sx.htm for the SX converter from SGML to XML in the SP package.
There is the mlParser, but I'm having a hard time trying to locate it: http://www.balisage.net/Proceedings/vol1/html/Smith01/BalisageVol1-Smith01.html
There is no api for parsing SGML using Java at this time. There also isn't any api or library for converting SGML to XML and then parsing it using Java. With the status of SGML being supplanted by XML for all the projects I've worked on until now, I don't think there will every be any work done in this area, but that is only a guess.
Here is some open source code code from a University that does it, however I haven't tried it and you would have to search to find the other dependent classes. I believe the only viable solution in Java would require Regular Expressions.
Also, here is a link for public SGML/XML software.

When and why would you use Apache commons-digester?

Out of all the libraries for inputing and outputting xml with java, in which circumstances is commons-digester the tool of choice?
From the digester wiki
Why use Digester?
Digester is a layer on top of the SAX
xml parser API to make it easier to
process xml input. In particular,
digester makes it easy to create and
initialise a tree of objects based on
an xml input file.
The most common use for Digester is to
process xml-format configuration
files, building a tree of objects
based on that information.
Note that digester can create and
initialise true objects, ie things
that relate to the business goals of
the application and have real
behaviours. Many other tools have a
different goal: to build a model of
the data in the input XML document,
like a W3C DOM does but a little more
friendly.
and
And unlike tools that generate
classes, you can write your
application's classes first, then
later decide to use Digester to build
them from an xml input file. The
result is that your classes are real
classes with real behaviours, that
happen to be initialised from an xml
file, rather than simple "structs"
that just hold data.
As an example of what it's NOT used for:
If, however, you are looking for a direct representation of the input xml document, as
data rather than true objects, then digester is not for you; DOM, jDOM or other more
direct binding tools will be more appropriate.
So, digester will map XML directly into java objects. In some cases that's more useful than having to read through the tree and pull out options.
My first take would be "never"... but perhaps it has its place. I agree with eljenso that it has been surpassed by competition.
So for good efficient and simple object binding/mapping, JAXB is much better, or XStream. Much more convenient and even faster.
EDIT 2019: also, Jackson XML, similar to JAXB in approach but using Jackson annotations
If you want to create and intialize "true" objects from XML, use a decent bean container, like the one provided by Spring.
Also, reading in the XML and processing it yourself using XPath, or using Java/XML binding tools like Castor, are good and maybe more standard alternatives.
I have worked with the Digester when using Struts, but it seems that it has been surpassed by other tools and frameworks for the possible uses it has.

Categories