Parsing XML Schemas and deriving metadata in Java

Parsing XML Schemas and deriving metadata in Java - java

I have been looking at ways to parse XML Schema files for metadata of types defined in those files and get other information, and build the type hierarchy to be shown to the user.
I found a number of candidates:
Apache WS Commons XMLSchema API
Apache Xerces XML Schema API
XSOM
XMLBeans
XMLSchema API and Xerces XML Schema API seem the two best suited.
While XMLSchema API was easier to use, it is not as well documented, and Xerces seems to be the one with much more support. However, I have been unable to locate any resources which might help me get started with the Xerces XML Schema API, except their FAQ's which have proved to be highly inadequate.
So my question is twofold - Which is the better choice for parsing and querying Schema files, and there any resources for these two to get started quickly?

take a look at Xstream, it's a good instrument for serialization, but you also can use it for parsing. Here is two minute tutorial.

Another option you might consider is Saxon's SCM format, which is an XML representation of the schema component model. Both SCM and XSOM are closely based on the schema component model defined in the W3C specs, and rely heavily on the user understanding that model; they don't repeat the documentation of the component model in the API definitions.

Related

Dom4J alternatives for XML processing in Java

We wanted to upgrade our project in order to use some up-to-date dependencies. In the moment we use jaxb for XML reading and writing. This is working very good.
In some cases we do not have an xsd or dtd in order to generate the java classes (via xjc). In those cases we use dom4j for creating xml documents or dom4j with xpath for reading xml documents.
The version 1.6.1 is over ten years old and as far as I understand, dom4j needs jaxen as the X-Path library. Jaxen 1.1.6 is also 4 years old. Also we removed from our project xerces 2.40 (also 12 years old).
What XML API is state of the art in the moment? It should support XPATH expressions and should create and read xml documents.
Also I am wondering about xerces. When we use JAXB for reading xml documents, sometimes we have an object values instead of a string, date or something else.
The reason for that is that somebody messed up the xsd and forgot do define a datatype for some elements. XJC creates simple object properties inside the generated java class. The strange thing is, that I needed to cast the object to an "ElementNSImpl" object. This object comes from the xerces project.
I am a little bit confused. Our solution for removing xerces was to define each element with a proper datatype. Unfortunately those XSDs are third party XSD and we have to fix that each time the XSD will change. But why do I have to cast the object in ElementNSImpl?
Thanks for your help.

Just because something is 'old' doesn't mean it's not useful. DOM4J is still my favorite tool for ad-hoc XML processing. dom4j has been updated since 1.6.1, but note that it is still dependent on an underying XML parser (such as Xerces).

dom4j version 1.6.1 has an XML Injection security vulnerability: https://nvd.nist.gov/vuln/detail/CVE-2018-1000632.
It appears to have been fixed in 2.1.1, released in July of 2018.

Support XSD versioning with JAXB

I am currently working on an application that performs the task of importing or exporting some entities. The file format being used for the same is XML. JAXB is being used for XML binding.
The problem is present XSD that defines the structure of entities has no provision for versioning. How do I get started with defining versioned XSD and subsequently XML instance documents provided JAXB lies as the underlying binding framework ?
I have read that there are three possible ways of introducing versions in XSD.
1) Change the internal schema version attribute
2) Create a attribute like schemaVersion on the root element
3) Change the schema's target namespace.
Which one best suits the usecase mentioned below?
Use case: The changes made to the XSD in the next version may invalidate the existing elements. Although the schema itself may not be backward compatible but the application needs to provides support for handling all versions of schema.

XML is designed to facilitate change and flexibility in document structures. Unfortunately, JAXB isn't. The very act of compiling knowledge of document structure into your Java source code makes change to the document structure a lot more difficult.
If structural change is part of your agenda, I think you should seriously consider not using JAXB: technologies like XQuery and XSLT are much better suited to this scenario.

xml serialization generator for java without using reflection

Is there an XML serialization framework for Java that does not use reflection, but instead generates static serialization code (Java source) from XSD ?

I've never seen anything that does exactly what you are asking for: generating serialization code from XSD. However, if you're not stuck with an existing XSD schema, Modello may satisfy your requirements.
Modello is used by Maven for parsing pom.xml and settings.xml files. It reads a .mdo file (like this description of the Maven project model), and can generate a Java object model; an XML Schema (XSD) file; and serialisation/de-serialisation code. The serialisation/deserialisation code can use one of a number of XML parser APIs (e.g. JDOM, StAX, etc.). The XML parser API used by Maven itself is xpp3.
Modello can also generate code to convert one version of the model to another. It can generate HTML documentation about your XML format.
If you have an existing XSD, it might be too much work to use modello. But, if you're creating your own XML format, it could be worth starting with modello and generating the XSD.

Programatic generation of xml from a xsd that uses other xsds

I have a xsd that in turn uses/imports a set of xsds.
I would like to programtically generate sample xml from the xsd. The xml must contain all elements and attributes populated with example data based on data type.
How can I do this using eclipse api classes?
Also are there any other tools that accomplish this task and can be evoked in a java program or a batch file?
Any pointers to examples/documentation/api is highly appreciated.
Thanks in advance.

if I am reading your question correctly, I believe what you are trying to do is programmatically generate (i.e. using Java) XML documents based on an XML Schema Document (which may in turn import other supporting XSD's).
You may wish to have a look at Oracle/Sun's JAXB (Java Architecture for Xml Binding) which you can find more info about here:
http://jaxb.java.net/
JAXB works with J2SE-SDK and/or IDEs - such as Netbeans or Eclipse, and permits you to unmarshall (read XML documents into mapped Java Objects) or marshall (write Java objects as XML documents) as required. Standard mappings (known as binding declarations) are provided based on valid XML Schema provided to JAXB. You can also provide binding declarations through custom annotations directly within your XML Schema files or by using external JAXB declarations.
Another alternative (similar to JAXB) is Apache's XML-Beans.
Hope this helps!

I need some jar for Java and xml

classical way to handle XML in java is really lengthy and scary.
For this purpose i made my own class which can return me result without giving me more detail like,
myXML mx=new myXML("filename");
:
mx.getAll("node name");
mx.getFirst("node name");
:
I had completed it 80%. But unfortunately, i had lost it in PC crash.
is there any jar under GPL or apache license which provides facility to read & write XML in simplest way?

JDOM is simple API for parsing, creating, manipulating, and serializing XML documents in Java. API's you mentioned in your question are supported by JDOM (Other than many more useful API's).
Checkout JDOM documentation/book chapter here for more reading:
http://www.jdom.org/downloads/docs.html
http://www.cafeconleche.org/books/xmljava/chapters/ch14.html
Following are lines from http://www.jdom.org/docs/oracle/jdom-part1.pdf
So what’s the point of JDOM (Java
Document Object Model), and why do
developers need it? JDOM is an open
source library for Java-optimized XML
data manipulations. Although it’s
similar to the World Wide Web
Consortium’s (W3C) DOM, it’s an
alternative document object model that
was not built on DOM or modeled after
DOM. The main difference is that while
DOM was created to be language-neutral
and initially used for JavaScript
manipulation of HTML pages, JDOM was
created to be Java-specific and
thereby take advantage of Java’s
features, including method
overloading, collections, reflection,
and familiar programming idioms. For
Java programmers, JDOM tends to feel
more natural and “right.”

Try Apache Digester.Using digester will really simplify your XML parsing.You can refer this link for an example.

For your use case you may be interested in the javax.xml.xpath APIs available in the JDK. For an example see one of my answers to another question (below):
Remove XML Node using java parser
You may also prefer Service Data Objects (SDO). It is a generic data structure for representing XML data. For more information see:
http://www.eclipse.org/eclipselink/sdo.php
http://bdoughan.blogspot.com/2010/09/processing-atom-feeds-with-sdo.html
When parsing XML I recommend using the standard technologies: StAX, SAX, DOM, and JAXB. An implementation of each is included in the. JDK and alternate open source implementations are available offering improved performance and extended features, such as MOXy JAXB's XPath based mapping:
http://bdoughan.blogspot.com/2010/09/xpath-based-mapping-geocode-example.html
http://bdoughan.blogspot.com/2010/10/how-does-jaxb-compare-to-xstream.html
The advantage of the standard libraries is that they all work together:
StAX, SAX, and DOM are all valid inputs/outputs for JAXB
StAX, SAX, DOM, and JAXB are all compatible with javax.xml.transform libraries
StAX, SAX, DOM, and JAXB are all compatible with javax.xml.xpath libraries
StAX, SAX, DOM, and JAXB are all compatible with javax.xml.validation libraries
JAXB is the binding layer for two Web Service standards: JAX-WS and JAX-RS

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Parsing XML Schemas and deriving metadata in Java - java

take a look at Xstream, it's a good instrument for serialization, but you also can use it for parsing. Here is two minute tutorial.

Related

Dom4J alternatives for XML processing in Java

Support XSD versioning with JAXB

xml serialization generator for java without using reflection

Programatic generation of xml from a xsd that uses other xsds

I need some jar for Java and xml

Categories

Resources