Java : Working with multiple versions of XML Schemas

Java : Working with multiple versions of XML Schemas - java

I have an application that receives an XML message and then parses it to perform further processing.In order to have this working in my application, after receiving the XML string, I call the JAXB marshaller to marshal it in to java POJOs and the processing takes from here after.
This thing works well till the time I have one version of XML.
Problem
But the problem comes when there are more than one versions of the same XSD. And my application has to deal with both version of messages. Clients to my application may send a old version xml or they may send a latest version XML.
For JAXB I have to convert the XSD into java pojos using JAVA XJC tool, and the problem arises when I convert the latest version of XSD it has the same class names as the previous version, but internal fields and hierarchy of the class is different. this causes problems even if i put the XJC outputs in different jars for each version.
Expected Solution
This change in version is expected to happen every 6 months and I have to make my system able to read the newer version XMLs also alongwith the old versions.
I want to know how to manage this XML processing in JAVA with JAXB or some other framework.
Shall I use SAX Parser? I have read that its not that efficient as compared to JAXB. And after working on SAX parser for last few days, i found out that it can be error prone as it involves looking for each element and getting values out of it and putting it into a java structure of our own and that is a lot of effort as compared to JAXB.
*
Question
Is there any simple solution similar like JAXB ?
*
Temporary Solution Used
I have used a temporary solution, with which i am not happy as a good solution. What I did is, I created a seperate jar for each XSD version using XJC tool. And created different packages ion each jar e.g.
1. Pojos for version 1.2 are in a jar with base package com.cgs.v_12
2. Pojos for version 2.0 are in a different jar with base package com.cgs.v_20
i have added both as a maven dependancy to my system and using them for processing different versions.

For JAXB/ any other solution that maps between XSD -> POJOs, this will be a 1-1 mapping, especially if the POJOs are generated.
Do you have to
(1) map the entire XML to POJO, or
(2) a subset of that XML to a static/ fixed POJO model?
If (1), since the changes in the subsequent versions cannot be anticipated, I believe the solution for the above will be use a strategy pattern to select the correct JAXB artifacts based on the version
If (2), you can explore using XPATH, defining XPATH mappings per version.

Related

Dom4J alternatives for XML processing in Java

We wanted to upgrade our project in order to use some up-to-date dependencies. In the moment we use jaxb for XML reading and writing. This is working very good.
In some cases we do not have an xsd or dtd in order to generate the java classes (via xjc). In those cases we use dom4j for creating xml documents or dom4j with xpath for reading xml documents.
The version 1.6.1 is over ten years old and as far as I understand, dom4j needs jaxen as the X-Path library. Jaxen 1.1.6 is also 4 years old. Also we removed from our project xerces 2.40 (also 12 years old).
What XML API is state of the art in the moment? It should support XPATH expressions and should create and read xml documents.
Also I am wondering about xerces. When we use JAXB for reading xml documents, sometimes we have an object values instead of a string, date or something else.
The reason for that is that somebody messed up the xsd and forgot do define a datatype for some elements. XJC creates simple object properties inside the generated java class. The strange thing is, that I needed to cast the object to an "ElementNSImpl" object. This object comes from the xerces project.
I am a little bit confused. Our solution for removing xerces was to define each element with a proper datatype. Unfortunately those XSDs are third party XSD and we have to fix that each time the XSD will change. But why do I have to cast the object in ElementNSImpl?
Thanks for your help.

Just because something is 'old' doesn't mean it's not useful. DOM4J is still my favorite tool for ad-hoc XML processing. dom4j has been updated since 1.6.1, but note that it is still dependent on an underying XML parser (such as Xerces).

dom4j version 1.6.1 has an XML Injection security vulnerability: https://nvd.nist.gov/vuln/detail/CVE-2018-1000632.
It appears to have been fixed in 2.1.1, released in July of 2018.

Support XSD versioning with JAXB

I am currently working on an application that performs the task of importing or exporting some entities. The file format being used for the same is XML. JAXB is being used for XML binding.
The problem is present XSD that defines the structure of entities has no provision for versioning. How do I get started with defining versioned XSD and subsequently XML instance documents provided JAXB lies as the underlying binding framework ?
I have read that there are three possible ways of introducing versions in XSD.
1) Change the internal schema version attribute
2) Create a attribute like schemaVersion on the root element
3) Change the schema's target namespace.
Which one best suits the usecase mentioned below?
Use case: The changes made to the XSD in the next version may invalidate the existing elements. Although the schema itself may not be backward compatible but the application needs to provides support for handling all versions of schema.

XML is designed to facilitate change and flexibility in document structures. Unfortunately, JAXB isn't. The very act of compiling knowledge of document structure into your Java source code makes change to the document structure a lot more difficult.
If structural change is part of your agenda, I think you should seriously consider not using JAXB: technologies like XQuery and XSLT are much better suited to this scenario.

XSD schema changes, XSLT and backwards compatibility

this is more of a high level question about using jaxb and xslt, as I try to gain more of an understanding of what I need to do, and what I need to learn more about.
I have inherited an application that has Java class files generated from an xsd schema (using jaxb), does some stuff, then writes one of these objects to a serialized 'save file'.
I currently need to make changes to the xsd, which of course will mean some of my originally generated classes will be updated. However, I still need to be able to load the old serialized saved files for backwards compatibility - does this mean I need to maintain a copy of the current xsd, and all generated class files in order to load the old serialized save files? Does anyone have a suggested way I can do this, if I must be able to load the old files?
For all future version of the xsd, I intend to output saved files to xml, and use xslt to transform the file before unmarshalling the xml, which I think will work, as mentioned in this thread How should I manage different incompatible formts of Xml based documents. Doesn't help me with the older serialized files though - any ideas?
Thanks.

Probably the main drawback of JAXB, and of data binding in general, is that it makes schema evolution very cumbersome. XML is a technology where people expect to change and extend the schema/data model frequently, whereas in Java it is hard-coded and hard to change. Use of XML-oriented languages like XSLT and XQuery is a big advantage in such situations.
Saving persistent data in the form of serialized Java objects seems completely perverse to me. Before you move to your new schema format, convert it all back to XML. The whole point of XML is that the data is then in a format that is far more durable, and not dependent on the continued existence of the software that created it.

Set up maven build for xjc for multiple schemas in the same xml namespace?

I'm working on a project where we have Jersey/JaxB based serialization system to talk to a web service. The service in question returns data wrapped inside an Atom feed.
An older part of the system wrote a one-off specific to their service XSD for Atom that was hard wired with only their particular elements. I now need to add support for a new service, which is doing a similar thing (using Atom as a "envelope"), but using significantly different elements and content schema.
I don't want to disturb the existing code, so ideally I'd like to do the same thing the previous project did: define my own schema for the parts of Atom that the new service is using.
I'm running into:
org.xml.sax.SAXParseException: 'feed' is already defined
I'm apparently hitting the limitation described in the XJC release notes: It is not legal to have more than one <jaxb:schemaBindings> per namespace.
Is there a way to set things up in our build so that if I have separate xjb files, I can run xjc independently over the two distinct schemas and generate code for each of them into separate packages? How do I work around this limitation?
We're using the maven jaxb plugin.

Just for the record, what we ended up doing was generating the code from the schema separately, and checking in the generated code. Since the ATOM schema's not changing, it was reasonably safe. Annoying to have to do it that way though.

Castor 1.2 for POJO to XML

I am using Castor 1.2 for marshalling.
Do you have any experience with using Castor for this purpose?
Do you have suggestions for improving performance?

Castor 1.2 was the last version to provide support for Java 1.4, so it's still widely used by shops that haven't made the transition to 1.5 or 1.6 (in my case, we're stuck with deploying to an older Weblogic version).
The best way to gain performance improvements is to use a mapping file, rather than having Castor use reflection to marshall/unmarshall your XML. The mapping file can contain explicit XML element to Java class mappings, and omit any translations you are not interested in. So, for example, if an XML record contains a customer's billing information along with a history of the last 100 orders, but all you care about is the billing information, you can explicitly map the appropriate XML elements to your billing information classes. Castor will ignore the remainder of the XML elements, speeding up the marshalling process.
A final tip is to download the source code for Castor 1.2, even if you don't plan on building the code yourself. The documentation for 1.2 hasn't been kept up to date, so some new features that appear to have been introduced in 1.3 and higher have actually been added to Castor 1.2 as well. A quick comparison of the 1.3 documentation and 1.2 code will let you see what improvements have been recently made to Castor 1.2.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java : Working with multiple versions of XML Schemas - java

Related

Dom4J alternatives for XML processing in Java

Support XSD versioning with JAXB

XSD schema changes, XSLT and backwards compatibility

Set up maven build for xjc for multiple schemas in the same xml namespace?

Castor 1.2 for POJO to XML

Categories

Resources