XSD schema changes, XSLT and backwards compatibility - java

this is more of a high level question about using jaxb and xslt, as I try to gain more of an understanding of what I need to do, and what I need to learn more about.
I have inherited an application that has Java class files generated from an xsd schema (using jaxb), does some stuff, then writes one of these objects to a serialized 'save file'.
I currently need to make changes to the xsd, which of course will mean some of my originally generated classes will be updated. However, I still need to be able to load the old serialized saved files for backwards compatibility - does this mean I need to maintain a copy of the current xsd, and all generated class files in order to load the old serialized save files? Does anyone have a suggested way I can do this, if I must be able to load the old files?
For all future version of the xsd, I intend to output saved files to xml, and use xslt to transform the file before unmarshalling the xml, which I think will work, as mentioned in this thread How should I manage different incompatible formts of Xml based documents. Doesn't help me with the older serialized files though - any ideas?
Thanks.

Probably the main drawback of JAXB, and of data binding in general, is that it makes schema evolution very cumbersome. XML is a technology where people expect to change and extend the schema/data model frequently, whereas in Java it is hard-coded and hard to change. Use of XML-oriented languages like XSLT and XQuery is a big advantage in such situations.
Saving persistent data in the form of serialized Java objects seems completely perverse to me. Before you move to your new schema format, convert it all back to XML. The whole point of XML is that the data is then in a format that is far more durable, and not dependent on the continued existence of the software that created it.

Related

What is the best way to validate multiple XML files against a XSD?

I am working on a project that requires the validation of many XML files against their XSD, the trouble I am having is that many of the XSD files depend on others XSDs, making the usual validation kind of troublesome, is there an elegant way to resolve this issue?
I would prefer if possible to work with those files in memory, the files are not in a concise directory structure that conforms with their importation paths.
Just to note I am working with the Java language.
Assuming here that you work with JAXP, so that you can setSchema() on either SAXParserFactory or `DocumentBuilderFactory.
One solution I was part of, was to read all XSD sources into an aggregated Schema object using SchemaFactory.newSchema(Source[] schemas). This aggregated Schema was then able to validate any XML document that referenced any "top" schema; all imported schemas had to be part of the aggregated schema. As I remember it, it was necessary to order the Source array by dependency, so that if Schema A imported Schema B, Schema B had to occur befor Schema A in the array.
Also, as I recall, <include> didn't work very well with this mechanism.
Another solution would be to set an LSResourceResolver on the ShemaFactory. You would have to implement your own LSResourceresolver that serves byte- or character streams based on the input to the resolver. I haven't personally used or researched this solution.
The first solution has of course the benefit that schema parsing and processing can be done once and reused for all validations that follows; something that will probably be difficult to achieve with the second option.
Another thing to keep in mind (depending on your context): It is a good design choice to control the whole "resolving" process (i.e. control how the parsers get access to external resources), from a performance as well as a security perspective.

XSD for testdata generation

I'm trying to generate flat file for test data using JAVA. Flat file has own mapping document which describes all fields of each line.
I was suggested to use XSD for mapping and I did some research on XSD. As I understood that XSD is for only validation of XML. In this case I have to randomly generate XML file based on XSD and convert it to txt or other format. Because as an output I need flat file, not XML.
It seems like using XSD i'm adding extra steps in creating the file as first create XML, validate with XSD and convert it expected format.
What would be your recommendation in my situation for following the mapping document?
Thanks in advance.
I have seen your kind of setup before. The reasons may be different than your particular scenario, but nonetheless it made sense. One thing to consider has to do with the skills and tools people have available and so whatever makes the job done quick and well, goes.
You seem to describe an offset based, "flat" data structure. In my case, people used COBOL copybooks which are very good at describing this. IBM Rational Developer had a built-in wizard which allows the creation of Java Data Bindings from a COBOL copybook. This is to say that within a minute one gets a Java class which can create a record for your flat file in no time (it comes with all the logic required to do the padding, etc.)
To get the data generated, there are tools capable of generating XML files which cover all constraints defined by an XSD (e.g. alternate content i.e. xsd:choice, enumerated values, etc.) Now, assuming you have a proper XSD describing your logical model of your flat file, one can get 10s, 100s, even 100K XML generated from an XSD spec. It takes a click, plus the time spent by the tool to create those files.
Next, to get the XML files in your generated Java class, and so avoid going through XSLT or whatever (many shops don't have the skills) it may be as simple as writing Java mapping code between a JAXB generated class and the one created above, or if matching is possible, simply annotate the generated class to support JAXB unmarshalling. This last step may take longer to code, but it would be trivial code any Java developer would know how to do it.
This could possibly give you a view into why someone may have recommended Java and XSD for this task. XSD is a modeling language with built in support for constraints, which may prove helpful in generating test data through combinatorial techniques.

Java library to assist in XSD creation?

Is anyone aware of a library that makes the process of creating XSDs in Java a little easier than using something like a DocumentBuilder? Something like a helper or utility class. I came across the org.eclipse.xsd maven jar but I'm having ClassNotFoundException issues when working with it in Eclipse and I'm not entirely sure it's meant to be used as a standalone kind of thing. This is a bit difficult to Google for as well since there are lot of search results around automatic generation/translation from Java to XSD and vice versa.
Essentially what I need to do is to programmatically create an XSD from a certain source of data -- not Java classes.
Apache XMLSchema is a lightweight Java object model that can be used to manipulate and generate XML schema representations. You can use it to read XML Schema (xsd) files into memory and analyze or modify them, or to create entirely new schemas from scratch.
The fact that with this API one can create an XSD from scratch, it sounds as a starting point to achieve the ask; as to the fitness, it depends on what that "certain source of data" is.

creating a .xml file from .xsd with java

I'm a quite new to java world and I have a requirement of generating an .xml file from an .xsd file
I did some research and found that 'jaxb' could do it. And I found some example too, but the problem is, almost all the examples uses 'xjc' tool to do this. But I want a way to do this through my java code.
Os this possible?
if yes, I'm thinking something like this, from my java code
load the .xsd file
generate the .xml
save the .xml file
Can someone direct me to a good resource and or tell me if my thinking is wrong
I've had good experiences using XMLBeans, however I've always had the XSD available at compile time. It integrates nicely with Maven (plus potentially other build systems). The compilation produces a series of Java classes that can be used to construct an XML document that conforms to the XSD or process an XML file you've received.
You can potentially do some runtime processing of an XSD using the org.apache.xmlbeans.XmlBeans.compileXsd class, but I've never experimented with it. Just seen a reference from an FAQ.
I think the main problem is that to do it in a clean way you should have classes reflecting your xsd. Xsd defines a data model, so the important part is to recreate it with classes. If you want to do it dynamically it could be rather difficult. If you want to do it at compile time- jaxb is the way to go. There is very interesting article talking about problems related with parsing xml (it goes from a different perspective than you describe), but I think there is a wealth of knowledge to be learned from here:
http://elegantcode.com/2010/08/07/dont-parse-that-xml/

Felxibility of data import/export feature using XML-binding

I am going to develop a database import/export feature in a Java EE application.
I plan to use XML-binding solution to convert between Java object and XML file which is import/export file.
Import feature: unmarshal the XML file to Java object in memory representation, then use JDBC to update the database.
Export feature: inverse the import process. retrieve the database to Java object and marshal the object to XML.
I think it can work fine, but it's not flexible enough. Because the XSD of XML is pre-defined, it's impossible to change XML schema and Java object definition at runtime. Say it's dynamic binding. Even I want the feature supports other file formats (You could forget it, if the format is too far at this stage).
What is your advice about the feature? Thanks!
I don't know whether this is correct answer, but if in any case it is helpful:
I have used Spring, Hibernate, JAXB where you can annotate your database entity class and its element with jaxb annotation and you are not required to write any xml schema files. In spring you can use jaxb Marshaller.
I think it should be possible in pure jaxb also, so u can look into jaxb annotation.
I think it can work fine, but it's not
flexible enough. Because the XSD of
XML is pre-defined, it's impossible to
change XML schema and Java object
definition at runtime
If you think it isn't flexible enough, go with a data interchange format which relieves you from all these fixed schema definitions (I know even JSON has a schema specification but you get the point). Is using JSON acceptable?
I would go as far to argue that if "importing to database" and "exporting from database" is the only requirement, you need not even create Java objects for this. Simply pass in a JSON string which contains the schema which would then be processed by a JSON processor which interfaces directly with your DAO layer. Similarly with the data read from the database. The downside is that "date" support in JSON is spotty at best.
Come to think of it, it need not be JSON. You can take a look at other data serialization formats like Apache Avro. But then again, if XML is your requirement which can't be changed, you can get around the "flexibility limitation" by not using a schema at all.
After all, XML is like violence. If it doesn't solve your problem, you're not using enough of it. :-)

Categories