Java - Generating XML for a legacy system - java

I'm working on an existing system that's generating XML for a legacy system using a simple template language. This is obviously not ideal because it's difficult to see the structure of the generated XML, it suffers from escaping problems and it's easy to generate invalid XML.
For any sane XML formats I'd just Xstream or another Java XML serializing library, but this legacy system has a lot of strange rules like "this node should be excluded if the value is less then ten" and "the formatting of the date in node x depends on the value of node y". There are other strange rules as well, but this should be enough to get the idea.
As I've said, the template approach is far from idea, but it's pragmatic and works (with some effort). Is there a better way to approach generating XML for legacy systems with this amount of formatting rules? XSL has crossed my mind, but implementing any amount of logic in XSL is frankly not very tempting.

Basically you need some custom logic during serialization. I am guessing that the in-memory object structure is not directly mirrored in the XML structure? Alternatives:
Use StAX and distribute read and write methods within the objects.
Use JAXB and insert custom serialization.
Don't even think of expressing your custom logic in anything other than java, i.e. some "super" framework.

I am not sure, if this is what you are looking for, but maybe try XML Binding like JAXB...
In other words: you could generate a class library from your xsd-Schema and then build your object graph in java code, then serialize it in one call to xml.

You could use simple xml and some converters I think:
http://simple.sourceforge.net/download/stream/doc/tutorial/tutorial.php

Related

Is it practical to combine XML Schema and an XML-to-JSON conversion?

I have to specify a JSON data structure; that data structure will be part of an interface description, the data will be processed by JavaScript. JSON is set for the data transmission. In other projects, where we used XML instead of JSON, I have used rich XML schemas for this. Unfortunately, I cannot do that now.
I did some researching and found JSON Schema.
However, this is still draft status, which makes me feel a bit uneasy to use it in this context.
I also came across this question discussing how to map XML to JSON. There seems to be a standard (?) conversion in the XML class in the org.json namespace. It appears that the conversion is rather straight-forward for XML documents without mixed content.
So the idea is to use XML Schema to describe the data structure, use our existing XML processing (editing, transformation, validation, ...) tools as long as possible on the server side and convert the XML DOM to JSON just before delivering the data to the JSON consumer.
Data transmission is one-way only and we would not have mixed-content XML.
Maybe someone has tried this before? Would that be a practical approach in the sense that the the semantics of the XML Schema are still clear enough for the client-side programmers when (conceptually) applied to the JSON document? Are there any particular pitfalls to be aware of?
If I understood your idea right, you want to use XML Schema as the primary model for you data exchange - for XML as well as JSON formats.
This idea has two parts:
Use single source to model all the data exchange.
Use XML Schema as this single source.
Singe source model
The first idea brings you to MDD (Model-Driven Development) or MDA (Model-Driven Architecture) which had a hype around 2002-2005. It was UML-heavy, vendor-driven hype, but quite a few reasonable things (like AndroMDA) survived.
Generally, MDA is a good idea. It works splendid as long as you do "standard" things. But it can be a nightmare if you want to "customize".
In your case, I would definitely say that single-source model makes sense. This is about data exchange. In the core this can be reduced to very simple models which are still powerful enough to express everything you need.
JSON is an example for this. JSON is even simpler that XML but still powerful enough. It clearly shows that as long as you have basic primitive types, objects, arrays and nesting you can express almost anything.
This "single source model" must not be necessarily UML, it can be anything powerful enough to cover all the underlying requirements.
The main problem with "single source model" is customizing. You know, 90% works verwy well OOTB, but then in 10% you don't get the result you want and have to customize and then the effort gets you. Most of the generation tools have some kinds of "plugins". So if you fit in the 90%, you're lucky, otherwise you may need to get to know the hairy internals of the genration tools.
To sum up, single-source model is a good idea as long as it serves all the needs AND the effort to adapt/apply it for the required scenarios is not greater that making it from scratch.
XML Schema as the model
The next question is whether XML Schema is good as the single source model.
You have probably heard or used JAXB which has a schema compiler (XJC). This compiler can take your XML Schema and then generate Java classes with JAXB annotations. These classes can then be used to unmarshal XML into Java objects or marshal these object to XML.
And to JSON:
JAXB Mapping to JSON
Looks like you can also produce a JSON Schema from these classes (haven't tried it myself though):
How to generate JSON schema from a JAXB annotated class?
So XML Schema-first approach works. You can call it schema-driven development (I, hereby, claim the copyright on this term).
I personally did a lot of things schema-first wrote a number of tools/plugins for XJC. For instance:
Hyperjaxb makes schema-derived classes persistable with JPA.
Jsonix is baiscally a JAXB port for pure JavaScript.
My experience is that you can do a lot of things schema-first, but I also have to say that XML Schema is good but not the best or simplest model. The specification is complex, and if you take a look at the schema-derived classes then you could spot a few constructs which don't fit well in Java beans and properties. For instance, #XmlElementRef is a complex and often weird looking construct - which is stil necessary to cover quite a number of cases you can easily express in the XML Schema. In all the tools I wrote i alsways had to fight with cases and corder cases and corner cases of corner cases of such constructs.
XML Schema, if you keep it simple and neat, may be beautiful. Maps perfect to beans and properties, easy to understand and work with, a lot of tool support. So XML Schema is not the worst choice to model or specify data exchange.
But it can also get as complex as hell. I saw a lot of overengineered schemas, which then are extremely hard to work with - for a very little gain. Sometimes schema designers just don't know XML Schema well enough, sometimes know it too well. Last time I helped to work out "XML Schema design best practices", we landed on 60+ someting pages document of do's and don't's. So it's easy to get XML Schemas wrong.
But still, as I said above, if it's kept simple and clean it may be beuatiful.
What are the alternatives?
Well, you may actually use your Java code as your model source. Annotated POJOs are expressionaly powerful and versatile enough, but still quite simple to work with. You are not schema-first, you're Java code-first then, but you still can do all the same tricks. You can generate an XML Schema based on your annotated classes. You can do persistence (and much more) with MOXy. You can do JSON just as well.
To sum up and answer your question:
Yes, it is practical, and is known to work fairly well.
Along with the schema-first approach also consider Java-first approach.
You have tools to get XML-Objects-JSON-Persistence.
There are pitfalls (see above).
Hope this helps.
Since no one has answered to this question so far and we have started to follow this approach, I quickly summarize that for us the approach works generally quite well. We have designed a very rich XML Schema, that serves us as part of the contract between the server and the web client. The JSON follows the XML one-to-one, so the XML Schema reads naturally for the JSON document, too.
The only minor problem we noticed is that the canonical XML-to-JSON transformation that we use (which is not Schema-aware) creates a single object when there is just one child element somewhere in the tree, even when the XML Schema has an upperBound of 'many' for that element. This means that the programmers have to handle some polymorphism between object-values and collections here on the JSON side.

Java to XSD or XSD to Java

I know that, using JAXB, you can generate Java files from an XSD and that you can also generate the XSD from annotated POJOs. What are the advantages and disadvantages of each? Is one overall better than the other?
We basically want to serialize events to a log in XML format.
Ultimately it depends on where you want to focus:
If the XML Schema is the Most Important Thing
Then it is best to start from the XML schema and generate a JAXB model. There are details of an XML schema that a JAXB (JSR-222) implementation just can't generate:
A maxOccurs other than 0, 1, or unbounded
many facets on a simple type
model groups
If the Object Model is the Most Important Thing
If you will be using the Java model for more than just converting between objects and XML (i.e. using it with JPA for persistence) then I would recommend starting with Java objects. This will give you the greatest control.
It depends on your requirement and scenario with respect to the point of initiation.
Given your requirement, use generate Java files from an XSD as you want to define the output(XML) format first which should be supported by Java.,
Given that one of the main points of XML is to be a portable data-transfer format, usable regardless of platform or language, I would avoid generating XSD from any specific programming language, as a rule of thumb. This may not matter if you know you are only communicating between Java endpoints (but are you sure this will be true forever?).
It is better, all else being equal, to define the interface/schema in a programming-language neutral way.
There are lots of exceptions to this general principle, especially if you are integrating with existing or legacy code...
If you have the opportunity to design both pojo and schema, It's a matter of design - do you design for a "perfect" schema or for a "perfect" java class.
In some cases, you don't have the luxury of a choice, in system integration scenarios, you might be given a predefined XSD from another system that you need to adapt to, then XSD -> Class will be the only way.

Generate object model out of RelaxNG schema with RNGOM - how to start?

I want to generate an object model out of an RelaxNG Schema.
Therefore I want to use the RNGOM Object Model/Parser (mainly because I could not find any alternative - although I don't even care about the language the parser is written in/generates). Now that I checked out the RNGOM source from SVN, I don't have ANY idea how to use RNGOM, since there is not any piece of information out there about the usage.
A useful hint how to start with RNGOM - a link, example, or any description which saves me from having to read understand the whole source code of RNGOM - will be awarded as an answer.
Even better would be a simple example how to use the parser to generate an Object model out of an RNG file.
More infos:
I want to generate Java classes out of the following RelaxNG Schema:
http://libvirt.org/git/?p=libvirt.git;a=tree;f=docs/schemas;hb=HEAD
I found out that the Glassfish guys are using rngom to generate the same object model I need, but I could not yet find out how they are using rngom.
A way to proceed could be to :
use jing to convert from Relax NG to XML Schema (see here)
use more common tools to generate classes (e.g. JaxB).
Hi I ran into mostly the same requirement except I am concentrating on the Compact Syntax. Here is one way of doing what you want but YMMV.
To give some context, my goal in 2 phases: (a) Trying to slurp RelaxNG Compact Syntax and traverse an object/tree to create Spring 4 POJOs usable in Spring 4 Rest Controller. (b) From there I want to develop a request validator that uses the RNG Compact and automatically validates the request before Spring de-serializes the request. Basically scaffolding JSON REST API development using RelaxNG Compact Syntax as both design/documentation and JSON schema definition/validation.
For the first objective I thought about annotating the CompactSyntax with JJTree but I am obviously not fluent in JavaCC so I decided to go a more programatic approach...
I analyzed and tested the code in several ways to determine if there was a tree implementation in binary, digested and/or nc packages but I don't think there is one (an om/tree) as such.
So my latest, actually successful approach, has been to build upon binary and extend SchemaBuilderImpl, implement the visitor interface, and passing my custom SchemaBuilderImpl to CompactSyntax using the long constructor: CompactSyntax(CompactParseable parseable, Reader r, String sourceUri, SchemaBuilder sb, ErrorHandler eh, String inheritedNs)
When you call CompactParseable.parse you will get structured events in the visitor interface and I think this is good enough to traverse the rng schema and from here you could easily create an OM or tree.
But I am not sure this is the best approach. Maybe I missed something and there is in fact an OM/Tree built by the rngom implementation (in my case CompactSyntax) that you can traverse to determine parent/child relationships more easily. Or maybe there are other approaches to this.
Anyway, this is one approach that seems to be working for what I want. Is mostly visitor pattern based and since the interfaces were there I decided to use them. Maybe it will work for you. Bottom line, I could not find an OM/AST that can be traversed implemented anywhere in the implementation packages (nc, binary, digested).

Creating multithreaded Java server and clients, but messages have to be in XML format

I've got to write a multithreaded chat program, using a server and clients but each message sent has to be in XML.
Is it simpler/easier just to write out all the code in java, and then try and somehow alter it so the messages are sent in XMl format, or would it be simpler just to try and go for it in XML and hope it works. I'll admit I don't know that much about XML. :)
Also any links to any relevant online help/tutorials would be much appreciated.
Thanks.
When messing with XML in Java, PLEASE consider using JAXB or something similar. It allows you to work with a normal object graph in memory and then serialize that to XML in one operation (and the other way around).
Manipulating XML through the DOM API is a slow way to lose your sanity, do not do it for any non-trivial amount of XML.
I fail to see what the program being multithreaded or a server have to do with it though...
Check out XStream. You can use this to marshall a normal Java object into XML, and back again into an object, without having to do anything instrusive like define interfaces or specify schema etc. i.e. it works out of the box for objects you already have defined. For most cases it's seamless in its default mode.
XStream produces a direct XML serialised representation of a Java object (i.e. XML elements represent each field of a Java object directly). You can customise this further as/when you require. If you want to define persisted objects in terms of schema (XSD) then it's not appropriate. However if you're transporting objects where persistence is short-term and you're not worried about conforming to some schema then it's definitely of use.
e.g.
Person person = new Person("Brian Agnew");
XStream xStream = new XStream();
System.out.println(xStream.toXML(person));
and conversion from XML to the Person object is similarly trivial.
(note XStream is thread-safe)
There is something called XML RPC. This examples pretty much shows what you're looking for:
http://docstore.mik.ua/orelly/xml/jxml/ch11_02.htm
It would be simpler to use existing XMPP clients and servers and not write your own at all.
If this is in fact homework, then I would suggest writing the client and server as you have suggested, using all java, but use a String as the message. You can then easily add parsing of the string to/from XML when all other parts are working.
I would suggest to also have a look at Betwixt and Digester. For Digester there are some tutorials which can be found in the Digister-wiki. Betwixt provides some pretty good tutorials right on its website.
Additionally to these two tools there is a list of alternatives that can be found in the Reference section of http://wiki.apache.org/commons/Digester/WhyUseDigester
You're on the right page trying to break the task into smaller pieces.

What is JAXB and why would I use it? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
There is guy here swearing that JAXB is the greatest thing since sliced bread. I am curious to see what Stack Overflow users think the use case is for JAXB and what makes it a good or a bad solution for that case.
I'm a big fan of JAXB for manipulating XML. Basically, it provides a solution to this problem (I'm assuming familiarity with XML, Java data structures, and XML Schemas):
Working with XML is difficult. One needs a way to take an XML file - which is basically a text file - and convert it into some sort of data structure, which your program can then manipulate.
JAXB will take an XML Schema that you write and create a set of classes that correspond to that schema. The JAXB utilities will create the hierarchy of data structures for manipulating that XML.
JAXB can then be used to read an XML file, and then create instances of the generated classes - laden with the data from your XML. JAXB also does the reverse: takes java classes, and generates the corresponding XML.
I like JAXB because it is easy to use, and comes with Java 1.6 (if you are using 1.5, you can download the JAXB .jars.) The way it creates the class hierarchy is intuitive, and in my experience, does a decent job abstracting away the "XML" so that I can focus on "data".
So to answer your question: I would expect that, for small XML files, JAXB might be overkill. It requires you to create and maintain an XML schema, and to use "standard textbook methods" of utilizing Java classes for data structures. (Main classes, small inner-classes to represent "nodes", and a huge hierarchy of them.) So, JAXB is probably not that great for a simple linear list of "preferences" for an application.
But if you have a rather complex XML schema, and lots of data contained within it, then JAXB is fantastic. In my project, I was converting large amounts of data between binary (which was consumed by a C program) and XML (so that humans could consume and modify that data). The resulting XML Schema was nontrivial (many levels of hierarchy, some fields could be repeated, others could not) so JAXB was helpful in being able to manipulate that.
Here's a reason not to use it: performance suffers. There is a good deal of overhead when marshaling and unmarshaling. You might also want to consider another API for XML-Object binding -- such as JiBX:
http://jibx.sourceforge.net/
I use JAXB at work all the time and I really love it. It's perfect for complex XML schemas that are always changing and especially good for random access of tags in an XML file.
I hate to pimp but I just started a blog and this is literally the first thing I posted about!
Check it out here:
http://arthur.gonigberg.com/2010/04/21/getting-started-with-jaxb/
It's an "ORM for XML". Most often used alongside JAX-WS (and indeed the Sun implementations are developed together) for WS Death Star systems.
With JAXB you can automatically create XML representations of your objects (marshalling) and object representations of the XML (unmarshalling).
As far as the XML Schema is concerned, you have two choices:
Generate Java classes from an XSD
Generate an XSD from your Java classes
There are also some simpler XML serialization libraries like XStream, Digester or XMLBeans that might be alternatives.
JAXB is great if you have to code to some external XML spec defined as an XML schema (xsd).
For example, you have a trading application and you must report the trades to the Uber Lame Trade Reporting App and they've given you ultra.xsd to be getting on with. Use the $JAVA_HOME/bin/xjc compiler to turn the XML into a bunch of Java classes (e.g. UltraTrade).
Then you can just write a simple adapter layer to convert your trade objects to UltraTrades and use the JAXB to marshal the data across to Ultra-Corp. Much easier than messing about converting your trades into their XML format.
Where it all breaks down is when Ultra-Corp haven't actually obeyed their own spec, and the trade price which they have down as a xsd:float should actually be expressed as a double!
Why we need JAXB?
The remote components (written in Java) of web services uses XML as a mean to exchange messages between each other. Why XML? Because XML is considered light weight option to exchange message on Networks with limited resources.
So often we need to convert these XML documents into objects and vice versa. E.g: Simple Java POJO Employee can be used to send Employee data to remote component( also a Java programme).
class Employee{
String name;
String dept;
....
}
This Pojo should be converted (Marshall) in to XML document as follow:
<Employee>
<Name>...</Name>
<Department>...</Department>
</Employee>
And at the remote component, back to Java object from XML document (Un-Marshall).
What is JAXB?
JAXB is a library or a tool to perform this operation of Marshalling and UnMarshalling. It spares you from this headache, as simple as that.
You can also check out JIBX too. It is also a very good xml data binder, which is also specialized in OTA (Open Travel Alliance) and is supported by AXIS2 servers. If you're looking for performance and compatibility, you can check it out :
http://jibx.sourceforge.net/index.html
JAXB provides improved performance via default marshalling optimizations.
JAXB defines a programmer API for reading and writing Java objects to and from XML documents, thus simplifying the reading and writing of XML via Java.

Categories