Multi-mode XML processors for Java and/or Scala - java

One of benefits of using Jackson for JSON processing is:
all modes [i.e. streaming, tree, and binding to Java objects] fully supported, and best of all, in such a way that it is easy to convert between modes, mix and match. For example, to process very large JSON streams, one typically starts with a streaming parser, but uses data binder to bind sub-sections of data into Java objects: this allows processing of huge files without excessive memory usage, but with full convenience of data binding.
Are there XML processors for Java or Scala which also support this scenario?

Maybe you want to check out Smooks
http://smooks.org
HTH

Related

Best Practice for large XML file builder

I have to build an XML file for an input to a SOAP service in Java. The input xml can consist of at least 1000 tags. What is the best way to build the XML? I have the XSD files but it is a bit complicated to use JAXB. Is XMLStreamWriter a good option for that?
XMLStreamWriter is one of the better APIs to use for writing XML from a Java application, but it has a few quirks (e.g. its namespace handling is a bit bizarre) and you may find it worthwhile to wrap it in a convenience API that knows about the kind of document you are writing, e.g. what namespaces it uses.
One of the advantages of the XMLStreamWriter interface is that there are plenty of implementations to choose from. For example Saxon has an implementation that gives you full control over all the XSLT/XQuery serialization options plus Saxon extensions (for example, you can even control the output order of attributes!)
One of the problems I hit with all event-based APIs is that sooner or later you find yourself forgetting to write an end tag, and that can be quite tricky to debug. Using a wrapper API that forces you to include the element name in a call on endElement() can be useful for debugging; if debugging is switched on you can keep a stack of element names and check that endElement() is writing the right tag; with debugging switched off you just drop this check.
Serializing using JAXB is higher-level, of course, but the downside is that it gives you less control.

Best method to sending an object to a Spring MVC controller

What is the most efficient (uses least amount of bandwidth) method of sending a java bean from a java application to a Spring MVC servlet?
I am currently using XML, but I think it's using more bandwidth and more time to serialize the bean into XML because it is more verbose, which I do not need, because it's being transferred directly from one application to another, where no person is actually reading the serialized data.
JSON could be an option I guess..
What I understand here is that the two applications are not in the same VM and you need a way to pass on data between these two application. If it is so, here I would suggest you to use below approach:
Try using Java's default serialization and stream the output to
next application.
Optionally, you should use a compression mechanism (like gzip api in java) to compress the serialized file.
Also, if you want to stick with XML version, you can add compression step to reduce the size of the xml. This should be a minimal code change, if it is an existing application.

XML serialization library interoperability between Java and Python

I have been searching for an xml serialization library that can serialize and deserialize an (Java/Python) object into xml and back. I am using XStream right now for Java. If XStream had a python version to deserialize from xml generated by Xstream that would have done it for me. Thrift or such other libraries is not going to work unless they allow the data format to be xml. I am looking for suggestion for any library that can do it. - Thanks
Since Java and Python objects are so different in themselves, it's almost impossible to do this, unless you on both sides restrict the types allowed and such things.
And in that case, I'd recommend you use JSON, which is a nice interoperability format, even though it's not XML.
Otherwise you could easily write a library that takes XStream XML and loads it into Python objects, but it will always be limited to whatever is similar between Java and Python.
I don't think you're likely to find an automated way to serialise Java objects and deserialise into Python objects. They're different things, so if you want to translate, you'll have to write some code at one or both ends.
If it's really simple stuff - strings, numbers, booleans, and so on, then you might want to look into json, a very simple format with bindings for just about every language. Deserialising a json object in Python is as simple as:
json.loads('{"test":false}')
Another way to approach the problem might be to use Jython, an implementation of Python in Java, so you can use Java objects directly.
The problem is (like sort of suggested by other answers) that XStream is a Java object serialization framework, and not general data mapping/binding framework. This is by design (see XStream FAQ): upside is that it can nicely serialize and deserialize all kinds of Java objects out-of-box. Downside is that resulting XML structure is fairly rigid, and while you can rename things there isn't much other configurability.
But XStream is not the only Java XML handling library. I would suggest checking out either JAXB reference implementation or JibX as possibly better alternatives, so that you have more control over XML to process. This may be necessary to achieve good interoperability.
Does it really need to use XML?
For serializing structured data between Java and Python you might want to consider Google Protocol Buffers.

Java Collada Parser - XML Pull based implementation

I am looking at a set of parsers generated for Atom, XAL, Kml etc. seemingly using an automated technique with a XML pull based parser. The clue towards the automation is presence of "package.html" in all XML-to-Java mapped classes folders. I would like to produce a similar one for the rather large Collada 1.4 spec. My first attempt with Altova ran into small problems due the "enum" keyword. I am sure I can fix it in the next run with appropriate renaming. Khronos admit to not designing the 1.4 spec to being automated parser generation friendly.
The actual parsers i.e. XAL parser, Atom parser etc. implement the XMLEventParser interface. I would like to know if anybody has encountered/used this pattern. If so which tool can be used to map the XSD to a class set simply giving access to the data components of the nodes using getters and setters.
I'm not sure I understand your question, but it appears that you want to process XML formats like Atom and represent it in objects with getters/setters. This can easily be done with JAXB.
For an example see:
http://bdoughan.blogspot.com/2010/09/processing-atom-feeds-with-jaxb.html

Sending large xml data through a socket

I'm newbie to XML using Java. I've to write a method to send a large XML data having lots of nodes through a socket to client application.
What is the suitable method to generate XML?
What is the best method to send large XML through sockets?
Since you are using sockets you just need to deal with Java InputStream/OutputStream. This gives you alot of flexibility in your XML handling as almost all XML technologies handle streams as input/output.
You could represent your data as plain old Java objects (POJOs), and then bind them to XML using JAXB. An implementation of JAXB is included in Java SE 6. There are other implementations such as MOXy (I'm the tech lead) and JaxMe.
For an example see:
http://wiki.eclipse.org/EclipseLink/Examples/MOXy/GettingStarted
To generate XML you use DOM implementation provided by any XML DOM parser and generator.
Here is a nice tutorial. But for only generation try to use some small and light-weight parcers e.g. [tinyxml][2] or [qdparcer][3], because the xerces and others are going to be heavy weight for that. But if the parcing is also involved libxml or xerces will be of good choice because they provide nice SAX implementation for parsing, but you need to have schema defined for your data. Again try to serialize the data before sending so you can get rid of other problems.

Categories