Converting xml files into java objects by using json - java

I've been trying to convert XML file to java objects efficiently but I haven't succeeded yet. I have seen JAXB notation, and a few others but they havent looked efficient to me and I need to use json. I need help with efficient code example.

Do not invent the wheel. These libraries (GSON, Jackson...) are pretty fast, tested and have huge community. If it was easy to write things better, it would have been already done.
And this is not really a question ;-)

Related

Handling messages with Java and JavaScript: JSON or XML?

I'm currently working on a project which needs some server-client communication. We're planning to use Websockets and a Java server (Jetty) on the server side. So, messages sent must be interpreted with Java from the server and with JavaScript from the client.
Now we're thinking about a protocol and which structure the messages should have. We already have a reference implementation which uses XML messages. But since JSON is designed to be used with JavaScript we're also thinking about the possibility to use JSON-Strings.
Messages will contain data which consists of XML strings and some meta information which is needed to process this data (i.e. store it in a database, redirect is to other clients...). It would be important if the processing of the messages (parsing and creating) would be easy and fast on both server and client side since the application should feature real time speed.
Since we have not the time to test both of the technologies I would be happy about some suggestions based on personal experience or on technical aspects. Is one of the technics more usable than the other or are there any drawbacks in one of them?
Thanks in advance.
JSON is infinitely easier to work with, in my opinion. It is far easier to access something like data.foo.bar.name than trying to work your way to the corresponding node in XML.
XML is okay for data files, albeit still iffy, but for client-server communication I highly recommend JSON.
You are opening a can of worms (again, not the first time).
have a look at this JSON vs XML. also a quick serach on stackoverflow will also be good.
this question might be duplicated across. Like this Stackoverflow XML vs JSON.
In the end answers stays the same. It depends on you. I though agree with many comments there that sometime, XML is overkill (and sometime not).
I agree with Kolink,
The reason, it is better to use JSON because the XML has a big Header, which means each transfer has a big overhead.
For iOS or Android, you have to use JSON as opposed to WLAN XML.
I agree with Kolink, but if you already have an XML scheme in place, I'd use XML to save you some headaches on the Java-side. It really depends on who's doing the most work.
Also, JSON is more compact, so you could save bandwidth using its format.
There seem to be some libraries for parsing JSON in Java, so it may not be too hard to switch formats.
http://json.org/java/

XML API for best performance

I have an application that works with a lot of XML data. So, I want to ask you which is the best API to handle XML in java. Today, I'm using W3 and, for performance, I want to migrate to some API.
I make XML from 0, a lot of transforms, import into database (mysql, mssql, etc), export from database to html, modifi of those XML, and more.
Is JDOM the best option? do you know some other better than JDOM?
I heard (by reading pages) about javolution. Somebody use it?
Which API you recommend me?
If you have vast amounts of data, the main thing is to avoid having to load it all into memory at once (because it will use a vast amount of memory, and because it prevents you overlapping IO and processing). Sadly, i believe most DOM and DOM-like libraries (like DOM4J) do just that, so they are not well suited for processing vast amounts of XML efficiently.
Instead, look at using a streaming API, like SAX or StAX. StAX is, in my experience, usually easier to use.
There are other APIs that try to give you the convenience of DOM with the performance of SAX. Javolution might be one; VTD-XML is another. But to be honest, i find StAX quite easy to work with - it's basically a fancy stream, so you just think in the same way as if you were reading a text file from a stream.
One thing you might try is combining JAXB with StAX. The idea is that you stream the file using StAX, then use JAXB to unmarshal chunks within it. For instance, if you were processing an Atom feed, you could open it, read past the header, then work in a loop unmarshalling entry elements to objects one at a time. This only really works if your format consists of a sequence of independent elements, like Atom; it would be largely useless on something richer like XHTML. You can see examples of this in the JAXB reference implementation and a guy's blog post.
The answer depends on what performance aspects are important for your application. One factor is whether you are handling large XML documents.
For parsing, DOM-based approaches will not scale well to large documents. If you need to parse large documents, non-DOM parsers such as those using SAX and StAX will be faster and less resource intensive. However, if you need to transform XML after parsing, using either XSL or a DOM API, you are going to need the whole document in memory in any case.
For creating XML from code, StAX provides a nice API for this. Since the approach is stream-based, this will scale well to writing very large documents.
Well, the most developers I know and myself, we use dom4J, maybe if you have the time you could write a small performancetest with use of both frameworks, then you will see the difference. I prefere dom4j.

File Manipulation libraries

I have a project which I need to manipulate files. things like: create new file by a defined structure(header,data,trail). and then I need to things like search/validate/create/read.
basically I want to map the files to objects and vise versa.(I am willing to map them to objects coz it will be much more comfortable for me to manipulate the fields inside each file via object)
I wonder if any of you deal with such things before? and maybe could recommend me on libraries which could easy my work.
thanks,
ray.
You may want to look at serialization and de-serialization
If you want custom mapping, you need custom coding. I would suggest you look at DataInputStream and DataOutputStream.
Using these you can control the header, records and footer in any binary format you want.
I suggest you generate your serialization (if you need to have the afstest possible speed) or use reflections to do the translation. Just using reflection is pretty fast and much simpler than generating code. ;)
In the end I Found a ORM framework called Canyon which mapping Files to Objects. but still had difficulties. so I have implemented my own ORM file to objects and vise versa.
If you have a defined file layout with different content you should consider to use a template engine like FreeMarker or Velocity to generate your files.
You can define templates here which will be filled with your dynamic content which you have to provide. Definitly better than to use System.out (I mean hard code your template text).
A library which helps for basic file manipulation is Apache Commons IO.
If you realy want to map your files to objects than it would be a Serialization/Deserialization as Angelom mentions. Many libraries help you to do this but the file format is fixed:
JSON: Jackson, GSON
XML: JAXB
If you want the file read by 3rd party as well, how about using some popular existing exchange format such as CSV or XML?
XML is fully supported in standard library. There's plenty of CSV libraries out there, including Apache Commons CSV.

Best way to parse large XML document in Jython

I need to parse a large (>800MB) XML file from Jython. The XML is not deeply nested, containing about a million relevant elements. I need to convert these elements into real objects.
I've used nu.xom.* successfully before, but now that I've switched from Java to Jython, the library fails with the following message:
The parser has encountered more than
"64,000" entity expansions in this
document; this is the limit imposed by
the application.
I have not found a way to fix this, so I probably have to look for another XML library. It could be either Java or Jython-compatible Python and should be efficient. Pythonic would be great, nu.xom.* is simple but not very pythonic. Do you have any suggestions?
Sax is the best way to parse large documents.
Sounds like you're hitting the default expansion limit.
See this note:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4843787
You need to set System property "entityExpansionLimit" to change
the default.
(added) see also the answer to this question.
Try using the SAX parser, it is great for streaming large XML files.
Does jython support xml.etree.ElementTree? If so, use the iterparse method to keep your memory size down. Read this and use elem.clear() as described.
there is a lxml python library, that can parse large files, without loading data to memory.
but i don't know if i jython compatible

XML serialization library interoperability between Java and Python

I have been searching for an xml serialization library that can serialize and deserialize an (Java/Python) object into xml and back. I am using XStream right now for Java. If XStream had a python version to deserialize from xml generated by Xstream that would have done it for me. Thrift or such other libraries is not going to work unless they allow the data format to be xml. I am looking for suggestion for any library that can do it. - Thanks
Since Java and Python objects are so different in themselves, it's almost impossible to do this, unless you on both sides restrict the types allowed and such things.
And in that case, I'd recommend you use JSON, which is a nice interoperability format, even though it's not XML.
Otherwise you could easily write a library that takes XStream XML and loads it into Python objects, but it will always be limited to whatever is similar between Java and Python.
I don't think you're likely to find an automated way to serialise Java objects and deserialise into Python objects. They're different things, so if you want to translate, you'll have to write some code at one or both ends.
If it's really simple stuff - strings, numbers, booleans, and so on, then you might want to look into json, a very simple format with bindings for just about every language. Deserialising a json object in Python is as simple as:
json.loads('{"test":false}')
Another way to approach the problem might be to use Jython, an implementation of Python in Java, so you can use Java objects directly.
The problem is (like sort of suggested by other answers) that XStream is a Java object serialization framework, and not general data mapping/binding framework. This is by design (see XStream FAQ): upside is that it can nicely serialize and deserialize all kinds of Java objects out-of-box. Downside is that resulting XML structure is fairly rigid, and while you can rename things there isn't much other configurability.
But XStream is not the only Java XML handling library. I would suggest checking out either JAXB reference implementation or JibX as possibly better alternatives, so that you have more control over XML to process. This may be necessary to achieve good interoperability.
Does it really need to use XML?
For serializing structured data between Java and Python you might want to consider Google Protocol Buffers.

Categories