Stream-lined xml builder/parser in Android? - java

I'm learning the Android api from a book, and it seems like there isn't any mention of a stream-lined api for dealing with raw xml (reading and writing). His suggestion for parsing is the XmlPullParser, and his examples look horrendous considering the kind of api's I'm spoiled by in other platforms (LINQ to XML especially).
Is this the best available technique on the Android platform?
Obviously I can write a wrapper to avoid the repetitive stuff, but I'd be surprised if no such thing already exists.
Also, he doesn't even make mention of creating xml structures in code. What are my options for both?
On a side note, do any Java devs that are familiar with LINQ to XML in .Net know of anything equivalent in Java?

Since you probably don't want to load any substantial size DOMs into Android's memory - pull and SAX parsers are preferred way dealing with XML in Android. I think it pays to invest into understanding how SAX works and write a custom handler than rely on some generic libraries that may be incompatible or overbloated. I parse XML in my apps using SAX all the time and I'm very pleased with the speed (most of the time)

Well I'm pretty new to Java, but here's what I've gleaned so far about xml parsing on Android:
The XmlPullParser approach is recommended for Android due to resource constraints. There is a DOM parser available in Android, which would let you use XPath to navigate an xml document. Using the DOM means that you have to load the entire document into memory at once, however. The XmlPullParser method is much more efficient in terms of memory used.
The XmlPullParser method takes a little getting used to after being comfortable with LINQ to XML or XPath, but it's really not too bad IMHO (at least with the documents I was parsing). If you're working with small xml documents you could certainly use the DOM with XPath.
There's a decent article about the different methods for reading and writing XML with Android here:
http://www.ibm.com/developerworks/opensource/library/x-android/index.html

I had the same issues with parsing xml or xhtml and ended up writing a webservice doing it for me.
Android Device ->(Request URL) -> Webservice Get and Parse -
-> (Data) -> Android Device
You can transmit the data in JSON to work with it on the device.
The advantage of this is you can minimize the traffic on the slow mobile network and change the parsing without releasing a new android app.
Maybe this is will work for you too.
regards

Related

Native JavaScript Library for Document Conversion

I know there are tons of library for converting between documents format for PHP, Java etc.
But I wanted to know if there is any pure javascript libary for converting between document formats.
I want the conversion to take place at client side itself without sending it to the server.
Is it possible or is it farfetched?
It is possible. For example you can use jsPDF to generate PDF documents. But it is generally not advisable to do document conversion on the client side due to all the unknown variables (i.e. doc versions, client computer capacity, etc....) A server library meant for document conversion will have much more robust handling of all the various conditions that may arise when trying to process documents. Hope that helps!

Java - .Net object interchange, not web-based

I have a client-server system implemented in C#, and the client and server exchange .Net objects via serialization / deserialization and communicating via TCP/IP. This runs on a local network, it is not web-based or Internet-based.
Now I want to include Android clients connected by wifi. Again, this is local network only, not via the Internet and not web-based. The Android programming will be in Java. (I am aware of Mono for Android, but prefer not to get into that now.)
Is there some fairly simple way to implement object to object interchange between Java and .Net objects, provided, of course, that they are compatible?
I've looked a bit at JSON (Jackson on the Java end and Json.Net on the .Net end), and I'm guessing it can probably be done, but only with major efforts on remapping things at each end as soon as the objects become fairly complicated.
Any other suggestions? JSON-based or otherwise?
PS. My question is somewhat related to this one Mapping tool for converting Java's JSON to/from C#, but it never got a suitable answer, perhaps due to insufficient info in the question. Also, I don't care whether I end up using a JSON-based transport or XML or something else.
I would suggest either JSON or XML (which is based on a .xsd file) because these are independent of their respective implementations (instead of something like an ObjectOutputStream in java).
The problem of having this format between the two components (client and server) is that they need to be at the same version. My best practice is to have one underlying definition of the format (i use xml with an xsd file which specifies how the xml has to look like), then use jaxb to generated java classes. That way you can (un)marshal from/to xml in the java part.
I am very sure a similar thing exists in the world of .NET.
JSON is smaller than xml in size, i find xml to be more readable.
SO user "default locale" should get the honor for this, but he/she has only answered via a comment. So just to make it very clear what my choice was I'll answer my own question.
I've decided to go with Google Protocol Buffers, which in my opinion has much better support for moving objects back and forth between Java and .Net than JSON. Because I have a lot of experience with C#, and a lot of existing C#-defined classes, I've selected Marc Gravell's protobuf-net program for the .Net end, and Google's own support for the Android end (no - see edit). This implies that I'm defining the objects in C#, not in .proto files - protobuf-net generates the .proto files from which I then generate the Java code.
Incidentally, as the transport mechanism I'm using a little-known program called naga on the Android end. http://code.google.com/p/naga/ Naga seems to work fine, and is well-documented and has sample programs, and should be better known in my opinion.
EDIT:
OK, I've got it working now to my satisfaction. Here's what I'm using:
Google Protocol buffers as the interchange format: https://developers.google.com/protocol-buffers/
Marc Gravell's protobuf-net at the C# end: http://code.google.com/p/protobuf-net/
A program called called protostuff at the Java end: http://code.google.com/p/protostuff/
(I prefer protostuff to the official Google Java implementation of protocol buffers due to Google's implementation being based on the Java objects being immutable.)
Actually, I'm not using pure protocol buffers as the interchange format - I prefix the data with the name of the (outermost) class being transmitted. This makes the data self-identifying for deserializing at the other end.
You can also try wox (https://github.com/codelion/wox), it is a cross platform serialization library for Java and C# based on XML.

Efficient Parser for large XMLs

I have very large XML files to process. I want to convert them to readable PDFs with colors, borders, images, tables and fonts. I don't have a lot of resources in my machine, thus, I need my application to be very optimal addressing memory and processor.
I did a humble research to make my mind about the technology to use but I could not decide what is the best programming language and API for my requirements. I believe DOM is not an option because it consumes a lot of memory, but, would Java with SAX parser fulfill my requirements?
Some people also recommended Python for XML parsing. Is it that good?
I would appreciate your kind advice.
SAX is very good parser but it is outdated.
Recently Oracle have launched new Parser to parse the xml files efficiently called Stax
*http://docs.oracle.com/cd/E17802_01/webservices/webservices/docs/1.6/tutorial/doc/SJSXP2.html*
Attached link will also shows comparisons of all parsers along with memory utilization and its features.
Thanks,
Pavan
Yes I think Sax will work for you. Dom is not good for large XML files as It keeps the whole XML file in memory. You can see a Comparison I wrote in my blog here
Not sure if you're interested in using Perl, but if you're open to it, the following are all good options: LibXML, LibXSLT and XML-Twig, which is good for files too large to fit in memory (so is LibXML::Reader). Of course as SAX is there, but it can be slow. Most people recommend the first two options. Finally, CPAN is an amazing source with a very active community.
If you want the best of DOM without its memory overhead, vtd-xml is the best bet, here is the proof...
http://recipp.ipp.pt/bitstream/10400.22/1847/1/ART_BrunoOliveira_2013.pdf

Simple XML Serialization 3rd Party Library

I'm trying to speed up my XML parsing and I've stumbled upon Simple XML Serialization which looks pretty good, but I have two questions which I was wondering if anybody could help me with:
Does anyone have any performance figures of Simple over the built in SAXParser on an Android device? (or just in general if not on a handset)
Does anyone know if Simple includes support for streamed files? Unfortunately the application I'm working on sometimes needs large XML files, and there's no room for alternatives and I can't for the life of me find any reference to streaming on the website.
You may also want to consider XStream.
After several tests I found that the built in SAXParser is much faster than anything else I could find.

Use jsoup or gquery for plain XML

I was recently wondering about a good library for XML manipulation in Java: A nice Java XML DOM utility
Before re-inventing the wheel, porting jQuery to Java in jOOX, I checked out these libraries:
http://jsoup.org
http://code.google.com/p/gwtquery
But at closer inspection, I can see:
jsoup does not operate on a standard org.w3c.dom document structure. They rolled their own implementation. I checked out the code and I doubt that it is as efficient and tuned as Xerces, for instance. For my use-cases, performance is important
jsoup seems tightly coupled with HTML. I only want to operate on XML, no HTML structure, no CSS
gwtquery is coupled with GWT. I'm not sure how tightly
Has anyone made any experience with these libraries when using it only for server-side XML, not for HTML?
I'm interested in
Performance benchmarks (maybe comparing it with standard DOM / XPath)
Compatibility experience (easy to import/export to standard DOM?)
Without an answer after one month, I think that my own library will resolve my problems best:
http://www.jooq.org/products/jOOX

Categories