I'm trying to speed up my XML parsing and I've stumbled upon Simple XML Serialization which looks pretty good, but I have two questions which I was wondering if anybody could help me with:
Does anyone have any performance figures of Simple over the built in SAXParser on an Android device? (or just in general if not on a handset)
Does anyone know if Simple includes support for streamed files? Unfortunately the application I'm working on sometimes needs large XML files, and there's no room for alternatives and I can't for the life of me find any reference to streaming on the website.
You may also want to consider XStream.
After several tests I found that the built in SAXParser is much faster than anything else I could find.
Related
I am working on a project that currently uses a bunch of very big xslt files.
we use those xslt's to translate an XML from our system to an XML that the other system can read.
Our system actually receives JSONs which we actually save as XMLs just for those xslts.
We are now thinking about a way to replace the xslt with something simpler, but we have a restriction:
Those xslt's are modified by outside people (which work on the other system), so just refactoring them is not an option, since its only a temporary solution until they will become ugly again. also, we still need to find a way to let those people change the way we transform the XML - preferably without teaching them how to code.
Since our system is written in java, we would also like our solution to be supported by one of the major java frameworks.
I was thinking about a sort of rule engine with XQuery for customization, but I am not sure if that is a valid solution.
Another idea I found was to just use ruby, since many people say that it does the job better. but I fear that the teaching overhead will be too great.
I would really appreciate any ideas you might have for solving this problem.
Thanks :)
I have very large XML files to process. I want to convert them to readable PDFs with colors, borders, images, tables and fonts. I don't have a lot of resources in my machine, thus, I need my application to be very optimal addressing memory and processor.
I did a humble research to make my mind about the technology to use but I could not decide what is the best programming language and API for my requirements. I believe DOM is not an option because it consumes a lot of memory, but, would Java with SAX parser fulfill my requirements?
Some people also recommended Python for XML parsing. Is it that good?
I would appreciate your kind advice.
SAX is very good parser but it is outdated.
Recently Oracle have launched new Parser to parse the xml files efficiently called Stax
*http://docs.oracle.com/cd/E17802_01/webservices/webservices/docs/1.6/tutorial/doc/SJSXP2.html*
Attached link will also shows comparisons of all parsers along with memory utilization and its features.
Thanks,
Pavan
Yes I think Sax will work for you. Dom is not good for large XML files as It keeps the whole XML file in memory. You can see a Comparison I wrote in my blog here
Not sure if you're interested in using Perl, but if you're open to it, the following are all good options: LibXML, LibXSLT and XML-Twig, which is good for files too large to fit in memory (so is LibXML::Reader). Of course as SAX is there, but it can be slow. Most people recommend the first two options. Finally, CPAN is an amazing source with a very active community.
If you want the best of DOM without its memory overhead, vtd-xml is the best bet, here is the proof...
http://recipp.ipp.pt/bitstream/10400.22/1847/1/ART_BrunoOliveira_2013.pdf
I'm working on developing a media player like application in Java (it's a swing based application) and I want it to be able to run on smoothly using as many different file formats as possible. I want to be able to take in a bunch of music files, then retrieve their tag information (artist/album/songname/etc), and then later play them. I've done a bit of poking around but it's hard to find a library which will support .m4a, .mp3. and maybe even .flac files. Does anyone know of a library which will do what I want? Thanks!
JMF is, to put it in the nicest possible way, rather out of date, unmaintained, difficult to distribute and in my experience has quite a few annoying bugs that crop up where you least expect them. And if you can get FMJ to work at all, good luck - they pride on it being an up to date, drop in replacement but my experience begs to differ on both those points.
Personally I wouldn't even consider it - just use separate libraries for each format or bunch of formats you want to support. JLayer would be a good one to start with as it can do a fair few, JFlac will do your flac files on top of that.
There's JMF - see http://en.wikipedia.org/wiki/Java_Media_Framework, which also lists some alternatives. I've had rather mixed success with JMF; it worked well for some static MPEG files but didn't seem compatible with the streaming sources we were using at the time (a couple of years ago).
An alternative to jFLAC for FLAC files is to use the official libFLAC, invoked via the Java native interface. See this blog post, under the headline “FLAC decoding with Java native interface” for an explanation of how it's done, with links to working code.
I know this question has been asked before, but that was several years ago, and of the two answers, Rome and Abdera, the first no-longer seems to be maintained (there aren't even any download links on the website, nor can I find documentation). The latter also appears rather complicated, and neither appears up to contemporary standards of Java library design.
Are there any new alternatives out there that are well designed, and well maintained?
Sorry, I do not know of any library, but, that said, seeing as RSS is an XML format you should be able to roll your own using SAX/JAXB/DOM. Which one to use depends on whether you wan ease of integration with Java (JAXB) or speed (SAX). There is a middle ground in DOM.
RSS is not a complicated format so I think you could just develop the features you need as you come across them and it'll be faster (and the skills you learn more transferable) than exhaustice searching for a library if one cannot be found easily.
Hope this helps.
I did find this class RSSDigester. It might help, I don't realy have the time to investigate it right now, sorry.
RSS reading hasn't really needed changing for some time. ROME really is quite nice, and as far as fetching it you can get it from http://download.java.net/maven/2/rome/.
I eventually found HorroRSS, which is exactly what I was hoping for. Its simple, easy to use, and appears robust.
I'm learning the Android api from a book, and it seems like there isn't any mention of a stream-lined api for dealing with raw xml (reading and writing). His suggestion for parsing is the XmlPullParser, and his examples look horrendous considering the kind of api's I'm spoiled by in other platforms (LINQ to XML especially).
Is this the best available technique on the Android platform?
Obviously I can write a wrapper to avoid the repetitive stuff, but I'd be surprised if no such thing already exists.
Also, he doesn't even make mention of creating xml structures in code. What are my options for both?
On a side note, do any Java devs that are familiar with LINQ to XML in .Net know of anything equivalent in Java?
Since you probably don't want to load any substantial size DOMs into Android's memory - pull and SAX parsers are preferred way dealing with XML in Android. I think it pays to invest into understanding how SAX works and write a custom handler than rely on some generic libraries that may be incompatible or overbloated. I parse XML in my apps using SAX all the time and I'm very pleased with the speed (most of the time)
Well I'm pretty new to Java, but here's what I've gleaned so far about xml parsing on Android:
The XmlPullParser approach is recommended for Android due to resource constraints. There is a DOM parser available in Android, which would let you use XPath to navigate an xml document. Using the DOM means that you have to load the entire document into memory at once, however. The XmlPullParser method is much more efficient in terms of memory used.
The XmlPullParser method takes a little getting used to after being comfortable with LINQ to XML or XPath, but it's really not too bad IMHO (at least with the documents I was parsing). If you're working with small xml documents you could certainly use the DOM with XPath.
There's a decent article about the different methods for reading and writing XML with Android here:
http://www.ibm.com/developerworks/opensource/library/x-android/index.html
I had the same issues with parsing xml or xhtml and ended up writing a webservice doing it for me.
Android Device ->(Request URL) -> Webservice Get and Parse -
-> (Data) -> Android Device
You can transmit the data in JSON to work with it on the device.
The advantage of this is you can minimize the traffic on the slow mobile network and change the parsing without releasing a new android app.
Maybe this is will work for you too.
regards