map csv / xml files to java objects - java

can you please help me for choice the best java APIs to map CSV file and XML file into java objects in the context of spring boot application and micro services?
- OpenCSV, CommonApache JAXB ...?
what are the best API for csv and XML to java objcts for a
Thanks you.

I used OpenCSV a lot, without any issue. You can get a good feel of it from this article.
You will need a different library for XML. You need to choose first between DOM and SAX. The most important criteria is size - does it fit in memory with ease? If so, use a DOM one, as it's faster. Otherwise SAX.
A good recommendation for DOM parsing is dom4j.

Related

which XML write/parse/modify is good for android

I need to parse and modify the XML in android ..Can any one suggest which XML parser is better to parse and modify the XML in android ..?
Currently I'm using XMLpullparser but using this i'm not able to modify the XML...
Xpath is available for Android developers I believe. I use that all the time for any XML parsing really.
If the XML has a simple structure then you can deserialize the file into an object. You modify some properties of that object and serialize it to XML afterwards.
XStream is a simple library to serialize objects to XML and back again. It can be found here.
I think this is a clean way, but it isn't the easiest way if the XML file is very complex (because you have to map its structure to a Java class).
XMLPullParser isn't really designed for complex tasks AFAIK. SAX Parser is much better for advanced tasks, but it's also not as easy to use.
For manipulating XMLs, you could use SAX Filters. Look at this tutorial (IBM tutorials are great!): http://www.ibm.com/developerworks/xml/library/x-tipsaxfilter/

XML API for best performance

I have an application that works with a lot of XML data. So, I want to ask you which is the best API to handle XML in java. Today, I'm using W3 and, for performance, I want to migrate to some API.
I make XML from 0, a lot of transforms, import into database (mysql, mssql, etc), export from database to html, modifi of those XML, and more.
Is JDOM the best option? do you know some other better than JDOM?
I heard (by reading pages) about javolution. Somebody use it?
Which API you recommend me?
If you have vast amounts of data, the main thing is to avoid having to load it all into memory at once (because it will use a vast amount of memory, and because it prevents you overlapping IO and processing). Sadly, i believe most DOM and DOM-like libraries (like DOM4J) do just that, so they are not well suited for processing vast amounts of XML efficiently.
Instead, look at using a streaming API, like SAX or StAX. StAX is, in my experience, usually easier to use.
There are other APIs that try to give you the convenience of DOM with the performance of SAX. Javolution might be one; VTD-XML is another. But to be honest, i find StAX quite easy to work with - it's basically a fancy stream, so you just think in the same way as if you were reading a text file from a stream.
One thing you might try is combining JAXB with StAX. The idea is that you stream the file using StAX, then use JAXB to unmarshal chunks within it. For instance, if you were processing an Atom feed, you could open it, read past the header, then work in a loop unmarshalling entry elements to objects one at a time. This only really works if your format consists of a sequence of independent elements, like Atom; it would be largely useless on something richer like XHTML. You can see examples of this in the JAXB reference implementation and a guy's blog post.
The answer depends on what performance aspects are important for your application. One factor is whether you are handling large XML documents.
For parsing, DOM-based approaches will not scale well to large documents. If you need to parse large documents, non-DOM parsers such as those using SAX and StAX will be faster and less resource intensive. However, if you need to transform XML after parsing, using either XSL or a DOM API, you are going to need the whole document in memory in any case.
For creating XML from code, StAX provides a nice API for this. Since the approach is stream-based, this will scale well to writing very large documents.
Well, the most developers I know and myself, we use dom4J, maybe if you have the time you could write a small performancetest with use of both frameworks, then you will see the difference. I prefere dom4j.

File Manipulation libraries

I have a project which I need to manipulate files. things like: create new file by a defined structure(header,data,trail). and then I need to things like search/validate/create/read.
basically I want to map the files to objects and vise versa.(I am willing to map them to objects coz it will be much more comfortable for me to manipulate the fields inside each file via object)
I wonder if any of you deal with such things before? and maybe could recommend me on libraries which could easy my work.
thanks,
ray.
You may want to look at serialization and de-serialization
If you want custom mapping, you need custom coding. I would suggest you look at DataInputStream and DataOutputStream.
Using these you can control the header, records and footer in any binary format you want.
I suggest you generate your serialization (if you need to have the afstest possible speed) or use reflections to do the translation. Just using reflection is pretty fast and much simpler than generating code. ;)
In the end I Found a ORM framework called Canyon which mapping Files to Objects. but still had difficulties. so I have implemented my own ORM file to objects and vise versa.
If you have a defined file layout with different content you should consider to use a template engine like FreeMarker or Velocity to generate your files.
You can define templates here which will be filled with your dynamic content which you have to provide. Definitly better than to use System.out (I mean hard code your template text).
A library which helps for basic file manipulation is Apache Commons IO.
If you realy want to map your files to objects than it would be a Serialization/Deserialization as Angelom mentions. Many libraries help you to do this but the file format is fixed:
JSON: Jackson, GSON
XML: JAXB
If you want the file read by 3rd party as well, how about using some popular existing exchange format such as CSV or XML?
XML is fully supported in standard library. There's plenty of CSV libraries out there, including Apache Commons CSV.

Best way to parse large XML document in Jython

I need to parse a large (>800MB) XML file from Jython. The XML is not deeply nested, containing about a million relevant elements. I need to convert these elements into real objects.
I've used nu.xom.* successfully before, but now that I've switched from Java to Jython, the library fails with the following message:
The parser has encountered more than
"64,000" entity expansions in this
document; this is the limit imposed by
the application.
I have not found a way to fix this, so I probably have to look for another XML library. It could be either Java or Jython-compatible Python and should be efficient. Pythonic would be great, nu.xom.* is simple but not very pythonic. Do you have any suggestions?
Sax is the best way to parse large documents.
Sounds like you're hitting the default expansion limit.
See this note:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4843787
You need to set System property "entityExpansionLimit" to change
the default.
(added) see also the answer to this question.
Try using the SAX parser, it is great for streaming large XML files.
Does jython support xml.etree.ElementTree? If so, use the iterparse method to keep your memory size down. Read this and use elem.clear() as described.
there is a lxml python library, that can parse large files, without loading data to memory.
but i don't know if i jython compatible

What is best practice in converting XML to Java object?

I need to convert XML data to Java objects. What would be best practice to convert this XML data to object?
Idea is to fetch data via a web service (it doesn't use WSDL, just HTTP GET queries, so I cannot use any framework) and answers are in XML. What would be best practice to handle this situation?
JAXB is a standard API for doing this: http://java.sun.com/developer/technicalArticles/WebServices/jaxb/
Have a look at XStream. It might not be the quickest, but it is one of the most user friendly and straightforward converters in Java, especially if your model is not complex.
For a JMS project we were marshalling and unmarshalling (going from java to xml and xml to java) XML embedded in TextMessages (string property). We tried JAXB, Jibx, and XMLBeans. We found that XMLBeans worked best for us. Fast, easily configurable, good documentation, and easy Maven integration.
I have used and will continue to use JDOM -> www.jdom.org
Another option is a Sax Parser. It is procedural - i.e. a visitor pattern - but if the xml is fairly lightweight, (and even medium weight) I have found it to be very useful for this.
JAXB API which comes in Java(In built).
I have used JIBX in MQ module. It works very well. Ant config is simple. Used Xsd2Jibx converter to generate the binding files and Java beans from XML schema. Marshalling and un-marshalling allow to specify character-set parameter. It was useful in my project to handle custom character-set. But I found an issue in the binding compiler. If the Java bean has lengthier path name, it generates class file with lengthier file name which will cause issue in Windows XP(it has a maximum file length limit).
I haven't used other APIs. So I am not trying to compare with others. If you decided to use JIBX, I hope this will be helpful.
More details, please refer JIBX website
I've used XStream as well, it is easy to use and customizable. You can add your own custom converters and that was very handy for me...
So surprised more people have not mentioned Jibx. Amazing lib and i think a lot simpler to use than Jaxb. Performance is also fab!
For this you can also consider apache's bitwixt and simple framework for xml

Categories