Which is the best JSON rewriter for Java? - java

Which JSON rewriter is the best for applications written in Java? Criteria may vary. I'm personally most interested in stability and performance.

I am using the one from http://www.json.org. The direct link to the Java code is this:
http://www.json.org/java/index.html.
The nice thing about it is that it does not require any dependencies. You just need to add seven source files to your project and you've got yourself a JSON builder.

This one works just fine: http://json-lib.sourceforge.net/

This JsonTools library is very complete. You can find it at Berlios.

Related

java - tf*idf implementation?

I am basically creating a search engine and I want to implement tf*idf to rank my xml documents based on a search query. How do I implement it? How do I start it? Any help appreciated.
Surprising that the Weka library hasn't been mentioned here. Weka's StringToWordVector class implements TF-IDF.
I did this in the past, and I used Lucene to get the TD*IDF data.
It took fair amount of fiddling aound though, so if there are other solutions people know are easier, then use them.
Start by looking at TermFreqVector and other classes in org.apache.lucene.index.
tfidf is a standalone Java package that calculates Tf-Idf.
Apache Mahout:
https://github.com/apache/mahout/blob/master/mr/src/main/java/org/apache/mahout/vectorizer/TFIDF.java
I believe it requires a Hadoop File System, which is a bit of extra work. But it works great.

Intellij parsing java code

I want to use a math-expression parser of java code. In particular I would like to convert a math-expression given as String to an abstract syntax tree consisted of separate nodes.
Is there anyone to recommend me a relevant open source tool?
If no, how do you reckon the possibility to exploit Intellij source code to do this work?
Which classes are responsible for code parsing and analysis?
Are they included in idea.jar? How can I easily infiltrate their functionality (methods etc)?
I am speaking exclusively for Intellij.
Take a look at MVEL library.
If you only want the results of the math-expression you should revise the question and the answer i selected months ago:
Java 1.5: mathematical formula parser
Brieff description: use the java integration with dinamyc languajes like javascript to let them do the work for you
I would not use IntelliJ, as much as I love it.
If you need an AST, look no further than ANTLR. If you can write a grammar for your equations, ANTLR can generate a lexer/parser to create it for you.

Download Pubmed Abstracts in Java

Does anyone have an implementation of a program that downloads pubmed abstracts with title, author, date, and content to separate plaintext files given a MESH term?
http://www.ncbi.nlm.nih.gov/entrez/eutils/soap/v2.0/DOC/esoap_java_help.html has an example. It worked for me like a charm.
I posted the code as a maven project on github
There is a built-in function for downloading different type of files (for example XML, CSV, and plain text files) right on the PubMed homepage. Just make a search and then select "Send to" where you'll be given a plethora of options.
As an alternative to esoap you can also use RESTful API.
Assuming that you want to get all articles with MESH keyword: galactosylceramides then your query would look like:
http://www.ebi.ac.uk/europepmc/webservices/rest/search/resulttype=core&query=KW:galactosylceramides
Of course, you have to parse xml result, but I don't think it's a big problem.
There is an example here, but not in Java. http://www.ncbi.nlm.nih.gov/books/NBK25500/

Best way to parse large XML document in Jython

I need to parse a large (>800MB) XML file from Jython. The XML is not deeply nested, containing about a million relevant elements. I need to convert these elements into real objects.
I've used nu.xom.* successfully before, but now that I've switched from Java to Jython, the library fails with the following message:
The parser has encountered more than
"64,000" entity expansions in this
document; this is the limit imposed by
the application.
I have not found a way to fix this, so I probably have to look for another XML library. It could be either Java or Jython-compatible Python and should be efficient. Pythonic would be great, nu.xom.* is simple but not very pythonic. Do you have any suggestions?
Sax is the best way to parse large documents.
Sounds like you're hitting the default expansion limit.
See this note:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4843787
You need to set System property "entityExpansionLimit" to change
the default.
(added) see also the answer to this question.
Try using the SAX parser, it is great for streaming large XML files.
Does jython support xml.etree.ElementTree? If so, use the iterparse method to keep your memory size down. Read this and use elem.clear() as described.
there is a lxml python library, that can parse large files, without loading data to memory.
but i don't know if i jython compatible

What is best practice in converting XML to Java object?

I need to convert XML data to Java objects. What would be best practice to convert this XML data to object?
Idea is to fetch data via a web service (it doesn't use WSDL, just HTTP GET queries, so I cannot use any framework) and answers are in XML. What would be best practice to handle this situation?
JAXB is a standard API for doing this: http://java.sun.com/developer/technicalArticles/WebServices/jaxb/
Have a look at XStream. It might not be the quickest, but it is one of the most user friendly and straightforward converters in Java, especially if your model is not complex.
For a JMS project we were marshalling and unmarshalling (going from java to xml and xml to java) XML embedded in TextMessages (string property). We tried JAXB, Jibx, and XMLBeans. We found that XMLBeans worked best for us. Fast, easily configurable, good documentation, and easy Maven integration.
I have used and will continue to use JDOM -> www.jdom.org
Another option is a Sax Parser. It is procedural - i.e. a visitor pattern - but if the xml is fairly lightweight, (and even medium weight) I have found it to be very useful for this.
JAXB API which comes in Java(In built).
I have used JIBX in MQ module. It works very well. Ant config is simple. Used Xsd2Jibx converter to generate the binding files and Java beans from XML schema. Marshalling and un-marshalling allow to specify character-set parameter. It was useful in my project to handle custom character-set. But I found an issue in the binding compiler. If the Java bean has lengthier path name, it generates class file with lengthier file name which will cause issue in Windows XP(it has a maximum file length limit).
I haven't used other APIs. So I am not trying to compare with others. If you decided to use JIBX, I hope this will be helpful.
More details, please refer JIBX website
I've used XStream as well, it is easy to use and customizable. You can add your own custom converters and that was very handy for me...
So surprised more people have not mentioned Jibx. Amazing lib and i think a lot simpler to use than Jaxb. Performance is also fab!
For this you can also consider apache's bitwixt and simple framework for xml

Categories