Python's NLTK vs. related Java Libraries? [closed] - java

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I've used LingPipe, Stanford's NER, RiTa and various sentence similarity libraries for my previous Java projects that focused on text (pre)processing (indexing, xml tagging, topic detection, etc.) of large amounts of English text (around 10,000 documents summing to > 1gb of text). Maybe I'm a bad Java programmer, but I find myself typing a lot of code and using a lot of libraries when I switch to a different corpus. Overall, I feel like there might be a better tool for the job.
I guess my question is, will I benefit from switching to Python and NLTK for information retrieval / language processing? Or are there enough pros and cons to make it very subjective? Is NLTK intuitive enough to be learned quickly?
I'd get my hands dirty, but I won't have access to a personal machine for the next few days.

NLTK is good for natural language processing. I've used it for my data-mining project. You can train your own analyzer. The learning curve is not steep.
NLTK got huge corpus for training of your analyzer. You can also provide your own set of data, for example, a journal which a part-of-speech tagged.
Because python is very good for text processing, you may to give it a try. Plus, it got a online tutorial
Please don't forget to use python 2.x version. Try python 2.6.
NLTK may not be good with python 3.x

If you already understand the basics of NLP, I think NLTK should be pretty easy to pick up. It's got a bunch of documentation, 2 books, and I've written a number of articles & tutorials on streamhacker.com. And if there's anything from the Java packages you don't want to lose, you could theoretically combine it with NLTK using Jython (and perhaps execnet).
You also may want to take a look at the Pattern library.

Related

FFT Sound analysis [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm trying to write some code that will a. take in sound from my computer's microphone and b. output what frequency (ie. pitch) the sound is. It does not have to be very precise, but has to work. I have spent many hours perusing various fora on this subject and have found that they all ought to be very useful except and would be too if I had more knowledge on the subject. However, I am not a particularly experienced coder and most of the answers I've seen go over my head. I understand that I may have bitten off more than I can chew, considering my novice, but if anyone could give a really down-to-earth easy to understand walkthrough of how I should go about implementing this, I would be verrry appreciative. Please forgive my basic question :).
I was looking to write it in Java but have experience in python and swift as well.
There's a lot of solutions for your problem. If you're good at math, you can look at the definition of a FFT and implement the formula.
However, that job has already be done by other programmers and there are a lot of different libraries that implement the FFT function.
In python, you can use numpy. Or, if you prefer java, you can use that snippet:
http://introcs.cs.princeton.edu/java/97data/FFT.java.html
To read from the microphone, you can use:
https://docs.oracle.com/javase/tutorial/sound/capturing.html
(there's a sample for acquiring audio from microphone here:
Java Sound API - capturing microphone)
So, you just need to use the second code, read the data as 16bits PCM big endian and forward it to the FFT function.
I've been using Processing for a while now and it has a couple of nice audio libraries with FFT support. By default Processing is a java library, so you might want to give it a shot (you can use it in eclipse/netbeans/etc. if the default minimal IDE isn't suitable).
You don't have to use Processing with these libraries though, they are java libraries after all.
Minim has a FFT class with forward() and logAverages()
Beads also has a FFT class and there's a book available that does into more detail on analysis.
Both libraries also offer support for sound input.

Java: SQL and Statistics/Machine Learning [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I have a question for you concerning Java. I am basically a Java user and did most of my work with it. However, in the machine learning classes I took in college, we used mostly python with the scikit-learn and numpy packages.
Now I want to do a project where I crawl data from the web, store it in SQL databases, and then do machine learning on this data. Maybe some of you have experience with those things and share some of it? I mean, of course it is possible to do these things with java, but maybe you have had some particular experiences on why I should use something else or what to consider?
I am happy for all your thoughts :-)
Have a great weekend!
It turns out that programming language and database implementation are secondary problems. Think first about the machine learning you want to do. Review the existing packages (in any language) and pick one according to how well it fits the needs of the business problem you are trying to solve. Then work with whatever language is most convenient for that package. You will probably find that no single language is suitable for all parts of the problem; you will end up gluing together Java, Python, R, shell scripts, etc, to make a complete solution, and there's nothing wrong with that. Consider that your job is problem solving instead of programming in a specific language and go from there.

Java API for Auto regression (AR), ARIMA, Time Series Analysis [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I am looking for either Opensource or Free Java API for Time Series Analysis using AR, ARIMA etc. I need this api for DDOS Attack analysis.
I googled around and found 2 solutions but both are not completely solving the problem:
1) This same question was asked earlier in stackoverflow and a solution was posted regarding SuanSu Api but this API is not free
2) Apache Math Library, but this API provides other forms of Regression like Simple, OLS, GLS etc but not Auto Regression.
I checked for Options in Machine Learning apis like Mahout but not luck yet. Please suggest an appropiate API
I spent my 4th year Computing project on implementing time series forecasting for Java heap usage prediction using ARIMA, Holt Winters etc, so I might be in a good position to advise you on this.
Your best option by far is using the R language, you can call on the forecasting libraries provided by R, through Java by using the JRI library found here. R is well documented, free and open source. You can even run R on a server and then make calls to it via command line using Rserve, which then returns forecasts over HTTP but JRI is the local equivalent if memory serves me correctly.
If you have any questions, let me know.
Have a look at spark-timeseries. The source code is mostly Scala, but it's relatively simple to use the library from within Java. If you're in a place where you are doing time series analysis on the JVM, then you should consider learning Spark/Scala anyways.
The library is young as of this writing and has room for improvement and growth, but as of version 0.3 it implements AR, ARIMA, simple exponential smoothing (EWMA), and Holt-Winters smoothing. It's areas for improvement are a better automatic ARIMA algorithm, support for seasonal ARIMA, and state space modeling, but it's already very useful.

Convert prolog application to a JVM based language? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I have a legacy ISO prolog application of medium size that I would like to move to a JVM based language. The application is a command line tool that parses text files, does some evaluation/transformations and then export a text based file.
My team develops mainly in Java so we have a lot of existing java competence and reusable components. Prolog competence is however very low.
I don't expect there to be a tool that takes prolog source code and transform it to some other language. But I'm trying to understand what would be the easiest solution. Starting from scratch in Java or using a more functional language like Clojure?
But I'm trying to understand what would be the easiest solution.
Some implementations of Prolog run on the JVM platform. Wikipedia lists 5 of them here: http://en.wikipedia.org/wiki/Comparison_of_Prolog_implementations. So maybe the easiest solution is train someone in your team in Prolog, and just port the application to a JVM Prolog implementation. (Which might be a simple thing ...)
Someone on your team is likely to need Prolog skills anyway to successfully translate Prolog to some other language.
However, I recognize that there could be other reasons to translate; e.g. if the existing Prolog code needs a major overhaul anyway.
That is mainly depending on your team's skill. You mentioned that your team has a pretty good Java skills; why not starting with that?
If they don't know LISP, they will spent a lot of time learning it from the scratch. Learning LISP is quite an investment but it is definitely pays in the end.
Although Clojure is going to help you a lot in your case (because of data flow and data transformation), I would say that Java is a better bet since your team is competent with it.
You could consider using Clojure together with core.logic (tutorial) which is a miniKanren implementation. You would need some logic/functional programming skills but you could stay on the JVM.
Prolog is so different from java and other Object Oriented Language. I studied this language to see an other way of programming.
But I dont think there is a magic solution to convert prolog app to a java app. The logic is not the same and no other language is like Prolog. I think you will have to analyse what your prolog app does exactly and go from scractch with a new java app.

Examples of Object-Oriented Projects Help Procedural Programmers [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
Please help me identify some small to medium sized open source projects that embody object oriented design (preferably in C++ or Java). I would like to use these projects to demonstrate how real world problems (as opposed to contrived text book examples) can be solved with an object oriented design. I want to be able to present a plausible explanation of why certain things were chosen to be objects and how they all work together to solve a problem.
Google Chromium (C++): windows, tabs, plugins etc. are all classes.
The Unreal Tournament Public Source Code (432 Headers) contains the declarations of the Unreal engine class library written in C++. I found it to be a rich example of a large object-oriented program. It taught me a lot about how to modularize and object-orient my code. It also demonstrates many tactics for getting a handle on a large code base.
Also, because all you can read are header files, you'll have a fun (and educational) time trying to figure out how the whole thing comes together. (I actually ended up writing my own x86 disassembler so I could cheat and read some of the definitions!)
On the same note, the Doom 3 SDK contains a large chunk of the Doom/Quake engine written in very readable C++.
Just about any large project designed in Java is object-oriented, almost by definition. You can take a look at Apache Hadoop as a large-scale, open-source, objected oriented project written in Java. Another is Apache Ant.
Eclipse would be a good example on the Java side: the plugin architecture is all object oriented.
I asked the same thing to my OO mentor. He pointed me to the JUnit sources, with the recommendation to see how it evolved version by version. This would show you how Kent Beck writes Java code.
Another example on this vein would be the sources of Fit by Ward Cunningham.

Categories