Is there any statistical library for javascript? [closed] - java

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I need to implement some statistical tests like: T-test, Anova and Wilcoxon on javascript.
Similar to Java's - Apache Commons Math Library, is there any statistical tests library or codes for javascript?

jStat : a JavaScript statistical library
https://github.com/jstat/jstat

OpenEpi is a Javascript stats library, is open source, and has ANOVA and t-tests. I've not tried it (it's a bit too focused on epidemiology for my needs) but it might be useful.
jStat is a javascript statistical library project, and it looks like it's got a great future, but it might not have all you need right now. Edit: as of Dec 2012 it looks like the jStat project page is no longer maintained but the project is continuing to be developed. There's more up to date documentation on github. It now does have anova tests and varieties of t-test. No sign of Wilcoxon signed-rank though.
If you need very specific statistical processing in javascript urgently, you might have most success by browsing Omegahat who have various little tools that bridge the established stats language R with others including javascript.
It'll depend on the details of exactly what you want to do, but you might have some success with packages such as RJavascript - a code translator which aims to help turn existing R features into Javascript (just don't expect quality results first time). Also, SpiderMonkey builds on R for browsers, so it might be useful for internal or personal uses (but it's unlikely to be suitable for public publishing).

Some years ago I ported https://code.google.com/p/statistics-distributions-js/ so that I could use it in http://elem.com/~btilly/effective-ab-testing/ - it may have the functionality you need if you only need simple things.

If you're looking for a simple library for descriptive statistics, you could use javascriptstats.com
It does:
Mean
Median
Mode
Range
Variance
Standard Deviation
Best!

Leveraging a related answer:
The following blog post lists some recent packages: http://jgoodall.me/posts/2012/02/01/javascript-statistical-libraries/
As mentioned by others, native JS is a far cry from R, which web-wise has progressed from RApache (http://rapache.net/) to shiny (http://www.rstudio.com/shiny/). The latter uses node.js server-side, so this is quite promising. Of course both approaches will require you to code stats in R server-side, instead of using JS either on client or server.
Marc

Related

Bayesian networks in Scala [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
I'm looking for a library to create Bayes nets and perform learning and inference on them in Scala (or Java, in case of lack of a better solution). The library should be actively maintained, performant, preferably easy, definitely well-documented unless the usage is really straightforward. Free, open-source and commercial alternatives are ok, but for commercial solutions a free trial is required.
An ideal solution would be the equivalent of what in the .NET world is Infer.NET by Microsoft Research, but more documented.
Thanks in advance!
FACTORIE is a young project, but it fits the bill and is implemented in Scala:
FACTORIE is a toolkit for deployable probabilistic modeling,
implemented as a software library in Scala. It provides its users with
a succinct language for creating relational factor graphs, estimating
parameters and performing inference.
It's developed by Andrew McCallum and his lab at UMass, who are also responsible for the hugely useful MALLET machine learning toolkit.
You might want to look into SMILE. It is free and has Java API. Other free options in Java are UnBBayes and SamIam.
SMILE
SMILE (Structural Modeling, Inference, and Learning Engine) is a fully
portable library of C++ classes implementing graphical
decision-theoretic methods, such as Bayesian net-works and influence
diagrams, directly amenable to inclusion in intelligent systems.
UnBBayes
UnBBayes is a probabilistic network framework written in Java. It has
both a GUI and an API with inference, sampling, learning and
evaluation. It supports BN, ID, MSBN, OOBN, HBN, MEBN/PR-OWL, PRM,
structure, parameter and incremental learning.
SamIam
Samiam includes two main components: a graphical user interface and a
reasoning engine. The graphical interface lets users develop Bayesian
network models and save them in a variety of formats. The reasoning
engine supports many tasks including: classical inference; parameter
estimation; time-space tradeoffs; sensitivity analysis; and
explanation-generation based on MAP and MPE.
Pure Scala and free options are FACTORIE (already mentioned) and Figaro. But Figaro currently lacks learning part.
Figaro - Probabilistic Modeling
Figaro models are data structures in the Scala programming language,
which is interoperable with Java, and can be constructed, manipulated,
and used directly within any Scala or Java program.
Perhaps Banjo fits the bill? I'm not sure how actively it is developed, but I know it has been around for at least a few years ... (never used it myself).
Banjo: Bayesian Network Inference with Java Objects
Some Java alternatives to Infer.NET were presented as answers to this question. So, I think basically you're asking about either a follow up to that question (it was asked during the second half of 2010) with respect to Java or fully Scala-based solution.
There is a Scala lib out there by now:
https://github.com/danielkorzekwa/bayes-scala

Double-entry accounting libraries for Java? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
What double-entry accounting libraries are available for Java?
I did write a library for myself, but since it was for a really trivial application, I don't know if it would suit a general purpose accounting need.
It has an interface like:
ledger.newPosting(new Date(), "Received $10 from Anne")
.debit("Cash:Anne", 1000)
.credit("Dues Received", 1000)
.post();
int cashBalance = ledger.getAccount("Cash").getTrialBalance();
assertEquals(-1000, cashBalance);
int anneBalance = ledger.getAccount("Cash:Anne").getTrialBalance();
assertEquals(-1000, anneBalance);
int duesBalance = ledger.getAccount("Dues Received").getTrialBalance();
assertEquals(1000, duesBalance);
Is this the kind of thing you're looking for? Anyone else actually INTERESTED in this code? I wrote it generically, but never published it because I didn't think anyone would want something this trivial.
There's a Swedish project called fribok.org (free (as in GNU free) accounting). It's an application too, but might be componentized and contain what you look for (given that GPL is a viable option for you).
I've seen JMoney used with custom plug-ins. What are you trying to do?
Well, I am not aware of any such libraries. Personally me thinks that double entry accounting framework would boil down to couple of interfaces and minimal code to ensure equation invariants. Hence no libs for that: try to bite a relevant code snippet from JMoney or something like that...
;)
How about jLedger - Java Business Accounting API?
Citing the project's home page: "This is a Java Business Accounting API that consist of invoicing, general ledger, stock/inventory control and other business API that will assist java developer to build a business software with ease."
Note, however, that this project releases the software under the GNU GPL v2 license, not the Apache license that's usually associated with Java-related projects.
GNU GPL is a copyleft license and libraries licensed under it may not be appropriate for internally developed or commercial software.
There is this more recent implementation using JTA and Spring. As it states:
The Double-entry bookkeeping concept implemented with Spring 4, the
Java Transaction API and the H2 database in embedded mode
The best I have seen is a jPos module called minigl which is part of jpos-ee., The jPOS framework is used widely in many production grade deployments. I have personally used in at scale on some high-profile projects.
You will need to get up to speed on jpos-ee, a very solid java framework for all things payment and fintech related. It is worth the learning curve as if you are asking about ledgers you are probably going to have other needs which are likely already addressed in the jPos codebase.
I just wrote a java library for accounting. The beauty of my library is that it uses a 4GL to do the credits, debits and ledgers. You can also import other functions to handle inventory, payroll and things like that. Fetal Libraries

C++ code coverage tool [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I am looking for c++ code coverage tool which fares well in mutli server setup and on both windows and linux without licensing issues(if non free).
I have done some research and found 2 free tools: Covtool and gcov. Any disadvantages on these or any other suggestions?
Although I don't remember all the details of my research for code coverage tools, I seem to remember the following about gcov and covtool:
They require custom modifications to your build system
They need custom compiler flags and/or link steps
They both provide minimal output and formatting
We needed support for Windows/Linux and gcc/MSVC and settled on BullseyeCoverage which is commercial and non-free. We estimated that it would cost us more, in money, to change our build system to use the free products than it would to pay for a BullseyeCoverage license. Their support was great and responsive and I was very pleased with the quality of the tool.
Some benefits:
Great query support both in command line and GUI form
Required no changes to our build system
Had minimal impact on both compile time and run time
Provides tools to integrate with build bots such as CruiseControl and Hudson
Nice GUI for visualization and navigation of coverage results
AQTime is popular for Delphi/C++Builder users, but like the other recommendation, it is not free.
use Gcov tool along with LCOV tool. LCOV tool is a graphical front-end for gcov.
The OovAide program is a free open source tool that will instrument source files
and generate code coverage statistics as well as show which lines were never
run. It is thread safe and efficient.
It is fairly transparent meaning that the code that it produces is all visibile
and can be modified for your project if there are special needs required.
The basic idea of the source code modifications is that it inserts a macro
at every grouping of statements in the AST that CLang is processing.
This is typically after the conditionals or at braces. The macro can be
modified, but the default is that it increments a value at an offset in
an array. I have also modified it to write to a file in some cases,
and this allows a program trace of execution.
One problem may be that its build system is limited, and must be able to be
built using CLang. It may not work on certain types of projects. But since
it just modifies source code by inserting the macro, it is possible to
use it to modify the source code, then use the existing build system to
build the modified source code.
There is a document describing how it works here. http://oovaide.sourceforge.net/articles/TestCoverage.html

Starting out NLP - Python + large data set [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
I've been wanting to learn python and do some NLP, so have finally gotten round to starting. Downloaded the english wikipedia mirror for a nice chunky dataset to start on, and have been playing around a bit, at this stage just getting some of it into a sqlite db (havent worked with dbs in the past unfort).
But I'm guessing sqlite is not the way to go for a full blown nlp project(/experiment :) - what would be the sort of things I should look at ? HBase (.. and hadoop) seem interesting, i guess i could run then im java, prototype in python and maybe migrate the really slow bits to java... alternatively just run Mysql.. but the dataset is 12gb, i wonder if that will be a problem? Also looked at lucene, but not sure how (other than breaking the wiki articles into chunks) i'd get that to work..
What comes to mind for a really flexible NLP platform (i dont really know at this stage WHAT i want to do.. just want to learn large scale lang analysis tbh) ?
Many thanks.
NLTK is where you should start from (it's Python-based -- not sure why you're already thinking about parallelizing your processing at such an early stage... start with a more flexible experimental setup, is my advice). sqlite should be fine for a few GB -- if you need more advanced and standard SQL power you could consider postgresql.
There is a related talk on PyCon 2010 "The Python and the Elephant: Large Scale Natural Language Processing with NLTK and Dumbo".
The link has introductory information, slides and video.
I think sqlite is still a good choice for 12G size data. I have a text classification training set which has the similar size, both sqlite and plain text is fine as long as just iterator it line by line.
It is most likely that you are going to use Vector Space Model to represent the text while doing the anlaysis.
In which case, you should look at platforms that can help you store term vectors with term frequencies. It makes your life so much easier.
Have a look at Apache Lucene which has a python library to access Java Lucene. Elasticsearch is also a good alternative, which uses Apache Lucene underneath and has a really good python package. Elasticsearch also exposes a REST API.
Postgresql is also really good at storing tokens. Check out this article to learn more.
I have worked with sizable language data before and I personally prefer Lucene/Elasticsearch for analysis projects.
Cheers.
Summary from the internet:
Spacy is a natural language processing (NLP) library for Python designed to have fast performance, and with word embedding models built in, it’s perfect for a quick and easy start.
Gensim is a topic modelling library for Python that provides access to Word2Vec and other word embedding algorithms for training, and it also allows pre-trained word embeddings that you can download from the internet to be loaded.
NLTK details already given above.
Standford NLP has recently launched 50+ langauge supported python framework. You should check it out for sure.
There are many others but the above 4 are most usable in the sense of community support and latest features
I personally prefer Spacy.
Spacy is one of fastest of all and can use gensim/other APIs integrated into its model.
Moreover, Spacy models has a lots of languages in its alpha stage making it a perfect choice for multilingual apps.
Scaling is whole different thing[you can use alot of tools].But lets stick to scaling in NLP: Spacy gives so much control over different pipelines that you can disable unwanted pipelines making it faster.
Look into it try yourself and explore.

How to get started writing a code coverage tool? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
Looking for books or other references that discuss actually how to write a code coverage tool in Java; some of the various techniques or tricks - source vs. byte code instrumentation.
This is for a scripting language that generates Java byte code under the hood.
Does your scripting language generate bytecode? Does it generate debug metadata? If so, bytecode instrumentation is probably the way to go. In fact existing tools will probably work (perhaps with minimal modification).
The typical problem with such tools that they are written to work with Java and assume that a class com.foo.Bar.class corresponds to a file com/foo/Bar.java. Unwinding that assumption can be tedious.
EMMA is a ClassLoader that does byte-code re-writing for code-coverage collection in Java. The coding style is a little funky, but I recommend reading source code for some ideas.
If your scripting language is interpreted then you will need a higher-level class loader (at a source level) that hooks into the interpreter.
You can also get the source from a Open Source code coverage tool and learn from it.
Thxm, Mc! http://asm.objectweb.org/ is another one. Excellent documentation on byte code instrumentation, but nothing "directly" aimed at writing a coverage tool - just some hints or ideas.
You might also want to use something like BCEL to analyse which lines of source actually exist in the byte-code. You don't want to report that things like blank lines and comments haven't been covered.
If you're talking about ColdFusion (which I assume you are from the tags) then I'm not sure this is doable but I may be very wrong here...
IIRC, When CF compiles it essentially compiles into a interpreted form of the CFML as a plain old java source file, this is then compiled into the class. Therefore, any instrumentation that you may have will apply to the intermediary version rather than the CFML itself.
Saying that though, Adobe have got the CF debugger now which can step though code, so please prove me wrong - I'd love code coverage in CFML.

Categories