ApacheCommons: Weird results from ChiSquareTest

ApacheCommons: Weird results from ChiSquareTest - java

I am using the Apache Commons lib to calculate the p-value with the ChiSquareTest:
I use the method chiSquareTest(double[] expected, long[] observed); But the values I get back don't make sense to me. So I tried numerous ChiSquare Online Calculators to find out what this function calculates.
An example:
Group 1: {25,25}
Group 2: {30,20}
(Taken from Wikipedia, German Chi Square Test article)
P- values from:
http://www.quantpsy.org/chisq/chisq.htm and
http://vassarstats.net/newcs.html
P = 0.3149 and 0.31490284
0.42154642 and 0.4201
(with and without Yates Correction)
Apache Commons: 0.1489146731787664
Code:
ChiSquareTest tester = new ChiSquareTest();
long[] b = {25,25};
double[] a = {30,20};
tester.chiSquareTest(a,b);
Another thing I do not understand is the need to have a long and a double array. Why not two long arrays?

There are two functions in the lib:
chiSquareTest(double[] expected, long[] observed)
chiSquareTest(long[][] values)
The first one (which I used in the question above) computes the goodness of a fit. But I expected the result from the second one, the test of independence.
The answer was given to me on the Apache Commons user Mailinglist, I will add a link to the archive once it is there. But it is also written in the JavaDoc.
Update:
Mailinglist Archive

Related

How to use MLeap DenseTensor in Java

I am using MLeap to run a Pyspark logistic regression model in a java program. Once I run the pipeline I am able to get a DefaultLeapFrame object with one row Stream(Row(1.3,12,3.6,DenseTensor([D#538613b3,List(2)),1.0), ?).
But I am not sure how to actually inspect the DenseTensor object. When I use getTensor(3) on this row I get an object. I am not familiar with Scala but that seems to be how this is meant to be interacted with. In Java how can I get the values within this DenseVector?
Here is roughly what I am doing. I'm guessing using Object is not right for the type. . .
DefaultLeapFrame df = leapFrameSupport.select(frame2, Arrays.asList("feat1", "feat2", "feat3", "probability", "prediction"));
Tensor<Object> tensor = df.dataset().head().getTensor(3);
Thanks

So the MLeap documentation for the Java DSL is not so good but I was able to look over some unit tests (link) that pointed me to the right thing to use. In case anyone else is interested, this is what I did.
DefaultLeapFrame df = leapFrameSupport.select(frame, Arrays.asList("feat1", "feat2", "feat3", "probability", "prediction"));
TensorSupport tensorSupport = new TensorSupport();
List<Double> tensor_vals = tensorSupport.toArray(df.dataset().head().getTensor(3));

Difficulty with Octave's "javaMethod"

In this question, I was trying to import java classes into Octave. In my particular example, I was (and am) working with javaplex, a set of java tools with code for implementation in Matlab. The answer to the question shows that, whereas in Matlab you would do the following:
import edu.stanford.math.plex4.*;
api.Plex4.createExplicitSimplexStream();
The answer provided in the question showed that the way to do this in Octave is
javaMethod( 'createExplicitSimplexStream', 'edu.stanford.math.plex4.api.Plex4')
This was working excellently, but then I ran into a strange problem. There is another method called createVietorisRipsStream. In Matlab, I would run this with a line such as the following:
api.Plex4.createVietorisRipsStream(parameters);
So I would think that the equivalent command in Octave would be
javaMethod( 'createVietorisRipsStream', 'edu.stanford.math.plex4.api.Plex4')
However, when I do this, I get the following error:
error: [java] java.lang.NoSuchMethodException: createVietorisRipsStream
I'm not sure why this error is coming up, and both are in the same JAVA file ('Plex4'). I did take a look at the Plex4 file, and there are two differences between createExplicitSimplexStream and createVietorisRipsStream that I noticed:
There are two instances of createExplicitSimplexStream and six instances of createVietorisRipsStream
There is bit that says <double[]>. I don't know if that is relevant however (I haven't read or wrote much java, up to this point, I've been able to use the tutorial they provided to only use Matlab and not have to look under the hood).
Here is one example of the code from the Plex4 file for a createExplicitSimplexStream:
public static ExplicitSimplexStream createExplicitSimplexStream(double maxFiltrationValue) {
return new ExplicitSimplexStream(maxFiltrationValue);
}
Here is one example of the code from the Plex4 file for a createVietorisRipsStream:
public static VietorisRipsStream<double[]> createVietorisRipsStream(double[][] points, int maxDimension, double maxFiltrationValue, int numDivisions) {
return FilteredStreamInterface.createPlex4VietorisRipsStream(points, maxDimension, maxFiltrationValue, numDivisions);
}
Any idea of why I'm getting the error I'm getting?

Read the octave documentation for the Java section properly, it's only 4 pages, and it explains this well!
As I mentioned in the comments in the previous question, the way to call a java method with arguments is:
javamethod(
name of method as a string,
name of class fully qualified with packages as a string,
method's first argument,
method's second argument,
... etc
)
This is the only way to call 'static' methods; with normal 'instance' methods, you can either use javaMethod and replace the name of the class by the java object itself, or simply use it as you would in java, i.e. objectname.methodname(arg1, arg2, ... etc)
I have implemented here the tutorial for you to have a look at (page 14 in the pdf). (don't forget to run the modified 'load_javaplex' script first).
octave:2> max_dimension = 3;
octave:3> max_filtration_value = 4;
octave:4> num_divisions = 1000;
octave:5> point_cloud = javaMethod( 'getHouseExample', 'edu.stanford.math.plex4.examples.PointCloudExamples')
point_cloud =
<Java object: double[][]>
octave:6> stream = javaMethod( 'createVietorisRipsStream', 'edu.stanford.math.plex4.api.Plex4', point_cloud, max_dimension, max_filtration_value, num_divisions)
stream =
<Java object: edu.stanford.math.plex4.streams.impl.VietorisRipsStream>
octave:7> persistence = javaMethod( 'getModularSimplicialAlgorithm', 'edu.stanford.math.plex4.api.Plex4', max_dimension, 2)
persistence =
<Java object: edu.stanford.math.plex4.autogen.homology.IntAbsoluteHomology>
octave:8> intervals = persistence.computeIntervals(stream)
intervals =
<Java object: edu.stanford.math.plex4.homology.barcodes.BarcodeCollection>
(I have not gone further because plot_barcodes needs to be modified a bit too; it's only a couple of lines but it would be too much to post here, the reasoning is the same though).
Also, if you're not sure what is meant by class constructors, class methods, and static vs instance-specific methods, unfortunately this is more to do with java, although it should be pretty introductory stuff. It is well worth reading up a bit about it first.
Good luck!

How configure Stanford QNMinimizer to get similar results as scipy.optimize.minimize L-BFGS-B

I want to configurate the QN-Minimizer from Stanford Core NLP Lib to get nearly similar optimization results as scipy optimize L-BFGS-B implementation or get a standard L-BFSG configuration that is suitable for the most things. I set the standard paramters as follow:
The python example I want to copy:
scipy.optimize.minimize(neuralNetworkCost, input_theta, method = 'L-BFGS-B', jac = True)
My try to do the same in Java:
QNMinimizer qn = new QNMinimizer(10,true) ;
qn.terminateOnMaxItr(batch_iterations);
//qn.setM(10);
output = qn.minimize(neuralNetworkCost, 1e-5, input,15000);
What I need is a solid and general L-BFSG configuration, that is suitable to solve most problems.
I m also not sure, if I need to set some of these parameters for standard L-BFGS configuration:
useAveImprovement = ?;
useRelativeNorm = ?;
useNumericalZero = ?;
useEvalImprovement = ?;
Thanks for your help in advance, I m new on that field.
Resources for Information:
Stanford Core NLP QNMinimizer:
http://nlp.stanford.edu/nlp/javadoc/javanlp-3.5.2/edu/stanford/nlp/optimization/QNMinimizer.html#setM-int-
https://github.com/stanfordnlp/CoreNLP/blob/master/src/edu/stanford/nlp/optimization/QNMinimizer.java
Scipy Optimize L-BFGS-B:
http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html
http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.optimize.fmin_l_bfgs_b.html
Thanks in advance!

What you have should be just fine. (Have you actually had any problems with it?)
Setting termination both on max iterations and max function evaluations is probably overkill, so you might omit the last argument to qn.minimize(), but it seems from the documentation that scipy does use both with a default value of 15000.
In general, using the robustOptions (with a second argument of true as you do) should give a reliable minimizer, similar to the pgtol convergence criterion of scipy. The other options are for special situations or just to experiment with how they work.

How can I display my example code in JavaDoc, without having to manually copy/paste it?

I need to display my example code directly in my library's JavaDoc documentation, including its output. But I want to automate this process, so the example code can be unit tested by an external process, and not displayed unless it actually works.
I have not figured out a way to do this except by manually copy-pasting the source code (and output) each time a change is made--which is unmanageable given there are now well above a hundred example classes in my various projects. Alternatively I could simply not display those examples, but instead provide a link to them.
Both of these solutions are unacceptable, and I am hoping there might be a better way to do this.
How do you automate the insertion of your example code, so it is displayed directly in your JavaDoc?
Thanks.

Not exactly what you want, but maybe another interesting approach is the documentation of the Play Framework:
They document with Markdown and integrate code samples with special annotations (described in the Guidelines for writing Play documentation). So all the code samples can be tested before they are included in the documentation.
Their (unfortunately custom) solution to generate the docs can be found in the play-doc GitHub repository.

This is the question I've tried to answer with Codelet (GitHub link).
Codelet automates the insertion of already unit-tested example code into your JavaDoc, using taglets. As with all taglets, Codelet is executed as part of javadoc.exe. It is now released in beta, (and needs beta testers!).
There are four Codelet taglets:
{#codelet.and.out}: Displays source code immediately followed by its output
{#codelet}: Displays source code only
{#codelet.out}: Displays output only
{#file.textlet}: Displays the contents of any plain-text file, such as the input for an example code.
A common example:
{#.codelet.and.out com.github.aliteralmind.codelet.examples.adder.AdderDemo%eliminateCommentBlocksAndPackageDecl()}
which uses the eliminateCommentBlocksAndPackageDecl() "customizer" to eliminate the package declaration line and all multi-line comments (such as the license and JavaDoc blocks).
Output (between the horizontal rules):
Example
public class AdderDemo {
public static final void main(String[] ignored) {
Adder adder = new Adder();
System.out.println(adder.getSum());
adder = new Adder(5, -7, 20, 27);
System.out.println(adder.getSum());
}
}
Output
0
45
An alternative is to display only a portion of the example's code: A code snippet:
{#.codelet.and.out com.github.aliteralmind.codelet.examples.adder.AdderDemo%lineRange(1, false, "Adder adder", 2, false, "println(adder.getSum())", "^ ")}
This displays the same example as above, starting with (the line containing) Adder adder, and ending with the second println(adder.getSum()). This also eliminates the extra indentation, which in this case is six spaces.
Output (between the horizontal rules):
Example
Adder adder = new Adder();
System.out.println(adder.getSum());
adder = new Adder(5, -7, 20, 27);
System.out.println(adder.getSum());
Output:
0
45
All taglets accept customizers.
It is possible to write your own customizers which, for example, can "linkify" function names, change the template in which source-and-output is displayed, and do any arbitrary alteration to any or all lines. Examples include highlighting something in yellow, or making regular expression replacements.
As a final example, and as a contrast to those above, here is the taglet that blindly prints all lines from an example code, without any changes. It uses no customizer:
{#.codelet.and.out com.github.aliteralmind.codelet.examples.adder.AdderDemo}
Output (between the horizontal rules):
Example
/*license*\
Codelet: Copyright (C) 2014, Jeff Epstein (aliteralmind __DASH__ github __AT__ yahoo __DOT__ com)
This software is dual-licensed under the:
- Lesser General Public License (LGPL) version 3.0 or, at your option, any later version;
- Apache Software License (ASL) version 2.0.
Either license may be applied at your discretion. More information may be found at
- http://en.wikipedia.org/wiki/Multi-licensing.
The text of both licenses is available in the root directory of this project, under the names "LICENSE_lgpl-3.0.txt" and "LICENSE_asl-2.0.txt". The latest copies may be downloaded at:
- LGPL 3.0: https://www.gnu.org/licenses/lgpl-3.0.txt
- ASL 2.0: http://www.apache.org/licenses/LICENSE-2.0.txt
\*license*/
package com.github.aliteralmind.codelet.examples.adder;
/**
<P>Demonstration of {#code com.github.aliteralmind.codelet.examples.adder.Adder}.</P>
<P>{#code java com.github.aliteralmind.codelet.examples.AdderDemo}</P>
#since 0.1.0
#author Copyright (C) 2014, Jeff Epstein ({#code aliteralmind __DASH__ github __AT__ yahoo __DOT__ com}), dual-licensed under the LGPL (version 3.0 or later) or the ASL (version 2.0). See source code for details. <A HREF="http://codelet.aliteralmind.com">{#code http://codelet.aliteralmind.com}</A>, <A HREF="https://github.com/aliteralmind/codelet">{#code https://github.com/aliteralmind/codelet}</A>
**/
public class AdderDemo {
public static final void main(String[] ignored) {
Adder adder = new Adder();
System.out.println(adder.getSum());
adder = new Adder(5, -7, 20, 27);
System.out.println(adder.getSum());
}
}
Output:
0
45
Codelet is now released in beta. Please consider giving it a try, and posting your comments and criticisms in the GitHub issue tracker.

Issue with Jama's Eigenvalue decomposition function

I am getting a wrong eigen-vector (also checked by running multiple times to be sure) when i am using matrix.eig(). The matrix is:
1.2290 1.2168 2.8760 2.6370 2.2949 2.6402
1.2168 0.9476 2.5179 2.1737 1.9795 2.2828
2.8760 2.5179 8.8114 8.6530 7.3910 8.1058
2.6370 2.1737 8.6530 7.6366 6.9503 7.6743
2.2949 1.9795 7.3910 6.9503 6.2722 7.3441
2.6402 2.2828 8.1058 7.6743 7.3441 7.6870
The function returns the eigen vectors:
-0.1698 0.6764 0.1442 -0.6929 -0.1069 0.0365
-0.1460 0.6478 0.1926 0.6898 0.0483 -0.2094
-0.5239 0.0780 -0.5236 0.1621 -0.2244 0.6072
-0.4906 -0.0758 -0.4573 -0.1279 0.2842 -0.6688
-0.4428 -0.2770 0.4307 0.0226 -0.6959 -0.2383
-0.4884 -0.1852 0.5228 -0.0312 0.6089 0.2865
Matlab gives the following eigen-vector for the same input:
0.1698 -0.6762 -0.1439 0.6931 0.1069 0.0365
0.1460 -0.6481 -0.1926 -0.6895 -0.0483 -0.2094
0.5237 -0.0780 0.5233 -0.1622 0.2238 0.6077
0.4907 0.0758 0.4577 0.1278 -0.2840 -0.6686
0.4425 0.2766 -0.4298 -0.0227 0.6968 -0.2384
0.4888 0.1854 -0.5236 0.0313 -0.6082 0.2857
The eigen-values for matlab and jama are matching but eigen-vectors the first 5 columns are reversed in sign and only the last column is accurate.
Is there any issue on the kind of input that Jama.Matrix.EigenvalueDecomposition.eig()
accepts or any other problem with the same? Please tell me how i can fix the error. Thanks in advance.

There is no error here, both results are correct - as is any other scalar times the eigen vectors.
There are an infinite number of eigen vectors that work - its just convention that most software programs report the vectors that have length of one. That Jama reports eigen vectors equal to -1 times those of Matlab is probably just an artifact of the algorithm they used.

For a given matrix, the eigenvalues are unique, whose number equals the dimension of the matrix if plurality is considered. While the corresponding eigenvalues might be different because the vectors can scale according to a certain direction. In your post results, both JAVA and Matlab versions are correct.
Also, you could check the D matrix, where the eigenvalues come from. You could find they are the same.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

ApacheCommons: Weird results from ChiSquareTest - java

Related

How to use MLeap DenseTensor in Java

Difficulty with Octave's "javaMethod"

How configure Stanford QNMinimizer to get similar results as scipy.optimize.minimize L-BFGS-B

How can I display my example code in JavaDoc, without having to manually copy/paste it?

Issue with Jama's Eigenvalue decomposition function

Categories

Resources