In the code below I wanted to get sample standard deviation but I got (population standard deviation) instead of (sample standard
deviation), What am I doing wrong?
public void compute(View view) {
no1 = Double.parseDouble(et1.getText().toString());
no2 = Double.parseDouble(et2.getText().toString());
no3 = Double.parseDouble(et3.getText().toString());
m = (no1 + no2 + no3)/3;
mm1= (no1-m);
mm1 = mm1*mm1;
mm2= (no2-m);
mm2 = mm2*mm2;
mm3= (no3-m);
mm3 = mm3*mm3;
std = (mm1+mm2+mm3)/3;
tv1.setText(String.valueOf(Math.sqrt(std)));
}
If you're trying to calculate an estimate of the population using a random sample of that population (the "sample standard deviation") then the calculation is almost the same, but the dividend needs to be decreased by one.
In other words, your sample size is three so you need to divide by two in order to adjust for the fact you're working from a sample and not the entire population. So your final line of calculation needs to look like this:
std = (mm1 + mm2 + mm3) / 2;
You can find numerous pages online which give a detailed explanation about the difference between population and sample standard deviation, such as this article on macroption.com.
I am new to Java, and I am reading a book on it now. The book does not give me the answer. I am using the following code:
package loanpayments;
public class LoanPayments {
public static void main(String[] args) {
double years = Double.parseDouble(args[0]);
double P = Double.parseDouble(args[1]);
double r = Double.parseDouble(args[2]);
double R = r / 100;
double A = P*(Math.E*Math.exp(R*years));
System.out.println(A);
}
}
I am testing the code with the following values:
years = 3
P = 2340
r = 3.1
First I have to divide r by 100 to get a correct value (in this case it becomes 0.031). The new value of 0.031 becomes capitalized R. Then I use the formula to find A.
I am getting an incorrect output of ~6980.712, when the output should instead be ~2568.060.
I am thinking that I put in the formula wrong, it should be this:
Pe^R(years)
In this case e is Euler's number (~2.71828)
If anyone could advise me on how to fix the formula, or some other mistake, I would much appreciate it, thanks.
Not needed to multiply with another e because Math.exp() is already the exponential function.
I have implemented Logistic Regression with Gradient Descent in Java. It doesn't seem to work well (It does not classify records properly; the probability of y=1 is a lot.) I don't know whether my implementation is correct.I have gone through the code several times and i am unable to find any bug. I have been following Andrew Ng's tutorials on Machine learning on Course Era. My Java implementation has 3 classes. namely :
DataSet.java : To read the data set
Instance.java : Has two members : 1. double[ ] x and 2. double label
Logistic.java : This is the main class that implements Logistic Regression with Gradient Descent.
This is my cost function:
J(Θ) = (- 1/m ) [Σmi=1 y(i) log( hΘ( x(i) ) ) + (1 - y(i) ) log(1 - hΘ (x(i)) )]
For the above Cost function, this is my Gradient Descent algorithm:
Repeat ( Θj := Θj - α Σmi=1 ( hΘ( x(i)) - y(i) ) x(i)j
(Simultaneously update all Θj )
)
import java.io.FileNotFoundException;
import java.util.Arrays;
import java.util.Collections;
import java.util.List;
public class Logistic {
/** the learning rate */
private double alpha;
/** the weight to learn */
private double[] theta;
/** the number of iterations */
private int ITERATIONS = 3000;
public Logistic(int n) {
this.alpha = 0.0001;
theta = new double[n];
}
private double sigmoid(double z) {
return (1 / (1 + Math.exp(-z)));
}
public void train(List<Instance> instances) {
double[] temp = new double[3];
//Gradient Descent algorithm for minimizing theta
for(int i=1;i<=ITERATIONS;i++)
{
for(int j=0;j<3;j++)
{
temp[j]=theta[j] - (alpha * sum(j,instances));
}
//simulataneous updates of theta
for(int j=0;j<3;j++)
{
theta[j] = temp[j];
}
System.out.println(Arrays.toString(theta));
}
}
private double sum(int j,List<Instance> instances)
{
double[] x;
double prediction,sum=0,y;
for(int i=0;i<instances.size();i++)
{
x = instances.get(i).getX();
y = instances.get(i).getLabel();
prediction = classify(x);
sum+=((prediction - y) * x[j]);
}
return (sum/instances.size());
}
private double classify(double[] x) {
double logit = .0;
for (int i=0; i<theta.length;i++) {
logit += (theta[i] * x[i]);
}
return sigmoid(logit);
}
public static void main(String... args) throws FileNotFoundException {
//DataSet is a class with a static method readDataSet which reads the dataset
// Instance is a class with two members: double[] x, double label y
// x contains the features and y is the label.
List<Instance> instances = DataSet.readDataSet("data.txt");
// 3 : number of theta parameters corresponding to the features x
// x0 is always 1
Logistic logistic = new Logistic(3);
logistic.train(instances);
//Test data
double[]x = new double[3];
x[0]=1;
x[1]=45;
x[2] = 85;
System.out.println("Prob: "+logistic.classify(x));
}
}
Can anyone tell me what am I doing wrong?
Thanks in advance! :)
As I am studying logistic regression, I took the time to review your code in detail.
TLDR
In fact, it appears the algorithm is correct.
The reason you had so much false negatives or false positives is, I think, because of the hyper parameters you choose.
The model was under-trained so the hypothesis was under-fitting.
Details
I had to create the DataSet and Instance classes because you did not publish them, and set up a train data set and a test data set based on the Cryotherapy dataset.
See http://archive.ics.uci.edu/ml/datasets/Cryotherapy+Dataset+.
Then, using your same exact code (for the logistic regression part) and by choosing an alpha rate of 0.001 and a number of iterations of 100000, I got a precision rate of 80.64516129032258 percent on the test data set, which is not so bad.
I tried to get a better precision rate by tweaking manualy those hyper parameters but could not obtain any better result.
At this point, an enhancement would be to implement regularization, I suppose.
Gradient descent formula
In Andrew Ng's video about the the cost function and gradient descent, it is correct that the 1/m term is omitted.
A possible explanation is that the 1/m term is included in the alpha term.
Or maybe it's just an oversight.
See https://www.youtube.com/watch?v=TTdcc21Ko9A&index=36&list=PLLssT5z_DsK-h9vYZkQkYNWcItqhlRJLN&t=6m53s at 6m53s.
But if you watch Andrew Ng's video about regularization and logistic regression you'll notice that the term 1/m is clearly present in the formula.
See https://www.youtube.com/watch?v=IXPgm1e0IOo&index=42&list=PLLssT5z_DsK-h9vYZkQkYNWcItqhlRJLN&t=2m19s at 2m19s.
I wanted to get regression parameters by using Apache's Commons.Math3 library and the OLSMultipleLinearRegression.
The regression should be polynomial with a power of 2.
It worked fine with test data but when I use this experimental data the methode gives me an absolutely wrong regression.
public static void poly (){
OLSMultipleLinearRegression quadRegression = new OLSMultipleLinearRegression();
double [] y = { 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,
26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,
51,52,53,54,55,56,57,58,59};
double [][] x = {{1.00,1.00},{1.00,1.00},{1.00,1.00},{1.00,1.00},{1.00,1.00},{1.00,1.00},{1.00,1.00},{1.00,1.00},{1.00,1.00},{0.95,0.90},{0.96,0.91},{0.96,0.92},{0.96,0.92},{0.96,0.92},{0.92,0.84},{0.92,0.85},
{0.92,0.86},{0.93,0.86},{0.93,0.87},{0.89,0.80},{0.90,0.81},{0.90,0.81},{0.90,0.82},{0.89,0.80},{0.90,0.81},{0.90,0.82},{0.91,0.82},{0.91,0.83},{0.90,0.80},{0.90,0.80},{0.90,0.81},{0.91,0.82},
{0.89,0.79},{0.89,0.80},{0.90,0.80},{0.90,0.81},{0.88,0.77},{0.88,0.77},{0.88,0.78},{0.88,0.78},{0.86,0.73},{0.86,0.74},{0.86,0.74},{0.86,0.74},{0.84,0.71},{0.85,0.72},{0.85,0.72},{0.85,0.73},
{0.84,0.71},{0.84,0.71},{0.84,0.71},{0.84,0.71},{0.83,0.69},{0.83,0.69},{0.83,0.69},{0.82,0.68},{0.82,0.68},{0.82,0.68},{0.82,0.68}};
quadRegression.newSampleData(y, x);
quadRegression.setNoIntercept(false);
double [] results = quadRegression.estimateRegressionParameters();}
For this input data I get the equation y=117.54x²-504.83x+389.088 which would result in a y-value of 379.760.85 for x=59 - way beyond my input value.
So I either handled the class absolutly wrong or I got stuck in a mathematical pitfall.
If someone please could explain me what I did wrong or misinterpreted - this problem drives me insane.
I'm doing a KMean clustering on a 12 dimensional matrix. I managed to get the result in K set of cluster. I want to show the result by plotting it into a 2D graph, but I can't figure it out how can I convert the 12 dimension data into 2 dimension.
Any suggestion on how can I do the conversion or any alternative ways on visualizing the result? I tried Multidimensional Scaling for Java (MDSJ) but it did not work.
The KMean algorithm I'm using was from the Java Machine Learning Library: Clustering basics.
I would do Principal Component Analysis (probably the easiest algorithm from Multidimensional scaling algorithms). (BTW PCA has nothing to do with KMeans, it is a general method for dimensionality reduction)
I assume variables are in columns, observations are in rows.
Standardize the data - convert variables to z-scores. That means: from each cell, subtract the mean of the column and devide the result by the std. deviation of the column. That way you get zero mean and unit variance. The former is obligatory, the latter, I would say, good to do. If you have zero variance, you calculate the eigen-vectors from the covariance matrix, otherwise have to use correlation matrix which kind of standardizes the data automatically. See this for explanation).
Calculate eigen-vectors and eigen-values of the covariance matrix. Sort the eigen-vectors by the eigen-values. (Many libraries already give you eigen-vectors sorted that way).
Use first two columns of the eigen-vector matrix and multiply the original matrix (converted to z-scores), visualize this data.
Using the colt library, you can do the following. It will be similar with other matrix libraries:
import cern.colt.matrix.DoubleMatrix1D;
import cern.colt.matrix.DoubleMatrix2D;
import cern.colt.matrix.doublealgo.Statistic;
import cern.colt.matrix.impl.SparseDoubleMatrix2D;
import cern.colt.matrix.linalg.Algebra;
import cern.colt.matrix.linalg.EigenvalueDecomposition;
import hep.aida.bin.DynamicBin1D;
public class Pca {
// to show matrix creation, it does not make much sense to calculate PCA on random data
public static void main(String[] x) {
double[][] data = {
{2.0,4.0,1.0,4.0,4.0,1.0,5.0,5.0,5.0,2.0,1.0,4.0},
{2.0,6.0,3.0,1.0,1.0,2.0,6.0,4.0,4.0,4.0,1.0,5.0},
{3.0,4.0,4.0,4.0,2.0,3.0,5.0,6.0,3.0,1.0,1.0,1.0},
{3.0,6.0,3.0,3.0,1.0,2.0,4.0,6.0,1.0,2.0,4.0,4.0},
{1.0,6.0,4.0,2.0,2.0,2.0,3.0,4.0,6.0,3.0,4.0,1.0},
{2.0,5.0,5.0,3.0,1.0,1.0,6.0,6.0,3.0,2.0,6.0,1.0}
};
DoubleMatrix2D matrix = new DenseDoubleMatrix2D(data);
DoubleMatrix2D pm = pcaTransform(matrix);
// print the first two dimensions of the transformed matrix - they capture most of the variance of the original data
System.out.println(pm.viewPart(0, 0, pm.rows(), 2).toString());
}
/** Returns a matrix in the space of principal components, take the first n columns */
public static DoubleMatrix2D pcaTransform(DoubleMatrix2D matrix) {
DoubleMatrix2D zScoresMatrix = toZScores(matrix);
final DoubleMatrix2D covarianceMatrix = Statistic.covariance(zScoresMatrix);
// compute eigenvalues and eigenvectors of the covariance matrix (flip needed since it is sorted by ascending).
final EigenvalueDecomposition decomp = new EigenvalueDecomposition(covarianceMatrix);
// Columns of Vs are eigenvectors = principal components = base of the new space; ordered by decreasing variance
final DoubleMatrix2D Vs = decomp.getV().viewColumnFlip();
// eigenvalues: ev(i) / sum(ev) is the percentage of variance captured by i-th column of Vs
// final DoubleMatrix1D ev = decomp.getRealEigenvalues().viewFlip();
// project the original matrix to the pca space
return Algebra.DEFAULT.mult(zScoresMatrix, Vs);
}
/**
* Converts matrix to a matrix of z-scores (by columns)
*/
public static DoubleMatrix2D toZScores(final DoubleMatrix2D matrix) {
final DoubleMatrix2D zMatrix = new SparseDoubleMatrix2D(matrix.rows(), matrix.columns());
for (int c = 0; c < matrix.columns(); c++) {
final DoubleMatrix1D column = matrix.viewColumn(c);
final DynamicBin1D bin = Statistic.bin(column);
if (bin.standardDeviation() == 0) { // use epsilon
for (int r = 0; r < matrix.rows(); r++) {
zMatrix.set(r, c, 0.0);
}
} else {
for (int r = 0; r < matrix.rows(); r++) {
double zScore = (column.get(r) - bin.mean()) / bin.standardDeviation();
zMatrix.set(r, c, zScore);
}
}
}
return zMatrix;
}
}
You could also use weka. I would first load your data into weka, then run PCA using the GUI (under attribute selection). You will see what classes are called with what parameters and then do the same thing from your code. The problem is you will need to convert/wrap your matrix into the data format weka works with.
A similar question has been discussed on CrossValidated2. The basic idea is to find an appropriate projection that separates these clusters (e.g., with discproj in R) and then to plot the projection on the clusters on the new space.
In addition to what the other answers suggest you should probably have a look at multidimensional scaling too.