Whats the difference between JAMA: Matrix.times() vs Matrix.arrayTimes() in JAMA (a java library for matrix calculations)
If I have a d dimension vector x and a k dimension vector z and I want to get the xz^T (x into z transpose) should I use Matrix.times or Matrix.arrayTimes?
How can I calculate this multiplication using JAMA?
arrayTimes is simply element by element multiplication
C[i][j] = A[i][j] * B[i][j];
(treated as corresponding individual numbers)
while times is the matrix multiplication
where each element of the product is the sum of the products of corresponding row-columns.
The dimensions must match as per what you want to achieve.
Given your problem of x z^T the only viable solution is to turn these into dx1 and kx1 matrices respectively and perform x.times(z.transpose()). The result will be a matrix of d x k dimensions.
Related
I'm trying to use the l2 normalization on a double vector with Java.
double[] vector = {0.00423823948, 0.00000000000823285934, 0.0000342523505342, 0.000040240234023423, 0, 0};
Now if i use the l2 normalization
for(double i : vector){
squareVectorSum += i * i;
}
normalizationFactor = Math.sqrt(squareVectorSum);
// System.out.println(squareVectorSum+" "+normalizationFactor);
for(int i = 0; i < vector.length; i++){
double normalizedFeature = vector[i] / normalizationFactor;
vector_result[i] = normalizedFeature;
}
My normalized vector is like this
Normalized vector (l2 normalization)
0.9999222784309146 1.9423676996312713E-9 0.008081112110203743 0.009493825603572155 0.0 0.0
Now if if make the squared sum of all the normalized-vector components I should get a sum that is is equal to one, instead my squared sum is
for(double i : vector_result){
sum += i*i;
}
Squared sum of the normalized-vector
1.0000000000000004
Why is my sum not equal to one?
Are there some problems in the code?
Or it's just because my numbers are too small and there is some approximation with doubles?
As indicated above, this is a common issue, one you're going to have to deal with if you're going to use floating point binary arithmetic. The problem mostly crops up when you want to compare two floating point binary numbers for equality. Since the operations applied to arrive at the values may not be identical, neither will their binary representations.
There are at least a couple strategies you can consider to deal with this situation. The first involves comparing the absolute difference between two floating point numbers, x and y rather than strict equality and comparing them to some small value ϵ>0. This would look something like
if (Math.abs(y-x) < epsilon) {
// Assume x == y
} else {
// Assume x != y
}
This works well when the possible values of x and y have a relatively tight bounding on their exponents. When this is not the case, value of x and y may be such that the difference always dominates the ϵ you choose (if the exponent is too large) or ϵ dominates the difference (such as when the possible exponents of x and y are small). To get around this, instead of comparing the absolute difference, you could instead compare the ratio of x and y to 1.0 and see whether that ratio has an absolute difference from 1.0 by more than ϵ. That would look like:
if (Math.abs(x/y-1.0) < epsilon) {
// Assume x == y
} else {
// Assume x != y
}
You will likely need to add another check to ensure y!=0 to avoid division by zero, but that's the general idea.
Other options include using a fixed point library for Java or a rational number library for Java. I have no recommendations for that, though.
I am following the tutorial at http://jeremykun.com/2011/07/27/eigenfaces/.
I am trying to implement this solution in Java using the Jama Linear Algebra package.
I am stuck on calculating the covariance matrix. I calculated all the differenceVectors and stored them each in a 'Matrix'. However, I don't see how to turn these into a covariance matrix.
How do I best go about doing this in Java?
What you are missing is
This is calculated as:
This means that you need just two things now:
multiply matrices of difference vectors (deviations from averages)
multiply the result by 1 / (N - 1), note: N - 1 to get unbiased
estimates from sample
I have created this spreadsheet example to show how to do it step by step.
You may do something like this (to deal with the matrix I am importing jama). Actually eigenfaces are implemented below, because there was a problem with this function for java.
private static void evaluateEigenface(int M,int N,Matrix x,double[] average,double[] eigenvalues,Matrix eigenfaces){
// x is (widthProcessedImage*heightProcessedImage)X(numberProcessedImages);
Matrix w=new Matrix(M,N,0.0);
for(int i=0;i<M;i++){
average[i]=0;
for(int j=0;j<N;j++){
average[i]=average[i]+x.get(i,j);
}
average[i]=average[i]/((double)N);
//System.out.println(average[i]);
}
for(int i=0;i<M;i++){
for(int j=0;j<N;j++){
w.set(i, j, x.get(i,j)-average[i]);
}
}
Matrix auxMat=w.transpose().times(w); // =w'*w
SingularValueDecomposition SVD = new SingularValueDecomposition(auxMat);
double[] mu = SVD.getSingularValues(); // Eigenvalues of w'w
Matrix d=SVD.getU(); // LeftSingularVectors of w'w => Each column is an eigenvector
Matrix e=w.times(d); // Eigenvector of ww'
for(int i=0;i<N;i++)eigenvalues[i]=mu[i];
double theNorm;
double[] auxArray=new double[M];
for(int i=0;i<N;i++){
for(int j=0;j<M;j++)auxArray[j]=e.get(j,i);
theNorm=norma2(M,auxArray);
for(int j=0;j<M;j++)eigenfaces.set(j,i, e.get(j, i)/theNorm); // eigenfaces are the normalized eigenvectors of ww'
}
}
Alternatively, you can use, after removing the average, the SVD function, which is also contained in Jama.
The eigen decomposition computes W'*W=V*D*V', while the SVD computes W=U*S*V', U, V orthogonal, S and D diagonal, with the diagonal non-negative and descending order. Comparing both, one gets
W'*W=(USV')'*USV'=VSU'*USV'=VS²V'
so one can recover the eigen decomposition from the SVD via D=S².
I have been looking for a good optimization algorithm for almost a year now.
My problem consists of taking a matrix of observed values, lets call it 'M' and using a function 'F' which by transforming each of M's cells, one-by-one, produces another Matrix 'N'.
Then matrices 'M' and 'N' are compared using least square method and the distance between them should be minimized by changing the variables of 'F'.
There is an array of variables lets call it 'a' and a single variable 'b' which are used in the function F.
The variable 'b' is consistent between all of the calculations required to get the matrix 'N'.
The length of array 'a' depends on the number of rows; one number from array 'a' corresponds to each row.
So lets say to calculate the 3rd row of 'N' I use F on the value of each cell in the 3rd row of 'M' together with the variables a[3] and b.
To calculate the 4th row of N I calculate F with the value of each cell from the 4th row in M in turn together with a[4] and b.
And so on, and so on.
Once I calculate the whole of N, I need to compare it to M and minimize their distance by adjusting the array of variables a[] and the variable b.
I have been using Apache cmaes for smaller matrices but it doesn't work as well as matlab's solver on large matrices
EDIT
So ill try to describe this algorithmically as opposed to mathematically as that is my stronger side.
double[w,h] m //Matrix M
double[w,h] n //Matrix N
double[] hv // this is an array of constant hardcoded values
double[] a // this array is initialised to an initial guess
double b //also initialised to an initial guess
double total //target value, this value needs to be minimised
//w and h are constant
for(i=0; i<h; i++){
for(j=0; j<w; j++)
m[i,j] = getObservedValue[i,j] //observed values are not under my control
}
}
for(i=0; i<h; i++){
for(j=0; j<w; j++)
n[i,j] = 0.75/1+e^(-b*(hv[i]-a[i]))+25
}
}
//once N is calculated initially using guesses for a[] and b
for(i=0; i<h; i++){
for(j=0; j<w; j++)
total = total + (m[i,j]*(m[i,j]-n[i,j])^2) //sum of square distances
}
}
Now the objective is to minimise 'total'(distance between M and N) by finding the optimum values for a[] and b.
Perhaps if someone has done something similar they could point me to a library?
Or a quick demo of how i could find the optimal values my self?
Thanks very much for reading this,
Erik
I know how to rotate an entire 2d array by 90 degrees around the center(My 2d array lengths are always odd numbers), but I need to find an algorithm that rotates specific indices of a 2d array of known length. For example I know that the 2d array is a 17 by 17 grid and I want the method to rotate the indices [4][5] around the center by 90 degrees and return the new indices as two separate ints(y,x); Please point me in the right direction or if your feeling charitable I would very much appreciate some bits of code - preferably in java. Thanks!
Assuming cartesian coordinates (i.e. x points right, and y points up) and that your coordinates are in the form array[y][x] the center [cx, cy] of your 17x17 grid is [8, 8].
Calculate the offset [dx, dy] of your point [px, py] being [4, 5] from there, i.e. [-4, -3]
For a clockwise rotation, the new location will be [cx - dy, cy + dx]
If your array uses the Y axis pointing "downwards" then you will need to reverse some of the signs in the formulae.
For a non-geometric solution, consider that the element [0][16] needs to get mapped to [16][16], and [0][0] mapped to [0][16]. i.e. the first row maps to the last column, the second row maps to the second last column, etc.
If n is one less than the size of the grid (i.e. 16) that just means that point [y][x] will map to [x][n - y]
In theory, the geometric solution should provide the same answer - here's the equivalence:
n = 17 - 1;
c = n / 2;
dx = x - c;
dy = y - c;
nx = c - dy = c - (y - c) = 2 * c - y = n - y
ny = c + dx = c + (x - c) = x
If you have a square array with N elements in each row/column a 90deg turn anti-/counter-clockwise sends (x,y) to (N+1-y,x) doesn't it ?
That is, if like me, you think that the top-left element in a square array is (1,1) and row numbers increase down and column numbers to the right. I guess someone who counts from 0 will have to adjust the formula somewhat.
The point in Cartesian space x,y rotated 90 degrees counterclockwise maps to -y,x.
An array with N columns and M rows would map to an array of M columns and N rows. The new "x" index will be non-positive, and will be made zero-based by adding M:
a[x][y] maps to a[M-y][x]
I developed some java program to calculate cosine similarity on the basis of TF*IDF. It worked very well. But there is one problem.... :(
for example:
If I have following two matrix and I want to calculate cosine similarity it does not work as rows are not same in length
doc 1
1 2 3
4 5 6
doc 2
1 2 3 4 5 6
7 8 5 2 4 9
if rows and colums are same in length then my program works very well but it does not if rows and columns are not in same length.
Any tips ???
I'm not sure of your implementation but the cosine distance of two vectors is equal to the normalized dot product of those vectors.
The dot product of two matrix can be expressed as a . b = aTb. As a result if the matrix have different length you can't take the dot product to identify the cosine.
Now in a standard TF*IDF approach the terms in your matrix should be indexed by term, document as a result any terms not appearing in a document should appear as zeroes in your matrix.
Now the way you have it set up seems to suggest there are two different matrices for your two documents. I'm not sure if this is your intent, but it seems incorrect.
On the other hand if one of your matrices is supposed to be your query, then it should be a vector and not a matrix, so that the transpose produces the correct result.
A full explanation of TF*IDF follows:
Ok, in a classic TF*IDF you construct a term-document matrix a. Each value in matrix a is characterized as ai,j where i is the term and j is the document. This value is a combination of local, global and normalized weights (although if you normalize your documents, the normalized weight should be 1). Thus ai,j = fi,j*D/di, where fi,j is the frequency of word i in doc j, D is the document size, and di is the number of documents with term i in them.
Your query is a vector of terms designated as b. For each term bi,q in your query refers to term i for query q. bi,q = fi,q where fi,q is the frequency of term i in query q. In this case each query is a vector, and multiple queries form a matrix.
We can then calculate the unit vectors of each so that when we take the dot product it will produce the correct cosine. To achieve the unit vector we divide both the matrix a and the query b by their Frobenius norm.
Finally we can perform the cosine distance by taking the transpose of the vector b for a given query. Thus one query (or vector) per calculation. This is denoted as bTa. The final result is a vector with the scoring for each term where a higher score denotes higher document rank.
simple java cosine similarity
static double cosine_similarity(Map<String, Double> v1, Map<String, Double> v2) {
Set<String> both = Sets.newHashSet(v1.keySet());
both.removeAll(v2.keySet());
double sclar = 0, norm1 = 0, norm2 = 0;
for (String k : both) sclar += v1.get(k) * v2.get(k);
for (String k : v1.keySet()) norm1 += v1.get(k) * v1.get(k);
for (String k : v2.keySet()) norm2 += v2.get(k) * v2.get(k);
return sclar / Math.sqrt(norm1 * norm2);
}