In the code listed below I am able to correctly find the sum, multiplication, and transpose of a two matrices. I am unsure how to find the cofactor and determinant going along the same type of set up I have for the other matrices. Any help would be appreciated. Thanks!!
public class MatrixMult {
public MatrixMult(int first[][], int second[][], int m, int n, int p, int q) {
doMatrixMultiply(first, second, m, n, p, q);
}
public void doMatrixMultiply(int first[][], int second[][], int m, int n,
int p, int q) {
if (n != p)
System.out
.println("Matrices with entered orders can't be multiplied with each other.");
else {
int multiply[][] = new int[m][q];
int addition[][] = new int[m][q];
int transpose[][] = new int[m][q];
int transpose2[][] = new int[m][q];
int cofactor[][] = new int[m][q];
int mult = 0;
int sum = 0;
int tran = 0;
int co = 0;
for (int c = 0; c < m; c++) {
for (int d = 0; d < q; d++) {
for (int k = 0; k < p; k++) {
mult = mult + first[c][k] * second[k][d];
}
multiply[c][d] = mult;
mult = 0;
}
}
System.out.println("Product of entered matrices:-");
for (int c = 0; c < m; c++) {
for (int d = 0; d < q; d++)
System.out.print(multiply[c][d] + "\t");
System.out.print("\n");
}
for (int c = 0; c < m; c++) {
for (int d = 0; d < q; d++) {
for (int k = 0; k < p; k++) {
sum = first[c][d] + second[c][d];
}
addition[c][d] = sum;
sum = 0;
}
}
System.out.println("Sum of entered matrices:-");
for (int c = 0; c < m; c++) {
for (int d = 0; d < q; d++)
System.out.print(addition[c][d] + "\t");
System.out.print("\n");
}
int c;
int d;
for (c = 0; c < m; c++) {
for (d = 0; d < q; d++)
transpose[d][c] = first[c][d];
}
for (c = 0; c < m; c++) {
for (d = 0; d < q; d++)
transpose2[d][c] = second[c][d];
}
System.out.println("Transpose of first entered matrix:-");
for (c = 0; c < n; c++) {
for (d = 0; d < m; d++)
System.out.print(transpose[c][d] + "\t");
System.out.print("\n");
}
System.out.println("Transpose of second entered matrix:-");
for (c = 0; c < n; c++) {
for (d = 0; d < m; d++)
System.out.print(transpose2[c][d] + "\t");
System.out.print("\n");
}
}
}
}
Following is the implemenation of determinant using your structure for the matrix represantation using the code from the link Determining Cofactor Matrix in Java:
public int determinant(int[][] result, int rows, int cols) {
if (rows == 2)
return result[0][0] * result[1][1] - result[0][1] * result[1][0];
int determinant1 = 0, determinant2 = 0;
for (int i = 0; i < rows; i++) {
int temp = 1, temp2 = 1;
for (int j = 0; j < cols; j++) {
temp *= result[(i + j) % cols][j];
temp2 *= result[(i + j) % cols][rows - 1 - j];
}
determinant1 += temp;
determinant2 += temp2;
}
return determinant1 - determinant2;
}
and for calculating the cofactor also using the code from the provided link:
public int[][] cofactor(int[][] matrix, int rows, int cols) {
int[][] result = new int[rows][cols];
for (int i = 0; i < rows; i++) {
for (int j = 0; j < cols; j++) {
result[i][j] = (int) (Math.pow(-1, i + j) * determinant(
removeRowCol(matrix, rows, cols, i, j), rows - 1,
cols - 1));
}
}
return result;
}
public int[][] removeRowCol(int[][] matrix, int rows, int cols,
int row, int col) {
int[][] result = new int[rows - 1][cols - 1];
int k = 0, l = 0;
for (int i = 0; i < rows; i++) {
if (i == row)
continue;
for (int j = 0; j < cols; j++) {
if (j == col)
continue;
result[l][k] = matrix[i][j];
k = (k + 1) % (rows - 1);
if (k == 0)
l++;
}
}
return result;
}
A simple Google search will find you a lot of examples. E.g. http://mrbool.com/how-to-use-java-for-performing-matrix-operations/26800
Related
I was trying to solve a XOR problem, but the output always converged to 0.5, so i tried a simpler problem like NOT and the same thing happened.
I really don't know what's going on, i checked the code a million times and everything seems to be right, when i debugged it saving the neural network info I saw that the either the weight values or the biases values were getting really large. To do that I followed the 3 blue 1 brown youtube series about neural network and some other videos, too.
this is my code:
PS: I put the entire code here but I think the main problem is inside the bakpropag function
class NeuralNetwork {
int inNum, hiddenLayersNum, outNum, netSize;
int[] hiddenLayerSize;
Matrix[] weights;
Matrix[] biases;
Matrix[] sums;
Matrix[] activations;
Matrix[] error;
Matrix inputs;
long samples = 0;
float learningRate;
//Constructor------------------------------------------------------------------------------------------------------
NeuralNetwork(int inNum, int hiddenLayersNum, int[] hiddenLayerSize, int outNum, float learningRate) {
this.inNum = inNum;
this.hiddenLayersNum = hiddenLayersNum;
this.hiddenLayerSize = hiddenLayerSize;
this.outNum = outNum;
this.netSize = hiddenLayersNum + 1;
this.learningRate = learningRate;
//output layer plus the hidden layer size
//Note: I'm not adding the input layer because it doesn't have weights
weights = new Matrix[netSize];
//no biases added to the output layer
biases = new Matrix[netSize - 1];
sums = new Matrix[netSize];
activations = new Matrix[netSize];
error = new Matrix[netSize];
initializeHiddenLayer();
initializeOutputLayer();
}
//Initializing Algorithms------------------------------------------------------------------------------------------
void initializeHiddenLayer() {
for (int i = 0; i < hiddenLayersNum; i++) {
if (i == 0) {//only the first hidden layer takes the inputs
weights[i] = new Matrix(hiddenLayerSize[i], inNum);
} else {
weights[i] = new Matrix(hiddenLayerSize[i], hiddenLayerSize[i - 1]);
}
biases[i] = new Matrix(hiddenLayerSize[i], 1);
sums[i] = new Matrix(hiddenLayerSize[i], 1);
activations[i] = new Matrix(hiddenLayerSize[i], 1);
error[i] = new Matrix(hiddenLayerSize[i], 1);
}
}
void initializeOutputLayer() {
//the output layer takes the last hidden layer activation values
weights[netSize - 1] = new Matrix(outNum, hiddenLayerSize[hiddenLayerSize.length - 1]);
activations[netSize - 1] = new Matrix(outNum, 1);
sums[netSize - 1] = new Matrix(outNum, 1);
error[netSize - 1] = new Matrix(outNum, 1);
for (Matrix m : weights) {
for (int i = 0; i < m.i; i++) {
for (int j = 0; j < m.j; j++) {
m.values[i][j] = random(-1, 1);
}
}
}
for (Matrix m : biases) {
for (int i = 0; i < m.i; i++) {
for (int j = 0; j < m.j; j++) {
m.values[i][j] = 1;
}
}
}
for (Matrix m : sums) {
for (int i = 0; i < m.i; i++) {
for (int j = 0; j < m.j; j++) {
m.values[i][j] = 0;
}
}
}
}
//Calculation------------------------------------------------------------------------------------------------------
void calculate(float[] inputs) {
this.inputs = new Matrix(0, 0);
this.inputs = this.inputs.arrayToCollumn(inputs);
sums[0] = (weights[0].matrixMult(this.inputs)).sum(biases[0]);
activations[0] = sigM(sums[0]);
for (int i = 1; i < netSize - 1; i++) {
sums[i] = weights[i].matrixMult(activations[i - 1]);
activations[i] = sigM(sums[i]).sum(biases[i]);
}
//there's no biases in the output layer
//And the output layer uses sigmoid function
sums[netSize - 1] = weights[netSize - 1].matrixMult(activations[netSize - 1 - 1]);
activations[netSize - 1] = sigM(sums[netSize - 1]);
}
//Sending outputs--------------------------------------------------------------------------------------------------
Matrix getOuts() {
return activations[netSize - 1];
}
//Backpropagation--------------------------------------------------------------------------------------------------
void calcError(float[] exp) {
Matrix expected = new Matrix(0, 0);
expected = expected.arrayToCollumn(exp);
//E = (output - expected)
error[netSize - 1] = this.getOuts().diff(expected);
samples++;
}
void backPropag(int layer) {
if (layer == netSize - 1) {
error[layer].scalarDiv(samples);
for (int i = layer - 1; i >= 0; i--) {
prevLayerCost(i);
}
weightError(layer);
backPropag(layer - 1);
} else {
weightError(layer);
biasError(layer);
if (layer != 0)
backPropag(layer - 1);
}
}
void weightError(int layer) {
if (layer != 0) {
for (int i = 0; i < weights[layer].i; i++) {
for (int j = 0; j < weights[layer].j; j++) {
float changeWeight = 0;
if (layer != netSize - 1)
changeWeight = activations[layer - 1].values[j][0] * deriSig(sums[layer].values[i][0]) * error[layer].values[i][0];
else
changeWeight = activations[layer - 1].values[j][0] * deriSig(sums[layer].values[i][0]) * error[layer].values[i][0];
weights[layer].values[i][j] += -learningRate * changeWeight;
}
}
} else {
for (int i = 0; i < weights[layer].i; i++) {
for (int j = 0; j < weights[layer].j; j++) {
float changeWeight = this.inputs.values[j][0] * deriSig(sums[layer].values[i][0]) * error[layer].values[i][0];
weights[layer].values[i][j] += -learningRate * changeWeight;
}
}
}
}
void biasError(int layer) {
for (int i = 0; i < biases[layer].i; i++) {
for (int j = 0; j < biases[layer].j; j++) {
float changeBias = 0;
if (layer != netSize - 1)
changeBias = deriSig(sums[layer].values[i][0]) * error[layer].values[i][0];
biases[layer].values[i][j] += -learningRate * changeBias;
}
}
}
void prevLayerCost(int layer) {
for (int i = 0; i < activations[layer].i; i++) {
for (int j = 0; j < activations[layer + 1].j; j++) {//for all conections of that neuron to the next layer
if (layer != netSize - 1)
error[layer].values[i][0] += weights[layer + 1].values[j][i] * deriSig(sums[layer + 1].values[j][0]) * error[layer + 1].values[j][0];
else
error[layer].values[i][0] += weights[layer + 1].values[j][i] * deriSig(sums[layer + 1].values[j][0]) * error[layer + 1].values[j][0];
}
}
}
//Activation Functions---------------------------------------------------------------------------------------------
Matrix reLUM(Matrix m) {
Matrix temp = m.copyM();
for (int i = 0; i < temp.i; i++) {
for (int j = 0; j < temp.j; j++) {
temp.values[i][j] = ReLU(m.values[i][j]);
}
}
return temp;
}
float ReLU(float x) {
return max(0, x);
}
float deriReLU(float x) {
if (x <= 0)
return 0;
else
return 1;
}
Matrix sigM(Matrix m) {
Matrix temp = m.copyM();
for (int i = 0; i < temp.i; i++) {
for (int j = 0; j < temp.j; j++) {
temp.values[i][j] = sig(m.values[i][j]);
}
}
return temp;
}
float sig(float x) {
return 1 / (1 + exp(-x));
}
float deriSig(float x) {
return sig(x) * (1 - sig(x));
}
//Saving Files-----------------------------------------------------------------------------------------------------
void SaveNeuNet() {
for (int i = 0; i < weights.length; i++) {
weights[i].saveM("weights\\weightLayer" + i);
}
for (int i = 0; i < biases.length; i++) {
biases[i].saveM("biases\\biasLayer" + i);
}
for (int i = 0; i < activations.length; i++) {
activations[i].saveM("activations\\activationLayer" + i);
}
for (int i = 0; i < error.length; i++) {
error[i].saveM("errors\\errorLayer" + i);
}
}
}
and this is the Matrix code:
class Matrix {
int i, j, size;
float[][] values;
Matrix(int i, int j) {
this.i = i;
this.j = j;
this.size = i * j;
values = new float[i][j];
}
Matrix sum (Matrix other) {
if (other.i == this.i && other.j == this.j) {
for (int x = 0; x < this.i; x++) {
for (int z = 0; z < this.j; z++) {
values[x][z] += other.values[x][z];
}
}
return this;
}
return null;
}
Matrix diff(Matrix other) {
if (other.i == this.i && other.j == this.j) {
for (int x = 0; x < this.i; x++) {
for (int z = 0; z < this.j; z++) {
values[x][z] -= other.values[x][z];
}
}
return this;
}
return null;
}
Matrix scalarMult(float k) {
for (int i = 0; i < this.i; i++) {
for (int j = 0; j < this.j; j++) {
values[i][j] *= k;
}
}
return this;
}
Matrix scalarDiv(float k) {
if (k != 0) {
for (int i = 0; i < this.i; i++) {
for (int j = 0; j < this.j; j++) {
values[i][j] /= k;
}
}
return this;
} else
return null;
}
Matrix matrixMult(Matrix other) {
if (this.j != other.i)
return null;
else {
Matrix temp = new Matrix(this.i, other.j);
for (int i = 0; i < temp.i; i++) {
for (int j = 0; j < temp.j; j++) {
for (int k = 0; k < this.j; k++) {
temp.values[i][j] += this.values[i][k] * other.values[k][j];
}
}
}
return temp;
}
}
Matrix squaredValues(){
for (int i = 0; i < this.i; i++){
for (int j = 0; j < this.j; j++){
values[i][j] = sq(values[i][j]);
}
}
return this;
}
void printM() {
for (int x = 0; x < this.i; x++) {
print("| ");
for (int z = 0; z < this.j; z++) {
print(values[x][z] + " | ");
}
println();
}
}
void saveM(String name) {
String out = "";
for (int x = 0; x < this.i; x++) {
out += "| ";
for (int z = 0; z < this.j; z++) {
out += values[x][z] + " | ";
}
out += "\n";
}
saveStrings("outputs\\" + name + ".txt", new String[] {out});
}
Matrix arrayToCollumn(float[] array) {
Matrix temp = new Matrix(array.length, 1);
for (int i = 0; i < array.length; i++)
temp.values[i][0] = array[i];
return temp;
}
Matrix arrayToLine(float[] array) {
Matrix temp = new Matrix(1, array.length);
for (int j = 0; j < array.length; j++)
temp.values[0][j] = array[j];
return temp;
}
Matrix copyM(){
Matrix temp = new Matrix(i, j);
for (int i = 0; i < this.i; i++){
for (int j = 0; j < this.j; j++){
temp.values[i][j] = this.values[i][j];
}
}
return temp;
}
}
As I said, the outputs are always converging to 0.5 instead of the actual value 1 or 0
I rewrote the code and it is working now! I have no idea what was wrong with the code before but this one works:
class NeuralNetwork {
int netSize;
float learningRate;
Matrix[] weights;
Matrix[] biases;
Matrix[] activations;
Matrix[] sums;
Matrix[] errors;
NeuralNetwork(int inNum, int hiddenNum, int[] hiddenLayerSize, int outNum, float learningRate) {
netSize = hiddenNum + 1;
this.learningRate = learningRate;
weights = new Matrix[netSize];
biases = new Matrix[netSize - 1];
activations = new Matrix[netSize];
sums = new Matrix[netSize];
errors = new Matrix[netSize];
initializeMatrices(inNum, hiddenNum, hiddenLayerSize, outNum);
}
//INITIALIZING MATRICES
void initializeMatrices(int inNum, int hiddenNum, int[] layerSize, int outNum) {
for (int i = 0; i < hiddenNum; i++) {
if (i == 0)
weights[i] = new Matrix(layerSize[0], inNum);
else
weights[i] = new Matrix(layerSize[i], layerSize[i - 1]);
biases[i] = new Matrix(layerSize[i], 1);
activations[i] = new Matrix(layerSize[i], 1);
errors[i] = new Matrix(layerSize[i], 1);
sums[i] = new Matrix(layerSize[i], 1);
weights[i].randomize(-1, 1);
biases[i].randomize(-1, 1);
activations[i].randomize(-1, 1);
}
weights[netSize - 1] = new Matrix(outNum, layerSize[layerSize.length - 1]);
activations[netSize - 1] = new Matrix(outNum, 1);
errors[netSize - 1] = new Matrix(outNum, 1);
sums[netSize - 1] = new Matrix(outNum, 1);
weights[netSize - 1].randomize(-1, 1);
activations[netSize - 1].randomize(-1, 1);
}
//---------------------------------------------------------------------------------------------------------------
void forwardPropag(float[] ins) {
Matrix inputs = new Matrix(0, 0);
inputs = inputs.arrayToCollumn(ins);
sums[0] = (weights[0].matrixMult(inputs)).sum(biases[0]);
activations[0] = sigM(sums[0]);
for (int i = 1; i < netSize - 1; i++) {
sums[i] = (weights[i].matrixMult(activations[i - 1])).sum(biases[i]);
activations[i] = sigM(sums[i]);
}
//output layer does not have biases
sums[netSize - 1] = weights[netSize - 1].matrixMult(activations[netSize - 2]);
activations[netSize - 1] = sigM(sums[netSize - 1]);
}
Matrix predict(float[] inputs) {
forwardPropag(inputs);
return activations[netSize - 1].copyM();
}
//SUPERVISED LEARNING - BACKPROPAGATION
void train(float[] inps, float[] expec) {
Matrix expected = new Matrix(0, 0);
expected = expected.arrayToCollumn(expec);
errors[netSize - 1] = predict(inps).diff(expected);
calcErorrPrevLayers();
adjustWeights(inps);
adjustBiases();
for (Matrix m : errors){
m.reset();
}
}
void calcErorrPrevLayers() {
for (int l = netSize - 2; l >= 0; l--) {
for (int i = 0; i < activations[l].i; i++) {
for (int j = 0; j < activations[l + 1].i; j++) {
errors[l].values[i][0] += weights[l + 1].values[j][i] * dSig(sums[l + 1].values[j][0]) * errors[l + 1].values[j][0];
}
}
}
}
void adjustWeights(float[] inputs) {
for (int l = 0; l < netSize; l++) {
if (l == 0) {
//for ervery neuron n in the first layer
for (int n = 0; n < activations[l].i; n++) {
//for every weight w of the first layer
for (int w = 0; w < inputs.length; w++) {
float weightChange = inputs[w] * dSig(sums[l].values[n][0]) * errors[l].values[n][0];
weights[l].values[n][w] += -learningRate * weightChange;
}
}
} else {
//for ervery neuron n in the first layer
for (int n = 0; n < activations[l].i; n++) {
//for every weight w of the first layer
for (int w = 0; w < activations[l - 1].i; w++) {
float weightChange = activations[l - 1].values[w][0] * dSig(sums[l].values[n][0]) * errors[l].values[n][0];
weights[l].values[n][w] += -learningRate * weightChange;
}
}
}
}
}
void adjustBiases() {
for (int l = 0; l < netSize - 1; l++) {
//for ervery neuron n in the first layer
for (int n = 0; n < activations[l].i; n++) {
float biasChange = dSig(sums[l].values[n][0]) * errors[l].values[n][0];
biases[l].values[n][0] += -learningRate * biasChange;
}
}
}
//ACTIVATION FUNCTION
float sig(float x) {
return 1 / (1 + exp(-x));
}
float dSig(float x) {
return sig(x) * (1 - sig(x));
}
Matrix sigM(Matrix m) {
Matrix temp = m.copyM();
for (int i = 0; i < m.i; i++) {
for (int j = 0; j < m.j; j++) {
temp.values[i][j] = sig(m.values[i][j]);
}
}
return temp;
}
}
I am trying to create a program that returns the maximum square submatrix of 1's from a square matrix of 0's and 1's. Right now I have figured out how to break the square up into a square submatrix starting at each number that equals 1. The problem is, as the program starts to get farther from the starting point of the matrix, it suddenly goes out of bounds, which I am suspecting has to do with how it calculates what part of the matrix to start from for each submatrix.
Here is my code:
public static void main(String[] args) {
Scanner input = new Scanner(System.in);
System.out.print("Enter the number of rows and columns in the matrix (only one input, this is a square matrix): ");
int dimensions = input.nextInt();
int[][] matrix = new int[dimensions][dimensions];
for (int i = 0; i < matrix.length; i++) {
for (int j = 0; j < matrix[i].length; j++) {
int n = input.nextInt();
if (n == 0 || n == 1)
matrix[i][j] = n;
else
System.out.print("Input only 0 or 1");
}
}
int[] largestBlock = findLargestBlock(matrix);
}
public static int[] findLargestBlock(int[][] m) {
int[] solution = new int[3];
//find rows with most consecutive 1's, then find columns with the same # of consecutive 1's
for (int i = 0; i < m.length; i++) {
for (int j = 0; j < m[i].length; j++) {
//"origin" for each iteration is (i, j)
if (m[i][j] == 1)
if (isSquare(m, i, j) == true) {
solution[0] = i; solution[1] = j; solution[2] = getSize(m, i, j);
}
}
}
return solution;
}
public static boolean isSquare(int[][] m, int i, int j) {
int k = m.length - i;
if (m[0].length - j < k)
k = m.length - j;
if (k < 2)
return false;
int[][] testSquare = new int[k][k];
for (int y = i; y < m.length - i; y++) {
for (int x = j; x < m[i].length - j; x++) {
testSquare[y - i][x - j] = m[y][x];
}
}
for (int y = 0; y < testSquare.length; y++) {
for (int x = 1; x < testSquare[y].length; x++) {
if (testSquare[y][x] != testSquare[y][x - 1])
return false;
}
}
for (int x = 0; x < testSquare[0].length; x++) {
for (int y = 1; y < testSquare.length; y++) {
if (testSquare[y][x] != testSquare[y - 1][x])
return false;
}
}
return true;
}
public static int getSize(int[][] m, int i, int j) {
int k = m.length - i;
if (m[0].length - j < k)
k = m.length - j;
return k;
}
I determined that this part of the program was causing the issue, apparently there is some flaw in it that sends the array x- or y- value out of bounds:
public static boolean isSquare(int[][] m, int i, int j) {
int k = m.length - i;
if (m[0].length - j < k)
k = m.length - j;
if (k < 2)
return false;
int[][] testSquare = new int[k][k];
for (int y = i; y < m.length - i; y++) {
for (int x = j; x < m[i].length - j; x++) {
**testSquare[y - i][x - j] = m[y][x];**
}
}
I'm very confused regarding the line in stars/in bold font, as I think this is the line causing the issue. However, I'm not sure how its causing the issue.
I think the loop you are looking for is this - since testSquare is square just start from it make sure its enumerated from 0 to k then find the other matrix indexes - m will never go more than k since k is the minimum so it starts from i and j and goes to i+k and j+k max.
if (m[i].length - j < k)
k = m[i].length - j;
for (int y = 0; y < k; y++) {
for (int x = 0; x < k; x++) {
testSquare[y][x] = m[i+y][j+x];
}
}
I am trying to iterate throughout 2D Array to find a row that its sum equals the sum of two other rows in the same 2D array.
I am having hard time figuring out how to compare before I can reset sum2 and sum3 to zero;
* for sum2: its sum will be just the sum at row (n-1), same as for sum3
* Just need to find a way to compare before resetting sum2 and sum3 to zero
boolean compare(int n, int [][] A)
{
int i, j, k, x, y, p, sum, sum2, sum3, total;
//row
for ( i = 0; i < n; i++)
{
sum = 0;
//col
for ( j = 0; j < n; j++)
sum+= A[i][j];
//row
for ( k = 0; k < n; k++)
{
sum2 = 0;
//col
if (k != i)
for ( x = 0; x < n; x++)
sum2 += A[k][x];
}
for ( y = 0; y < n; y++)
{
sum3 = 0;
if ( (y != k) && (y != i) )
for ( p = 0; p < n; p++)
sum3 += A[y][p];
}
total = sum2 + sum3;
if ( sum == (total) )
return true;
}//for ( i = 0; i < n; i++)
return false;
}
Any input is greatly appreciated
**** Here we go, I updated my code as below:
boolean compare(int n, int [][] A)
{
int i, j, k, x, y;
int [] sumArray = new int[n];
for (i = 0; i < n; i++)
{
sum = 0;
for(j = 0; j < n; j++)
sum += A[i][j];
sumArray[i] = sum;
}
for ( k = 0; k < n; k++)
{
for(x = 0; x < n; x++)
{
if( x != k)
{
for(y = 0; y < n; y++)
{
if( (y != x) && (y != k) )
{
if( sumArray[k] == (sumArray[x] + sumArray[y]) )
return true;
}
}
}
}
}
return false;
}
Seems like it would be easier to calculate the sum of each row and put them in a 1D array. Then you can compare sums of each row in a more concise way and you also avoid computing the sum of each row more than once.
Also, the parameter int n is not needed for the compare() method, since you can just check the length property of the array that gets passed in.
public boolean compare(int[][] arr) {
final int rowLen = arr.length;
int[] sums = new int[rowLen];
// Compute sum of each row
for (int row = 0; row < rowLen; row++) {
int rowSum = 0;
int[] rowArr = arr[row];
for (int col = 0; col < rowArr.length; col++)
rowSum += rowArr[col];
sums[row] = rowSum;
}
// Check if row n equals the sum of any other 2 rows
for (int n = 0; n < sums.length; n++) {
for (int i = 0; i < sums.length; i++) {
for (int j = i + 1; j < sums.length; j++)
if (n != i && n != j && sums[n] == sums[i] + sums[j]) {
// sum of row n equals sums of rows i+j
System.out.println("Sum of row " + n + " is equal to the sums of rows " + i + " and " + j);
return true;
}
}
}
return false;
}
Disclaimer: untested code, but it gets my point accross
This is the code for my sorting attempt. After running for about ten minutes in Eclipse's debugger mode, I got a lot of StackOverFlow errors. This was my output display:
Exception in thread "main" java.lang.StackOverflowError
at TestSorter.Tester.sort(Tester.java:6)
... (x112 repetitions of at TestSorter.Tester.sort(Tester.java:49))
at TestSorter.Tester.sort(Tester.java:49)
public static int[] sort(int[] a) {
int prod = (a.length)/2, b = lessThan(a, prod), c = greaterThan(a, prod), d = equalTo(a, prod);
int[] first, last, mid;
first = new int[b];
last = new int[c];
mid = new int[d];
int[] fina = new int[a.length];
int f = 0, l = 0, m = 0;
if (isSorted(a))
return a;
for (int x = 0; x < a.length; x++) {
if (a[x] < prod) {
first[f] = a[x];
f++;
}
else if (a[x] > prod) {
last[l] = a[x];
l++;
}
else if (a[x] == prod) {
mid[m] = a[x];
m++;
}
}
if (m == a.length)
return a;
first = sort(first);
last = sort(last);
for (int x = 0; x < b; x++) {
fina[x] += first[x];
}
for (int x = 0; x < d; x++) {
fina[x + b] = mid[x];
}
for (int x = 0; x < c; x++) {
fina[x + b + c] = last[x];
}
return fina;
}
My support methods are as follows:
private static int lessThan(int[] a, int prod) {
int less = 0;
for (int x = 0; x < a.length; x++) {
if (a[x] < prod) {
less++;
}
}
return less;
}
private static int greaterThan(int[] a, int prod) {
int greater = 0;
for (int x = 0; x < a.length; x++) {
if (a[x] > prod) {
greater++;
}
}
return greater;
}
private static int equalTo(int[] a, int prod) {
int equal = 0;
for (int x = 0; x < a.length; x++) {
if (a[x] == prod) {
equal++;
}
}
return equal;
}
private static boolean isSorted(int[] a) {
for (int x = 0; x < a.length - 1; x++) {
if (a[x] > a[x + 1])
return false;
}
return true;
}
Presumably the trouble is that your "prod" is not within the domain of your array. Thus either "first" or "last" is the same size as the input array, and you have an infinite recursion. Try setting prod to be an element in the array you are trying to sort.
THREE POINTS:
The pord should be the mid-element of the array, NOT the half of the array's length.
So, it should be prod =a[(a.length) / 2],
NOT prod =(a.length) / 2
If the array first only have 1 element, it does not need invoke the method sort any more.
Also the last.
So, add if statement:
if (1 < first.length) {
first = sort(first);
}
When you append the element of last to fina, the index should be x+b+d, it means first elements(b) + mid elements(d). NOT x+b+c.
So, change fina[x + b + c] = last[x]; to fina[x + b + d] = last[x];
Well, the method sort maybe like this:
public static int[] sort(int[] a) {
int prod =a[(a.length) / 2], b = lessThan(a, prod), c = greaterThan(a,
prod), d = equalTo(a, prod);
int[] first, last, mid;
first = new int[b];
last = new int[c];
mid = new int[d];
int[] fina = new int[a.length];
int f = 0, l = 0, m = 0;
if (isSorted(a) )
return a;
for (int x = 0; x < a.length; x++) {
if (a[x] < prod) {
first[f] = a[x];
f++;
} else if (a[x] > prod) {
last[l] = a[x];
l++;
} else if (a[x] == prod) {
mid[m] = a[x];
m++;
}
}
if (m == a.length)
return a;
if (1 < first.length) {
first = sort(first);
}
if (1 < last.length) {
last = sort(last);
}
for (int x = 0; x < b; x++) {
fina[x] += first[x];
}
for (int x = 0; x < d; x++) {
fina[x + b] = mid[x];
}
for (int x = 0; x < c; x++) {
fina[x + b + d] = last[x];
}
return fina;
}
I am trying to train a two-state Hidden Markov model with a scaled Baum-Welch, but I noticed when my emission sequence is too small. My probabilities turn to NaN in java. Is this normal? I have posted my code in java below:
import java.util.ArrayList;
/*
Scaled Baum-Welch Algorithm implementation
author: Ricky Chang
*/
public class HMModeltest {
public static double[][] stateTransitionMatrix = new double[2][2]; // State Transition Matrix
public static double[][] emissionMatrix; // Emission Probability Matrix
public static double[] pi = new double[2]; // Initial State Distribution
double[] scaler; // This is used for scaling to prevent underflow
private static int emissions_id = 1; // To identify if the emissions are for the price changes or spread changes
private static int numEmissions = 0; // The amount of emissions
private static int numStates = 2; // The number of states in hmm
public static double improvementVar; // Used to assess how much the model has improved
private static double genState; // Generated state, it is used to generate observations below
// Create an ArrayList to store the emissions
public static ArrayList<Integer> eSequence = new ArrayList<Integer>();
// Initialize H, emission_id: 1 is price change, 2 are spreads; count is for the amount of different emissions
public HMModeltest(int id, int count){
emissions_id = id;
numEmissions = count;
stateTransitionMatrix = set2DValues(numStates,numStates); // Give the STM row stochastic values
emissionMatrix = new double[numStates][numEmissions];
emissionMatrix = set2DValues(numStates,numEmissions); // Give the Emission probability matrix row stochastic values
pi = set1DValues(numStates); // Give the initial matrix row stochastic values
}
// Categorize the price change emissions; I may want to put these in the Implementation.
private int identifyE1(double e){
if( e == 0) return 4;
if( e > 0){
if(e == 1) return 5;
else if(e == 3) return 6;
else if(e == 5) return 7;
else return 8;
}
else{
if(e == -1) return 3;
else if(e == -3) return 2;
else if(e == -5) return 1;
else return 0;
}
}
// Categorize the spread emissions
private int identifyE2(double e){
if(e == 1) return 0;
else if(e == 3) return 1;
else return 2;
}
public void updateE(int emission){
if(emissions_id == 1) eSequence.add( identifyE1(emission) );
else eSequence.add( identifyE2(emission) );
}
// Used to intialize random row stochastic values to vectors
private double[] set1DValues(int col){
double sum = 0;
double temp = 0;
double [] returnVector = new double[col];
for(int i = 0; i < col; i++){
temp = Math.round(Math.random() * 1000);
returnVector[i] = temp;
sum = sum + temp;
}
for(int i = 0; i < col; i++){
returnVector[i] = returnVector[i] / sum;
}
return returnVector;
}
// Used to initialize random row stochastic values to matrices
public double[][] set2DValues(int row, int col){
double sum = 0;
double temp = 0;
double[][] returnMatrix = new double[row][col];
for(int i = 0; i < row; i++){
for(int j = 0; j < col; j++){
temp = Math.round(Math.random() * 1000);
returnMatrix[i][j] = temp;
sum = sum + temp;
}
for(int j = 0; j < col; j++){
returnMatrix[i][j] = returnMatrix[i][j] / sum;
}
sum = 0;
}
return returnMatrix;
}
// Use forward algorithm to calculate alpha for all states and times
public double[][] forwardAlgo(int time){
double alpha[][] = new double[numStates][time];
scaler = new double[time];
// Intialize alpha for time 0
scaler[0] = 0; // c0 is for scaling purposes to avoid underflow
for(int i = 0; i < numStates; i ++){
alpha[i][0] = pi[i] * emissionMatrix[i][eSequence.get(0)];
scaler[0] = scaler[0] + alpha[i][0];
}
// Scale alpha_0
scaler[0] = 1 / scaler[0];
for(int i = 0; i < numStates; i++){
alpha[i][0] = scaler[0] * alpha[i][0];
}
// Use recursive method to calculate alpha
double tempAlpha = 0;
for(int t = 1; t < time; t++){
scaler[t] = 0;
for(int i = 0; i < numStates; i++){
for(int j = 0; j < numStates; j++){
tempAlpha = tempAlpha + alpha[j][t-1] * stateTransitionMatrix[j][i];
}
alpha[i][t] = tempAlpha * emissionMatrix[i][eSequence.get(t)];
scaler[t] = scaler[t] + alpha[i][t];
tempAlpha = 0;
}
scaler[t] = 1 / scaler[t];
for(int i = 0; i < numStates; i++){
alpha[i][t] = scaler[t] * alpha[i][t];
}
}
System.out.format("scaler: ");
for(int t = 0; t < time; t++){
System.out.format("%f, ", scaler[t]);
}
System.out.print('\n');
return alpha;
}
// Use backward algorithm to calculate beta for all states
public double[][] backwardAlgo(int time){
double beta[][] = new double[2][time];
// Intialize beta for current time
for(int i = 0; i < numStates; i++){
beta[i][time-1] = scaler[time-1];
}
// Use recursive method to calculate beta
double tempBeta = 0;
for(int t = time-2; t >= 0; t--){
for(int i = 0; i < numStates; i++){
for(int j = 0; j < numStates; j++){
tempBeta = tempBeta + (stateTransitionMatrix[i][j] * emissionMatrix[j][eSequence.get(t+1)] * beta[j][t+1]);
}
beta[i][t] = tempBeta;
beta[i][t] = scaler[t] * beta[i][t];
tempBeta = 0;
}
}
return beta;
}
// Calculate the probability of emission sequence given the model (it is also the denominator to calculate gamma and digamma)
public double calcP(int t, double[][] alpha, double[][] beta){
double p = 0;
for(int i = 0; i < numStates; i++){
for(int j = 0; j < numStates; j++){
p = p + (alpha[i][t] * stateTransitionMatrix[i][j] * emissionMatrix[j][eSequence.get(t+1)] * beta[j][t+1]);
}
}
return p;
}
// Calculate digamma; i and j are both states
public double calcDigamma(double p, int t, int i, int j, double[][] alpha, double[][] beta){
double digamma = (alpha[i][t] * stateTransitionMatrix[i][j] * emissionMatrix[j][eSequence.get(t+1)] * beta[j][t+1]) / p;
return digamma;
}
public void updatePi(double[][] gamma){
for(int i = 0; i < numStates; i++){
pi[i] = gamma[i][0];
}
}
public void updateAll(){
int time = eSequence.size();
double alpha[][] = forwardAlgo(time);
double beta[][] = backwardAlgo(time);
double initialp = calcLogEProb(time);
double nextState0, nextState1;
double p = 0;
double[][][] digamma = new double[numStates][numStates][time];
double[][] gamma = new double[numStates][time];
for(int t = 0; t < time-1; t++){
p = calcP(t, alpha, beta);
for(int i = 0; i < numStates; i++){
gamma[i][t] = 0;
for(int j = 0; j < numStates; j++){
digamma[i][j][t] = calcDigamma(p, t, i, j, alpha, beta);
gamma[i][t] = gamma[i][t] + digamma[i][j][t];
}
}
}
updatePi(gamma);
updateA(digamma, gamma);
updateB(gamma);
alpha = forwardAlgo(time);
double postp = calcLogEProb(time);
improvementVar = postp - initialp;
}
// Update the state transition matrix
public void updateA(double[][][] digamma, double[][] gamma){
int time = eSequence.size();
double num = 0;
double denom = 0;
for(int i = 0; i < numStates; i++){
for(int j = 0; j < numStates; j++){
for(int t = 0; t < time-1; t++){
num = num + digamma[i][j][t];
denom = denom + gamma[i][t];
}
stateTransitionMatrix[i][j] = num/denom;
num = 0;
denom = 0;
}
}
}
public void updateB(double[][] gamma){
int time = eSequence.size();
double num = 0;
double denom = 0;
// k is an observation, j is a state, t is time
for(int i = 0; i < numStates; i++){
for(int k = 0; k < numEmissions; k++){
for(int t = 0; t < time-1; t++){
if( eSequence.get(t) == k) num = num + gamma[i][t];
denom = denom + gamma[i][t];
}
emissionMatrix[i][k] = num/denom;
num = 0;
denom = 0;
}
}
}
public double calcLogEProb(int time){
double logProb = 0;
for(int t = 0; t < time; t++){
logProb = logProb + Math.log(scaler[t]);
}
return -logProb;
}
public double calcNextState(int time, int state, double[][] gamma){
double p = 0;
for(int i = 0; i < numStates; i++){
for(int j = 0; j < numStates; j++){
p = p + gamma[i][time-2] * stateTransitionMatrix[i][j] * stateTransitionMatrix[j][state];
}
}
return p;
}
// Print parameters
public void print(){
System.out.println("Pi:");
System.out.print('[');
for(int i = 0; i < 2; i++){
System.out.format("%f, ", pi[i]);
}
System.out.print(']');
System.out.print('\n');
System.out.println("A:");
for(int i = 0; i < 2; i++){
System.out.print('[');
for(int j = 0; j < 2; j++){
System.out.format("%f, ", stateTransitionMatrix[i][j]);
}
System.out.print(']');
System.out.print('\n');
}
System.out.println("B:");
for(int i = 0; i < 2; i++){
System.out.print('[');
for(int j = 0; j < 9; j++){
System.out.format("%f, ", emissionMatrix[i][j]);
}
System.out.print(']');
System.out.print('\n');
}
System.out.print('\n');
}
/* Generate sample data to test HMM training with the following params:
* [ .3, .7 ]
* [ .8, .2 ] [ .45 .1 .08 .05 .03 .02 .05 .2 .02 ]
* [ .36 .02 .06 .15 .04 .05 .2 .1 .02 ]
* With these as observations: {-10, -5, -3, -1, 0, 1, 3, 5, 10}
*/
public static int sampleDataGen(){
double rand = 0;
rand = Math.random();
if(genState == 1){
if(rand < .3) genState = 1;
else genState = 2;
rand = Math.random();
if(rand < .45) return -10;
else if(rand < .55) return -5;
else if(rand < .63) return -3;
else if(rand < .68) return -1;
else if(rand < .71) return 0;
else if(rand < .73) return 1;
else if(rand < .78) return 3;
else if(rand < .98) return 5;
else return 10;
}
else {
if(rand < .8) genState = 1;
else genState = 2;
rand = Math.random();
if(rand < .36) return -10;
else if(rand < .38) return -5;
else if(rand < .44) return -3;
else if(rand < .59) return -1;
else if(rand < .63) return 0;
else if(rand < .68) return 1;
else if(rand < .88) return 3;
else if(rand < .98) return 5;
else return 10;
}
}
public static void main(String[] args){
HMModeltest test = new HMModeltest(1,9);
test.print();
System.out.print('\n');
for(int i = 0; i < 20; i++){
test.updateE(sampleDataGen());
}
test.updateAll();
System.out.print('\n');
test.print();
System.out.print('\n');
for(int i = 0; i < 10; i++){
test.updateE(sampleDataGen());
}
test.updateAll();
System.out.print('\n');
test.print();
System.out.print('\n');
}
}
My guess is that since the sample is too small, sometimes the probabilities don't exist for some observations. But it would be nice to have some confirmation.
You could refer the "Scaling" section in Rabiner's paper, which solves the underflow problem.
You could also do the calculations in log space, that's what HTK and R do. Multiplication and division become addition and subtraction. For the other two, look at the LAdd/ LSub and logspace_add/logspace_sub functions in the respective toolkits.
The log-sum-exp trick might be helpful too.