I have been trying to understand the correct implementation of Bellman-Ford by following these resources: 1 & 2
If we already know that the given weighted digraph doesn't contain a cycle (hence no negative cycle either), is following a correct implementation of Bellman-Ford algorithm?
int src = 0;
int V = nodes.length; // 0 to n-1 nodes
int E = edges.length;
double[] distTo = new double[V];
for (int i = 0; i < V; i++) {
distTo[i] = Double.POSITIVE_INFINITY;
}
int[] edgeTo = new int[V];
distTo[src] = 0.0;
for (int i = 1; i < V - 1; i++) {
double[] distToLocal = new double[V];
for (int j = 0; j < V; j++) {
distToLocal[j] = Double.POSITIVE_INFINITY;
}
for (int j = 0; j < E; j++) {
int to = edges[i].to;
int from = edges[i].from;
int weight = edges[i].weight;
if (distToLocal[to] > distTo[to] && distToLocal[to] > distTo[from] + weight) {
distToLocal[to] = distTo[from] + weight;
edgeTo[to] = from;
}
distToLocal[to] = Math.min(distToLocal[to],distTo[to]);
}
distTo = distToLocal;
}
The first issue that I am having with above implementation is that, if there are only 2 nodes in the graph with a directed edge from source node to destination node, then the first for loop needs to be modified to start with 0 instead of 1 as follows:
for (int i = 0; i < V - 1; i++) {
If I make the above change, is it still a correct implementation?
Variation in implementation
If there is no need to find the shortest distance of a node from src with maximum of K edges where K is [0,V-1], then following variation also seem to give correct result.
int src = 0;
int V = nodes.length; // 0 to n-1 nodes
int E = edges.length;
double[] distTo = new double[V];
for (int i = 0; i < V; i++) {
distTo[i] = Double.POSITIVE_INFINITY;
}
int[] edgeTo = new int[V];
distTo[src] = 0.0;
for (int i = 1; i < V - 1; i++) {
/*double[] distToLocal = new double[V];
for (int j = 0; j < V; j++) {
distToLocal[j] = Double.POSITIVE_INFINITY;
}*/
for (int j = 0; j < E; j++) {
int to = edges[i].to;
int from = edges[i].from;
int weight = edges[i].weight;
if (distTo[to] > distTo[from] + weight) {
distTo[to] = distTo[from] + weight;
edgeTo[to] = from;
}
}
//distTo = distToLocal;
}
I think I understand why the variation works, however I am curious why resource 1 doesn't mention this.
Are there any downsides to implementing this variation? Clearly, the variation has better memory requirement.
Note: I know that I can use topological sort SPT algorithm when there are no cycles in the weighted digraph, but I am trying to understand the correctness of Bellman-Ford.
Bellman-Ford algorithm states that after V-1 phases of relaxation of every edge would calculate the minimum distance between source to any destination. In your implementation, you run V-2 iterations of each phase. Actually, two of implementations are the same, you can just reuse the old array of distances.
Related
In an effort to learn and use hidden markov models, I am writing my own code to implement them. I am using this wiki article to help with my work. I do not wish to resort to pre-written libraries, because I have found I can achieve a better understanding if I write it myself. And no, this isn't a school assignment! :)
Unfortunately, my highest level of education consists of high school computer science and statistics. I have no background in Machine Learning besides the casual poking around with ANN libraries and TensorFlow. I am therefore having a bit of trouble translating mathematical equations into code. Specifically, I'm worried my implementations of the alpha and beta functions aren't functionally correct. If anyone can assist in describing where I messed up and how to correct my mistakes to have a functioning HMM implementation, it'd be greatly appreciated.
Here are my class-wide globals:
public int n; //number of states
public int t; //number of observations
public int time; //iteration holder
public double[][] emitprob; //Emission parameter
public double[][] stprob; //State transition parameter
public ArrayList<String> states, observations, x, y;
My constructor:
public Model(ArrayList<String> sts, ArrayList<String> obs)
{
//the most important algorithm we need right now is
//unsupervised learning through BM. Supervised is
//pretty easy.
//need hashtable of count objects... Aya...
//perhaps a learner...?
states = sts;
observations = obs;
n = states.size();
t = observations.size();
x = new ArrayList();
y = new ArrayList();
time = 0;
stprob = new double[n][n];
emitprob = new double[n][t];
stprob = newDistro(n,n);
emitprob = newDistro(n,t);
}
The newDistro method is for creating a new, uniform, normal distribution:
public double[][] newDistro(int x, int y)
{
Random r = new Random(System.currentTimeMillis());
double[][] returnme = new double[x][y];
double sum = 0;
for(int i = 0; i < x; i++)
{
for(int j = 0; j < y; j++)
{
returnme[i][j] = Math.abs(r.nextInt());
sum += returnme[i][j];
}
}
for(int i = 0; i < x; i++)
{
for(int j = 0; j < y; j++)
{
returnme[i][j] /= sum;
}
}
return returnme;
}
My viterbi algorithm implementation:
public ArrayList<String> viterbi(ArrayList<String> obs)
{
//K means states
//T means observations
//T arrays should be constructed as K * T (N * T)
ArrayList<String> path = new ArrayList();
String firstObservation = obs.get(0);
int firstObsIndex = observations.indexOf(firstObservation);
double[] pi = new double[n]; //initial probs of first obs for each st
int ts = obs.size();
double[][] t1 = new double[n][ts];
double[][] t2 = new double[n][ts];
int[] y = new int[obs.size()];
for(int i = 0; i < obs.size(); i++)
{
y[i] = observations.indexOf(obs.get(i));
}
for(int i = 0; i < n; i++)
{
pi[i] = emitprob[i][firstObsIndex];
}
for(int i = 0; i < n; i++)
{
t1[i][0] = pi[i] * emitprob[i][y[0]];
t2[i][0] = 0;
}
for(int i = 1; i < ts; i++)
{
for(int j = 0; j < n; j++)
{
double maxValue = 0;
int maxIndex = 0;
//first we compute the max value
for(int q = 0; q < n; q++)
{
double value = t1[q][i-1] * stprob[q][j];
if(value > maxValue)
{
maxValue = value; //the max
maxIndex = q; //the argmax
}
}
t1[j][i] = emitprob[j][y[i]] * maxValue;
t2[j][i] = maxIndex;
}
}
int[] z = new int[ts];
int maxIndex = 0;
double maxValue = 0.0d;
for(int k = 0; k < n; k++)
{
double myValue = t1[k][ts-1];
if(myValue > maxValue)
{
myValue = maxValue;
maxIndex = k;
}
}
path.add(states.get(maxIndex));
for(int i = ts-1; i >= 2; i--)
{
z[i-1] = (int)t2[z[i]][i];
path.add(states.get(z[i-1]));
}
System.out.println(path.size());
for(String s: path)
{
System.out.println(s);
}
return path;
}
My forward algorithm, which takes place of the alpha function as described later:
public double forward(ArrayList<String> obs)
{
double result = 0;
int length = obs.size()-1;
for(int i = 0; i < n; i++)
{
result += alpha(i, length, obs);
}
return result;
}
The remaining functions are for implementing the Baum-Welch Algorithm.
The alpha function is what I'm afraid I'm doing wrong of the most on here. I had trouble understanding which "direction" it needs to iterate over the sequence - Do I start from the last element (size-1) or the first (at index zero) ?
public double alpha(int j, int t, ArrayList<String> obs)
{
double sum = 0;
if(t == 0)
{
return stprob[0][j];
}
else
{
String lastObs = obs.get(t);
int obsIndex = observations.indexOf(lastObs);
for(int i = 0; i < n; i++)
{
sum += alpha(i, t-1, obs) * stprob[i][j] * emitprob[j][obsIndex];
}
}
return sum;
}
I'm having similar "correctness" issues with my beta function:
public double beta(int i, int t, ArrayList<String> obs)
{
double result = 0;
int obsSize = obs.size()-1;
if(t == obsSize)
{
return 1;
}
else
{
String lastObs = obs.get(t+1);
int obsIndex = observations.indexOf(lastObs);
for(int j = 0; j < n; j++)
{
result += beta(j, t+1, obs) * stprob[i][j] * emitprob[j][obsIndex];
}
}
return result;
}
I'm more confident in my gamma function; However, since it explicitly requires use of alpha and beta, obviously I'm worried it'll be "off" somehow.
public double gamma(int i, int t, ArrayList<String> obs)
{
double top = alpha(i, t, obs) * beta(i, t, obs);
double bottom = 0;
for(int j = 0; j < n; j++)
{
bottom += alpha(j, t, obs) * beta(j, t, obs);
}
return top / bottom;
}
Same for my "squiggle" function - I do apologize for naming; Not sure of the actual name for the symbol.
public double squiggle(int i, int j, int t, ArrayList<String> obs)
{
String lastObs = obs.get(t+1);
int obsIndex = observations.indexOf(lastObs);
double top = alpha(i, t, obs) * stprob[i][j] * beta(j, t+1, obs) * emitprob[j][obsIndex];
double bottom = 0;
double innerSum = 0;
double outterSum = 0;
for(i = 0; i < n; i++)
{
for(j = 0; j < n; j++)
{
innerSum += alpha(i, t, obs) * stprob[i][j] * beta(j, t+1, obs) * emitprob[j][obsIndex];
}
outterSum += innerSum;
}
return top / bottom;
}
Lastly, to update my state transition and emission probability arrays, I have implemented these functions as aStar and bStar.
public double aStar(int i, int j, ArrayList<String> obs)
{
double squiggleSum = 0;
double gammaSum = 0;
int T = obs.size()-1;
for(int t = 0; t < T; t++)
{
squiggleSum += squiggle(i, j, t, obs);
gammaSum += gamma(i, t, obs);
}
return squiggleSum / gammaSum;
}
public double bStar(int i, String v, ArrayList<String> obs)
{
double top = 0;
double bottom = 0;
for(int t = 0; t < obs.size()-1; t++)
{
if(obs.get(t).equals(v))
{
top += gamma(i, t, obs);
}
bottom += gamma(i, t, obs);
}
return top / bottom;
}
In my understanding, since the b* function includes a piecewise function that returns either 1 or 0, I think implementing it in an "if" statement and only adding the result if the string is equal to the observation history is the same as what is described, since the function would render the call to gamma 0, thus saving a little computation time. Is this correct?
In summation, I want to get my math right, to ensure a successful (albeit simple) HMM implementation. As for the Baum-Welch algorithm, I am having trouble understanding how to implment the complete function - would it be as simple as running aStar over all states (as an n * n FOR loop) and bStar for all observations, inside a loop with a convergence function? Also, what would be a best-practice function for checking for convergence without overfitting?
Please let me know of everything I need to do in order to get this right.
Thank you heavily for any help you can give me!
To avoid underflow, one should use a scaling factor in the forward and backward algorithms. To get the correct result, one uses nested for loops and the steps are forward in the forward method.
The backward method is similar to the forward function.
You invoke them from the method of the Baum-Welch algorithm.
I have been implementing a simple genetic algorithm(GA) using Java. The steps of my GA are basically binary encoding, tournament selection, single-point crossover, and bit-wise mutation. Each individual of the population is represented by a class consisting of binary genes and a fitness value.
public class Individual {
int gene[];
int fitness;
public Individual(int n){
this.gene = new int[n];
}
}
The codes below does not include the bit-wise mutation part as I have been facing problem at the single-point crossover part of the GA. The way I have implemented the single-point crossover algorithm is by randomly finding a point for two consecutive Individual array elements and then swap their tails. The tail swapping is then repeated for each pair of Individual. I have also created the printGenome() method to print out all the arrays to compare, the resulting array after the crossover process is not properly swapped. I have tested my single-point crossover algorithm separately, it works. However when I tried to run it here in the codes below, the crossover simply does not work. May I know is it because there is something wrong within the Tournament Selection algorithm? Or is it something else(silly mistakes)? I have been reworking on it and still I could not pinpoint the error.
I would be grateful for any help and information provided! :)
public class GeneticAlgorithm {
public static void main(String[] args) {
int p = 10;
int n = 10;
Individual population[];
//create new population
population = new Individual[p];
for (int i = 0; i < p; i++) {
population[i] = new Individual(n);
}
//fills individual's gene with binary randomly
for (int i = 0; i < p; i++) {
for (int j = 0; j < n; j++) {
population[i].gene[j] = (Math.random() < 0.5) ? 0 : 1;
}
population[i].fitness = 0;
}
//evaluate each individual
for (int i = 0; i < p; i++) {
for (int j = 0; j < n; j++) {
if (population[i].gene[j] == 1) {
population[i].fitness++;
}
}
}
//total fitness check
System.out.println("Total fitness check #1 before tournament selection: " + getTotalFitness(population, p));
System.out.println("Mean fitness check #1 before tournament selection: " + getMeanFitness(population, p));
System.out.println("");
//tournament selection
Individual offspring[] = new Individual[p];
for (int i = 0; i < p; i++) {
offspring[i] = new Individual(n);
}
int parent1, parent2;
Random rand = new Random();
for (int i = 0; i < p; i++) {
parent1 = rand.nextInt(p); //randomly choose parent
parent2 = rand.nextInt(p); //randomly choose parent
if (population[parent1].fitness >= population[parent2].fitness) {
offspring[i] = population[parent1];
} else {
offspring[i] = population[parent2];
}
}
//total fitness check
System.out.println("Total fitness check #2 after tournament selection: " + getTotalFitness(offspring, p));
System.out.println("Mean fitness check #2 after tournament selection: " + getMeanFitness(offspring, p));
System.out.println("");
//genome check
System.out.println("Before Crossover: ");
printGenome(offspring, p, n);
//crossover
for (int i = 0; i < p; i = i + 2) {
int splitPoint = rand.nextInt(n);
for (int j = splitPoint; j < n; j++) {
int temp = offspring[i].gene[j];
offspring[i].gene[j] = offspring[i + 1].gene[j];
offspring[i + 1].gene[j] = temp;
}
}
//genome check
System.out.println("After Crossover:");
printGenome(offspring, p, n);
//evaluate each individual by counting the number of 1s after crossover
for (int i = 0; i < p; i++) {
offspring[i].fitness = 0;
for (int j = 0; j < n; j++) {
if (offspring[i].gene[j] == 1) {
offspring[i].fitness++;
}
}
}
//total fitness check
System.out.println("Total fitness check #3 after crossover: " + getTotalFitness(offspring, p));
System.out.println("Mean fitness check #3 after crossover: " + getMeanFitness(offspring, p));
}
public static void printGenome(Individual pop[], int p, int n) {
for (int i = 0; i < p; i++) {
for (int j = 0; j < n; j++) {
System.out.print(pop[i].gene[j]);
}
System.out.println("");
}
}
public static int getTotalFitness(Individual pop[], int p) {
int totalFitness = 0;
for (int i = 0; i < p; i++) {
totalFitness = totalFitness + pop[i].fitness;
}
return totalFitness;
}
public static double getMeanFitness(Individual pop[], int p) {
double meanFitness = getTotalFitness(pop, p) / (double) p;
return meanFitness;
}
}
The problem is that, in your selection you are (most likely) duplicating individuals, when you say:
offspring[i] = population[parent1]
You are actually storing a reference to population[parent1] in offspring[i]. As a result your offspring array can contain the same reference multiple times, hence the same object will participate in crossover multiple times with multiple partners.
As a solution, you can store a clone instead of a reference to the same object. In Individual add:
public Individual clone(){
Individual clone = new Individual(gene.length);
clone.gene = gene.clone();
return clone;
}
And in your selection (note the added .clone()):
for (int i = 0; i < p; i++) {
parent1 = rand.nextInt(p); //randomly choose parent
parent2 = rand.nextInt(p); //randomly choose parent
if (population[parent1].fitness >= population[parent2].fitness) {
offspring[i] = population[parent1].clone();
} else {
offspring[i] = population[parent2].clone();
}
}
This way every element in offspring is a different object, even if the genome is the same.
That solves the Java part. Regarding the GA theory I hope some things, for instance your fitness measure are just placeholders, right?
I have this method where I find the distances with an Euclidean algorithm and save the values as double in an array of doubles. Now I need to find the minimum value of each test and return the value indexed.
public static double distance() {
for (int i = 0; i < GetFile.testMatrix.length;) {
double[] distances = new double[4000];
double minDistance = 999999;
for (int j = 0; j < GetFile.trainingMatrix.length; j++) {
distances[j] = EuclideanDistance.findED(GetFile.trainingMatrix[j], GetFile.testMatrix[i]);
}
return minDistance;
}
return 0;
}
I would appreciate any help. Thanks in advance
It doesn't look as though you need to store the result in an array at all, since that's being discarded. You should consider tracking the result in minDistance every time you get a result from findED.
This also looks like the kind of thing that would be a lot easier to understand if you used java's streams, if you're using a version of java that has access to them.
There are several ways to do so.
One way would be iterating over the values and taking always the smallest:
double minDistance = distances[0];
for(int j =1 ;j < GetFile.trainingMatrix.length; j++){
if(distances[j]<minDistance)
minDistance=distances[j];
}
or alternatively
double minDistance = distances[0];
for(int j =1 ;j < GetFile.trainingMatrix.length; j++){
minDistance = Math.min(minDistance, distances[j];
}
or using streams (with distances as List):
double minDistance = distances.stream().mapToDouble(e -> e).min().getAsDouble();
or even nicer (with your for-loop completely implemented):
double minDistance = Stream.iterate(0,j -> j+1)
.limit(GetFile.trainingMatrix.length)
.mapToDouble(j->EuclideanDistance.findED(GetFile.trainingMatrix[j], GetFile.testMatrix[i]))
.min().orElse(-1);
Another way along with your code:
public static double distance() {
for (int i = 0; i < GetFile.testMatrix.length;) {
double[] distances = new double[4000];
for (int j = 0; j < GetFile.trainingMatrix.length; j++) {
distances[j] = EuclideanDistance.findED(GetFile.trainingMatrix[j], GetFile.testMatrix[i]);
}
return getMinDistance(distances);
}
return 0;
}
static double getMinDistance(double[] distances) {
double minDistance = Double.MAX_VALUE;
for (double distance : distances) {
minDistance = Math.min(distance, minDistance);
}
return minDistance;
}
Assign the minDistance to distances[0] unless you are very sure about the maximum value that distances array will contain.
public static double distance() {
for (int i = 0; i < GetFile.testMatrix.length;) {
double[] distances = new double[4000];
double minDistance;
for (int j = 0; j < GetFile.trainingMatrix.length; j++) {
distances[j] = EuclideanDistance.findED(GetFile.trainingMatrix[j], GetFile.testMatrix[i]);
}
minDistance = distances[0];
for(int i = 1 ; i < distances.length; i++) {
if(minDistance > distances[i]) {
minDistance = distances[i];
}
}
return minDistance;
}
I am working on fingerprint image enhancement with Fast Fourier Transformation. I got the idea from this site.
I have implemented the FFT function using 32*32 window, and after that as the referral site suggested, I want to multiply power spectrum with the FFT. But I do not get,
How do I calculate Power Spectrum for an image? Or is there any ideal value for Power Spectrum ?
Code for FFT:
public FFT(int[] pixels, int w, int h) {
// progress = 0;
input = new TwoDArray(pixels, w, h);
intermediate = new TwoDArray(pixels, w, h);
output = new TwoDArray(pixels, w, h);
transform();
}
void transform() {
for (int i = 0; i < input.size; i+=32) {
for(int j = 0; j < input.size; j+=32){
ComplexNumber[] cn = recursiveFFT(input.getWindow(i,j));
output.putWindow(i,j, cn);
}
}
for (int j = 0; j < output.values.length; ++j) {
for (int i = 0; i < output.values[0].length; ++i) {
intermediate.values[i][j] = output.values[i][j];
input.values[i][j] = output.values[i][j];
}
}
}
static ComplexNumber[] recursiveFFT(ComplexNumber[] x) {
int N = x.length;
// base case
if (N == 1) return new ComplexNumber[] { x[0] };
// radix 2 Cooley-Tukey FFT
if (N % 2 != 0) { throw new RuntimeException("N is not a power of 2"); }
// fft of even terms
ComplexNumber[] even = new ComplexNumber[N/2];
for (int k = 0; k < N/2; k++) {
even[k] = x[2*k];
}
ComplexNumber[] q = recursiveFFT(even);
// fft of odd terms
ComplexNumber[] odd = even; // reuse the array
for (int k = 0; k < N/2; k++) {
odd[k] = x[2*k + 1];
}
ComplexNumber[] r = recursiveFFT(odd);
// combine
ComplexNumber[] y = new ComplexNumber[N];
for (int k = 0; k < N/2; k++) {
double kth = -2 * k * Math.PI / N;
ComplexNumber wk = new ComplexNumber(Math.cos(kth), Math.sin(kth));
ComplexNumber tmp = ComplexNumber.cMult(wk, r[k]);
y[k] = ComplexNumber.cSum(q[k], tmp);
ComplexNumber temp = ComplexNumber.cMult(wk, r[k]);
y[k + N/2] = ComplexNumber.cDif(q[k], temp);
}
return y;
}
I'm thinking that the power spectrum is the square of the output of the Fourier transform.
power#givenFrequency = x(x*) where x* is the complex conjugate
The total power in the image block would then be the sum over all frequency and space.
I have no idea if this helps.
I want to port Matlab's Fast Fourier transform function fft() to native Java code.
As a starting point I am using the code of JMathLib where the FFT is implemented as follows:
// given double[] x as the input signal
n = x.length; // assume n is a power of 2
nu = (int)(Math.log(n)/Math.log(2));
int n2 = n/2;
int nu1 = nu - 1;
double[] xre = new double[n];
double[] xim = new double[n];
double[] mag = new double[n2];
double tr, ti, p, arg, c, s;
for (int i = 0; i < n; i++) {
xre[i] = x[i];
xim[i] = 0.0;
}
int k = 0;
for (int l = 1; l <= nu; l++) {
while (k < n) {
for (int i = 1; i <= n2; i++) {
p = bitrev (k >> nu1);
arg = 2 * (double) Math.PI * p / n;
c = (double) Math.cos (arg);
s = (double) Math.sin (arg);
tr = xre[k+n2]*c + xim[k+n2]*s;
ti = xim[k+n2]*c - xre[k+n2]*s;
xre[k+n2] = xre[k] - tr;
xim[k+n2] = xim[k] - ti;
xre[k] += tr;
xim[k] += ti;
k++;
}
k += n2;
}
k = 0;
nu1--;
n2 = n2/2;
}
k = 0;
int r;
while (k < n) {
r = bitrev (k);
if (r > k) {
tr = xre[k];
ti = xim[k];
xre[k] = xre[r];
xim[k] = xim[r];
xre[r] = tr;
xim[r] = ti;
}
k++;
}
// The result
// -> real part stored in xre
// -> imaginary part stored in xim
Unfortunately it doesn't give me the right results when I unit test it, for example with the array
double[] x = { 1.0d, 5.0d, 9.0d, 13.0d };
the result in Matlab:
28.0
-8.0 - 8.0i
-8.0
-8.0 + 8.0i
the result in my implementation:
28.0
-8.0 + 8.0i
-8.0
-8.0 - 8.0i
Note how the signs are wrong in the complex part.
When I use longer, more complex signals the differences between the implementations affects also the numbers. So the implementation differences does not only relate to some sign-"error".
My question: how can I adapt my implemenation to make it "equal" to the Matlab one?
Or: is there already a library that does exactly this?
in order to use Jtransforms for FFT on matrix you need to do fft col by col and then join them into a matrix. here is my code which i compared with Matlab fft
double [][] newRes = new double[samplesPerWindow*2][Matrixres.numberOfSegments];
double [] colForFFT = new double [samplesPerWindow*2];
DoubleFFT_1D fft = new DoubleFFT_1D(samplesPerWindow);
for(int y = 0; y < Matrixres.numberOfSegments; y++)
{
//copy the original col into a col and and a col of zeros before FFT
for(int x = 0; x < samplesPerWindow; x++)
{
colForFFT[x] = Matrixres.res[x][y];
}
//fft on each col of the matrix
fft.realForwardFull(colForFFT); //Y=fft(y,nfft);
//copy the output of col*2 size into a new matrix
for(int x = 0; x < samplesPerWindow*2; x++)
{
newRes[x][y] = colForFFT[x];
}
}
hope this what you are looking for. note that Jtransforms represent Complex numbers as
array[2*k] = Re[k], array[2*k+1] = Im[k]