This question already has answers here:
What is a debugger and how can it help me diagnose problems?
(2 answers)
Closed 4 years ago.
I wrote a function:
private static LinearFunction[] aproxFunction(List<Point> list) {
try{
int amountOfClusters = getAmountOfClusters(list);
//System.out.println(amountOfClusters); for debug
LinearFunction[] linear = new LinearFunction[amountOfClusters];
int[][] clusters = new int[amountOfClusters][2]; // 2nd field 0 == r, 1 == g, 2 == b
clusters = getClusters(list, amountOfClusters);
for(int i = 0; i < amountOfClusters; i++) {
List<Point> pointsList = new ArrayList<>(getPointsInCluster(list, clusters[i][0], convertIdToString(clusters[i][1])));
int[][] points = new int[pointsList.size()][3];
for(int j = 0; j < points.length; j++) {
points[j][0] = pointsList.get(j).getX();
points[j][1] = pointsList.get(j).getY();
points[j][2] = pointsList.get(j).getValue();
}
pointsToFile(pointsList, clusters[i][0], convertIdToString(clusters[i][1]), "_points_in_cluster_");
int[][] array = new int[2][2];
array = aprox(removeDuplicates(points));
if((array[1][0] - array[0][0]) == 0) {
linear[i].a = 0;
linear[i].b = 0;
linear[i].flag = true;
linear[i].c = array[0][0];
} else {
linear[i].a = (array[1][1] - array[0][1]) / (array[1][0] - array[0][0]);
linear[i].b = array[1][1] - array[1][0] * linear[i].a;
}
linear[i].cluster = clusters[i][0];
linear[i].id = convertIdToString(clusters[i][1]);
}
return linear;
} catch (Exception e) {
System.out.println("Error - aproxFunction: " + e.getMessage());
}
LinearFunction[] error = new LinearFunction[1];
error[0].a = -1;
return error;
}
and I constantly receive the error messages of either division by null or just null. I can't find the reason for that.
The only division in this function happens here:
linear[i].a = (array[1][1] - array[0][1]) / (array[1][0] - array[0][0]);
but above that you have a check for null division, so how can I still get that error message.
As for the error message null I just don't know what it means. I read that it might be caused by an objects being null but how to find out which and where?
You are initialising an array of LinearFunctions, but you are not initialising each LinearFunction object inside it.
Have you tried this:
private static LinearFunction[] aproxFunction(List<Point> list) {
try{
int amountOfClusters = getAmountOfClusters(list);
//System.out.println(amountOfClusters); for debug
LinearFunction[] linear = new LinearFunction[amountOfClusters];
int[][] clusters = new int[amountOfClusters][2]; // 2nd field 0 == r, 1 == g, 2 == b
clusters = getClusters(list, amountOfClusters);
for(int i = 0; i < amountOfClusters; i++) {
linear[i] = new LinearFunction()
...
I solved this NullPointerException like this:
for(int i = 0; i < amountOfClusters; i++) {
LinearFunction lf = new LinearFunction();
List<Point> pointsList = new ArrayList<>(getPointsInCluster(list, clusters[i][0], convertIdToString(clusters[i][1])));
int[][] points = new int[pointsList.size()][3];
for(int j = 0; j < points.length; j++) {
points[j][0] = pointsList.get(j).getX();
points[j][1] = pointsList.get(j).getY();
points[j][2] = pointsList.get(j).getValue();
}
pointsToFile(pointsList, clusters[i][0], convertIdToString(clusters[i][1]), "_points_in_cluster_");
int[][] array = new int[2][2];
array = aprox(removeDuplicates(points));
if((array[1][0] - array[0][0]) == 0) {
lf.a = 0;
lf.b = 0;
lf.flag = true;
lf.c = array[0][0];
} else {
lf.a = (array[1][1] - array[0][1]) / (array[1][0] - array[0][0]);
lf.b = array[1][1] - array[1][0] * linear[i].a;
}
lf.cluster = clusters[i][0];
lf.id = convertIdToString(clusters[i][1]);
linear[i] = lf;
}
So I created an object of the same class as the array, and only at the end of the loop I point it to an element of the array.
I'm working on a brute force approach to the traveling salesman problem. I have a certain line that produces the ArrayIndexOutOfBounds exception, however all the arrays used there have more than enough space. The particular line of code:
testCity[0][a] = cities[0][(int) cityList[a]];
This is where I initialize testCity:
int[][] testCity = new int[2][CITIES+10];
cities:
public static int[][] cities = new int[2][CITIES+10];
And, finally, cityList:
Object[] cityList = new Integer[CITIES+10];
This is the entire error message:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 4
at BruteF.permute(BruteF.java:39)
at BruteF.permute(BruteF.java:30)
at BruteF.permute(BruteF.java:30)
at BruteF.permute(BruteF.java:30)
at BruteF.main(BruteF.java:11)
And here is the code:
public class BruteF {
public static final int CITIES = 5;
public static int[][] cities = new int[2][CITIES+10];
public static int[][] bestCity = new int[2][CITIES+10];
public static double bestDistance = 1000;
public static int[][] testCity = new int[2][CITIES+10];
public static Object[] cityList = new Integer[CITIES+10];
public static void main(String[] args)
{
permute(java.util.Arrays.asList(1,2,3,4), 0);
for (int i = 0;i < CITIES;i++)
{
System.out.println(bestCity[0][i] + "," + bestCity[1][i]);
}
}
static void permute(java.util.List<Integer> arr, int k){
cities[0][0] = 1;
cities[1][0] = 1;
cities[0][1] = 2;
cities[1][1] = 5;
cities[0][2] = 3;
cities[1][2] = 2;
cities[0][3] = 4;
cities[1][3] = 3;
int originalX = cities[0][0];
int originalY = cities[1][0];
for(int i = k; i < arr.size(); i++){
java.util.Collections.swap(arr, i, k);
permute(arr, k+1);
java.util.Collections.swap(arr, k, i);
}
if (k == arr.size() -1){
for (int i = 0;i < CITIES;i++)
{
cityList = arr.toArray();
for (int a = 0;a < CITIES;a++)
{
testCity[0][a] = cities[0][(int) cityList[a]];
}
if (distance(testCity,CITIES,originalX, originalY) < bestDistance)
{
bestCity = testCity;
bestDistance = distance(testCity,CITIES, originalX, originalY);
}
}
}
}
static double distance (int[][] cities, int CITIES, int originalX, int originalY)
{
int[][] taken = new int[2][CITIES+1];
int takenCounter = 0;
double distance = 0;
cities[0][CITIES] = cities[0][0];
cities[1][CITIES] = cities[1][0];
for (int i = 0;i <= CITIES;i++)
{
for (int z = 0;z <= CITIES;z++)
{
if (cities[0][i] == taken[0][z] && cities[1][i] == taken[1][z])
{
return CITIES*1000; //possible error here
}
else {
taken[0][takenCounter] = cities[0][i];
taken[1][takenCounter] = cities[1][i];
}
}
if (cities[0][0] != originalX && cities[1][0] != originalY)
{
return CITIES*1000; //POSSIBLE BUG HERE
}
distance = distance + Math.sqrt(Math.pow(cities[0][i+1]-cities[0][i],2) + Math.pow(cities[1][i+1]-cities[1][i],2));
}
return distance;
}
}
Why is this happenening? What can I do to fix it?
It is giving out of bound exception : 4
when you are initializing cityList i.e. cityList = arr.toArray(); your array cityList[] = {1,2,3,4} , i.e of size 4 from 0 to 3.
And you are running a for loop i.e
for (int a = 0;a < CITIES;a++)
from a=0 to CITIES , so as the moment arrive when a=4, it gives out of bound error.
Hi guys i am trying to sort a 2D array using thread pools to measure execution time and compare it when I increase the number of threads and the total number of elements in the array. I've done some coding but apparently it is running very slow and i am getting null pointer exceptions for some reason. I think the reason lies in the use of the Integer array since Callable cant work with primitive int[] arrays. Help will be much appreciated
public class MatrixBubbleSort {
public static void main(String[] args) throws InterruptedException, ExecutionException {
MatrixBubbleSort obj = new MatrixBubbleSort();
String resultMatrix = "";
resultMatrix = obj.sort2D();
System.out.println(resultMatrix);
}
public String sort2D() throws InterruptedException, ExecutionException {
long start = 0;
long end = 0;
int rows = 5;
int cols = 5;
Integer[][] matrix = new Integer[rows][cols];
Future<Integer[]>[] returned;
returned = new Future[rows];
Sorter[] tasks = new Sorter[rows];
ExecutorService executer = Executors.newFixedThreadPool(5);
for(int r = 0; r< rows; r++ ){
for(int c = 0; c < cols; cols++){
matrix[r][c] = (int) (Math.random() * (rows*cols));
}
}
System.out.print(printArr(matrix) + "\n");
start = System.currentTimeMillis();
for(int r = 0; r< rows; r++ ){
tasks[r] = new Sorter(matrix[r]);
returned[r] = executer.submit(tasks[r]);
}
executer.shutdown();
executer.awaitTermination(1, TimeUnit.DAYS);
end = System.currentTimeMillis();
for(int r = 0; r< rows; r++ ){
matrix[r] = returned[r].get();
}
System.out.print("Time taken = " + (end - start) + "\n");
return printArr(matrix);
}
public static String printArr(Integer[][] arr){
String out = "";
for(int i = 0; i < arr.length;i++ ){
for(int c = 0; c < arr[i].length;c++ ){
out += arr[i][c].intValue();
}
out += "\n";
}
return out;
}
}
class Sorter implements Callable{
private final Integer[] array;
Sorter(Integer[] array){
this.array = array.clone();
}
#Override
public Integer[] call() throws Exception {
boolean swap = true;
int buffer = 0;
while(swap){
swap = false;
for(int i = 0; i < array.length -1; i++){
if(array[i].intValue() > array[i+1].intValue()){
buffer = array[i].intValue();
array[i]= array[i+1].intValue();
array[i+1] = buffer;
swap = true;
}
}
}
return array;
}
}
Your code appears to run OK, after a small change:
for (int r = 0; r < rows; r++) {
for (int c = 0; c < cols; c++) { // instead of cols++
matrix[r][c] = (int) (Math.random() * (rows * cols));
}
}
Pretty fast and no null pointer exceptions so far - care to double check?
I want to calculate an x and a y value in my coördinaat class. I've got another class called arrayreader which just reads a textfile and returns it.
public class Coordinaten
{
public static final double SCHERMBREEDTE = 1200;
public static final double SCHERMHOOGTE = 1000;
private double tMax, tMin, sMax, sMin;
private Double[] ecgWaardes, xWaardes, yWaardes, coordinaatX;
public Coordinaten()
{
ArrayReader ecg1 = new ArrayReader();
ecg1.leesBestand();
ecg1.getLijnen();
ecgWaardes = ecg1.getLijnen();
}
public double bepaalTMax()
/**
* Deze method geeft de tijd in miliseconden van de ECG weer.
*/
{
tMax = (ecgWaardes.length * 2);
return tMax;
}
public double bepaalSMax()
{
for (int i = 0; i < ecgWaardes.length; i++)
{
sMax = ecgWaardes[0];
if (ecgWaardes[i] > sMax)
{
sMax = ecgWaardes[i];
}
}
return sMax;
}
public double bepaalSMin()
{
for (int i = 0; i < ecgWaardes.length; i++)
{
sMin = ecgWaardes[0];
if (ecgWaardes[i] < sMin)
{
sMin = ecgWaardes[i];
}
}
return sMin;
}
public Double[] berekenX()
{
for (int i = 0; i < ecgWaardes.length; i++)
{
coordinaatX = new Double[i];
xWaardes [i] = (double)((((i+1) *2) - 0) * (SCHERMBREEDTE-1) / (tMax - 0));
}
return xWaardes;
}
public Double[] berekenY()
{
for (int i = 0; i < ecgWaardes.length; i++)
{
yWaardes = new Double[i];
yWaardes [i] = (double)(((ecgWaardes[i] - sMax) * (SCHERMHOOGTE-1)) / (sMin - sMax));
}
return yWaardes;
}
It just keeps giving me null pointer exceptions and i really don't know why?
Anyone who can help?
When you initialise yWaardes, xWaardes, ... you put the initialization in your for loop! This causes your array to be reinitialised and (and resized) every loop in the for! You should only initialise your arrays once! This can be done outside the for loop and by making it ecgWaardes.length:
public Double[] berekenY()
{
yWaardes = new Double[ecgWaardes.length];
for (int i = 0; i < ecgWaardes.length; i++)
{
yWaardes [i] = (double)(((ecgWaardes[i] - sMax) * (SCHERMHOOGTE-1)) / (sMin - sMax));
}
return yWaardes;
}
EDIT: Note that the above only is valid for the initialisation for yWaardes, simular mistakes can be found for the other arrays. So make sure all your arrays are properly initialised!
I'm trying to implement a feed-forward neural network in Java.
I've created three classes NNeuron, NLayer and NNetwork. The "simple" calculations seem fine (I get correct sums/activations/outputs), but when it comes to the training process, I don't seem to get correct results. Can anyone, please tell what I'm doing wrong ?
The whole code for the NNetwork class is quite long, so I'm posting the part that is causing the problem:
[EDIT]: this is actually pretty much all of the NNetwork class
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
public class NNetwork
{
public static final double defaultLearningRate = 0.4;
public static final double defaultMomentum = 0.8;
private NLayer inputLayer;
private ArrayList<NLayer> hiddenLayers;
private NLayer outputLayer;
private ArrayList<NLayer> layers;
private double momentum = NNetwork1.defaultMomentum; // alpha: momentum, default! 0.3
private ArrayList<Double> learningRates;
public NNetwork (int nInputs, int nOutputs, Integer... neuronsPerHiddenLayer)
{
this(nInputs, nOutputs, Arrays.asList(neuronsPerHiddenLayer));
}
public NNetwork (int nInputs, int nOutputs, List<Integer> neuronsPerHiddenLayer)
{
// the number of neurons on the last layer build so far (i.e. the number of inputs for each neuron of the next layer)
int prvOuts = 1;
this.layers = new ArrayList<>();
// input layer
this.inputLayer = new NLayer(nInputs, prvOuts, this);
this.inputLayer.setAllWeightsTo(1.0);
this.inputLayer.setAllBiasesTo(0.0);
this.inputLayer.useSigmaForOutput(false);
prvOuts = nInputs;
this.layers.add(this.inputLayer);
// hidden layers
this.hiddenLayers = new ArrayList<>();
for (int i=0 ; i<neuronsPerHiddenLayer.size() ; i++)
{
this.hiddenLayers.add(new NLayer(neuronsPerHiddenLayer.get(i), prvOuts, this));
prvOuts = neuronsPerHiddenLayer.get(i);
}
this.layers.addAll(this.hiddenLayers);
// output layer
this.outputLayer = new NLayer(nOutputs, prvOuts, this);
this.layers.add(this.outputLayer);
this.initCoeffs();
}
private void initCoeffs ()
{
this.learningRates = new ArrayList<>();
// learning rates of the hidden layers
for (int i=0 ; i<this.hiddenLayers.size(); i++)
this.learningRates.add(NNetwork1.defaultLearningRate);
// learning rate of the output layer
this.learningRates.add(NNetwork1.defaultLearningRate);
}
public double getLearningRate (int layerIndex)
{
if (layerIndex > 0 && layerIndex <= this.hiddenLayers.size()+1)
{
return this.learningRates.get(layerIndex-1);
}
else
{
return 0;
}
}
public ArrayList<Double> getLearningRates ()
{
return this.learningRates;
}
public void setLearningRate (int layerIndex, double newLearningRate)
{
if (layerIndex > 0 && layerIndex <= this.hiddenLayers.size()+1)
{
this.learningRates.set(
layerIndex-1,
newLearningRate);
}
}
public void setLearningRates (Double... newLearningRates)
{
this.setLearningRates(Arrays.asList(newLearningRates));
}
public void setLearningRates (List<Double> newLearningRates)
{
int len = (this.learningRates.size() <= newLearningRates.size())
? this.learningRates.size()
: newLearningRates.size();
for (int i=0; i<len; i++)
this.learningRates
.set(i,
newLearningRates.get(i));
}
public double getMomentum ()
{
return this.momentum;
}
public void setMomentum (double momentum)
{
this.momentum = momentum;
}
public NNeuron getNeuron (int layerIndex, int neuronIndex)
{
if (layerIndex == 0)
return this.inputLayer.getNeurons().get(neuronIndex);
else if (layerIndex == this.hiddenLayers.size()+1)
return this.outputLayer.getNeurons().get(neuronIndex);
else
return this.hiddenLayers.get(layerIndex-1).getNeurons().get(neuronIndex);
}
public ArrayList<Double> getOutput (ArrayList<Double> inputs)
{
ArrayList<Double> lastOuts = inputs; // the last computed outputs of the last 'called' layer so far
// input layer
//lastOuts = this.inputLayer.getOutput(lastOuts);
lastOuts = this.getInputLayerOutputs(lastOuts);
// hidden layers
for (NLayer layer : this.hiddenLayers)
lastOuts = layer.getOutput(lastOuts);
// output layer
lastOuts = this.outputLayer.getOutput(lastOuts);
return lastOuts;
}
public ArrayList<ArrayList<Double>> getAllOutputs (ArrayList<Double> inputs)
{
ArrayList<ArrayList<Double>> outs = new ArrayList<>();
// input layer
outs.add(this.getInputLayerOutputs(inputs));
// hidden layers
for (NLayer layer : this.hiddenLayers)
outs.add(layer.getOutput(outs.get(outs.size()-1)));
// output layer
outs.add(this.outputLayer.getOutput(outs.get(outs.size()-1)));
return outs;
}
public ArrayList<ArrayList<Double>> getAllSums (ArrayList<Double> inputs)
{
//*
ArrayList<ArrayList<Double>> sums = new ArrayList<>();
ArrayList<Double> lastOut;
// input layer
sums.add(inputs);
lastOut = this.getInputLayerOutputs(inputs);
// hidden nodes
for (NLayer layer : this.hiddenLayers)
{
sums.add(layer.getSums(lastOut));
lastOut = layer.getOutput(lastOut);
}
// output layer
sums.add(this.outputLayer.getSums(lastOut));
return sums;
}
public ArrayList<Double> getInputLayerOutputs (ArrayList<Double> inputs)
{
ArrayList<Double> outs = new ArrayList<>();
for (int i=0 ; i<this.inputLayer.getNeurons().size() ; i++)
outs.add(this
.inputLayer
.getNeuron(i)
.getOutput(inputs.get(i)));
return outs;
}
public void changeWeights (
ArrayList<ArrayList<Double>> deltaW,
ArrayList<ArrayList<Double>> inputSet,
ArrayList<ArrayList<Double>> targetSet,
boolean checkError)
{
for (int i=0 ; i<deltaW.size()-1 ; i++)
this.hiddenLayers.get(i).changeWeights(deltaW.get(i), inputSet, targetSet, checkError);
this.outputLayer.changeWeights(deltaW.get(deltaW.size()-1), inputSet, targetSet, checkError);
}
public int train2 (
ArrayList<ArrayList<Double>> inputSet,
ArrayList<ArrayList<Double>> targetSet,
double maxError,
int maxIterations)
{
ArrayList<Double>
input,
target;
ArrayList<ArrayList<ArrayList<Double>>> prvNetworkDeltaW = null;
double error;
int i = 0, j = 0, traininSetLength = inputSet.size();
do // during each itreration...
{
error = 0.0;
for (j = 0; j < traininSetLength; j++) // ... for each training element...
{
input = inputSet.get(j);
target = targetSet.get(j);
prvNetworkDeltaW = this.train2_bp(input, target, prvNetworkDeltaW); // ... do backpropagation, and return the new weight deltas
error += this.getInputMeanSquareError(input, target);
}
i++;
} while (error > maxError && i < maxIterations); // iterate as much as necessary/possible
return i;
}
public ArrayList<ArrayList<ArrayList<Double>>> train2_bp (
ArrayList<Double> input,
ArrayList<Double> target,
ArrayList<ArrayList<ArrayList<Double>>> prvNetworkDeltaW)
{
ArrayList<ArrayList<Double>> layerSums = this.getAllSums(input); // the sums for each layer
ArrayList<ArrayList<Double>> layerOutputs = this.getAllOutputs(input); // the outputs of each layer
// get the layer deltas (inc the input layer that is null)
ArrayList<ArrayList<Double>> layerDeltas = this.train2_getLayerDeltas(layerSums, layerOutputs, target);
// get the weight deltas
ArrayList<ArrayList<ArrayList<Double>>> networkDeltaW = this.train2_getWeightDeltas(layerOutputs, layerDeltas, prvNetworkDeltaW);
// change the weights
this.train2_updateWeights(networkDeltaW);
return networkDeltaW;
}
public void train2_updateWeights (ArrayList<ArrayList<ArrayList<Double>>> networkDeltaW)
{
for (int i=1; i<this.layers.size(); i++)
this.layers.get(i).train2_updateWeights(networkDeltaW.get(i));
}
public ArrayList<ArrayList<ArrayList<Double>>> train2_getWeightDeltas (
ArrayList<ArrayList<Double>> layerOutputs,
ArrayList<ArrayList<Double>> layerDeltas,
ArrayList<ArrayList<ArrayList<Double>>> prvNetworkDeltaW)
{
ArrayList<ArrayList<ArrayList<Double>>> networkDeltaW = new ArrayList<>(this.layers.size());
ArrayList<ArrayList<Double>> layerDeltaW;
ArrayList<Double> neuronDeltaW;
for (int i=0; i<this.layers.size(); i++)
networkDeltaW.add(new ArrayList<ArrayList<Double>>());
double
deltaW, x, learningRate, prvDeltaW, d;
int i, j, k;
for (i=this.layers.size()-1; i>0; i--) // for each layer
{
learningRate = this.getLearningRate(i);
layerDeltaW = new ArrayList<>();
networkDeltaW.set(i, layerDeltaW);
for (j=0; j<this.layers.get(i).getNeurons().size(); j++) // for each neuron of this layer
{
neuronDeltaW = new ArrayList<>();
layerDeltaW.add(neuronDeltaW);
for (k=0; k<this.layers.get(i-1).getNeurons().size(); k++) // for each weight (i.e. each neuron of the previous layer)
{
d = layerDeltas.get(i).get(j);
x = layerOutputs.get(i-1).get(k);
prvDeltaW = (prvNetworkDeltaW != null)
? prvNetworkDeltaW.get(i).get(j).get(k)
: 0.0;
deltaW = -learningRate * d * x + this.momentum * prvDeltaW;
neuronDeltaW.add(deltaW);
}
// the bias !!
d = layerDeltas.get(i).get(j);
x = 1;
prvDeltaW = (prvNetworkDeltaW != null)
? prvNetworkDeltaW.get(i).get(j).get(prvNetworkDeltaW.get(i).get(j).size()-1)
: 0.0;
deltaW = -learningRate * d * x + this.momentum * prvDeltaW;
neuronDeltaW.add(deltaW);
}
}
return networkDeltaW;
}
ArrayList<ArrayList<Double>> train2_getLayerDeltas (
ArrayList<ArrayList<Double>> layerSums,
ArrayList<ArrayList<Double>> layerOutputs,
ArrayList<Double> target)
{
// get ouput deltas
ArrayList<Double> outputDeltas = new ArrayList<>(); // the output layer deltas
double
oErr, // output error given a target
s, // sum
o, // output
d; // delta
int
nOutputs = target.size(), // #TODO ?== this.outputLayer.size()
nLayers = this.hiddenLayers.size()+2; // #TODO ?== layerOutputs.size()
for (int i=0; i<nOutputs; i++) // for each neuron...
{
s = layerSums.get(nLayers-1).get(i);
o = layerOutputs.get(nLayers-1).get(i);
oErr = (target.get(i) - o);
d = -oErr * this.getNeuron(nLayers-1, i).sigmaPrime(s); // #TODO "s" or "o" ??
outputDeltas.add(d);
}
// get hidden deltas
ArrayList<ArrayList<Double>> hiddenDeltas = new ArrayList<>();
for (int i=0; i<this.hiddenLayers.size(); i++)
hiddenDeltas.add(new ArrayList<Double>());
NLayer nextLayer = this.outputLayer;
ArrayList<Double> nextDeltas = outputDeltas;
int
h, k,
nHidden = this.hiddenLayers.size(),
nNeurons = this.hiddenLayers.get(nHidden-1).getNeurons().size();
double
wdSum = 0.0;
for (int i=nHidden-1; i>=0; i--) // for each hidden layer
{
hiddenDeltas.set(i, new ArrayList<Double>());
for (h=0; h<nNeurons; h++)
{
wdSum = 0.0;
for (k=0; k<nextLayer.getNeurons().size(); k++)
{
wdSum += nextLayer.getNeuron(k).getWeight(h) * nextDeltas.get(k);
}
s = layerSums.get(i+1).get(h);
d = this.getNeuron(i+1, h).sigmaPrime(s) * wdSum;
hiddenDeltas.get(i).add(d);
}
nextLayer = this.hiddenLayers.get(i);
nextDeltas = hiddenDeltas.get(i);
}
ArrayList<ArrayList<Double>> deltas = new ArrayList<>();
// input layer deltas: void
deltas.add(null);
// hidden layers deltas
deltas.addAll(hiddenDeltas);
// output layer deltas
deltas.add(outputDeltas);
return deltas;
}
public double getInputMeanSquareError (ArrayList<Double> input, ArrayList<Double> target)
{
double diff, mse=0.0;
ArrayList<Double> output = this.getOutput(input);
for (int i=0; i<target.size(); i++)
{
diff = target.get(i) - output.get(i);
mse += (diff * diff);
}
mse /= 2.0;
return mse;
}
}
Some methods' names (with their return values/types) are quite self-explanatory, like "this.getAllSums" that returns the sums (sum(x_i*w_i) for each neuron) of each layer, "this.getAllOutputs" that return the outputs (sigmoid(sum) for each neuron) of each layer and "this.getNeuron(i,j)" that returns the j'th neuron of the i'th layer.
Thank you in advance for your help :)
Here is a very simple java implementation with tests in the main method :
import java.util.Arrays;
import java.util.Random;
public class MLP {
public static class MLPLayer {
float[] output;
float[] input;
float[] weights;
float[] dweights;
boolean isSigmoid = true;
public MLPLayer(int inputSize, int outputSize, Random r) {
output = new float[outputSize];
input = new float[inputSize + 1];
weights = new float[(1 + inputSize) * outputSize];
dweights = new float[weights.length];
initWeights(r);
}
public void setIsSigmoid(boolean isSigmoid) {
this.isSigmoid = isSigmoid;
}
public void initWeights(Random r) {
for (int i = 0; i < weights.length; i++) {
weights[i] = (r.nextFloat() - 0.5f) * 4f;
}
}
public float[] run(float[] in) {
System.arraycopy(in, 0, input, 0, in.length);
input[input.length - 1] = 1;
int offs = 0;
Arrays.fill(output, 0);
for (int i = 0; i < output.length; i++) {
for (int j = 0; j < input.length; j++) {
output[i] += weights[offs + j] * input[j];
}
if (isSigmoid) {
output[i] = (float) (1 / (1 + Math.exp(-output[i])));
}
offs += input.length;
}
return Arrays.copyOf(output, output.length);
}
public float[] train(float[] error, float learningRate, float momentum) {
int offs = 0;
float[] nextError = new float[input.length];
for (int i = 0; i < output.length; i++) {
float d = error[i];
if (isSigmoid) {
d *= output[i] * (1 - output[i]);
}
for (int j = 0; j < input.length; j++) {
int idx = offs + j;
nextError[j] += weights[idx] * d;
float dw = input[j] * d * learningRate;
weights[idx] += dweights[idx] * momentum + dw;
dweights[idx] = dw;
}
offs += input.length;
}
return nextError;
}
}
MLPLayer[] layers;
public MLP(int inputSize, int[] layersSize) {
layers = new MLPLayer[layersSize.length];
Random r = new Random(1234);
for (int i = 0; i < layersSize.length; i++) {
int inSize = i == 0 ? inputSize : layersSize[i - 1];
layers[i] = new MLPLayer(inSize, layersSize[i], r);
}
}
public MLPLayer getLayer(int idx) {
return layers[idx];
}
public float[] run(float[] input) {
float[] actIn = input;
for (int i = 0; i < layers.length; i++) {
actIn = layers[i].run(actIn);
}
return actIn;
}
public void train(float[] input, float[] targetOutput, float learningRate, float momentum) {
float[] calcOut = run(input);
float[] error = new float[calcOut.length];
for (int i = 0; i < error.length; i++) {
error[i] = targetOutput[i] - calcOut[i]; // negative error
}
for (int i = layers.length - 1; i >= 0; i--) {
error = layers[i].train(error, learningRate, momentum);
}
}
public static void main(String[] args) throws Exception {
float[][] train = new float[][]{new float[]{0, 0}, new float[]{0, 1}, new float[]{1, 0}, new float[]{1, 1}};
float[][] res = new float[][]{new float[]{0}, new float[]{1}, new float[]{1}, new float[]{0}};
MLP mlp = new MLP(2, new int[]{2, 1});
mlp.getLayer(1).setIsSigmoid(false);
Random r = new Random();
int en = 500;
for (int e = 0; e < en; e++) {
for (int i = 0; i < res.length; i++) {
int idx = r.nextInt(res.length);
mlp.train(train[idx], res[idx], 0.3f, 0.6f);
}
if ((e + 1) % 100 == 0) {
System.out.println();
for (int i = 0; i < res.length; i++) {
float[] t = train[i];
System.out.printf("%d epoch\n", e + 1);
System.out.printf("%.1f, %.1f --> %.3f\n", t[0], t[1], mlp.run(t)[0]);
}
}
}
}
}
I tried going over your code, but as you stated, it was pretty long.
Here's what I suggest:
To verify that your network is learning properly, try to train a simple network, like a network that recognizes the XOR operator. This shouldn't take all that long.
Use the simplest back-propagation algorithm. Stochastic backpropagation (where the weights are updated after the presentation of each training input) is the easiest. Implement the algorithm without the momentum term initially, and with a constant learning rate (i.e., don't start with adaptive learning-rates). Once you're satisfied that the algorithm is working, you can introduce the momentum term. Doing too many things at the same time increases the chances that more than one thing can go wrong. This makes it harder for you to see where you went wrong.
If you want to go over some code, you can check out some code that I wrote; you want to look at Backpropagator.java. I've basically implemented the stochastic backpropagation algorithm with a momentum term. I also have a video where I provide a quick explanation of my implementation of the backpropagation algorithm.
Hopefully this is of some help!