I've been trying to animate a model in OpenGl using Assimp.
The result of my attempts is
this.
Loading bones:
List<Bone> getBones(AIMesh mesh) {
List<Bone> bones = new ArrayList<>();
for (int i = 0; i < mesh.mNumBones(); i++) {
AIBone aiBone = AIBone.create(mesh.mBones().get(i));
Bone bone = new Bone(aiBone.mName().dataString());
bone.setOffset(aiMatrixToMatrix(aiBone.mOffsetMatrix()).transpose());
bones.add(bone);
}
return bones;
}
Loading vertices:
VertexData processVertices(AIMesh mesh) {
float[] weights = null;
int[] boneIds = null;
float[] vertices = new float[mesh.mNumVertices() * 3];
boolean calculateBones = mesh.mNumBones() != 0;
if (calculateBones) {
weights = new float[mesh.mNumVertices() * 4];
boneIds = new int[mesh.mNumVertices() * 4];
}
int i = 0;
int k = 0;
for (AIVector3D vertex : mesh.mVertices()) {
vertices[i++] = vertex.x();
vertices[i++] = vertex.y();
vertices[i++] = vertex.z();
//bone data if any
if (calculateBones) {
for (int j = 0; j < mesh.mNumBones(); j++) {
AIBone bone = AIBone.create(mesh.mBones().get(j));
for (AIVertexWeight weight : bone.mWeights()) {
if (weight.mVertexId() == i - 3) {
k++;
boneIds[k] = j;
weights[k] = weight.mWeight();
}
}
}
}
}
What am I doing wrong.
Are all the matrices required for the bind pose or can I use only the offset for testing?
If I get you code right you do not get the inidecs by the faces, right? You need to iterate over the faces of your mesh to get the correct inidices, if I get the concept you are using right.
Related
I'm currently coding my own convolutional neural network in Java. First I implemented the fully-connected-layers which worked perfectly fine (it worked correctly with the MNIST dataset).
Now I have also implemented the convolutional layer and tried it with a really simple example:
Network nn = new Network(new SGD(0.01), new CrossEntropy(), new Convolution(6, 3, 3, 2), new Convolution(4, 2, 2, 1), new Flatten());
ConvTrainPair[] trainPairs = new ConvTrainPair[] {
new ConvTrainPair(Cube.ones(6, 6, 3), Vector.from(0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.8))
};
//System.out.println(nn.run(new Cube[] {Cube.ones(6, 6, 3)}));
for(int e = 0; e < 100000000; ++e) {
nn.train(trainPairs, 2);
Matrix out = (Matrix) nn.run(new Cube[] {Cube.ones(6, 6, 3)});
System.out.println("out: " + out);
}
But somehow after a few iterations the network is only returning NaN and I have no idea why. The network is just a simple network with one convolutional-layer (with input-size 6 and input-depth 3 and kernel-size 3 and numKernels 2), another one (with input-size 4 and input-depth 2 and kernel-size 2 and numKernels 1) and a final Flatten layer which turns the output into a matrix. I'm using stochastic-gradient-descent and the cross-entropy loss.
I'm quite sure that the error must be in my code of the convolution-layer:
public class Convolution extends Layer {
private Matrix[][] filters, filterGradients;
private Matrix[] bias, biasGradients;
private Cube[] lastInputBatch;
private Trainer trainer;
private int trainerWW, trainerWH, trainerBL;
public Convolution(int inputSize, int inputDepth, int filterSize, int numFilters) {
int outputSize = inputSize - filterSize + 1;
this.trainerWW = numFilters * filterSize;
this.trainerWH = inputDepth * filterSize;
this.trainerBL = (numFilters + outputSize) * outputSize;
this.bias = new Matrix[numFilters];
this.filters = new Matrix[numFilters][inputDepth];
for(int i = 0; i < numFilters; ++i) {
bias[i] = Matrix.random(outputSize, outputSize);
for(int j = 0; j < inputDepth; ++j) {
filters[i][j] = Matrix.random(filterSize, filterSize);
}
}
}
#Override
public void init(Optimizer optimizer) {
this.trainer = new Trainer(trainerWW, trainerWH, trainerBL, optimizer);
}
#Override
public Object feedforward(Cube[] batch) {
this.lastInputBatch = batch;
Cube[] out = new Cube[batch.length];
for(int b = 0; b < batch.length; ++b) {
Cube current = batch[b];
Matrix[] fMaps = new Matrix[filters.length];
for(int i = 0; i < filters.length; ++i) {
fMaps[i] = bias[i];
for(int j = 0; j < filters[i].length; ++j) {
fMaps[i].addE(crossCorrelate(current.data[j], filters[i][j]));
}
}
out[b] = new Cube(fMaps);
}
return out;
}
#Override
public Object backward(Cube[] deltaBatch, boolean needsActivationGradient) {
Cube[] inputDeltaBatch = new Cube[deltaBatch.length];
for(int b = 0; b < deltaBatch.length; ++b) {
Cube delta = deltaBatch[b];
Cube lastInput = lastInputBatch[b];
filterGradients = new Matrix[filters.length][filters[0].length];
biasGradients = new Matrix[filterGradients.length];
Matrix[] inputGradients = new Matrix[filters[0].length];
for(int i = 0; i < filterGradients.length; ++i) {
for(int j = 0; j < filterGradients[i].length; ++j) {
Matrix filter = filters[i][j];
filterGradients[i][j] = crossCorrelate(lastInput.data[j], delta.data[i]);
if(i == 0) inputGradients[j] = new Matrix(lastInput.width(), lastInput.height());
inputGradients[j].addE(crossCorrelate(delta.data[i].padding(filter.w - 1), filter.rotate180()));
}
biasGradients[i] = delta.data[i];
}
inputDeltaBatch[b] = new Cube(inputGradients);
}
return inputDeltaBatch;
}
public static Matrix crossCorrelate(Matrix input, Matrix kernel) {
int nmW = input.w - kernel.w + 1;
int nmH = input.h - kernel.h + 1;
Matrix nm = new Matrix(nmW, nmH);
for(int i = 0; i < nmW; ++i) {
for(int j = 0; j < nmH; ++j) {
for(int a = 0; a < kernel.w; ++a) {
for(int b = 0; b < kernel.h; ++b) {
nm.e[i][j] += input.e[i + a][j + b] * kernel.e[a][b];
}
}
}
}
return nm;
}
#Override
public void optimize(int episode) {
double lr = 0.001;
for(int i = 0; i < filters.length; ++i) {
for(int j = 0; j < filters[i].length; ++j) {
//System.out.println(filters[i][j]);
filters[i][j].subE(filterGradients[i][j].mul(lr));
}
}
for(int i = 0; i < bias.length; ++i) {
bias[i].subE(biasGradients[i].mul(lr));
}
//System.out.println();
} }
I already tried to only use one convolutional-layer which solves the problem, but I don't understand why. I also tried to change the learning rate and the loss-function, but I didn't solve the problem. I also inserted some activation-functions between the two convolutional-layers and one after the flatten-layer. With the sigmoid-function it didn't return NaN anymore, but it also didn't seem to learn anything by training. With the tanH and the softmax function it still returned Nan.
If you have any idea what the problem might be, I would be very grateful.
Thank you in advance!
at the Moment i do a 2D strategy game using pathfinding to navigate my Units over the (still small) tilemap. The tiles are 32x32 and the map is 50x100 big so ist very small :) . It works all so far but i have the more laggs the more Units i create. Till 30 Units it works as it should but more makes my Programm lagg very strong.
So i use an ArrayList for my openSet and (after doing some googling) i know thats bad. So i need to Keep my openList sorted by using TreeSet, but by using TreeSet its necessary to Override compareTo(). Im not fit enough with comparisions like this.
What must i compare exactly, the f value or the Signum? I dont know that and i Need some help.
Here is the A* Algorithm:
public static List<Tile> findPath(int startx,int starty,int endx,int endy){
for(int i = 0; i < width; i++){
for(int j = 0;j < height;j++){
tiles[i][j] = new Tile(i,j,size,size,obstacles[i][j],false);
}
}
for(int i = 0; i < width; i++){
for(int j = 0;j < height;j++){
tiles[i][j].addNeighbours(tiles,width,height);
}
}
List<Tile> openList = new ArrayList<Tile>(); // Here i want a TreeSet
HashSet<Tile> closedList = new HashSet<Tile>();
List<Tile> path = null;
Tile start = tiles[startx][starty];
Tile end = tiles[endx][endy];
Tile closest = start;
closest.h = heuristic(closest,end);
openList.add(start);
while(!openList.isEmpty()) {
int winner = 0;
for (int i = 0; i < openList.size(); i++) {
if (openList.get(i).f < openList.get(winner).f) {
winner = i;
}
}
Tile current = openList.get(winner);
openList.remove(current);
if (current == end) {
path = new ArrayList<Tile>();
Tile tmp = current;
path.add(tmp);
while (tmp.previous != null) {
path.add(tmp);
tmp = tmp.previous;
}
return path;
}
closedList.add(current);
List<Tile> neighbours = current.neighbours;
for (int i = 0; i < neighbours.size(); i++) {
Tile neighbour = neighbours.get(i);
int cost = current.g + heuristic(current,neighbour);
if (openList.contains(neighbour) && cost < neighbour.g) {
openList.remove(neighbour);
}
if (closedList.contains(neighbour) && cost < neighbour.g) {
closedList.remove(neighbour);
}
int newcost = heuristic(neighbour, end);
if (!openList.contains(neighbour) && !closedList.contains(neighbour) && !neighbour.obstacle) {
neighbour.h = newcost;
if (neighbour.h < closest.h) {
closest = neighbour;
}
}
if (!openList.contains(neighbour) && !closedList.contains(neighbour) && !neighbour.obstacle) {
neighbour.g = cost;
openList.add(neighbour);
neighbour.f = neighbour.g + neighbour.h;
neighbour.previous = current;
}
}
}
Tile tmp = closest;
path = new ArrayList<Tile>();
path.add(tmp);
while (tmp.previous != null) {
path.add(tmp);
tmp = tmp.previous;
}
return path;
}
public static int heuristic(Tile A,Tile B) {
int dx = Math.abs(A.x - B.x);
int dy = Math.abs(A.y - B.y);
return 1 * (dx + dy) + (1 - 2 * 1) * Math.min(dx,dy);
}
And i have another Problem. I load the whole entire map inclusive ist obstacle during calling the finPath-Method, but i didnt find another solution, where i can load it only once. And i really tried a lot believe me... .
So here my two Questions:
What must i exactly compare within the compareTo Method to make it work?
Where can i load my TiledMap once, so A* havent got to update it during it is called?
I am having trouble creating a Genetic Algorithm in java. I am competing in an online GA contest. I am trying to save the best result each time back into index 0, but it just becomes a reference to the original index. Meaning when I evolve the rest of the indexes, if it evolves the best members original index I lose it.
I have tried shimming it with a getClone method that converts the objects data to and int array and creates a new object from it.
Individual class:
class Individual {
public int[] angle;
public int[] thrust;
public double fitness;
public Individual(){
angle = new int[2];
thrust = new int[2];
for (int i = 0; i < 2; i++) {
this.angle[i] = ThreadLocalRandom.current().nextInt(0, 37) - 18;
this.thrust[i] = ThreadLocalRandom.current().nextInt(0, 202);
this.thrust[i] = ( (this.thrust[i] == 201) ? 650 : this.thrust[i] );
}
this.fitness = Double.MIN_VALUE;
}
public Individual(int[][] genes, double f){
this.fitness = f;
angle = new int[2];
thrust = new int[2];
this.angle[0] = genes[0][0];
this.angle[1] = genes[0][1];
this.thrust[0] = genes[1][0];
this.thrust[1] = genes[1][1];
}
public Individual getClone() {
int[][] genes = new int[2][2];
genes[0][0] = (int)this.angle[0];
genes[0][1] = (int)this.angle[1];
genes[1][0] = (int)this.thrust[0];
genes[1][1] = (int)this.thrust[1];
return ( new Individual(genes, this.fitness) );
}
public Individual crossover(Individual other) {
int[][] genes = new int[2][2];
genes[0][0] = (int)( (this.angle[0] + other.angle[0])/2 );
genes[0][1] = (int)( (this.angle[1] + other.angle[1])/2 );
genes[1][0] = ( (this.thrust[0] == 650 || other.thrust[0] == 650) ? 650: (int)( (this.thrust[0] + other.thrust[0])/2 ) );
genes[1][1] = ( (this.thrust[1] == 650 || other.thrust[1] == 650) ? 650: (int)( (this.thrust[1] + other.thrust[1])/2 ) );
return ( new Individual(genes, Double.MIN_VALUE) );
}
public void mutate() {
for (int i = 0; i < 2; i++) {
if(ThreadLocalRandom.current().nextInt(0, 2)==1) {
this.angle[i] = ThreadLocalRandom.current().nextInt(0, 37) - 18;
}
if(ThreadLocalRandom.current().nextInt(0, 2)==1) {
this.thrust[i] = ThreadLocalRandom.current().nextInt(0, 202);
this.thrust[i] = ( (this.thrust[i] == 201) ? 650 : this.thrust[i] );
}
}
}
Population class:
class Population {
public Individual[] individuals;
public Population(int populationSize) {
individuals = new Individual[populationSize];
for (int i = 0; i < populationSize; i ++) {
individuals[i] = new Individual();
}
}
public void resetFitness() {
for (int i = 0; i < individuals.length; i++) {
individuals[i].fitness = Double.MIN_VALUE;
}
}
public void setIndividual(int i, Individual indiv) {
individuals[i] = indiv.getClone();
}
public Individual getIndividual(int i) {
return individuals[i].getClone();
}
public int size() {
return this.individuals.length;
}
public Individual getFittest() {
int fittest = 0;
// Loop through individuals to find fittest
for (int i = 0; i < individuals.length; i++) {
if (individuals[i].fitness > individuals[fittest].fitness) {
fittest = i;
}
}
return individuals[fittest].getClone();
}
}
The necessaries from the sim class:
class simGA {
private Population pop;
private final static int TSIZE = 5; //tournement size
public simGA (int poolsize) {
this.pop = new Population(poolsize);
}
public Individual search(int generations, int totalMoves) {
//this.pop.resetFitness();
for (int g = 0; g < generations; g++) {
for (int i = 0; i < this.pop.individuals.length; i++) {
this.pop.individuals[i].fitness = sim(this.pop.individuals[i],totalMoves);
}
System.err.print("Generation " + g + " ");
this.pop = evolvePopulation(this.pop);
}
return pop.getFittest();
}
private Population evolvePopulation(Population p) {
//save fittest
Population tempPop = new Population(p.individuals.length);
tempPop.setIndividual(0, p.getFittest().getClone() );
System.err.print("Best move: " + tempPop.individuals[0].fitness);
System.err.println();
for (int i = 1; i < p.individuals.length; i++) {
Individual indiv1 = tournamentSelection(p);
Individual indiv2 = tournamentSelection(p);
Individual newIndiv = indiv1.crossover(indiv2);
newIndiv.mutate();
tempPop.setIndividual(i, newIndiv.getClone() );
}
return tempPop;
}
// Select individuals for crossover
private Individual tournamentSelection(Population pop) {
// Create a tournament population
Population tournament = new Population(TSIZE);
// For each place in the tournament get a random individual
for (int i = 0; i < TSIZE; i++) {
int randomId = ThreadLocalRandom.current().nextInt(1, this.pop.individuals.length);
tournament.setIndividual(i, pop.getIndividual(randomId).getClone() );
}
// Get the fittest
return tournament.getFittest().getClone();
}
private double sim(Individual s, int moves) {
return score; //score of simmed moves
}
How can I make sure that the best individual is getting saved, not as a reference? When I error print the best score, sometimes it is lost and a worse scoring move is chosen. I don't think it is necessarily a object cloning issue, I can clone the game objects that are simulated just fine, resetting them each run.
As I said, this is for a contest, so I cannot use any libraries on the site, and also is the reason I am not posting the full code, the intricacies of the simulator it self that scores the moves are not to be just given away. But suffice it to say the scores come back as expected for the move when worked out on paper.
I response to NWS, I thought my getClone method was doing a deep copy.
Reference used beside wiki and other knowledge on Genetic Algorithms: http://www.theprojectspot.com/tutorial-post/creating-a-genetic-algorithm-for-beginners/3
I have fixed it by not resimming the individual at index 0. However this means there are other issue with my code not related to the question.
Individual newIndiv = indiv1.crossover(indiv2);
Above line is resetting the fitness to Double.MIN_VALUE. So, whenever evolvePopulation is called, only individual at index 0 is fittest.
I have fixed it by not resimming the individual at index 0. However this means there are other issue with my code not related to the question, since resimming the same individual from the same point in time as before should not change it's fitness.
I am using libgdx to generate some 3d mesh in code. Right now I am attempting to generate a flat plane with many vertices in the middle just to test things out, however I am received a Exception in thread "LWJGL Application" java.nio.BufferOverflowException when I called mesh.setIndices(indices); where the variable indices is a short array.
I am not having any trouble if I have less than 180-150 indices. I wasn't able to figure out the exact number from trail and error, however I am certain that the exception will be thrown if I have more than 180 indices.
Here is my code for creating the mesh:
Firstly how I generate my vertices (I don't think this is the problem, but Ill put them in anyways) *Note that my vertex attributes are (VertexAttribute.Position(), VertexAttribute.Normal(), VertexAttribute.ColorUnpacked())
private float[] generateVertices(int width, int height) {
int index = 0;
float vertices[] = new float[width*height*10];
for(int i = 0; i < width; i ++) {
for(int j = 0; j < height; j++) {
//vertex coordinates
vertices[index] = i;
vertices[index+1] = 0;
vertices[index+2] = j;
// normal
vertices[index+3] = 0;
vertices[index+4] = 1;
vertices[index+5] = 0;
// random colors!!!
vertices[index+6] = MathUtils.random(0.3f, 0.99f);
vertices[index+7] = MathUtils.random(0.3f, 0.99f);
vertices[index+8] = MathUtils.random(0.3f, 0.99f);
vertices[index+9] = 1;
index+=10;
}
}
return vertices;
}
Secondly here is how I generate my indices:
private short[] generateIndices(int width, int height) {
int index = 0;
short indices[] = new short[(width-1)*(height-1)*3 * 2];
for(int i = 0; i < width-1; i ++) {
for (int j = 0; j < height-1; j++) {
indices[index] = (short)((j*height) + i);
indices[index+1] = (short)((j*height) + i+1);
indices[index+2] = (short)(((j+1)*height) + i);
indices[index+3] = (short)(((j+1)*height) + i);
indices[index+4] = (short)((j*height) + i+1);
indices[index+5] = (short)(((j+1)*height) + i + 1);
index+= 6;
}
}
return indices;
}
Thirdly this is how I set the vertices and indices (note that 6 and 7 are the width and height of the plane):
mesh.setVertices(generateVertices(6, 7));
mesh.setIndices(generateIndices(6, 7));
Finally, here is how I rendered my mesh through a custom shader.
shaderProgram.setUniformMatrix("u_projectionViewMatrix", camera.combined);
shaderProgram.setUniformMatrix("uMVMatrix", mat4);
// rendering with triangles
mesh.render(shaderProgram, GL20.GL_TRIANGLES);
I don't know what is causing this exception, any help is appreciated. Any suggestions are welcome.
Thanks in advance.
I'm trying to implement a feed-forward neural network in Java.
I've created three classes NNeuron, NLayer and NNetwork. The "simple" calculations seem fine (I get correct sums/activations/outputs), but when it comes to the training process, I don't seem to get correct results. Can anyone, please tell what I'm doing wrong ?
The whole code for the NNetwork class is quite long, so I'm posting the part that is causing the problem:
[EDIT]: this is actually pretty much all of the NNetwork class
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
public class NNetwork
{
public static final double defaultLearningRate = 0.4;
public static final double defaultMomentum = 0.8;
private NLayer inputLayer;
private ArrayList<NLayer> hiddenLayers;
private NLayer outputLayer;
private ArrayList<NLayer> layers;
private double momentum = NNetwork1.defaultMomentum; // alpha: momentum, default! 0.3
private ArrayList<Double> learningRates;
public NNetwork (int nInputs, int nOutputs, Integer... neuronsPerHiddenLayer)
{
this(nInputs, nOutputs, Arrays.asList(neuronsPerHiddenLayer));
}
public NNetwork (int nInputs, int nOutputs, List<Integer> neuronsPerHiddenLayer)
{
// the number of neurons on the last layer build so far (i.e. the number of inputs for each neuron of the next layer)
int prvOuts = 1;
this.layers = new ArrayList<>();
// input layer
this.inputLayer = new NLayer(nInputs, prvOuts, this);
this.inputLayer.setAllWeightsTo(1.0);
this.inputLayer.setAllBiasesTo(0.0);
this.inputLayer.useSigmaForOutput(false);
prvOuts = nInputs;
this.layers.add(this.inputLayer);
// hidden layers
this.hiddenLayers = new ArrayList<>();
for (int i=0 ; i<neuronsPerHiddenLayer.size() ; i++)
{
this.hiddenLayers.add(new NLayer(neuronsPerHiddenLayer.get(i), prvOuts, this));
prvOuts = neuronsPerHiddenLayer.get(i);
}
this.layers.addAll(this.hiddenLayers);
// output layer
this.outputLayer = new NLayer(nOutputs, prvOuts, this);
this.layers.add(this.outputLayer);
this.initCoeffs();
}
private void initCoeffs ()
{
this.learningRates = new ArrayList<>();
// learning rates of the hidden layers
for (int i=0 ; i<this.hiddenLayers.size(); i++)
this.learningRates.add(NNetwork1.defaultLearningRate);
// learning rate of the output layer
this.learningRates.add(NNetwork1.defaultLearningRate);
}
public double getLearningRate (int layerIndex)
{
if (layerIndex > 0 && layerIndex <= this.hiddenLayers.size()+1)
{
return this.learningRates.get(layerIndex-1);
}
else
{
return 0;
}
}
public ArrayList<Double> getLearningRates ()
{
return this.learningRates;
}
public void setLearningRate (int layerIndex, double newLearningRate)
{
if (layerIndex > 0 && layerIndex <= this.hiddenLayers.size()+1)
{
this.learningRates.set(
layerIndex-1,
newLearningRate);
}
}
public void setLearningRates (Double... newLearningRates)
{
this.setLearningRates(Arrays.asList(newLearningRates));
}
public void setLearningRates (List<Double> newLearningRates)
{
int len = (this.learningRates.size() <= newLearningRates.size())
? this.learningRates.size()
: newLearningRates.size();
for (int i=0; i<len; i++)
this.learningRates
.set(i,
newLearningRates.get(i));
}
public double getMomentum ()
{
return this.momentum;
}
public void setMomentum (double momentum)
{
this.momentum = momentum;
}
public NNeuron getNeuron (int layerIndex, int neuronIndex)
{
if (layerIndex == 0)
return this.inputLayer.getNeurons().get(neuronIndex);
else if (layerIndex == this.hiddenLayers.size()+1)
return this.outputLayer.getNeurons().get(neuronIndex);
else
return this.hiddenLayers.get(layerIndex-1).getNeurons().get(neuronIndex);
}
public ArrayList<Double> getOutput (ArrayList<Double> inputs)
{
ArrayList<Double> lastOuts = inputs; // the last computed outputs of the last 'called' layer so far
// input layer
//lastOuts = this.inputLayer.getOutput(lastOuts);
lastOuts = this.getInputLayerOutputs(lastOuts);
// hidden layers
for (NLayer layer : this.hiddenLayers)
lastOuts = layer.getOutput(lastOuts);
// output layer
lastOuts = this.outputLayer.getOutput(lastOuts);
return lastOuts;
}
public ArrayList<ArrayList<Double>> getAllOutputs (ArrayList<Double> inputs)
{
ArrayList<ArrayList<Double>> outs = new ArrayList<>();
// input layer
outs.add(this.getInputLayerOutputs(inputs));
// hidden layers
for (NLayer layer : this.hiddenLayers)
outs.add(layer.getOutput(outs.get(outs.size()-1)));
// output layer
outs.add(this.outputLayer.getOutput(outs.get(outs.size()-1)));
return outs;
}
public ArrayList<ArrayList<Double>> getAllSums (ArrayList<Double> inputs)
{
//*
ArrayList<ArrayList<Double>> sums = new ArrayList<>();
ArrayList<Double> lastOut;
// input layer
sums.add(inputs);
lastOut = this.getInputLayerOutputs(inputs);
// hidden nodes
for (NLayer layer : this.hiddenLayers)
{
sums.add(layer.getSums(lastOut));
lastOut = layer.getOutput(lastOut);
}
// output layer
sums.add(this.outputLayer.getSums(lastOut));
return sums;
}
public ArrayList<Double> getInputLayerOutputs (ArrayList<Double> inputs)
{
ArrayList<Double> outs = new ArrayList<>();
for (int i=0 ; i<this.inputLayer.getNeurons().size() ; i++)
outs.add(this
.inputLayer
.getNeuron(i)
.getOutput(inputs.get(i)));
return outs;
}
public void changeWeights (
ArrayList<ArrayList<Double>> deltaW,
ArrayList<ArrayList<Double>> inputSet,
ArrayList<ArrayList<Double>> targetSet,
boolean checkError)
{
for (int i=0 ; i<deltaW.size()-1 ; i++)
this.hiddenLayers.get(i).changeWeights(deltaW.get(i), inputSet, targetSet, checkError);
this.outputLayer.changeWeights(deltaW.get(deltaW.size()-1), inputSet, targetSet, checkError);
}
public int train2 (
ArrayList<ArrayList<Double>> inputSet,
ArrayList<ArrayList<Double>> targetSet,
double maxError,
int maxIterations)
{
ArrayList<Double>
input,
target;
ArrayList<ArrayList<ArrayList<Double>>> prvNetworkDeltaW = null;
double error;
int i = 0, j = 0, traininSetLength = inputSet.size();
do // during each itreration...
{
error = 0.0;
for (j = 0; j < traininSetLength; j++) // ... for each training element...
{
input = inputSet.get(j);
target = targetSet.get(j);
prvNetworkDeltaW = this.train2_bp(input, target, prvNetworkDeltaW); // ... do backpropagation, and return the new weight deltas
error += this.getInputMeanSquareError(input, target);
}
i++;
} while (error > maxError && i < maxIterations); // iterate as much as necessary/possible
return i;
}
public ArrayList<ArrayList<ArrayList<Double>>> train2_bp (
ArrayList<Double> input,
ArrayList<Double> target,
ArrayList<ArrayList<ArrayList<Double>>> prvNetworkDeltaW)
{
ArrayList<ArrayList<Double>> layerSums = this.getAllSums(input); // the sums for each layer
ArrayList<ArrayList<Double>> layerOutputs = this.getAllOutputs(input); // the outputs of each layer
// get the layer deltas (inc the input layer that is null)
ArrayList<ArrayList<Double>> layerDeltas = this.train2_getLayerDeltas(layerSums, layerOutputs, target);
// get the weight deltas
ArrayList<ArrayList<ArrayList<Double>>> networkDeltaW = this.train2_getWeightDeltas(layerOutputs, layerDeltas, prvNetworkDeltaW);
// change the weights
this.train2_updateWeights(networkDeltaW);
return networkDeltaW;
}
public void train2_updateWeights (ArrayList<ArrayList<ArrayList<Double>>> networkDeltaW)
{
for (int i=1; i<this.layers.size(); i++)
this.layers.get(i).train2_updateWeights(networkDeltaW.get(i));
}
public ArrayList<ArrayList<ArrayList<Double>>> train2_getWeightDeltas (
ArrayList<ArrayList<Double>> layerOutputs,
ArrayList<ArrayList<Double>> layerDeltas,
ArrayList<ArrayList<ArrayList<Double>>> prvNetworkDeltaW)
{
ArrayList<ArrayList<ArrayList<Double>>> networkDeltaW = new ArrayList<>(this.layers.size());
ArrayList<ArrayList<Double>> layerDeltaW;
ArrayList<Double> neuronDeltaW;
for (int i=0; i<this.layers.size(); i++)
networkDeltaW.add(new ArrayList<ArrayList<Double>>());
double
deltaW, x, learningRate, prvDeltaW, d;
int i, j, k;
for (i=this.layers.size()-1; i>0; i--) // for each layer
{
learningRate = this.getLearningRate(i);
layerDeltaW = new ArrayList<>();
networkDeltaW.set(i, layerDeltaW);
for (j=0; j<this.layers.get(i).getNeurons().size(); j++) // for each neuron of this layer
{
neuronDeltaW = new ArrayList<>();
layerDeltaW.add(neuronDeltaW);
for (k=0; k<this.layers.get(i-1).getNeurons().size(); k++) // for each weight (i.e. each neuron of the previous layer)
{
d = layerDeltas.get(i).get(j);
x = layerOutputs.get(i-1).get(k);
prvDeltaW = (prvNetworkDeltaW != null)
? prvNetworkDeltaW.get(i).get(j).get(k)
: 0.0;
deltaW = -learningRate * d * x + this.momentum * prvDeltaW;
neuronDeltaW.add(deltaW);
}
// the bias !!
d = layerDeltas.get(i).get(j);
x = 1;
prvDeltaW = (prvNetworkDeltaW != null)
? prvNetworkDeltaW.get(i).get(j).get(prvNetworkDeltaW.get(i).get(j).size()-1)
: 0.0;
deltaW = -learningRate * d * x + this.momentum * prvDeltaW;
neuronDeltaW.add(deltaW);
}
}
return networkDeltaW;
}
ArrayList<ArrayList<Double>> train2_getLayerDeltas (
ArrayList<ArrayList<Double>> layerSums,
ArrayList<ArrayList<Double>> layerOutputs,
ArrayList<Double> target)
{
// get ouput deltas
ArrayList<Double> outputDeltas = new ArrayList<>(); // the output layer deltas
double
oErr, // output error given a target
s, // sum
o, // output
d; // delta
int
nOutputs = target.size(), // #TODO ?== this.outputLayer.size()
nLayers = this.hiddenLayers.size()+2; // #TODO ?== layerOutputs.size()
for (int i=0; i<nOutputs; i++) // for each neuron...
{
s = layerSums.get(nLayers-1).get(i);
o = layerOutputs.get(nLayers-1).get(i);
oErr = (target.get(i) - o);
d = -oErr * this.getNeuron(nLayers-1, i).sigmaPrime(s); // #TODO "s" or "o" ??
outputDeltas.add(d);
}
// get hidden deltas
ArrayList<ArrayList<Double>> hiddenDeltas = new ArrayList<>();
for (int i=0; i<this.hiddenLayers.size(); i++)
hiddenDeltas.add(new ArrayList<Double>());
NLayer nextLayer = this.outputLayer;
ArrayList<Double> nextDeltas = outputDeltas;
int
h, k,
nHidden = this.hiddenLayers.size(),
nNeurons = this.hiddenLayers.get(nHidden-1).getNeurons().size();
double
wdSum = 0.0;
for (int i=nHidden-1; i>=0; i--) // for each hidden layer
{
hiddenDeltas.set(i, new ArrayList<Double>());
for (h=0; h<nNeurons; h++)
{
wdSum = 0.0;
for (k=0; k<nextLayer.getNeurons().size(); k++)
{
wdSum += nextLayer.getNeuron(k).getWeight(h) * nextDeltas.get(k);
}
s = layerSums.get(i+1).get(h);
d = this.getNeuron(i+1, h).sigmaPrime(s) * wdSum;
hiddenDeltas.get(i).add(d);
}
nextLayer = this.hiddenLayers.get(i);
nextDeltas = hiddenDeltas.get(i);
}
ArrayList<ArrayList<Double>> deltas = new ArrayList<>();
// input layer deltas: void
deltas.add(null);
// hidden layers deltas
deltas.addAll(hiddenDeltas);
// output layer deltas
deltas.add(outputDeltas);
return deltas;
}
public double getInputMeanSquareError (ArrayList<Double> input, ArrayList<Double> target)
{
double diff, mse=0.0;
ArrayList<Double> output = this.getOutput(input);
for (int i=0; i<target.size(); i++)
{
diff = target.get(i) - output.get(i);
mse += (diff * diff);
}
mse /= 2.0;
return mse;
}
}
Some methods' names (with their return values/types) are quite self-explanatory, like "this.getAllSums" that returns the sums (sum(x_i*w_i) for each neuron) of each layer, "this.getAllOutputs" that return the outputs (sigmoid(sum) for each neuron) of each layer and "this.getNeuron(i,j)" that returns the j'th neuron of the i'th layer.
Thank you in advance for your help :)
Here is a very simple java implementation with tests in the main method :
import java.util.Arrays;
import java.util.Random;
public class MLP {
public static class MLPLayer {
float[] output;
float[] input;
float[] weights;
float[] dweights;
boolean isSigmoid = true;
public MLPLayer(int inputSize, int outputSize, Random r) {
output = new float[outputSize];
input = new float[inputSize + 1];
weights = new float[(1 + inputSize) * outputSize];
dweights = new float[weights.length];
initWeights(r);
}
public void setIsSigmoid(boolean isSigmoid) {
this.isSigmoid = isSigmoid;
}
public void initWeights(Random r) {
for (int i = 0; i < weights.length; i++) {
weights[i] = (r.nextFloat() - 0.5f) * 4f;
}
}
public float[] run(float[] in) {
System.arraycopy(in, 0, input, 0, in.length);
input[input.length - 1] = 1;
int offs = 0;
Arrays.fill(output, 0);
for (int i = 0; i < output.length; i++) {
for (int j = 0; j < input.length; j++) {
output[i] += weights[offs + j] * input[j];
}
if (isSigmoid) {
output[i] = (float) (1 / (1 + Math.exp(-output[i])));
}
offs += input.length;
}
return Arrays.copyOf(output, output.length);
}
public float[] train(float[] error, float learningRate, float momentum) {
int offs = 0;
float[] nextError = new float[input.length];
for (int i = 0; i < output.length; i++) {
float d = error[i];
if (isSigmoid) {
d *= output[i] * (1 - output[i]);
}
for (int j = 0; j < input.length; j++) {
int idx = offs + j;
nextError[j] += weights[idx] * d;
float dw = input[j] * d * learningRate;
weights[idx] += dweights[idx] * momentum + dw;
dweights[idx] = dw;
}
offs += input.length;
}
return nextError;
}
}
MLPLayer[] layers;
public MLP(int inputSize, int[] layersSize) {
layers = new MLPLayer[layersSize.length];
Random r = new Random(1234);
for (int i = 0; i < layersSize.length; i++) {
int inSize = i == 0 ? inputSize : layersSize[i - 1];
layers[i] = new MLPLayer(inSize, layersSize[i], r);
}
}
public MLPLayer getLayer(int idx) {
return layers[idx];
}
public float[] run(float[] input) {
float[] actIn = input;
for (int i = 0; i < layers.length; i++) {
actIn = layers[i].run(actIn);
}
return actIn;
}
public void train(float[] input, float[] targetOutput, float learningRate, float momentum) {
float[] calcOut = run(input);
float[] error = new float[calcOut.length];
for (int i = 0; i < error.length; i++) {
error[i] = targetOutput[i] - calcOut[i]; // negative error
}
for (int i = layers.length - 1; i >= 0; i--) {
error = layers[i].train(error, learningRate, momentum);
}
}
public static void main(String[] args) throws Exception {
float[][] train = new float[][]{new float[]{0, 0}, new float[]{0, 1}, new float[]{1, 0}, new float[]{1, 1}};
float[][] res = new float[][]{new float[]{0}, new float[]{1}, new float[]{1}, new float[]{0}};
MLP mlp = new MLP(2, new int[]{2, 1});
mlp.getLayer(1).setIsSigmoid(false);
Random r = new Random();
int en = 500;
for (int e = 0; e < en; e++) {
for (int i = 0; i < res.length; i++) {
int idx = r.nextInt(res.length);
mlp.train(train[idx], res[idx], 0.3f, 0.6f);
}
if ((e + 1) % 100 == 0) {
System.out.println();
for (int i = 0; i < res.length; i++) {
float[] t = train[i];
System.out.printf("%d epoch\n", e + 1);
System.out.printf("%.1f, %.1f --> %.3f\n", t[0], t[1], mlp.run(t)[0]);
}
}
}
}
}
I tried going over your code, but as you stated, it was pretty long.
Here's what I suggest:
To verify that your network is learning properly, try to train a simple network, like a network that recognizes the XOR operator. This shouldn't take all that long.
Use the simplest back-propagation algorithm. Stochastic backpropagation (where the weights are updated after the presentation of each training input) is the easiest. Implement the algorithm without the momentum term initially, and with a constant learning rate (i.e., don't start with adaptive learning-rates). Once you're satisfied that the algorithm is working, you can introduce the momentum term. Doing too many things at the same time increases the chances that more than one thing can go wrong. This makes it harder for you to see where you went wrong.
If you want to go over some code, you can check out some code that I wrote; you want to look at Backpropagator.java. I've basically implemented the stochastic backpropagation algorithm with a momentum term. I also have a video where I provide a quick explanation of my implementation of the backpropagation algorithm.
Hopefully this is of some help!