Related
Is it possible to get RMSPE after cross validating a model? I see I can easily get RMSE - but what about the Root Mean Square Percentage Error?
Sample code I've put together with WEKA linear regression cross validation:
// loads data and set class index
final ArrayList<Attribute> attributes = new ArrayList<>();
attributes.add(new Attribute("x"));
attributes.add(new Attribute("y"));
Instances data = new Instances("name", attributes, 0);
data.add(new DenseInstance(1d, new double[]{5, 80}));
// ... add more data
// -c last
data.setClassIndex(data.numAttributes() - 1);
// classifier
final LinearRegression cls = new LinearRegression();
// other options
int seed = 129;
int folds = 3;
// randomize data
Random rand = new Random(seed);
Instances randData = new Instances(data);
randData.randomize(rand);
if (randData.classAttribute().isNominal())
randData.stratify(folds);
// perform cross-validation
Evaluation eval = new Evaluation(data);
eval.crossValidateModel(cls, data, 3, new Random(seed));
System.out.println("rootMeanSquaredError " + eval.rootMeanSquaredError());
System.out.println("rootRelativeSquaredError " + eval.rootRelativeSquaredError());
System.out.println("rootMeanPriorSquaredError " + eval.rootMeanPriorSquaredError());
// output evaluation
System.out.println();
System.out.println("=== Setup ===");
System.out.println("Classifier: " + cls.getClass().getName() + " " + Utils.joinOptions(cls.getOptions()));
System.out.println("Dataset: " + data.relationName());
System.out.println("Folds: " + folds);
System.out.println("Seed: " + seed);
System.out.println();
System.out.println(eval.toSummaryString("=== " + folds + "-fold Cross-validation ===", true));
/*
=== Setup ===
Classifier: weka.classifiers.functions.LinearRegression -S 0 -R 1.0E-8 -num-decimal-places 4
Dataset: name
Folds: 3
Seed: 129
=== 3-fold Cross-validation ===
Correlation coefficient 0.6289
Mean absolute error 7.5177
Root mean squared error 8.262
Relative absolute error 85.7748 %
Root relative squared error 77.9819 %
Total Number of Instances 15
*/
Weka doesn't compute the RMSPE by default. I've put together a little Weka package that should do the trick for numeric classes (NB: only done limited testing), called rmspe-weka-package.
After an evaluation run (with that package installed), you should be able to retrieve the statistic as follows:
Evaluation eval = ... // initialize your evaluation object
... // perform your evaluation
double rmspe = eval.getPluginMetric("weka.classifiers.evaluation.RMSPE").getStatistic("RMSPE");
I have been trying to build a CNN model using dl4j, but it is giving me an error. The code:
RecordReader rr;
rr = new CSVRecordReader();
rr.initialize(new FileSplit(new File(dataLocalPath, nameTrain)));
DataSetIterator trainIter = new RecordReaderDataSetIterator(rr, batchSize, 0, 2);
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder().seed(123).updater(new Adam(0.1))
.list()
.layer(0,new Convolution1DLayer.Builder().kernelSize(3).padding(1).nIn(371).nOut(64).build())
.layer(1,new Subsampling1DLayer.Builder().kernelSize(3).padding(1).build())
.layer(2,new Convolution1DLayer.Builder().kernelSize(3).activation(Activation.RELU).padding(1).nIn(64)
.nOut(32).build())
.layer(3,new Subsampling1DLayer.Builder().kernelSize(3).padding(1).build())
.layer(4,new DenseLayer.Builder().activation(Activation.RELU).nIn(32).nOut(16).build())
.layer(5,new OutputLayer.Builder(LossFunction.RECONSTRUCTION_CROSSENTROPY).activation(Activation.SIGMOID)
.nIn(16).nOut(2).build())
.build();
MultiLayerNetwork model = new MultiLayerNetwork(conf);
model.init();
model.setListeners(new ScoreIterationListener(10));
final DataSet trainData = trainIter.next();
INDArray a = trainData.getFeatures();
final INDArray b = trainData.getLabels();
a = a.reshape(new int[] { (int) a.size(0), (int) a.size(1), 1 });
model.fit(a, b);
Added the error below,
Exception in thread "main" org.deeplearning4j.exception.DL4JInvalidInputException: Input that is not a matrix; expected matrix (rank 2), got rank 3 array with shape [128, 32, 1]. Missing preprocessor or wrong input type? (layer name: layer4, layer index: 4, layer type: DenseLayer)
at org.deeplearning4j.nn.layers.BaseLayer.preOutputWithPreNorm(BaseLayer.java:306)
at org.deeplearning4j.nn.layers.BaseLayer.preOutput(BaseLayer.java:289)
at org.deeplearning4j.nn.layers.BaseLayer.activate(BaseLayer.java:337)
at org.deeplearning4j.nn.layers.AbstractLayer.activate(AbstractLayer.java:257)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.ffToLayerActivationsInWs(MultiLayerNetwork.java:1129)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.computeGradientAndScore(MultiLayerNetwork.java:2741)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.computeGradientAndScore(MultiLayerNetwork.java:2699)
at org.deeplearning4j.optimize.solvers.BaseOptimizer.gradientAndScore(BaseOptimizer.java:170)
at org.deeplearning4j.optimize.solvers.StochasticGradientDescent.optimize(StochasticGradientDescent.java:63)
at org.deeplearning4j.optimize.Solver.optimize(Solver.java:52)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.fitHelper(MultiLayerNetwork.java:2303)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.fit(MultiLayerNetwork.java:2261)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.fit(MultiLayerNetwork.java:2248)
at com.rssoftware.efrm.AnnModelFromKeras.trainModel(AnnModelFromKeras.java:73)
at com.rssoftware.efrm.AnnModelFromKeras.main(AnnModelFromKeras.java:89)
I have tried using input pre-processor, CNN to feed forward pre processor, but it is not working.
The expected input shape into a conv1d layer is [minibatchSize, convNIn, length] or [minibatchSize, featuresSize, sequenceLength] in terms of a time series. The reshape in your code sets your length to 1. Maybe you intended to set featuresize/convNIn to 1?
When I run below code:
model.output(samples).getDouble(0);
I receive error:
org.deeplearning4j.exception.DL4JInvalidInputException:
Cannot do forward pass in Convolution layer (layer name = conv1d_1, layer index = 0): input array channels does not match CNN layer configuration
(data input channels = 80, [minibatch,inputDepth,height,width]=[1, 80, 3, 1]; expected input channels = 3)
(layer name: conv1d_1, layer index: 0, layer type: Convolution1DLayer)
I created the data as float[] array, with length = 240.
Creation of INDArray:
INDArray features = Nd4j.create(data, new int[]{1, 240}, 'c');
Here is my keras model:
model = Sequential()
model.add(Reshape((const.PERIOD, const.N_FEATURES), input_shape=(240,)))
model.add(Conv1D(100, 10, activation='relu', input_shape=(const.PERIOD, const.N_FEATURES)))
model.add(Conv1D(100, 10, activation='relu'))
model.add(MaxPooling1D(const.N_FEATURES))
model.add(Conv1D(160, 10, activation='relu'))
model.add(Conv1D(160, 10, activation='relu'))
model.add(Flatten())
model.add(Dropout(0.5))
model.add(Dense(7, activation='softmax'))
model.summary()
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
Where PERIOD = 80, N_FEATURES = 3
If I set shape as:
INDArray features = Nd4j.create(data, new int[]{240, 1});
Then error is:
IllegalStateException: Input shape [240, 1] and output shape[240, 1] do not match
at org.deeplearning4j.nn.modelimport.keras.preprocessors.ReshapePreprocessor.preProcess(ReshapePreprocessor.java:103)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.outputOfLayerDetached(MultiLayerNetwork.java:1256)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2340)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2303)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2294)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2281)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.output(MultiLayerNetwork.java:2377)
Can you file an issue? This looks like a bug. Thanks.
https://github.com/eclipse/deeplearning4j/issues
I have a problem and I can't find what it's wrong with my code.
From beginning. I have application to recognize digits and math operators, the basic one +, -, /, *. I have a model ready which creates a graph. This graph is imported by an Android application. In my opinion, it should work but I don't know why it isn't?
Model.py
import tensorflow as tf
import numpy as np
import _pickle as pickle
from matplotlib import pyplot as plt
from tensorflow.examples.tutorials.mnist import input_data
# mnist_data = input_data.read_data_sets('MNIST_data', one_hot=True)
# read CROHME_test_data
with open("test.pickle", 'rb') as f:
data = pickle.load(f)
# np.random.shuffle(data)
CROHME_test_data = {"features": np.array([d["features"] for d in data]), "labels": np.array([d["label"] for d in data])}
# read CROHME_train_data
with open("train.pickle", 'rb') as f:
data = pickle.load(f)
# np.random.shuffle(data)
CROHME_train_data = {"features": np.array([d["features"] for d in data]),
"labels": np.array([d["label"] for d in data])}
# function for pulling batches from custom data
def next_batch(num, data):
idx = np.arange(0, len(data["features"]))
np.random.shuffle(idx)
idx = idx[:num]
features_shuffle = [data["features"][i] for i in idx]
labels_shuffle = [data["labels"][i] for i in idx]
return np.asarray(features_shuffle), np.asarray(labels_shuffle)
# Function to create a weight neuron using a random number. Training will assign a real weight later
def weight_variable(shape, name):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial, name=name)
# Function to create a bias neuron. Bias of 0.1 will help to prevent any 1 neuron from being chosen too often
def biases_variable(shape, name):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial, name=name)
# Function to create a convolutional neuron. Convolutes input from 4d to 2d. This helps streamline inputs
def conv_2d(x, W, name):
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME', name=name)
# Function to create a neuron to represent the max input. Helps to make the best prediction for what comes next
def max_pool(x, name):
return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME', name=name)
labels_number = 14
# A way to input images (as 784 element arrays of pixel values 0 - 1)
x_input = tf.placeholder(dtype=tf.float32, shape=[None, 784], name='x_input')
# A way to input labels to show model what the correct answer is during training
y_input = tf.placeholder(dtype=tf.float32, shape=[None, labels_number], name='y_input')
# First convolutional layer - reshape/resize images
# A weight variable that examines batches of 5x5 pixels, returns 32 features (1 feature per bit value in 32 bit float)
W_conv1 = weight_variable([5, 5, 1, 32], 'W_conv1')
# Bias variable to add to each of the 32 features
b_conv1 = biases_variable([32], 'b_conv1')
# Reshape each input image into a 28 x 28 x 1 pixel matrix
x_image = tf.reshape(x_input, [-1, 28, 28, 1], name='x_image')
# Flattens filter (W_conv1) to [5 * 5 * 1, 32], multiplies by [None, 28, 28, 1] to associate each 5x5 batch with the
# 32 features, and adds biases
h_conv1 = tf.nn.relu(conv_2d(x_image, W_conv1, name='conv1') + b_conv1, name='h_conv1')
# Takes windows of size 2x2 and computes a reduction on the output of h_conv1 (computes max, used for better prediction)
# Images are reduced to size 14 x 14 for analysis
h_pool1 = max_pool(h_conv1, name='h_pool1')
# Second convolutional layer, reshape/resize images
# Does mostly the same as above but converts each 32 unit output tensor from layer 1 to a 64 feature tensor
W_conv2 = weight_variable([5, 5, 32, 64], 'W_conv2')
b_conv2 = biases_variable([64], 'b_conv2')
h_conv2 = tf.nn.relu(conv_2d(h_pool1, W_conv2, name='conv2') + b_conv2, name='h_conv2')
# Images at this point are reduced to size 7 x 7 for analysis
h_pool2 = max_pool(h_conv2, name='h_pool2')
# First dense layer, performing calculation based on previous layer output
# Each image is 7 x 7 at the end of the previous section and outputs 64 features, we want 32 x 32 neurons = 1024
W_dense1 = weight_variable([7 * 7 * 64, 1024], name='W_dense1')
# bias variable added to each output feature
b_dense1 = biases_variable([1024], name='b_dense1')
# Flatten each of the images into size [None, 7 x 7 x 64]
h_pool_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64], name='h_pool_flat')
# Multiply weights by the outputs of the flatten neuron and add biases
h_dense1 = tf.nn.relu(tf.matmul(h_pool_flat, W_dense1, name='matmul_dense1') + b_dense1, name='h_dense1')
# Dropout layer prevents overfitting or recognizing patterns where none exist
# Depending on what value we enter into keep_prob, it will apply or not apply dropout layer
keep_prob = tf.placeholder(dtype=tf.float32, name='keep_prob')
# Dropout layer will be applied during training but not testing or predicting
h_drop1 = tf.nn.dropout(h_dense1, keep_prob, name='h_drop1')
# Readout layer used to format output
# Weight variable takes inputs from each of the 1024 neurons from before and outputs an array of 14 elements
W_readout1 = weight_variable([1024, labels_number], name='W_readout1')
# Apply bias to each of the 14 outputs
b_readout1 = biases_variable([labels_number], name='b_readout1')
# Perform final calculation by multiplying each of the neurons from dropout layer by weights and adding biases
y_readout1 = tf.add(tf.matmul(h_drop1, W_readout1, name='matmul_readout1'), b_readout1, name='y_readout1')
# Softmax cross entropy loss function compares expected answers (labels) vs actual answers (logits)
cross_entropy_loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_input, logits=y_readout1))
# Adam optimizer aims to minimize loss
train_step = tf.train.AdamOptimizer(0.0001).minimize(cross_entropy_loss)
# Compare actual vs expected outputs to see if highest number is at the same index, true if they match and false if not
correct_prediction = tf.equal(tf.argmax(y_input, 1), tf.argmax(y_readout1, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
# Used to save the graph and weights
saver = tf.train.Saver()
# Run in with statement so session only exists within it
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
# Save the graph shape and node names to pbtxt file
tf.train.write_graph(sess.graph_def, '.', 'advanced_mnist.pbtxt', False)
# Train the model, running through data 20000 times in batches of 50
# Print out step # and accuracy every 100 steps and final accuracy at the end of training
# Train by running train_step and apply dropout by setting keep_prob to 0.5
for i in range(1000):
batch = next_batch(50, CROHME_train_data)
if i % 100 == 0:
train_accuracy = accuracy.eval(feed_dict={x_input: batch[0], y_input: batch[1], keep_prob: 1.0})
print("step %d, training accuracy %g" % (i, train_accuracy))
train_step.run(feed_dict={x_input: batch[0], y_input: batch[1], keep_prob: 0.5})
print("test accuracy %g" % accuracy.eval(feed_dict={x_input: CROHME_test_data["features"],
y_input: CROHME_test_data["labels"], keep_prob: 1.0}))
# Save the session with graph shape and node weights
saver.save(sess, './advanced_mnist.ckpt')
# Make a prediction of random image
img_num = np.random.randint(0, len(CROHME_test_data["features"]))
print("expected :", CROHME_test_data["labels"][img_num])
print("predicted :",
sess.run(y_readout1, feed_dict={x_input: [CROHME_test_data["features"][img_num]], keep_prob: 1.0}))
for j in range(28):
print(CROHME_test_data["features"][img_num][j * 28:(j + 1) * 28])
# this code can be used for image displaying
first_image = CROHME_test_data["features"][img_num]
first_image = np.array(first_image, dtype='float')
pixels = first_image.reshape((28, 28))
plt.imshow(pixels, cmap='gray')
plt.show()
Graph.py
import tensorflow as tf
from tensorflow.python.tools import freeze_graph, optimize_for_inference_lib
# Saving the graph as a pb file taking data from pbtxt and ckpt files and providing a few operations
freeze_graph.freeze_graph('advanced_mnist.pbtxt',
'',
True,
'advanced_mnist.ckpt',
'y_readout1',
'save/restore_all',
'save/Const:0',
'frozen_advanced_mnist.pb',
True,
'')
# Read the data form the frozen graph pb file
input_graph_def = tf.GraphDef()
with tf.gfile.Open('frozen_advanced_mnist.pb', 'rb') as f:
data = f.read()
input_graph_def.ParseFromString(data)
# Optimize the graph with input and output nodes
output_graph_def = optimize_for_inference_lib.optimize_for_inference(
input_graph_def,
['x_input', 'keep_prob'],
['y_readout1'],
tf.float32.as_datatype_enum)
# Save the optimized graph to the optimized pb file
f = tf.gfile.FastGFile('optimized_advanced_mnist.pb', 'w')
f.write(output_graph_def.SerializeToString())
MainActivity
package com.example.owl.advanced_mnist_android;
import android.graphics.Bitmap;
import android.graphics.BitmapFactory;
import android.support.v7.app.AppCompatActivity;
import android.os.Bundle;
import android.view.View;
import android.widget.ImageView;
import android.widget.TextView;
import org.tensorflow.contrib.android.TensorFlowInferenceInterface;
public class MainActivity extends AppCompatActivity {
ImageView imageView;
TextView resultsTextView;
static {
System.loadLibrary("tensorflow_inference");
}
private static final String MODEL_FILE = "file:///android_asset/optimized_advanced_mnist.pb";
private static final String INPUT_NODE = "x_input";
private static final int[] INPUT_SIZE = {1, 784};
private static final String KEEP_PROB = "keep_prob";
private static final int[] KEEP_PROB_SIZE = {1};
private static final String OUTPUT_NODE = "y_readout1";
private TensorFlowInferenceInterface inferenceInterface;
private int imageIndex = 10;
private int[] imageResourceIDs = {
R.drawable.digit0,
R.drawable.digit1,
R.drawable.digit2,
R.drawable.digit3,
R.drawable.digit4,
R.drawable.digit5,
R.drawable.digit6,
R.drawable.digit7,
R.drawable.digit8,
R.drawable.digit9,
R.drawable.rsz_przechwytywanie,
};
#Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
imageView = (ImageView) findViewById(R.id.image_view);
resultsTextView = (TextView) findViewById(R.id.results_text_view);
inferenceInterface = new TensorFlowInferenceInterface();
inferenceInterface.initializeTensorFlow(getAssets(), MODEL_FILE);
}
public void loadImageAction(View view) {
imageIndex = (imageIndex == 10) ? 0 : imageIndex + 1 ;
imageView.setImageResource(imageResourceIDs[imageIndex]);
}
public void predictDigitAction(View view) {
float[] pixelBuffer = convertImage();
float[] results = predictDigit(pixelBuffer);
formatResults(results);
}
private void formatResults(float[] results) {
float max = 0;
float secondMax = 0;
int maxIndex = 0;
int secondMaxIndex = 0;
for (int i = 0; i < 15; i++) {
if (results[i] > max) {
secondMax = max;
secondMaxIndex = maxIndex;
max = results[i];
maxIndex = i;
} else if (results[i] < max && results[i] > secondMax) {
secondMax = results[i];
secondMaxIndex = i;
}
}
String output = "Model predicts: " + String.valueOf(maxIndex) +
", second choice: " + String.valueOf(secondMaxIndex);
resultsTextView.setText(output);
}
private float[] predictDigit(float[] pixelBuffer) {
inferenceInterface.fillNodeFloat(INPUT_NODE, INPUT_SIZE, pixelBuffer);
inferenceInterface.fillNodeFloat(KEEP_PROB, KEEP_PROB_SIZE, new float[] {0.5f});
inferenceInterface.runInference(new String[] {OUTPUT_NODE});
float[] outputs = new float[14];
inferenceInterface.readNodeFloat(OUTPUT_NODE, outputs);
return outputs;
}
private float[] convertImage() {
Bitmap bitmap = BitmapFactory.decodeResource(getResources(), imageResourceIDs[imageIndex]);
Bitmap scaledBitmap = Bitmap.createScaledBitmap(bitmap, 28, 28, true);
imageView.setImageBitmap(scaledBitmap);
int[] intArray = new int[784];
float[] floatArray = new float[784];
scaledBitmap.getPixels(intArray, 0, 28, 0, 0, 28, 28);
for (int i = 0; i < 784; i++) {
floatArray[i] = intArray[i] / -16777216;
}
return floatArray;
}
}
Everything looks good, but I have an error:
E/AndroidRuntime: FATAL EXCEPTION: main
Process: com.example.owl.advanced_mnist_android, PID: 6855
java.lang.IllegalStateException: Could not execute method for android:onClick
at android.support.v7.app.AppCompatViewInflater$DeclaredOnClickListener.onClick(AppCompatViewInflater.java:293)
at android.view.View.performClick(View.java:6891)
at android.widget.TextView.performClick(TextView.java:12651)
at android.view.View$PerformClick.run(View.java:26083)
at android.os.Handler.handleCallback(Handler.java:789)
at android.os.Handler.dispatchMessage(Handler.java:98)
at android.os.Looper.loop(Looper.java:164)
at android.app.ActivityThread.main(ActivityThread.java:6938)
at java.lang.reflect.Method.invoke(Native Method)
at com.android.internal.os.Zygote$MethodAndArgsCaller.run(Zygote.java:327)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1374)
Caused by: java.lang.reflect.InvocationTargetException
at java.lang.reflect.Method.invoke(Native Method)
at android.support.v7.app.AppCompatViewInflater$DeclaredOnClickListener.onClick(AppCompatViewInflater.java:288)
at android.view.View.performClick(View.java:6891)
at android.widget.TextView.performClick(TextView.java:12651)
at android.view.View$PerformClick.run(View.java:26083)
at android.os.Handler.handleCallback(Handler.java:789)
at android.os.Handler.dispatchMessage(Handler.java:98)
at android.os.Looper.loop(Looper.java:164)
at android.app.ActivityThread.main(ActivityThread.java:6938)
at java.lang.reflect.Method.invoke(Native Method)
at com.android.internal.os.Zygote$MethodAndArgsCaller.run(Zygote.java:327)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1374)
Caused by: java.lang.IllegalArgumentException: No Operation named [x_input] in the Graph
at org.tensorflow.Session$Runner.operationByName(Session.java:297)
at org.tensorflow.Session$Runner.feed(Session.java:115)
at org.tensorflow.contrib.android.TensorFlowInferenceInterface.addFeed(TensorFlowInferenceInterface.java:437)
at org.tensorflow.contrib.android.TensorFlowInferenceInterface.fillNodeFloat(TensorFlowInferenceInterface.java:186)
at com.example.owl.advanced_mnist_android.MainActivity.predictDigit(MainActivity.java:89)
at com.example.owl.advanced_mnist_android.MainActivity.predictDigitAction(MainActivity.java:63)
at java.lang.reflect.Method.invoke(Native Method)
at android.support.v7.app.AppCompatViewInflater$DeclaredOnClickListener.onClick(AppCompatViewInflater.java:288)
at android.view.View.performClick(View.java:6891)
at android.widget.TextView.performClick(TextView.java:12651)
at android.view.View$PerformClick.run(View.java:26083)
at android.os.Handler.handleCallback(Handler.java:789)
at android.os.Handler.dispatchMessage(Handler.java:98)
at android.os.Looper.loop(Looper.java:164)
at android.app.ActivityThread.main(ActivityThread.java:6938)
at java.lang.reflect.Method.invoke(Native Method)
at com.android.internal.os.Zygote$MethodAndArgsCaller.run(Zygote.java:327)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1374)
I have developed a Tensorflow model with python in Linux based on the tutorial here: "http://cv-tricks.com/tensorflow-tutorial/training-convolutional-neural-network-for-image-classification/". I trained and saved the model using "tf.train.Saver". I am able to deploy the model in Linux environment and perform prediction successfully. Now I need to be able to load this saved model in JAVA on WINDOWS. Through extensive research online I have read that it does not work with "tf.train.Saver" and I have to change my code to use "Serving" to be able to load a saved TF model in java! Therefore, I followed the tutorial here:
"https://github.com/tensorflow/serving/blob/master/tensorflow_serving/example/mnist_saved_model.py
" and changed my code. However, I have an error with "tf.FixedLenFeature" where it is asking me to use "FixedLenSequenceFeature". Here is the complete error message:
"ValueError: First dimension of shape for feature x unknown. Consider using FixedLenSequenceFeature."
which is happening here:
feature_configs = {'x': tf.FixedLenFeature(shape=[None, img_size,img_size,num_channels], dtype=tf.float32),}
I am not sure this is the right path to go since I have batch of images of size [batchsize*128*128*3] and should not be using the sequence feature! It would be great if someone could clear this out for me and answer these questions:
1- Do I have to change my code from "tf.train.Saver" to "serving" to be able to load the saved model and deploy it in JAVA?
2- If the answer to the above question is yes, how can I feed the data correctly and solve the aforementioned ERROR?
3- Is there any example of how to DEPLOY the model that was saved using "serving"?
Here is my training code that throws the error:
import dataset
import tensorflow as tf
import time
from datetime import timedelta
import math
import random
import numpy as np
import os
#Adding Seed so that random initialization is consistent
from numpy.random import seed
seed(1)
from tensorflow import set_random_seed
set_random_seed(2)
batch_size = 32
#Prepare input data
classes = ['class1','class2','class3']
num_classes = len(classes)
# 20% of the data will automatically be used for validation
validation_size = 0.2
img_size = 128
num_channels = 3
train_path='/home/user1/Downloads/Expression/Augmented/Data/Train'
# We shall load all the training and validation images and labels into memory using openCV and use that during training
data = dataset.read_train_sets(train_path, img_size, classes, validation_size=validation_size)
print("Complete reading input data. Will Now print a snippet of it")
print("Number of files in Training-set:\t\t{}".format(len(data.train.labels)))
print("Number of files in Validation-set:\t{}".format(len(data.valid.labels)))
session = tf.Session()
serialized_tf_example = tf.placeholder(tf.string, name='tf_example')
feature_configs = {'x': tf.FixedLenFeature(shape=[None, img_size,img_size,num_channels], dtype=tf.float32),}
tf_example = tf.parse_example(serialized_tf_example, feature_configs)
x = tf.identity(tf_example['x'], name='x') # use tf.identity() to assign name
# x = tf.placeholder(tf.float32, shape=[None, img_size,img_size,num_channels], name='x')
## labels
y_true = tf.placeholder(tf.float32, shape=[None, num_classes], name='y_true')
y_true_cls = tf.argmax(y_true, dimension=1)
##Network graph params
filter_size_conv1 = 3
num_filters_conv1 = 32
filter_size_conv2 = 3
num_filters_conv2 = 32
filter_size_conv3 = 3
num_filters_conv3 = 64
fc_layer_size = 128
def create_weights(shape):
return tf.Variable(tf.truncated_normal(shape, stddev=0.05))
def create_biases(size):
return tf.Variable(tf.constant(0.05, shape=[size]))
def create_convolutional_layer(input,
num_input_channels,
conv_filter_size,
num_filters):
## We shall define the weights that will be trained using create_weights function.
weights = create_weights(shape=[conv_filter_size, conv_filter_size, num_input_channels, num_filters])
## We create biases using the create_biases function. These are also trained.
biases = create_biases(num_filters)
## Creating the convolutional layer
layer = tf.nn.conv2d(input=input,
filter=weights,
strides=[1, 1, 1, 1],
padding='SAME')
layer += biases
## We shall be using max-pooling.
layer = tf.nn.max_pool(value=layer,
ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1],
padding='SAME')
## Output of pooling is fed to Relu which is the activation function for us.
layer = tf.nn.relu(layer)
return layer
def create_flatten_layer(layer):
#We know that the shape of the layer will be [batch_size img_size img_size num_channels]
# But let's get it from the previous layer.
layer_shape = layer.get_shape()
## Number of features will be img_height * img_width* num_channels. But we shall calculate it in place of hard-coding it.
num_features = layer_shape[1:4].num_elements()
## Now, we Flatten the layer so we shall have to reshape to num_features
layer = tf.reshape(layer, [-1, num_features])
return layer
def create_fc_layer(input,
num_inputs,
num_outputs,
use_relu=True):
#Let's define trainable weights and biases.
weights = create_weights(shape=[num_inputs, num_outputs])
biases = create_biases(num_outputs)
# Fully connected layer takes input x and produces wx+b.Since, these are matrices, we use matmul function in Tensorflow
layer = tf.matmul(input, weights) + biases
if use_relu:
layer = tf.nn.relu(layer)
return layer
layer_conv1 = create_convolutional_layer(input=x,
num_input_channels=num_channels,
conv_filter_size=filter_size_conv1,
num_filters=num_filters_conv1)
layer_conv2 = create_convolutional_layer(input=layer_conv1,
num_input_channels=num_filters_conv1,
conv_filter_size=filter_size_conv2,
num_filters=num_filters_conv2)
layer_conv3= create_convolutional_layer(input=layer_conv2,
num_input_channels=num_filters_conv2,
conv_filter_size=filter_size_conv3,
num_filters=num_filters_conv3)
layer_flat = create_flatten_layer(layer_conv3)
layer_fc1 = create_fc_layer(input=layer_flat,
num_inputs=layer_flat.get_shape()[1:4].num_elements(),
num_outputs=fc_layer_size,
use_relu=True)
layer_fc2 = create_fc_layer(input=layer_fc1,
num_inputs=fc_layer_size,
num_outputs=num_classes,
use_relu=False)
y_pred = tf.nn.softmax(layer_fc2,name='y_pred')
y_pred_cls = tf.argmax(y_pred, dimension=1)
values, indices = tf.nn.top_k(y_pred, 3)
table = tf.contrib.lookup.index_to_string_table_from_tensor(
tf.constant([str(i) for i in xrange(3)]))
prediction_classes = table.lookup(tf.to_int64(indices))
session.run(tf.global_variables_initializer())
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=layer_fc2,
labels=y_true)
cost = tf.reduce_mean(cross_entropy)
optimizer = tf.train.AdamOptimizer(learning_rate=1e-4).minimize(cost)
correct_prediction = tf.equal(y_pred_cls, y_true_cls)
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
session.run(tf.global_variables_initializer())
def show_progress(epoch, feed_dict_train, feed_dict_validate, val_loss):
acc = session.run(accuracy, feed_dict=feed_dict_train)
val_acc = session.run(accuracy, feed_dict=feed_dict_validate)
msg = "Training Epoch {0} --- Training Accuracy: {1:>6.1%}, Validation Accuracy: {2:>6.1%}, Validation Loss: {3:.3f}"
print(msg.format(epoch + 1, acc, val_acc, val_loss))
total_iterations = 0
# saver = tf.train.Saver()
def train(num_iteration):
global total_iterations
for i in range(total_iterations,
total_iterations + num_iteration):
x_batch, y_true_batch, _, cls_batch = data.train.next_batch(batch_size)
x_valid_batch, y_valid_batch, _, valid_cls_batch = data.valid.next_batch(batch_size)
feed_dict_tr = {x: x_batch,
y_true: y_true_batch}
feed_dict_val = {x: x_valid_batch,
y_true: y_valid_batch}
session.run(optimizer, feed_dict=feed_dict_tr)
if i % int(data.train.num_examples/batch_size) == 0:
print(i)
val_loss = session.run(cost, feed_dict=feed_dict_val)
epoch = int(i / int(data.train.num_examples/batch_size))
show_progress(epoch, feed_dict_tr, feed_dict_val, val_loss)
print("Saving the model Now!")
# saver.save(session, save_path_full, global_step=i)
total_iterations += num_iteration
train(num_iteration=10000)#3000
# Export model
# WARNING(break-tutorial-inline-code): The following code snippet is
# in-lined in tutorials, please update tutorial documents accordingly
# whenever code changes.
export_path_base = './SavedModel/'
export_path = os.path.join(
tf.compat.as_bytes(export_path_base),
tf.compat.as_bytes(str(1)))
print 'Exporting trained model to', export_path
builder = tf.saved_model.builder.SavedModelBuilder(export_path)
# Build the signature_def_map.
classification_inputs = tf.saved_model.utils.build_tensor_info(
serialized_tf_example)
classification_outputs_classes = tf.saved_model.utils.build_tensor_info(
prediction_classes)
classification_outputs_scores = tf.saved_model.utils.build_tensor_info(values)
classification_signature = (
tf.saved_model.signature_def_utils.build_signature_def(
inputs={
tf.saved_model.signature_constants.CLASSIFY_INPUTS:
classification_inputs
},
outputs={
tf.saved_model.signature_constants.CLASSIFY_OUTPUT_CLASSES:
classification_outputs_classes,
tf.saved_model.signature_constants.CLASSIFY_OUTPUT_SCORES:
classification_outputs_scores
},
method_name=tf.saved_model.signature_constants.CLASSIFY_METHOD_NAME))
tensor_info_x = tf.saved_model.utils.build_tensor_info(x)
tensor_info_y = tf.saved_model.utils.build_tensor_info(y_pred)
prediction_signature = (
tf.saved_model.signature_def_utils.build_signature_def(
inputs={'images': tensor_info_x},
outputs={'scores': tensor_info_y},
method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME))
legacy_init_op = tf.group(tf.tables_initializer(), name='legacy_init_op')
builder.add_meta_graph_and_variables(
sess, [tf.saved_model.tag_constants.SERVING],
signature_def_map={
'predict_images':
prediction_signature,
tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY:
classification_signature,
},
legacy_init_op=legacy_init_op)
builder.save()
print 'Done exporting!'