I have the training and test "labeled.arff" files. Then I build a classifier and write to a "modelFile.model" file.
I have a "unlabeled.arff" file with the last attribute in each row "?".
How can I make the prediction in Java or C#?
I have some code but it is not right, always gives me the same prediction.
Thank you
// Write to Model
public static void Classify()
{
Instances train = new Instances(new java.io.FileReader(dirTrain + "labeled.arff"));
Instances test = new Instances(new java.io.FileReader(dirTest + "labeled.arff"));
train.setClassIndex(train.numAttributes() - 1);
test.setClassIndex(test.numAttributes() - 1);
// train Classifier
Classifier cl = new J48();
// Randomize the order of the instances in the dataset
weka.filters.Filter myRandom = new weka.filters.unsupervised.instance.Randomize();
myRandom.setInputFormat(train);
train = weka.filters.Filter.useFilter(train, myRandom);
// Build the classifier
cl.buildClassifier(train);
// evaluate classifier and print some statistics
Evaluation eval = new Evaluation(train);
eval.evaluateModel(cl, test);
Console.WriteLine(eval.toSummaryString("\nResults Decision Tree\n======\n", false));
SerializationHelper.write(dirModel + "modelFile.model", cl);
}
// Make predictions
public void Predictions()
{
Classifier cl = (Classifier)SerializationHelper.read(dirModel + "modelFile.model");
// load unlabeled data
Instances unlabeled = new Instances(new java.io.FileReader(pathFeatures + "unlabeled.arff"));
// set class attribute
unlabeled.setClassIndex(unlabeled.numAttributes() - 1);
// create copy
Instances labeled = new Instances(unlabeled);
// label instances
for (int i = 0; i < unlabeled.numInstances(); i++)
{
double clsLabel = cl.classifyInstance(unlabeled.instance(i));
labeled.instance(i).setClassValue(clsLabel);
}
int numCorrect = 0;
for (int i = 0; i < unlabeled.numInstances(); i++)
{
double pred = cl.classifyInstance(unlabeled.instance(i));
Console.Write("ID: " + unlabeled.instance(i).value(i));
//Console.Write(", actual: " + unlabeled.classAttribute().value((int)unlabeled.instance(i).classValue()));
Console.WriteLine(", predicted: " + unlabeled.classAttribute().value((int)pred));
}
Console.WriteLine("Correct predictions: " + numCorrect);
}
Related
I'm trying to train a model using deep learning in java, when I start training the train data it gives an error
Invalid classification data: expect label value (at label index column = 0) to be in range 0 to 1 inclusive (0 to numClasses-1, with numClasses=2); got label value of 2
I didn't understand the error since I am a beginner in deep learning 4j. I am using a data set which views relationship between two people (if there is a relationship between two people then the class label is going to be 1 otherwise 0).
The Java code
public class SNA {
private static Logger log = LoggerFactory.getLogger(SNA.class);
public static void main(String[] args) throws Exception {
int seed = 123;
double learningRate = 0.01;
int batchSize = 50;
int nEpochs = 30;
int numInputs = 2;
int numOutputs = 2;
int numHiddenNodes = 20;
//load the training data
RecordReader rr = new CSVRecordReader(0,",");
rr.initialize(new FileSplit(new File("C:\\Users\\GTS\\Desktop\\SNA project\\experiments\\First experiment\\train\\slashdotTrain.csv")));
DataSetIterator trainIter = new RecordReaderDataSetIterator(rr, batchSize,0, 2);
// load test data
RecordReader rrTest = new CSVRecordReader();
rr.initialize(new FileSplit(new File("C:\\Users\\GTS\\Desktop\\SNA project\\experiments\\First experiment\\test\\slashdotTest.csv")));
DataSetIterator testIter = new RecordReaderDataSetIterator(rrTest, batchSize,0, 2);
log.info("**** Building Model ****");
MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
.seed(seed)
.optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
.iterations(1)
.learningRate(learningRate)
.updater(Updater.NESTEROVS).momentum(0.9)
.list()
.layer(0, new DenseLayer.Builder()
.nIn(numInputs)
.nOut(numHiddenNodes)
.activation("relu")
.weightInit(WeightInit.XAVIER)
.build())
.layer(1, new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
.activation("softmax")
.weightInit(WeightInit.XAVIER)
.nIn(numHiddenNodes)
.nOut(numOutputs)
.build())
.pretrain(false).backprop(true)
.build();
MultiLayerNetwork model = new MultiLayerNetwork(conf);
model.init();
// Listener to show how the network is training in the log
model.setListeners(new ScoreIterationListener(10));
log.info(" **** Train Model **** ");
for (int i = 0; i < nEpochs; i++) {
model.fit(trainIter);
}
System.out.println("**** Evaluate Model ****");
Evaluation evaluation = new Evaluation(numOutputs);
while (testIter.hasNext()) {
DataSet t = testIter.next();
INDArray feature = t.getFeatureMatrix();
INDArray labels = t.getLabels();
INDArray predicted = model.output(feature, false);
evaluation.eval(labels, predicted);
}
System.out.println(evaluation.stats());
}
}
Any help Please?
Thanks A lot
problem solved:
Change the third parameter of RecordReaderDataSetIterator in
DataSetIterator testIter = new RecordReaderDataSetIterator(rrTest, batchSize,0, 2); from 0 to 2; because the data set has three columns and the index of the class label is 2 because its the third column.
solution:
DataSetIterator trainIter = new RecordReaderDataSetIterator(rr, batchSize,2, 2);
refrences:
enter link description here
I am trying to get a learning curve for an automated weka experiment. I currently have the following java code.
public static void EvaluateModel(AbstractClassifier cl, String datapath, String outfile) throws Exception {
Experiment exp = new Experiment();
ClassifierSplitEvaluator se = new ClassifierSplitEvaluator();
se.setClassifier(cl);
Classifier sec = ((ClassifierSplitEvaluator) se).getClassifier();
CrossValidationResultProducer cvrp = new CrossValidationResultProducer();
cvrp.setNumFolds(10);
cvrp.setSplitEvaluator(se);
PropertyNode[] propertyPath = new PropertyNode[2];
try {
propertyPath[0] = new PropertyNode(
se,
new PropertyDescriptor("splitEvaluator",
CrossValidationResultProducer.class),
CrossValidationResultProducer.class);
propertyPath[1] = new PropertyNode(sec,
new PropertyDescriptor("classifier", se.getClass()),
se.getClass());
} catch (IntrospectionException e) {
e.printStackTrace();
}
exp.setResultProducer(cvrp);
exp.setPropertyPath(propertyPath);
exp.setPropertyArray(new Classifier[]{cl});
DefaultListModel model = new DefaultListModel();
model.addElement(new File(datapath));
exp.setDatasets(model);
InstancesResultListener irl = new InstancesResultListener();
irl.setOutputFile(new File(outfile));
exp.setResultListener(irl);
System.out.println("Initializing...");
exp.initialize();
System.out.println("Running...");
exp.runExperiment();
System.out.println("Finishing...");
exp.postProcess();
System.out.println("Evaluating...");
PairedTTester tester = new PairedCorrectedTTester();
FileReader reader = new FileReader(irl.getOutputFile());
Instances result = new Instances(reader);
tester.setInstances(result);
tester.setSortColumn(-1);
tester.setRunColumn(result.attribute("Key_Run").index());
tester.setFoldColumn(result.attribute("Key_Fold").index());
tester.setDatasetKeyColumns(
new Range(
""
+ (result.attribute("Key_Dataset").index() + 1)));
tester.setResultsetKeyColumns(
new Range(
""
+ (result.attribute("Key_Scheme").index() + 1)
+ ","
+ (result.attribute("Key_Scheme_options").index() + 1)
+ ","
+ (result.attribute("Key_Scheme_version_ID").index() + 1)));
tester.setResultMatrix(new ResultMatrixPlainText());
tester.setDisplayedResultsets(null);
tester.setSignificanceLevel(0.05);
tester.setShowStdDevs(true);
// fill result matrix (but discarding the output)
tester.multiResultsetFull(0, result.attribute("Percent_correct").index());
// output results for reach dataset
System.out.println("\nResult:");
ResultMatrix matrix = tester.getResultMatrix();
for (int i = 0; i < matrix.getColCount(); i++) {
System.out.println(matrix.getColName(i));
System.out.println(" Perc. correct: " + matrix.getMean(i, 0));
System.out.println(" StdDev: " + matrix.getStdDev(i, 0));
}
}
What I would like to do is either save or display the learning curve in this method. I cannot find info for how to do this programmatically.
For the code below, the classifyInstance() line gives an error:
Exception in thread "main" java.lang.NullPointerException
at weka.classifiers.functions.LinearRegression.classifyInstance(LinearRegression.java:272)
at LR.main(LR.java:45)
I tried to debug but no success. How can I use my saved model to predict the class attribute of my test file? The problem is based on the regression.
for (int i = 0; i < unlabeled.numInstances(); i++) {
double clsLabel = cls.classifyInstance(unlabeled.instance(i));
labeled.instance(i).setClassValue(clsLabel);
System.out.println(clsLabel + " -> " + unlabeled.classAttribute().value((int) clsLabel));
}
This is the actual code:
public class LR{
public static void main(String[] args) throws Exception
{
BufferedReader datafile = new BufferedReader(new FileReader("C:\\dataset.arff"));
Instances data = new Instances(datafile);
data.setClassIndex(data.numAttributes()-1); //setting class attribute
datafile.close();
LinearRegression lr = new LinearRegression(); //build model
int folds=10;
Evaluation eval = new Evaluation(data);
eval.crossValidateModel(lr, data, folds, new Random(1));
System.out.println(eval.toSummaryString());
//save the model
weka.core.SerializationHelper.write("C:\\lr.model", lr);
//load the model
Classifier cls = (Classifier)weka.core.SerializationHelper.read("C:\\lr.model");
Instances unlabeled = new Instances(new BufferedReader(new FileReader("C:\\testfile.arff")));
// set class attribute
unlabeled.setClassIndex(unlabeled.numAttributes() - 1);
// create copy
Instances labeled = new Instances(unlabeled);
double clsLabel;
// label instances
for (int i = 0; i < unlabeled.numInstances(); i++)
{
clsLabel = cls.classifyInstance(unlabeled.instance(i));
labeled.instance(i).setClassValue(clsLabel);
System.out.println(clsLabel + " -> " + unlabeled.classAttribute().value((int) clsLabel));
}
// save labeled data
BufferedWriter writer = new BufferedWriter(new FileWriter("C:\\final.arff"));
writer.write(labeled.toString());
writer.newLine();
writer.flush();
writer.close();
}
}
Did you train your classifier?
Looks to me like you are trying to classify, without having trained your classifier.
Iam using lingpipe tool for naive bayes algorithm.I trained it using my trained data and it successfullu tests my test data. But each time I runs the algorithm each time it trains. I don't want to train it each time instead I want to build a model to which I can apply the test data.
public class ClassifyNews {
private static File TRAINING_DIR= new File("train");
private static File TESTING_DIR= new File("test");
private static String[] CATEGORIES
= { "c1",
"c2",
"c3"};
private static int NGRAM_SIZE = 6;
public static void main(String[] args)throws ClassNotFoundException, IOException
{
DynamicLMClassifier<NGramProcessLM> classifier
=DynamicLMClassifier.createNGramProcess(CATEGORIES,NGRAM_SIZE);
for(int i=0; i<CATEGORIES.length; ++i)
{
File classDir = new File(TRAINING_DIR,CATEGORIES[i]);
if (!classDir.isDirectory())
{
String msg = "Could not find training directory="+ classDir
+ "\nTraining directory not found";
System.out.println(msg);
throw new IllegalArgumentException(msg);
}
String[] trainingFiles = classDir.list();
for (int j = 0; j < trainingFiles.length; ++j)
{
File file = new File(classDir,trainingFiles[j]);
String text = Files.readFromFile(file,"ISO-8859-1");
System.out.println("Training on " + CATEGORIES[i] + "/" + trainingFiles[j]);
Classification classification= new Classification(CATEGORIES[i]);
Classified<CharSequence> classified= new Classified<CharSequence>(text,classification);
classifier.handle(classified);}
}
System.out.println("Compiling");
JointClassifier<CharSequence> compiledClassifier
= (JointClassifier<CharSequence>)
AbstractExternalizable.compile(classifier);
boolean storeCategories = true;
JointClassifierEvaluator<CharSequence> evaluator =
new JointClassifierEvaluator
<CharSequence> (compiledClassifier,CATEGORIES,storeCategories);
for(int i = 0; i < CATEGORIES.length; ++i)
{
File classDir = new File(TESTING_DIR,CATEGORIES[i]);
String[] testingFiles = classDir.list();
for (int j=0; j<testingFiles.length; ++j)
{
String text= Files.readFromFile(new File(classDir,testingFiles[j]),"ISO-8859-1");
System.out.print("\nTesting on " + CATEGORIES[i] + "/" + testingFiles[j] + " ");
Classification classification= new Classification(CATEGORIES[i]);
Classified<CharSequence> classified= new Classified<CharSequence>(text,classification);
evaluator.handle(classified);
JointClassification jc =compiledClassifier.classify(text);
String bestCategory = jc.bestCategory();
String details = jc.toString();
System.out.println("\tGot best category of: " + bestCategory);
System.out.println(jc.toString());
}}
}
}
I'm reading a couple images with the exact same name but numbered from 1-6 so I've used and array to read in the images, for example AstroWalkLeft1, AstroWalkLeft2 into arimgAstroWalkleft[]. This is what I have:
public void GetImages() {
imgMonster = new ImageIcon("Assets\\MonsterSingle.png").getImage();
for (int i = 1; i <= nASTROIMGMAX; i++) {
arimgAstroWalkLeft[i] = new ImageIcon("Assets\\AstroWalkLeft" + i + ".png").getImage();
arimgAstroWalkRight[i] = new ImageIcon("Assets\\AstroWalkRight" + i + ".png").getImage();
}
imgAstroStandLeft = new ImageIcon("Assets\\AstroStandLeft.png").getImage();
imgAstroStandRight = new ImageIcon("Assets\\AstroStandRight.png").getImage();
imgBackground1 = new ImageIcon("Assets\\Hallway.png").getImage();
imgBackground2 = new ImageIcon("Assets\\Observation Room.png").getImage();
}
My problem is replacing the numbers in the image's name to the variable in my loop. I'm wonder how I put that variable in where the number once was.