For the code below, the classifyInstance() line gives an error:
Exception in thread "main" java.lang.NullPointerException
at weka.classifiers.functions.LinearRegression.classifyInstance(LinearRegression.java:272)
at LR.main(LR.java:45)
I tried to debug but no success. How can I use my saved model to predict the class attribute of my test file? The problem is based on the regression.
for (int i = 0; i < unlabeled.numInstances(); i++) {
double clsLabel = cls.classifyInstance(unlabeled.instance(i));
labeled.instance(i).setClassValue(clsLabel);
System.out.println(clsLabel + " -> " + unlabeled.classAttribute().value((int) clsLabel));
}
This is the actual code:
public class LR{
public static void main(String[] args) throws Exception
{
BufferedReader datafile = new BufferedReader(new FileReader("C:\\dataset.arff"));
Instances data = new Instances(datafile);
data.setClassIndex(data.numAttributes()-1); //setting class attribute
datafile.close();
LinearRegression lr = new LinearRegression(); //build model
int folds=10;
Evaluation eval = new Evaluation(data);
eval.crossValidateModel(lr, data, folds, new Random(1));
System.out.println(eval.toSummaryString());
//save the model
weka.core.SerializationHelper.write("C:\\lr.model", lr);
//load the model
Classifier cls = (Classifier)weka.core.SerializationHelper.read("C:\\lr.model");
Instances unlabeled = new Instances(new BufferedReader(new FileReader("C:\\testfile.arff")));
// set class attribute
unlabeled.setClassIndex(unlabeled.numAttributes() - 1);
// create copy
Instances labeled = new Instances(unlabeled);
double clsLabel;
// label instances
for (int i = 0; i < unlabeled.numInstances(); i++)
{
clsLabel = cls.classifyInstance(unlabeled.instance(i));
labeled.instance(i).setClassValue(clsLabel);
System.out.println(clsLabel + " -> " + unlabeled.classAttribute().value((int) clsLabel));
}
// save labeled data
BufferedWriter writer = new BufferedWriter(new FileWriter("C:\\final.arff"));
writer.write(labeled.toString());
writer.newLine();
writer.flush();
writer.close();
}
}
Did you train your classifier?
Looks to me like you are trying to classify, without having trained your classifier.
Related
I have the training and test "labeled.arff" files. Then I build a classifier and write to a "modelFile.model" file.
I have a "unlabeled.arff" file with the last attribute in each row "?".
How can I make the prediction in Java or C#?
I have some code but it is not right, always gives me the same prediction.
Thank you
// Write to Model
public static void Classify()
{
Instances train = new Instances(new java.io.FileReader(dirTrain + "labeled.arff"));
Instances test = new Instances(new java.io.FileReader(dirTest + "labeled.arff"));
train.setClassIndex(train.numAttributes() - 1);
test.setClassIndex(test.numAttributes() - 1);
// train Classifier
Classifier cl = new J48();
// Randomize the order of the instances in the dataset
weka.filters.Filter myRandom = new weka.filters.unsupervised.instance.Randomize();
myRandom.setInputFormat(train);
train = weka.filters.Filter.useFilter(train, myRandom);
// Build the classifier
cl.buildClassifier(train);
// evaluate classifier and print some statistics
Evaluation eval = new Evaluation(train);
eval.evaluateModel(cl, test);
Console.WriteLine(eval.toSummaryString("\nResults Decision Tree\n======\n", false));
SerializationHelper.write(dirModel + "modelFile.model", cl);
}
// Make predictions
public void Predictions()
{
Classifier cl = (Classifier)SerializationHelper.read(dirModel + "modelFile.model");
// load unlabeled data
Instances unlabeled = new Instances(new java.io.FileReader(pathFeatures + "unlabeled.arff"));
// set class attribute
unlabeled.setClassIndex(unlabeled.numAttributes() - 1);
// create copy
Instances labeled = new Instances(unlabeled);
// label instances
for (int i = 0; i < unlabeled.numInstances(); i++)
{
double clsLabel = cl.classifyInstance(unlabeled.instance(i));
labeled.instance(i).setClassValue(clsLabel);
}
int numCorrect = 0;
for (int i = 0; i < unlabeled.numInstances(); i++)
{
double pred = cl.classifyInstance(unlabeled.instance(i));
Console.Write("ID: " + unlabeled.instance(i).value(i));
//Console.Write(", actual: " + unlabeled.classAttribute().value((int)unlabeled.instance(i).classValue()));
Console.WriteLine(", predicted: " + unlabeled.classAttribute().value((int)pred));
}
Console.WriteLine("Correct predictions: " + numCorrect);
}
I am trying to get a learning curve for an automated weka experiment. I currently have the following java code.
public static void EvaluateModel(AbstractClassifier cl, String datapath, String outfile) throws Exception {
Experiment exp = new Experiment();
ClassifierSplitEvaluator se = new ClassifierSplitEvaluator();
se.setClassifier(cl);
Classifier sec = ((ClassifierSplitEvaluator) se).getClassifier();
CrossValidationResultProducer cvrp = new CrossValidationResultProducer();
cvrp.setNumFolds(10);
cvrp.setSplitEvaluator(se);
PropertyNode[] propertyPath = new PropertyNode[2];
try {
propertyPath[0] = new PropertyNode(
se,
new PropertyDescriptor("splitEvaluator",
CrossValidationResultProducer.class),
CrossValidationResultProducer.class);
propertyPath[1] = new PropertyNode(sec,
new PropertyDescriptor("classifier", se.getClass()),
se.getClass());
} catch (IntrospectionException e) {
e.printStackTrace();
}
exp.setResultProducer(cvrp);
exp.setPropertyPath(propertyPath);
exp.setPropertyArray(new Classifier[]{cl});
DefaultListModel model = new DefaultListModel();
model.addElement(new File(datapath));
exp.setDatasets(model);
InstancesResultListener irl = new InstancesResultListener();
irl.setOutputFile(new File(outfile));
exp.setResultListener(irl);
System.out.println("Initializing...");
exp.initialize();
System.out.println("Running...");
exp.runExperiment();
System.out.println("Finishing...");
exp.postProcess();
System.out.println("Evaluating...");
PairedTTester tester = new PairedCorrectedTTester();
FileReader reader = new FileReader(irl.getOutputFile());
Instances result = new Instances(reader);
tester.setInstances(result);
tester.setSortColumn(-1);
tester.setRunColumn(result.attribute("Key_Run").index());
tester.setFoldColumn(result.attribute("Key_Fold").index());
tester.setDatasetKeyColumns(
new Range(
""
+ (result.attribute("Key_Dataset").index() + 1)));
tester.setResultsetKeyColumns(
new Range(
""
+ (result.attribute("Key_Scheme").index() + 1)
+ ","
+ (result.attribute("Key_Scheme_options").index() + 1)
+ ","
+ (result.attribute("Key_Scheme_version_ID").index() + 1)));
tester.setResultMatrix(new ResultMatrixPlainText());
tester.setDisplayedResultsets(null);
tester.setSignificanceLevel(0.05);
tester.setShowStdDevs(true);
// fill result matrix (but discarding the output)
tester.multiResultsetFull(0, result.attribute("Percent_correct").index());
// output results for reach dataset
System.out.println("\nResult:");
ResultMatrix matrix = tester.getResultMatrix();
for (int i = 0; i < matrix.getColCount(); i++) {
System.out.println(matrix.getColName(i));
System.out.println(" Perc. correct: " + matrix.getMean(i, 0));
System.out.println(" StdDev: " + matrix.getStdDev(i, 0));
}
}
What I would like to do is either save or display the learning curve in this method. I cannot find info for how to do this programmatically.
Iam using lingpipe tool for naive bayes algorithm.I trained it using my trained data and it successfullu tests my test data. But each time I runs the algorithm each time it trains. I don't want to train it each time instead I want to build a model to which I can apply the test data.
public class ClassifyNews {
private static File TRAINING_DIR= new File("train");
private static File TESTING_DIR= new File("test");
private static String[] CATEGORIES
= { "c1",
"c2",
"c3"};
private static int NGRAM_SIZE = 6;
public static void main(String[] args)throws ClassNotFoundException, IOException
{
DynamicLMClassifier<NGramProcessLM> classifier
=DynamicLMClassifier.createNGramProcess(CATEGORIES,NGRAM_SIZE);
for(int i=0; i<CATEGORIES.length; ++i)
{
File classDir = new File(TRAINING_DIR,CATEGORIES[i]);
if (!classDir.isDirectory())
{
String msg = "Could not find training directory="+ classDir
+ "\nTraining directory not found";
System.out.println(msg);
throw new IllegalArgumentException(msg);
}
String[] trainingFiles = classDir.list();
for (int j = 0; j < trainingFiles.length; ++j)
{
File file = new File(classDir,trainingFiles[j]);
String text = Files.readFromFile(file,"ISO-8859-1");
System.out.println("Training on " + CATEGORIES[i] + "/" + trainingFiles[j]);
Classification classification= new Classification(CATEGORIES[i]);
Classified<CharSequence> classified= new Classified<CharSequence>(text,classification);
classifier.handle(classified);}
}
System.out.println("Compiling");
JointClassifier<CharSequence> compiledClassifier
= (JointClassifier<CharSequence>)
AbstractExternalizable.compile(classifier);
boolean storeCategories = true;
JointClassifierEvaluator<CharSequence> evaluator =
new JointClassifierEvaluator
<CharSequence> (compiledClassifier,CATEGORIES,storeCategories);
for(int i = 0; i < CATEGORIES.length; ++i)
{
File classDir = new File(TESTING_DIR,CATEGORIES[i]);
String[] testingFiles = classDir.list();
for (int j=0; j<testingFiles.length; ++j)
{
String text= Files.readFromFile(new File(classDir,testingFiles[j]),"ISO-8859-1");
System.out.print("\nTesting on " + CATEGORIES[i] + "/" + testingFiles[j] + " ");
Classification classification= new Classification(CATEGORIES[i]);
Classified<CharSequence> classified= new Classified<CharSequence>(text,classification);
evaluator.handle(classified);
JointClassification jc =compiledClassifier.classify(text);
String bestCategory = jc.bestCategory();
String details = jc.toString();
System.out.println("\tGot best category of: " + bestCategory);
System.out.println(jc.toString());
}}
}
}
I am using opencsv to parse two csv files. I only copy some values from the two files.
I have a seperate function which processes the CDax.csv. Which looks like that:
public HashMap<String,String> readCDax() throws Exception {
String csvDaxFile = "C:\\Users\\CDAX.csv";
CSVReader reader = new CSVReader(new FileReader(csvDaxFile), ';');
String [] line;
HashMap<String, String> cdaxMap = new HashMap<String, String>();
while ((line = reader.readNext()) != null) {
cdaxMap.put(line[0], line[7]);
}
System.out.println("Process CDax File!");
reader.close();
return cdaxMap;
}
My main method is run() which I execute in my main method:
public void run() throws Exception {
while ((firstLine = reader.readNext()) != null && (secondLine = reader.readNext()) != null && i<10) {
//fileName of the String
fileName = firstLine[0];
writerPath = "C:\\Users\\" + fileName + ".csv";
//write csv file
CSVWriter writer = new CSVWriter(new FileWriter(writerPath), ';');
//write Header
//String[] entries = "Name;Date;TotalReturn;Currency".split(";");
String [] entries = {"Name","Date", "TotalReturn", "Currency"};
writer.writeNext(entries);
//create Content
//companyName of the String
companyName = secondLine[1];
//currency
currency = secondLine[2];
//dates
dateList = new ArrayList<String>();
for(int p = 3; p < firstLine.length; p++) {
dateList.add(firstLine[p]);
}
//total returns
returnList = new ArrayList<String>();
for(int j = 3; j < secondLine.length; j++) {
returnList.add(secondLine[j]);
}
// cDaxList
cDaxList = new ArrayList<String>();
for(int j = 1; j <dateList.size(); j++) {
if(cDaxMethodValuesMap.containsKey(dateList.get(j))){
cDaxList.add(cDaxMethodValuesMap.get(dateList.get(j)));
} else{
dateList.add("na"); // I get the error here!
}
}
if(dateList.size()!=returnList.size()) {
System.out.println("Dates and Returns do not have the same length!");
}
int minSize = Math.min(dateList.size(), returnList.size());
//"Name;Date;TotalReturn;Currency"
List<String[]> data = new ArrayList<String[]>();
for(int m = 0; m < minSize; m++) {
data.add(new String[] {companyName, dateList.get(m), returnList.get(m), currency, cDaxList.get(m)});
}
writer.writeAll(data);
//close Writer
writer.close();
i++;
System.out.println(fileName + " parsed successfully!");
}
System.out.println("Done");
}
However when I run my program I get:
Process CDax File!
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2760)
at java.util.Arrays.copyOf(Arrays.java:2734)
at java.util.ArrayList.ensureCapacity(ArrayList.java:167)
at java.util.ArrayList.add(ArrayList.java:351)
at com.TransformCSV.main.ParseCSV.run(ParseCSV.java:109)
at com.TransformCSV.main.ParseCSV.main(ParseCSV.java:21)
I am getting the error in this method:
cDaxList = new ArrayList<String>();
for(int j = 1; j <dateList.size(); j++) {
if(cDaxMethodValuesMap.containsKey(dateList.get(j))){
cDaxList.add(cDaxMethodValuesMap.get(dateList.get(j)));
} else{
dateList.add("na"); //I get the error here!!!
}
}
I tried to put up the heapsize via the vm settings, however I do not think that this should be done because I only read in both csv files only 3000 values.
I appreciate your reply!
Your loop:
for(int j = 1; j <dateList.size(); j++) {
is looping through dateList but in that loop you are adding to dateList:
dateList.add("na"); //I get the error here!!!
so dateList will get bigger and bigger until you run out of memory. dateList.size() is evaluated every time through the loop, not once at the beginning.
I have a Jtable which displays the following when triggered.
How can I write all the contents of it to a Text File while keeping its original format. The Text File should look something like this.
Line Number Error Solution Percentage(%)
6 in int 33%
This is what I tried so far. But only value 6 is being written to the file. Any Help.
My codes(only the main parts):
private static final String[] columnNames = {"Line Number", "Error","Solution","Percentage (%)"};
static DefaultTableModel model = new DefaultTableModel(null,columnNames);
public static void DisplayMyJList(List<CaptureErrors> x) throws IOException
{
String [] myErrorDetails = new String[x.size()];
int i = 0;
int line,percentage;
String err, sol;
String aLine;
StringBuffer fileContent = new StringBuffer();
for(CaptureErrors e: x)
{
Vector row = new Vector();
row.add(e.getLinenumber());
row.add(e.getMyfounderror());
row.add(e.getMycorrection());
row.add(e.getMyPercentage()+"%");
model.addRow( row );
for (int i1 = 0; i1 < model.getRowCount(); i1++) {
Object cellValue = model.getValueAt(i1, 0);
// ... continue to read each cell in a row
fileContent.append(cellValue);
// ... continue to append each cell value
FileWriter fileWriter = new FileWriter(new File("C:\\Users\\Antish\\Desktop\\data.txt"));
fileWriter.write(fileContent.toString());
fileWriter.flush();
fileWriter.close();
}
Update I tried this with 2 loops and it gives me the following. I lost the original Format:
Code:
String separator = System.getProperty( "line.separator" );
try
{
BufferedWriter bufferedWriter = new BufferedWriter(new FileWriter(file,true));
PrintWriter fileWriter = new PrintWriter(bufferedWriter);
for(int i1=0; i1<model.getRowCount(); ++i1)
{
for(int j=0; j<model.getColumnCount(); ++j)
{ String names = columnNames[counter];
String s = model.getValueAt(i1,j).toString();
fileWriter.print(names +" ");
fileWriter.append( separator );
fileWriter.print(s + " ");
counter ++;
}
fileWriter.println("");
}
fileWriter.close();
}catch(Exception e)
{
Your code isn't complete but I can see 1, maybe 2 errors:
1) You need a double loop, one for the rows and then a second for every column in the row. The code you posted only shows you getting the value from the first column which would explain why you only see "6".
2) The code to write to the file needs to be outside your two loops. The way the code is written now you will recreate a new file for every row, which mean you will only ever have a single row of data in the file