TimeSeries Forecasting Weka - Java API - java

I am trying to implement TimeSeries Forecasting in a JavaService in webMethods. My Code is not working and i am completely lost so i would be glad if you could help me! FYI: I used this Tutorial.
This is the Exception i get: com.wm.lang.flow.FlowException: weka.core.expressionlanguage.parser.Parser.getSymbolFactory()Ljava_cup/runtime/SymbolFactory;
I just post the part which is not webMethods specific (normal Java):
In the first part i am building an ARFF File which works fine. Because i saved the file and opened it with the weka Explorer and everything looks fine.
The ARFF file looks like this:
#relation Rel
#attribute Count numeric
#data
2758
2797
2861
575
505
4029
(just with some more values (59 in total))
I want to forecast the next 3 values.
Forecasting Part:
// At the berginning i create and save the arff file, so i have an Instances
object called 'dataset'
WekaForecaster forecaster = new WekaForecaster();
try {
forecaster.setFieldsToForecast("Count");
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
forecaster.setBaseForecaster(new GaussianProcesses());
forecaster.getTSLagMaker().setTimeStampField("Date");
forecaster.getTSLagMaker().setMinLag(1);
forecaster.getTSLagMaker().setMaxLag(12);
forecaster.getTSLagMaker().setAddMonthOfYear(true);
forecaster.getTSLagMaker().setAddQuarterOfYear(true);
PrintStream stream = null;
List<List<NumericPrediction>> forecast = null;
try {
stream = new PrintStream("./path/forecast.txt");
forecaster.buildForecaster(dataset, stream);
forecaster.primeForecaster(dataset);
forecast = forecaster.forecast(3, dataset, stream);
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
// output the predictions
for (int i = 0; i < 3; i++) {
List<NumericPrediction> predsAtStep = forecast.get(i);
NumericPrediction predForTarget = predsAtStep.get(0);
stream.print("" + predForTarget.predicted() + " ");
stream.println();
}
The Java Code is hard to debug in webMethods, but it seems that forecaster.buildForecaster(dataset, stream); is causing the Exception.
What am i missing?

Related

weka.core.UnassignedClassException: Class index is negative (not set)!

I try to implement linear regression over an csv file. Here is the content of the csv file:
X1;X2;X3;X4;X5;X6;X7;X8;Y1;Y2;
0.98;514.50;294.00;110.25;7.00;2;0.00;0;15.55;21.33;
0.98;514.50;294.00;110.25;7.00;3;0.00;0;15.55;21.33;
0.98;514.50;294.00;110.25;7.00;4;0.00;0;15.55;21.33;
0.98;514.50;294.00;110.25;7.00;5;0.00;0;15.55;21.33;
0.90;563.50;318.50;122.50;7.00;2;0.00;0;20.84;28.28;
0.90;563.50;318.50;122.50;7.00;3;0.00;0;21.46;25.38;
0.90;563.50;318.50;122.50;7.00;4;0.00;0;20.71;25.16;
0.90;563.50;318.50;122.50;7.00;5;0.00;0;19.68;29.60;
0.86;588.00;294.00;147.00;7.00;2;0.00;0;19.50;27.30;
0.86;588.00;294.00;147.00;7.00;3;0.00;0;19.95;21.97;
0.86;588.00;294.00;147.00;7.00;4;0.00;0;19.34;23.49;
0.86;588.00;294.00;147.00;7.00;5;0.00;0;18.31;27.87;
0.82;612.50;318.50;147.00;7.00;2;0.00;0;17.05;23.77;
...
0.71;710.50;269.50;220.50;3.50;2;0.40;5;12.43;15.59;
0.71;710.50;269.50;220.50;3.50;3;0.40;5;12.63;14.58;
0.71;710.50;269.50;220.50;3.50;4;0.40;5;12.76;15.33;
0.71;710.50;269.50;220.50;3.50;5;0.40;5;12.42;15.31;
0.69;735.00;294.00;220.50;3.50;2;0.40;5;14.12;16.63;
0.69;735.00;294.00;220.50;3.50;3;0.40;5;14.28;15.87;
0.69;735.00;294.00;220.50;3.50;4;0.40;5;14.37;16.54;
0.69;735.00;294.00;220.50;3.50;5;0.40;5;14.21;16.74;
0.66;759.50;318.50;220.50;3.50;2;0.40;5;14.96;17.64;
0.66;759.50;318.50;220.50;3.50;3;0.40;5;14.92;17.79;
0.66;759.50;318.50;220.50;3.50;4;0.40;5;14.92;17.55;
0.66;759.50;318.50;220.50;3.50;5;0.40;5;15.16;18.06;
0.64;784.00;343.00;220.50;3.50;2;0.40;5;17.69;20.82;
0.64;784.00;343.00;220.50;3.50;3;0.40;5;18.19;20.21;
0.64;784.00;343.00;220.50;3.50;4;0.40;5;18.16;20.71;
0.64;784.00;343.00;220.50;3.50;5;0.40;5;17.88;21.40;
0.62;808.50;367.50;220.50;3.50;2;0.40;5;16.54;16.88;
0.62;808.50;367.50;220.50;3.50;3;0.40;5;16.44;17.11;
0.62;808.50;367.50;220.50;3.50;4;0.40;5;16.48;16.61;
0.62;808.50;367.50;220.50;3.50;5;0.40;5;16.64;16.03;
I read this csv file and implement linear regression implementation. Here is the source code in java:
public static void main(String[] args) throws IOException
{
String csvFile = null;
CSVLoader loader = null;
Remove remove =null;
Instances data =null;
LinearRegression model = null;
int numberofFeatures = 0;
try
{
csvFile = "C:\\Users\\Taha\\Desktop/ENB2012_data.csv";
loader = new CSVLoader();
// load CSV
loader.setSource(new File(csvFile));
data = loader.getDataSet();
//System.out.println(data);
numberofFeatures = data.numAttributes();
System.out.println("number of features: " + numberofFeatures);
data.setClassIndex(data.numAttributes() - 2);
//remove last attribute Y2
remove = new Remove();
remove.setOptions(new String[]{"-R", data.numAttributes()+""});
remove.setInputFormat(data);
data = Filter.useFilter(data, remove);
// data.setClassIndex(data.numAttributes() - 2);
model = new LinearRegression();
model.buildClassifier(data);
System.out.println(model);
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
I am getting an error, weka.core.UnassignedClassException: Class index is negative (not set)! at the line model.buildClassifier(data); Number of features is 1, however, it is expected to be 9.They are X1;X2;X3;X4;X5;X6;X7;X8;Y1;Y2 What am I missing?
Thanks in advance.
You can add after the line data=loader.getDataSet(), the next lines which will resolve your exception:
if (data.classIndex() == -1) {
System.out.println("reset index...");
instances.setClassIndex(data.numAttributes() - 1);
}
This worked for me.
Since I can not find any solution to that problem, I decided to position data into Oracle database and I read data from Oracle. There is an import utility in Oracle Sql Developer and I used it. That solves my problem. I write this article for people who has the same problem.
Here is the detailed information about connecting an Oracle database for weka.
http://tahasozgen.blogspot.com.tr/2016/10/connection-to-oracle-database-in-weka.html

QR code not decoded same string two formats

We have the string 117355|1.3,-0.6|1.68,1.25|2.95,-0.6|1.68,1.25encoded into a QR code. We have been using an online generator. The generator has produced two variations, shown here:
This one is easily decoded by our app.
This one returns the ChecksumException
Both scan fine when using a 3rd party scanning app, however, we are using the ZXING library within our app and instead of using a video feed we are using a still image, so one attempt, this is due to this being part of a wider image processing workflow.
The reason I said "two formats" is due to the sections of each QR code being different according to this. We have different strings and whenever we have the "3 dots" under the top right registration marker it scans, and when we don't the scan is unreliable.
Here is the reading section of code:
QRCodeReader reader;
try {
reader = new QRCodeReader();
Map<DecodeHintType, Object> tmpHintsMap = new EnumMap<DecodeHintType, Object>(DecodeHintType.class);
tmpHintsMap.put(DecodeHintType.TRY_HARDER, Boolean.TRUE);
BinaryBitmap img = new BinaryBitmap(new HybridBinarizer(
new GreyscaleLuminanceSource(greyscaleHalf, w, h)));
Result scanResult = reader.decode(img, tmpHintsMap);
return scanResult;
} catch (ChecksumException e) {
e.printStackTrace();
mError = "The QR Code could not be decoded";
} catch (NotFoundException e) {
mError = "The QR Code could not be found";
e.printStackTrace();
} catch (FormatException e) {
mError = "The QR Code Format";
e.printStackTrace();
}
return null;
As you can see I have tried the DecodeHintType.TRY_HARDER option, but this does not help. We were working with 3.2.0 and I have recently tried with 3.2.1.
We have had 22,000 of these labels printed and I need to find a solution to make this scan reliably.

java: reading large file with charset

My file is 14GB and I would like to read line by line and will be export to excel file.
As the file include different language, such as Chinese and English,
I tried to use FileInputStream with UTF-16 for reading data,
but result in java.lang.OutOfMemoryError: Java heap space
I have tried to increase the heap space but problem still exist
How should I change my file reading code?
createExcel(); //open a excel file
try {
//success but cannot read and output for different language
//br = new BufferedReader(
// new FileReader("C:\\Users\\brian_000\\Desktop\\appdatafile.json"));
//result in java.lang.OutOfMemoryError: Java heap space
br = new BufferedReader(new InputStreamReader(
new FileInputStream("C:\\Users\\brian_000\\Desktop\\appdatafile.json"),
"UTF-16"));
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (UnsupportedEncodingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
System.out.println("cann be print");
String line;
int i=0;
try {
while ((line = br.readLine()) != null) {
// process the line.
try{
System.out.println("cannot be print");
//some statement for storing the data in variables.
//a function for writing the variable into excel
writeToExcel(platform,kind,title,shareUrl,contentRating,userRatingCount,averageUserRating
,marketLanguage,pricing
,majorVersionNumber,releaseDate,downloadsCount);
}
catch(com.google.gson.JsonSyntaxException exception){
System.out.println("error");
}
// trying to get the first 1000rows
i++;
if(i==1000){
br.close();
break;
}
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
closeExcel();
public static void writeToExcel(String platform,String kind,String title,String shareUrl,String contentRating,String userRatingCount,String averageUserRating
,String marketLanguage,String pricing,String majorVersionNumber,String releaseDate,String downloadsCount){
currentRow++;
System.out.println(currentRow);
if(currentRow>1000000){
currentsheet++;
sheet = workbook.createSheet("apps"+currentsheet, 0);
createFristRow();
currentRow=1;
}
try {
//character id
Label label = new Label(0, currentRow, String.valueOf(currentRow), cellFormat);
sheet.addCell(label);
//12 of statements for write the data to excel
label = new Label(1, currentRow, platform, cellFormat);
sheet.addCell(label);
} catch (WriteException e) {
e.printStackTrace();
}
Excel, UTF-16
As mentioned, the problem is likely caused by the Excel document construction. Try whether UTF-8 yields a lesser size; for instance Chinese HTML still is better compressed with UTF-8 rather than UTF-16 because of the many ASCII chars.
Object creation java
You can share common small Strings. Useful for String.valueOf(row) and such. Cache only strings with a small length. I assume the cellFormat to be fixed.
DIY with xlsx
Excel builds a costly DOM.
If CSV text (with a Unicode BOM marker) is no options (you could give it the extension .xls to be opened by Excel), try generating an xslx.
Create an example workbook in xslx.
This is a zip format you can process in java easiest with a zip filesystem.
For Excel there is a content XML and a shared XML, sharing cell values with an index from content to shared strings.
Then no overflow happens as you write buffer-wise.
Or use a JDBC driver for Excel. (No recent experience on my side, maybe JDBC/ODBC.)
Best
Excel is hard to use with that much data. Consider more effort using a database, or write every N rows in a proper Excel file. Maybe you can later import them with java in one document. (I doubt it.)

using CSVWriter to export database tables with BLOB

I've already tried exporting my database tables to CSV using the CSVWriter.
But my tables contain BLOB data. How can I include them in my export?
Then later on, im going to import that exported CSV using CSVReader. Can anyone share some concepts?
This is a part of my code for export
ResultSet res = st.executeQuery("select * from "+db+"."+obTableNames[23]);
int colunmCount = getColumnCount(res);
try {
File filename = new File(dir,""+obTableNames[23]+".csv");
fw = new FileWriter(filename);
CSVWriter writer = new CSVWriter(fw);
writer.writeAll(res, false);
int colType = res.getMetaData().getColumnType(colunmCount);
dispInt(colType);
fw.flush();
fw.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
Did you take a look at encodeBase64String(byte[] data) method from the Base64 provided by Apache?
Encodes binary data using the base64 algorithm but does not chunk the output.
This should allow you to return encoded strings representing your Binary Large Object and incorporate it in your CSV.
People on the other side can then use the decodeBase64String(String data) to get the BLOB back again.

how to intentionally corrupt a file in java

Note: Please do not judge this question. To those who think that I am doing this to "cheat"; you are mistaken, as I am no longer in school anyway. In addition, if I was, myself actually trying to cheat, I would simply use services that have already been created for this, instead of recreating the program. I took on this project because I thought it might be fun, nothing else. Before you down-vote, please consider the value of the question it's self, and not the speculative uses of it, as the purpose of SO is not to judge, but simply give the public information.
I am developing a program in java that is supposed intentionally corrupt a file (specifically a .doc, txt, or pdf, but others would be good as well)
I initially tried this:
public void corruptFile (String pathInName, String pathOutName) {
curroptMethod method = new curroptMethod();
ArrayList<Integer> corruptHash = corrupt(getBytes(pathInName));
writeBytes(corruptHash, pathOutName);
new MimetypesFileTypeMap().getContentType(new File(pathInName));
// "/home/ephraim/Desktop/testfile"
}
public ArrayList<Integer> getBytes(String filePath) {
ArrayList<Integer> fileBytes = new ArrayList<Integer>();
try {
FileInputStream myInputStream = new FileInputStream(new File(filePath));
do {
int currentByte = myInputStream.read();
if(currentByte == -1) {
System.out.println("broke loop");
break;
}
fileBytes.add(currentByte);
} while (true);
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
System.out.println(fileBytes);
return fileBytes;
}
public void writeBytes(ArrayList<Integer> hash, String pathName) {
try {
OutputStream myOutputStream = new FileOutputStream(new File(pathName));
for (int currentHash : hash) {
myOutputStream.write(currentHash);
}
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
//System.out.println(hash);
}
public ArrayList<Integer> corrupt(ArrayList<Integer> hash) {
ArrayList<Integer> corruptHash = new ArrayList<Integer>();
ArrayList<Integer> keywordCodeArray = new ArrayList<Integer>();
Integer keywordIndex = 0;
String keyword = "corruptthisfile";
for (int i = 0; i < keyword.length(); i++) {
keywordCodeArray.add(keyword.codePointAt(i));
}
for (Integer currentByte : hash) {
//Integer currentByteProduct = (keywordCodeArray.get(keywordIndex) + currentByte) / 2;
Integer currentByteProduct = currentByte - keywordCodeArray.get(keywordIndex);
if (currentByteProduct < 0) currentByteProduct += 255;
corruptHash.add(currentByteProduct);
if (keywordIndex == (keyword.length() - 1)) {
keywordIndex = 0;
} else keywordIndex++;
}
//System.out.println(corruptHash);
return corruptHash;
}
but the problem is that the file is still openable. When you open the file, all of the words are changed (and they may not make any sense, and they may not even be letters, but it can still be opened)
so here is my actual question:
Is there a way to make a file so corrupt that the computer doesn't know how to open it at all (ie. when you open it, the computer will say something along the lines of "this file is not recognized, and cannot be opened")?
I think you want to look into the RandomAccessFile. Also, it is almost always the case that a program recognizes its file by its very start. So open the file and scramble the first 5 bytes.
The only way to fully corrupt an arbitrary file is to replace all of its contents with random garbage. Even then, there is an infinitely small probability that the random garbage will actually be something meaningful.
Depending on the file type, it may be possible to recover from limited - or even from not so limited - corruption. E.g.:
Streaming media codecs are designed with network packet loss take into account. Limited corruption may show up as picture artifacts, or even as a few lost frames, but the content is usually still viewable.
Block-based compression algorithms, such as bzip2, allow undamaged blocks to be recovered.
File-based compression systems such as rar and zip may be able to recover those files whose compressed data has not been damaged, regardless of damage to the rest of the archive.
Human-readable text, such as text files and source code files, is still viewable in a text editor, even if parts of it are corrupt - not to mention its size that does not change. Unless you corrupted the whole thing, any casual reader would be able to tell whether an assignment was done and whether the retransmitted file was the same as the one that got corrupted.
Apart from the ethical issue, have you considered that this would be a one-time thing only? Data corruption does happen, but it's not that frequent and it's never that convenient...
If you are that desperate for more time, you would be better off breaking your leg and getting yourself admitted to a hospital.
There are better ways:
Your professor accepts Word documents. Infect it with a macro virus before sending.
"Forget" to attach the file to the email.
Forge the send date on your email. If your prof is the kind that accepts Word docs, this may work.

Categories