I have my application which need to get associations via the Apriori Algorithm. In order to achieve results I use Weka dependency. Though I want to get associations it prints memory locations. I have attached the output as well. Thanks.
Here is my code:
public class App
{
static Logger log = Logger.getLogger(App.class.getName());
public static BufferedReader readDataFile(String filename) {
BufferedReader inputReader = null;
try {
inputReader = new BufferedReader(new FileReader(filename));
} catch (FileNotFoundException ex) {
}
return inputReader;
}
public static void main( String[] args ) throws Exception {
//Define ArrayList to Add Clustered Information
ArrayList<FastVector[]> associationInfoArrayList = new ArrayList<FastVector[]>();
Apriori apriori = new Apriori();
apriori.setNumRules(15);
BufferedReader datafile = readDataFile("/media/jingi_ingi/IR1_CPRA_X6/Documents/ss.arff");
Instances data = new Instances(datafile);
// Instances instances = new Instances(datafile);
apriori.buildAssociations(data);
log.debug("Testing Apriori Algo Results Started ....");
log.debug("-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-");
log.debug("Number of Associations : " + apriori.getNumRules());
log.debug("Adding Association Information to ArrayList ..");
Object objectArrayOfAssociations[] = new Object[apriori.getNumRules()];
log.debug(apriori.getAllTheRules().toString());
for(int i=0; i<apriori.getNumRules(); i++) {
objectArrayOfAssociations[i] = apriori.getAllTheRules();
log.debug("Associations Discovered : " + objectArrayOfAssociations[i].toString());
}
}
}
Output of the Application:
2015-04-05 20:16:42 DEBUG App:48 - Associations Discovered :
[Lweka.core.FastVector;#19a96bae
2015-04-05 20:16:42 DEBUG App:48 - Associations Discovered :
[Lweka.core.FastVector;#19a96bae
2015-04-05 20:16:42 DEBUG App:48 - Associations Discovered :
[Lweka.core.FastVector;#19a96bae
2015-04-05 20:16:42 DEBUG App:48 - Associations Discovered :
[Lweka.core.FastVector;#19a96bae
apriori.getAllTheRules()
returns an Array of FastVectors, but FastVector doesn't have a toString() method to dump its contents as implied by your intentions. You can extend FastVector and add your own toString() or write a little helper method to dump the contents as desired. Here's an example
Something like:
for(FastVector fastVector : apriori.getAllTheRules())
log.debug(fastVector.getRevision());
// or whichever attribute you want to show
Related
I have created an OWL ontology using Protégé, describing a patient database system. I am now attempting to develop a Java code using Apache Jena to read the OWL file I created, then perform a number of operations on it. My primary goal is to get my code to be able to find a specific Individual by name (Patient name for example) and then access a specific Object Property for that individual, and output its value. For example, A patient "John" has an object property "Treated_By" which corresponds to another individual "Amy" (Amy is an individual of type doctor). However, I have been unable to figure out which Jena method is used to retrieve Object property values from a certain individual.
Here is my code (Please ignore comments, they are fragments of previous attempts for this task):
public class Main {
public static void main(String[] args) {
OntModel model = ModelFactory.createOntologyModel(OntModelSpec.OWL_MEM);
String fileName = "C:/Users/Ahmed Medhat/Documents/assignment1ontv3.0.owl";
try {
InputStream inputStream = new FileInputStream(fileName);
model.read(inputStream, "RDF/XML");
//model.read(inputStream, "OWL/XML");
inputStream.close();
} catch (Exception e) {
System.out.println(e.getMessage());
}
Scanner sc = new Scanner(System.in);
System.out.println("Enter Patient Name: ");
String patientName = sc.next();
ExtendedIterator<Individual> itI = model.listIndividuals();
while (itI.hasNext()) {
Individual i = itI.next();
String localName = i.getLocalName();
//System.out.println(patientName);
//System.out.println(localName);
if(localName.equals(patientName))
{
//System.out.println("Conditional Code Accessed.");
OntClass Class = i.getOntClass();
System.out.println("Patient Disease is: " + Class.listDeclaredProperties());
}
System.out.println("Failed.");
}
}
}
Try this (replace the property URI accordingly):
final Property p = model.createObjectProperty("http://example.org/Treated_by");
final RDFNode object = i.getPropertyValue(p);
I try to implement linear regression over an csv file. Here is the content of the csv file:
X1;X2;X3;X4;X5;X6;X7;X8;Y1;Y2;
0.98;514.50;294.00;110.25;7.00;2;0.00;0;15.55;21.33;
0.98;514.50;294.00;110.25;7.00;3;0.00;0;15.55;21.33;
0.98;514.50;294.00;110.25;7.00;4;0.00;0;15.55;21.33;
0.98;514.50;294.00;110.25;7.00;5;0.00;0;15.55;21.33;
0.90;563.50;318.50;122.50;7.00;2;0.00;0;20.84;28.28;
0.90;563.50;318.50;122.50;7.00;3;0.00;0;21.46;25.38;
0.90;563.50;318.50;122.50;7.00;4;0.00;0;20.71;25.16;
0.90;563.50;318.50;122.50;7.00;5;0.00;0;19.68;29.60;
0.86;588.00;294.00;147.00;7.00;2;0.00;0;19.50;27.30;
0.86;588.00;294.00;147.00;7.00;3;0.00;0;19.95;21.97;
0.86;588.00;294.00;147.00;7.00;4;0.00;0;19.34;23.49;
0.86;588.00;294.00;147.00;7.00;5;0.00;0;18.31;27.87;
0.82;612.50;318.50;147.00;7.00;2;0.00;0;17.05;23.77;
...
0.71;710.50;269.50;220.50;3.50;2;0.40;5;12.43;15.59;
0.71;710.50;269.50;220.50;3.50;3;0.40;5;12.63;14.58;
0.71;710.50;269.50;220.50;3.50;4;0.40;5;12.76;15.33;
0.71;710.50;269.50;220.50;3.50;5;0.40;5;12.42;15.31;
0.69;735.00;294.00;220.50;3.50;2;0.40;5;14.12;16.63;
0.69;735.00;294.00;220.50;3.50;3;0.40;5;14.28;15.87;
0.69;735.00;294.00;220.50;3.50;4;0.40;5;14.37;16.54;
0.69;735.00;294.00;220.50;3.50;5;0.40;5;14.21;16.74;
0.66;759.50;318.50;220.50;3.50;2;0.40;5;14.96;17.64;
0.66;759.50;318.50;220.50;3.50;3;0.40;5;14.92;17.79;
0.66;759.50;318.50;220.50;3.50;4;0.40;5;14.92;17.55;
0.66;759.50;318.50;220.50;3.50;5;0.40;5;15.16;18.06;
0.64;784.00;343.00;220.50;3.50;2;0.40;5;17.69;20.82;
0.64;784.00;343.00;220.50;3.50;3;0.40;5;18.19;20.21;
0.64;784.00;343.00;220.50;3.50;4;0.40;5;18.16;20.71;
0.64;784.00;343.00;220.50;3.50;5;0.40;5;17.88;21.40;
0.62;808.50;367.50;220.50;3.50;2;0.40;5;16.54;16.88;
0.62;808.50;367.50;220.50;3.50;3;0.40;5;16.44;17.11;
0.62;808.50;367.50;220.50;3.50;4;0.40;5;16.48;16.61;
0.62;808.50;367.50;220.50;3.50;5;0.40;5;16.64;16.03;
I read this csv file and implement linear regression implementation. Here is the source code in java:
public static void main(String[] args) throws IOException
{
String csvFile = null;
CSVLoader loader = null;
Remove remove =null;
Instances data =null;
LinearRegression model = null;
int numberofFeatures = 0;
try
{
csvFile = "C:\\Users\\Taha\\Desktop/ENB2012_data.csv";
loader = new CSVLoader();
// load CSV
loader.setSource(new File(csvFile));
data = loader.getDataSet();
//System.out.println(data);
numberofFeatures = data.numAttributes();
System.out.println("number of features: " + numberofFeatures);
data.setClassIndex(data.numAttributes() - 2);
//remove last attribute Y2
remove = new Remove();
remove.setOptions(new String[]{"-R", data.numAttributes()+""});
remove.setInputFormat(data);
data = Filter.useFilter(data, remove);
// data.setClassIndex(data.numAttributes() - 2);
model = new LinearRegression();
model.buildClassifier(data);
System.out.println(model);
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
I am getting an error, weka.core.UnassignedClassException: Class index is negative (not set)! at the line model.buildClassifier(data); Number of features is 1, however, it is expected to be 9.They are X1;X2;X3;X4;X5;X6;X7;X8;Y1;Y2 What am I missing?
Thanks in advance.
You can add after the line data=loader.getDataSet(), the next lines which will resolve your exception:
if (data.classIndex() == -1) {
System.out.println("reset index...");
instances.setClassIndex(data.numAttributes() - 1);
}
This worked for me.
Since I can not find any solution to that problem, I decided to position data into Oracle database and I read data from Oracle. There is an import utility in Oracle Sql Developer and I used it. That solves my problem. I write this article for people who has the same problem.
Here is the detailed information about connecting an Oracle database for weka.
http://tahasozgen.blogspot.com.tr/2016/10/connection-to-oracle-database-in-weka.html
I am currently trying to read in multiple CSV files using beanReader before taking a few columns from each and parsing them into one bean.
So far I cannot seem to parse columns from different files into one bean object. Is this even possible with ICsvBeanReader?
Yes, it's possible :) As of Super CSV 2.2.0 you can read into an existing bean (see javadoc).
The following example uses 3 readers simultaneously (operating on 3 different files) - the first reader is used to create the bean, the other 2 just update the existing bean. This approach assumes that each file has the same number of rows (and that each row number represents the same person). If they don't, but they share some unique identifier, you'll have to read all the records from the first file into memory first, then update from the second/third matching on the identifier.
I've tried to make it a little bit smart, so you don't have to hard-code the name mapping - it just nulls out the headers it doesn't know about (so that Super CSV doesn't attempt to map fields that don't exist in your bean - see the partial reading examples on the website). Of course this will only work if your file has headers - otherwise you'll just have to hard code the mapping arrays with nulls in the appropriate places.
Person bean
public class Person {
private String firstName;
private String sex;
private String country;
// getters/setters
}
Example code
public class Example {
private static final String FILE1 = "firstName,lastName\nJohn,Smith\nSally,Jones";
private static final String FILE2 = "age,sex\n21,male\n24,female";
private static final String FILE3 = "city,country\nBrisbane,Australia\nBerlin,Germany";
private static final List<String> DESIRED_HEADERS = Arrays.asList("firstName", "sex", "country");
#Test
public void testMultipleFiles() throws Exception {
try (
ICsvBeanReader reader1 = new CsvBeanReader(new StringReader(FILE1), CsvPreference.STANDARD_PREFERENCE);
ICsvBeanReader reader2 = new CsvBeanReader(new StringReader(FILE2), CsvPreference.STANDARD_PREFERENCE);
ICsvBeanReader reader3 = new CsvBeanReader(new StringReader(FILE3), CsvPreference.STANDARD_PREFERENCE);){
String[] mapping1 = getNameMappingFromHeader(reader1);
String[] mapping2 = getNameMappingFromHeader(reader2);
String[] mapping3 = getNameMappingFromHeader(reader3);
Person person;
while((person = reader1.read(Person.class, mapping1)) != null){
reader2.read(person, mapping2);
reader3.read(person, mapping3);
System.out.println(person);
}
}
}
private String[] getNameMappingFromHeader(ICsvBeanReader reader) throws IOException{
String[] header = reader.getHeader(true);
// only read in the desired fields (set unknown headers to null to ignore)
for (int i = 0; i < header.length; i++){
if (!DESIRED_HEADERS.contains(header[i])){
header[i] = null;
}
}
return header;
}
}
Output
Person [firstName=John, sex=male, country=Australia]
Person [firstName=Sally, sex=female, country=Germany]
I'm doing an animation in Processing. I'm using random points and I need to execute the code twice for stereo vision.
I have lots of random variables in my code, so I should save it somewhere for the second run or re-generate the SAME string of "random" numbers any time I run the program. (as said here: http://www.coderanch.com/t/372076/java/java/save-random-numbers)
Is this approach possible? How? If I save the numbers in a txt file and then read it, will my program run slower? What's the best way to do this?
Thanks.
If you just need to be able to generate the same sequence for a limited time, seeding the random number generator with the same value to generate the same sequence is most likely the easiest and fastest way to go. Just make sure that any parallel threads always request their pseudo random numbers in the same sequence, or you'll be in trouble.
Note though that there afaik is nothing guaranteeing the same sequence if you update your Java VM or even run a patch, so if you want long time storage for your sequence, or want to be able to use it outside of your Java program, you need to save it to a file.
Here is a sample example:
public static void writeRandomDoublesToFile(String filePath, int numbersCount) throws IOException
{
FileOutputStream fos = new FileOutputStream(new File(filePath));
BufferedOutputStream bos = new BufferedOutputStream(fos);
DataOutputStream dos = new DataOutputStream(bos);
dos.writeInt(numbersCount);
for(int i = 0; i < numbersCount; i++) dos.writeDouble(Math.random());
}
public static double[] readRandomDoublesFromFile(String filePath) throws IOException
{
FileInputStream fis = new FileInputStream(new File(filePath));
BufferedInputStream bis = new BufferedInputStream(fis);
DataInputStream dis = new DataInputStream(bis);
int numbersCount = dis.readInt();
double[] result = new double[numbersCount];
for(int i = 0; i < numbersCount; i++) result[i] = dis.readDouble();
return result;
}
Well, there's a couple of ways that you can approach this problem. One of them would be to save the random variables as input into a file and pass that file name as a parameter to your program.
And you could do that in one of two ways, the first of which would be to use the args[] parameter:
import java.io.*;
import java.util.*;
public class bla {
public static void main(String[] args) {
// You'd need to put some verification code here to make
// sure that input was actually sent to the program.
Scanner in = new Scanner(new File(args[1]));
while(in.hasNextLine()) {
System.out.println(in.nextLine());
}
} }
Another way would be to use Scanner and read from the console input. It's all the same code as above, but instead of Scanner in = new Scanner(new File(args[1])); and all the verification code above that. You'd substitute Scanner in = new Scanner(System.in), but that's just to load the file.
The process of generating those points could be done in the following manner:
import java.util.*;
import java.io.*;
public class generator {
public static void main(String[] args) {
// You'd get some user input (or not) here
// that would ask for the file to save to,
// and that can be done by either using the
// scanner class like the input example above,
// or by using args, but in this case we'll
// just say:
String fileName = "somefile.txt";
FileWriter fstream = new FileWriter(fileName);
BufferedWriter out = new BufferedWriter(fstream);
out.write("Stuff");
out.close();
}
}
Both of those solutions are simple ways to read and write to and from a file in Java. However, if you deploy either of those solutions, you're still left with some kind of parsing of the data.
If it were me, I'd go for object serialization, and store a binary copy of the data structure I've already generated to disk rather than having to parse and reparse that information in an inefficient way. (Using text files, usually, takes up more disk space.)
And here's how you would do that (Here, I'm going to reuse code that has already been written, and comment on it along the way) Source
You declare some wrapper class that holds data (you don't always have to do this, by the way.)
public class Employee implements java.io.Serializable
{
public String name;
public String address;
public int transient SSN;
public int number;
public void mailCheck()
{
System.out.println("Mailing a check to " + name
+ " " + address);
}
}
And then, to serialize:
import java.io.*;
public class SerializeDemo
{
public static void main(String [] args)
{
Employee e = new Employee();
e.name = "Reyan Ali";
e.address = "Phokka Kuan, Ambehta Peer";
e.SSN = 11122333;
e.number = 101;
try
{
FileOutputStream fileOut =
new FileOutputStream("employee.ser");
ObjectOutputStream out =
new ObjectOutputStream(fileOut);
out.writeObject(e);
out.close();
fileOut.close();
}catch(IOException i)
{
i.printStackTrace();
}
}
}
And then, to deserialize:
import java.io.*;
public class DeserializeDemo
{
public static void main(String [] args)
{
Employee e = null;
try
{
FileInputStream fileIn =
new FileInputStream("employee.ser");
ObjectInputStream in = new ObjectInputStream(fileIn);
e = (Employee) in.readObject();
in.close();
fileIn.close();
}catch(IOException i)
{
i.printStackTrace();
return;
}catch(ClassNotFoundException c)
{
System.out.println(.Employee class not found.);
c.printStackTrace();
return;
}
System.out.println("Deserialized Employee...");
System.out.println("Name: " + e.name);
System.out.println("Address: " + e.address);
System.out.println("SSN: " + e.SSN);
System.out.println("Number: " + e.number);
}
}
Another alternative solution to your problem, that does not involve storing data, is to create a lazy generator for whatever function that provides you your random values, and provide the same seed each and every time. That way, you don't have to store any data at all.
However, that still is quite a bit slower (I think) than serializing the object to disk and loading it back up again. (Of course, that's a really subjective statement, but I'm not going to enumerate cases where that is not true). The advantage of doing that is so that it doesn't require any kind of storage at all.
Another way, that you may have not possibly thought of, is to create a wrapper around your generator function that memoizes the output -- meaning that data that has already been generated before will be retrieved from memory and will not have to be generated again if the same inputs are true. You can see some resources on that here: Memoization source
The idea behind memoizing your function calls is that you save time without persisting to disk. This is ideal if the same values are generated over and over and over again. Of course, for a set of random points, this isn't going to work very well if every point is unique, but keep that in the back of your mind.
The really interesting part comes when considering the ways that all the previous strategies I've described in this post can be combined together.
It'd be interesting to setup a Memoizer class, like described in the second page of 2 and then implement java.io.Serialization in that class. After that, you can add methods save(String fileName) and load(String fileName) in the memoizer class that make serialization and deserialization easier, so you can persist the cache used to memoize the function. Very useful.
Anyway, enough is enough. In short, just use the same seed value, and generate the same point pairs on the fly.
I have some questions regarding reading and writing to CSV files (or if there is a simpler alternative).
Scenario:
I need to have a simple database of people and some basic information about them. I need to be able to add new entries and search through the file for entries. I also need to be able to find an entry and modify it (i.e change their name or fill in a currently empty field).
Now I'm not sure if a CSV reader/writer is the best route or not? I wouldn't know where to begin with SQL in Java but if anyone knows of a good resource for learning that, that would be great.
Currently I am using SuperCSV, I put together a test project based around some example code:
class ReadingObjects {
// private static UserBean userDB[] = new UserBean[2];
private static ArrayList<UserBean> arrUserDB = new ArrayList<UserBean>();
static final CellProcessor[] userProcessors = new CellProcessor[] {
new StrMinMax(5, 20),
new StrMinMax(8, 35),
new ParseDate("dd/MM/yyyy"),
new Optional(new ParseInt()),
null
};
public static void main(String[] args) throws Exception {
ICsvBeanReader inFile = new CsvBeanReader(new FileReader("foo.csv"), CsvPreference.EXCEL_PREFERENCE);
try {
final String[] header = inFile.getCSVHeader(true);
UserBean user;
int i = 0;
while( (user = inFile.read(UserBean.class, header, userProcessors)) != null) {
UserBean addMe = new UserBean(user.getUsername(), user.getPassword(), user.getTown(), user.getDate(), user.getZip());
arrUserDB.add(addMe);
i++;
}
} finally {
inFile.close();
}
for(UserBean currentUser:arrUserDB){
if (currentUser.getUsername().equals("Klaus")) {
System.out.println("Found Klaus! :D");
}
}
WritingMaps.add();
}
}
And a writer class:
class WritingMaps {
public static void add() throws Exception {
ICsvMapWriter writer = new CsvMapWriter(new FileWriter("foo.csv", true), CsvPreference.EXCEL_PREFERENCE);
try {
final String[] header = new String[] { "username", "password", "date", "zip", "town"};
String test = System.getProperty("line.seperator");
// set up some data to write
final HashMap<String, ? super Object> data1 = new HashMap<String, Object>();
data1.put(header[0], "Karlasa");
data1.put(header[1], "fdsfsdfsdfs");
data1.put(header[2], "17/01/2010");
data1.put(header[3], 1111);
data1.put(header[4], "New York");
System.out.println(data1);
// the actual writing
// writer.writeHeader(header);
writer.write(data1, header);
// writer.write(data2, header);
} finally {
writer.close();
}
}
}
Issues:
I'm struggling to get the writer to add a new line to the CSV file. Purely for human readability purposes, not such a big deal.
I'm not sure how I would add data to an existing record to modify it. (remove and add it again? Not sure how to do this).
Thanks.
Have you considered an embedded database like H2, HSQL or SQLite? They can all persist to the filesystem and you'll discover a more flexible datastore with less code.
The easiest solution is to read the file at application startup into an in-memory structure (list of UserBean, for example), to add, remove, modify beans in this in-memory structure, and to write the whole list of UserBean to the file when the app closes, or when the user chooses to Save.
Regarding newlines when writing, the javadoc seems to indicate that the writer will take care of that. Just call write for each of your user bean, and the writer will automatically insert newlines between each row.