im trying to get the attribute table from esri shapefiles in java but i only managed to get the header table using Geotools library ?
Here is my source code :
File dbfFile = new File("C:/Users/ilyasse2.0/Desktop/shapefiles/marocShp/mar_admbndp_admALL_unhcr_itos_20201203.dbf");
FileInputStream fis = new FileInputStream(dbfFile);
DbaseFileReader dbfReader = new DbaseFileReader(fis.getChannel(), false, Charset.forName("ISO-8859-1"));
DbaseFileHeader dbfHeader = dbfReader.getHeader();
System.out.println(dbfHeader.getRecordLength());
List<String> names = new Vector<String>();
int n = dbfHeader.getNumFields();
for (int i = 0; i < n; i++) names.add(dbfHeader.getFieldName(i));
System.out.println(names);
You need to use a ShapefileDataStore to read a Shapefile, you can then iterate through the features and extract the attributes. Note: the file should end in .shp, GeoTools will find the other necessary files.
List attributes = new ArrayList<>();
FileDataStore store = FileDataStoreFinder.getDataStore(file);
SimpleFeatureSource source = store.getFeatureSource();
FeatureType schema = source.getSchema();
try(SimpleFeatureCollection features = source.getFeatures();SimpleFeatureIterator iterator = featureCollection.features()) {
while (iterator.hasNext()) {
// copy the contents of each feature and transform the geometry
SimpleFeature feature = iterator.next();
attributes.add(feature.getAttributes());
}
}
Related
I`m trying to get the diferences between two jsonArrays, in different files, and print it in a new file.
What is the best practice for this?
I hope you can help me.
Thanks!
I`m using eclipse. I´ve tried with Maps.difference reading the files with a fileReader.
enter code here ://Reading the file
File jsonInputFileMod = new File("../MENU.json");
InputStream isMod;
is = new FileInputStream(jsonInputFileMod);
// Create JsonReader from Json.
JsonReader readerMod = Json.createReader(is);
// Get the JsonObject structure from JsonReader.
JsonArray empObjMod = readerMod.readArray();
readerMod.close();
//Creating maps
Map [] mapArray = new Map [empObj.size()];
for(int i=0; i<empObj.size(); i++){
mapArray[i] = (Map) empObj.get(i);
}
Map [] mapArrayMod = new Map [empObjMod.size()];
for(int i=0; i<empObjMod.size(); i++){
mapArrayMod[i] = (Map) empObjMod.get(i);
//Comparation
if(mapArray.length==mapArrayMod.length){
String [] dif = new String [mapArray.length];
FileWriter salida = new FileWriter("../diferences.json");
for(int i=0; i<mapArray.length; i++){
dif[i] = Maps.difference(mapArray[i], mapArrayMod[i]).toString();
salida.write("\n\n JSON : " + i + "\n\n");
//salida.write(Maps.difference(mapArray[i], mapArrayMod[i]).toString().replace("[", "\n\t["));
salida.write(dif[i]);
}
salida.close();
I have used weka and made a Naive Bayes classifier, by using weka GUI. Then I have saved this model by following this tutorial. Now I want to load this model through Java code but I am unable to find any way to load a saved model using weka.
This is my requirement that I have to made model separately and then use it in a separate program.
If anyone can guide me in this regard I will be thankful to you.
You can easily load a saved model in java using this command:
Classifier myCls = (Classifier) weka.core.SerializationHelper.read(pathToModel);
For a complete workflow in Java I wrote the following article in SO Documentation, now copied here:
Text Classification in Weka
Text Classification with LibLinear
Create training instances from .arff file
private static Instances getDataFromFile(String path) throws Exception{
DataSource source = new DataSource(path);
Instances data = source.getDataSet();
if (data.classIndex() == -1){
data.setClassIndex(data.numAttributes()-1);
//last attribute as class index
}
return data;
}
Instances trainingData = getDataFromFile(pathToArffFile);
Use StringToWordVector to transform your string attributes to number representation:
Important features of this filter:
tf-idf representation
stemming
lowercase words
stopwords
n-gram representation*
StringToWordVector() filter = new StringToWordVector();
filter.setWordsToKeep(1000000);
if(useIdf){
filter.setIDFTransform(true);
}
filter.setTFTransform(true);
filter.setLowerCaseTokens(true);
filter.setOutputWordCounts(true);
filter.setMinTermFreq(minTermFreq);
filter.setNormalizeDocLength(new SelectedTag(StringToWordVector.FILTER_NORMALIZE_ALL,StringToWordVector.TAGS_FILTER));
NGramTokenizer t = new NGramTokenizer();
t.setNGramMaxSize(maxGrams);
t.setNGramMinSize(minGrams);
filter.setTokenizer(t);
WordsFromFile stopwords = new WordsFromFile();
stopwords.setStopwords(new File("data/stopwords/stopwords.txt"));
filter.setStopwordsHandler(stopwords);
if (useStemmer){
Stemmer s = new /*Iterated*/LovinsStemmer();
filter.setStemmer(s);
}
filter.setInputFormat(trainingData);
Apply the filter to trainingData: trainingData = Filter.useFilter(trainingData, filter);
Create the LibLinear Classifier
SVMType 0 below corresponds to the L2-regularized logistic regression
Set setProbabilityEstimates(true) to print the output probabilities
Classifier cls = null;
LibLINEAR liblinear = new LibLINEAR();
liblinear.setSVMType(new SelectedTag(0, LibLINEAR.TAGS_SVMTYPE));
liblinear.setProbabilityEstimates(true);
// liblinear.setBias(1); // default value
cls = liblinear;
cls.buildClassifier(trainingData);
Save model
System.out.println("Saving the model...");
ObjectOutputStream oos;
oos = new ObjectOutputStream(new FileOutputStream(path+"mymodel.model"));
oos.writeObject(cls);
oos.flush();
oos.close();
Create testing instances from .arff file
Instances trainingData = getDataFromFile(pathToArffFile);
Load classifier
Classifier myCls = (Classifier) weka.core.SerializationHelper.read(path+"mymodel.model");
Use the same StringToWordVector filter as above or create a new one for testingData, but remember to use the trainingData for this command:filter.setInputFormat(trainingData); This will make training and testing instances compatible.
Alternatively you could use InputMappedClassifier
Apply the filter to testingData: testingData = Filter.useFilter(testingData, filter);
Classify!
1.Get the class value for every instance in the testing set
for (int j = 0; j < testingData.numInstances(); j++) {
double res = myCls.classifyInstance(testingData.get(j));
}
res is a double value that corresponds to the nominal class that is defined in .arff file. To get the nominal class use : testintData.classAttribute().value((int)res)
2.Get the probability distribution for every instance
for (int j = 0; j < testingData.numInstances(); j++) {
double[] dist = first.distributionForInstance(testInstances.get(j));
}
dist is a double array that contains the probabilities for every class defined in .arff file
Note. Classifier should support probability distributions and enable them with: myClassifier.setProbabilityEstimates(true);
I have created a model in Weka using the SMO algorithm. I am trying to evaluate a test sample using the mentioned model to classify it in my two-class problem. I am a bit confused on how to evaluate the sample using Weka Smo code. I have built an empty arff file which contains only the meta-data of the file. I calculate the sample features and I add the vector in arff file. I have created the following function Evaluate in order to evaluate a sample. File template.arff is the template which contains the meta-data of a arff file and models/smo my model.
public static void Evaluate(ArrayList<Float> temp) throws Exception {
temp.add(Float.parseFloat("1"));
System.out.println(temp.size());
double dt[] = new double[temp.size()];
for (int index = 0; index < temp.size(); index++) {
dt[index] = temp.get(index);
}
double data[][] = new double[1][];
data[0] = dt;
weka.classifiers.Classifier c = loadModel(new File("models/"), "/smo"); // loads smo model
File tmp = new File("template.arff"); //loads data template
Instances dataset = new weka.core.converters.ConverterUtils.DataSource(tmp.getAbsolutePath()).getDataSet();
int numInstances = data.length;
for (int inst = 0; inst < numInstances; inst++) {
dataset.add(new Instance(1.0, data[inst]));
}
dataset.setClassIndex(dataset.numAttributes() - 1);
Evaluation eval = new Evaluation(dataset);
//returned evaluated index
double a = eval.evaluateModelOnceAndRecordPrediction(c, dataset.instance(0));
double arr[] = c.distributionForInstance(dataset.instance(0));
System.out.println(" Confidence Scores");
for (int idx = 0; idx < arr.length; idx++) {
System.out.print(arr[idx] + " ");
}
System.out.println();
}
I am not sure if I am right here. I create the sample file. Afterwards I am loading my model. I am wandering if my code is what I need in order to evaluate the class of sample temp. If this code is ok, how can I extract the confidence score and not the binary decision about the class? The structure of template.arff file is:
#relation Dataset
#attribute Attribute0 numeric
#attribute Attribute1 numeric
#attribute Attribute2 numeric
...
#ATTRIBUTE class {1, 2}
#data
Moreover loadModel function is the following:
public static SMO loadModel(File path, String name) throws Exception {
SMO classifier;
FileInputStream fis = new FileInputStream(path + name + ".model");
ObjectInputStream ois = new ObjectInputStream(fis);
classifier = (SMO) ois.readObject();
ois.close();
return classifier;
}
I found this post here which suggest to locate the SMO.java file and change the following line smo.buildClassifier(train, cl1, cl2, true, -1, -1); // from false to true.
However it seems when I did so, I got the same binary output.
My training function:
public void weka_train(File input, String[] options) throws Exception {
long start = System.nanoTime();
File tmp = new File("data.arff");
TwitterTrendSetters obj = new TwitterTrendSetters();
Instances data = new weka.core.converters.ConverterUtils.DataSource(
tmp.getAbsolutePath()).getDataSet();
data.setClassIndex(data.numAttributes() - 1);
Classifier c = null;
String ctype = null;
boolean newmodel = false;
ctype = "SMO";
c = new SMO();
for (int i = 0; i < options.length; i++) {
System.out.print(options[i]);
}
c.setOptions(options);
c.buildClassifier(data);
newmodel = true;
if (newmodel) {
obj.saveModel(c, ctype, new File("models"));
}
}
I have some suggestions but I have no idea whether they will work. Let me know if this works for you.
First use SMO not just the parent object Classifier class. I created a new method loadModelSMO as an example of this.
SMO Class
public static SMO loadModelSMO(File path, String name) throws Exception {
SMO classifier;
FileInputStream fis = new FileInputStream(path + name + ".model");
ObjectInputStream ois = new ObjectInputStream(fis);
classifier = (SMO) ois.readObject();
ois.close();
return classifier;
}
and then
SMO c = loadModelSMO(new File("models/"), "/smo");
...
I found a article that might help you out from the mailing list subject titled
I used SMO with logistic regression but I always get a confidence of 1.0
It suggest to set use the -M to fit your logistics model which can be used through the method
setOptions(java.lang.String[] options)
Also maybe you need to set your build logistics model to true
Confidence score in SMO
c.setBuildLogisticModels(true);
Let me know if this helped at all.
Basically you should try to use the option "-M" for SMO to fit logistic models, in training process. Check the solution proposed here. It should work!
I need to export the values in 4 column.values for 3 columns are populating properly.
I am having trouble with 4th column which is organization column.it is multivalued column.i.e.: it has multiple values.
I have tried to convert from object to String for organization column but didnt help.
Please see the code below:
String appname = "abc";
String path = "//home/exportfile//";
String filename = path+"ApplicationExport-"+appname+".txt";
String ret = "false";
QueryOptions ops = new QueryOptions();
Filter [] filters = new Filter[1];
filters[0] = Filter.eq("application.name", appname);
ops.add(filters);
List props = new ArrayList();
props.add("identity.name");
//Do search
Iterator it = context.search(Link.class, ops, props);
//Build file and export header row
BufferedWriter out = new BufferedWriter(new FileWriter(filename));
out.write("Name,UserName,WorkforceID,organization");
out.newLine();
//Iterate Search Results
if (it!=null)
{
while (it.hasNext()) {
//Get link and create object
Object [] record = it.next();
String identityName = (String) record[0];
Identity user = (Identity) context.getObject(Identity.class, identityName);
//Get Identity attributes for export
String workforceid = (String) user.getAttribute("workforceID");
//Get application attributes for export
String userid="";
List links = user.getLinks();
if (links!=null)
{
Iterator lit = links.iterator();
while (lit.hasNext())
{
Link l = lit.next();
String lname = l.getApplicationName();
if (lname.equalsIgnoreCase(appname))
{
userid = (String) l.getAttribute("User Name");
List orgList = l.getAttribute("Organization");
}
}
}
//Output file
out.write(identityName+","+userid+","+workforceid+","+org);
out.newLine();
out.flush();
}
ret="true";
}
//Close file and return
out.close();
return ret;
The output of this code should be:
for ex:
Name,UserName,WorkforceID,organization
abc,abc,123,xy
qwe,q01,234,xy
any help correcting this code will be greatly appreciated.
EDIT:
This should give you the output you want:
out.write(identityName+","+userid+","+workforceid+","+Arrays.toString(orgList.toArray());
you probably want to declare List orgList outside the while loop since everytime it is being created and also you are using org and i havent seen any org elsewhere in your code
I'm trying to open MS Word 2003 document in java, search for a specified String and replace it with a new String. I use APACHE POI to do that. My code is like the following one:
public void searchAndReplace(String inputFilename, String outputFilename,
HashMap<String, String> replacements) {
File outputFile = null;
File inputFile = null;
FileInputStream fileIStream = null;
FileOutputStream fileOStream = null;
BufferedInputStream bufIStream = null;
BufferedOutputStream bufOStream = null;
POIFSFileSystem fileSystem = null;
HWPFDocument document = null;
Range docRange = null;
Paragraph paragraph = null;
CharacterRun charRun = null;
Set<String> keySet = null;
Iterator<String> keySetIterator = null;
int numParagraphs = 0;
int numCharRuns = 0;
String text = null;
String key = null;
String value = null;
try {
// Create an instance of the POIFSFileSystem class and
// attach it to the Word document using an InputStream.
inputFile = new File(inputFilename);
fileIStream = new FileInputStream(inputFile);
bufIStream = new BufferedInputStream(fileIStream);
fileSystem = new POIFSFileSystem(bufIStream);
document = new HWPFDocument(fileSystem);
docRange = document.getRange();
numParagraphs = docRange.numParagraphs();
keySet = replacements.keySet();
for (int i = 0; i < numParagraphs; i++) {
paragraph = docRange.getParagraph(i);
text = paragraph.text();
numCharRuns = paragraph.numCharacterRuns();
for (int j = 0; j < numCharRuns; j++) {
charRun = paragraph.getCharacterRun(j);
text = charRun.text();
System.out.println("Character Run text: " + text);
keySetIterator = keySet.iterator();
while (keySetIterator.hasNext()) {
key = keySetIterator.next();
if (text.contains(key)) {
value = replacements.get(key);
charRun.replaceText(key, value);
docRange = document.getRange();
paragraph = docRange.getParagraph(i);
charRun = paragraph.getCharacterRun(j);
text = charRun.text();
}
}
}
}
bufIStream.close();
bufIStream = null;
outputFile = new File(outputFilename);
fileOStream = new FileOutputStream(outputFile);
bufOStream = new BufferedOutputStream(fileOStream);
document.write(bufOStream);
} catch (Exception ex) {
System.out.println("Caught an: " + ex.getClass().getName());
System.out.println("Message: " + ex.getMessage());
System.out.println("Stacktrace follows.............");
ex.printStackTrace(System.out);
}
}
I call this function with following arguments:
HashMap<String, String> replacements = new HashMap<String, String>();
replacements.put("AAA", "BBB");
searchAndReplace("C:/Test.doc", "C:/Test1.doc", replacements);
When the Test.doc file contains a simple line like this : "AAA EEE", it works successfully, but when i use a complicated file it will read the content successfully and generate the Test1.doc file but when I try to open it, it will give me the following error:
Word unable to read this document. It may be corrupt.
Try one or more of the following:
* Open and repair the file.
* Open the file with Text Recovery converter.
(C:\Test1.doc)
Please tell me what to do, because I'm a beginner in POI and I have not found a good tutorial for it.
First of all you should be closing your document.
Besides that, what I suggest doing is resaving your original Word document as a Word XML document, then changing the extension manually from .XML to .doc . Then look at the XML of the actual document you're working with and trace the content to make sure you're not accidentally editing hexadecimal values (AAA and EEE could be hex values in other fields).
Without seeing the actual Word document it's hard to say what's going on.
There is not much documentation about POI at all, especially for Word document unfortunately.
I don't know : is its OK to answer myself, but Just to share the knowledge, I'll answer myself.
After navigating the web, the final solution i found is :
The Library called docx4j is very good for dealing with MS docx file, although its documentation is not enough till now and its forum is still in a beginning steps, but overall it help me to do what i need..
Thanks 4 all who help me..
You could try OpenOffice API, but there arent many resources out there to tell you how to use it.
You can also try this one: http://www.dancrintea.ro/doc-to-pdf/
Looks like this could be the issue.