I am reading from ini files and passing them via data providers to test cases.
(The data provider reads these and returns an Ini.Section[][] array. If there are several sections, testng runs the test that many times.)
Let's imagine there is a section like this:
[sectionx]
key1=111
key2=222
key3=aaa,bbb,ccc
What I want, in the end, is to read this data and execute the test case three times, each time with a different value of key3, the other keys being the same.
One way would be to copy&paste the section as many times as needed... which is clearly not an ideal solution.
The way to go about it would seem to create further copies of the section, then change the key values to aaa, bbb and ccc. The data provider would return the new array and testng would do the rest.
However, I cannot seem to be able to create a new instance of the section object. Ini.Section is actually an interface; the implementing class org.ini4j.BasicProfileSection is not visible. It does not appear to be possible to create a copy of the object, or to inherit the class. I can only manipulate existing objects of this type, but not create new ones. Is there any way around it?
It seems that it is not possible to create copies of sections or the ini files. I ended up using this workaround:
First create an 'empty' ini file, that will serve as a sort of a placeholder. It will look like this:
[env]
test1=1
test2=2
test3=3
[1]
[2]
[3]
...with a sufficiently large number of sections, equal or greater to the number of sections in the other ini files.
Second, read the data in the data provider. When there is a key that contains several values, create a new Ini object for each value. The new Ini object must be created from a new file object. (You can read the placeholder file over and over, creating any number of Ini files.)
Finally, you have to copy the content of the actual ini file into the placeholder file.
The following code code works for me:
public static Ini copyIniFile(Ini originalFile){
Set<Entry<String, Section>> entries = originalFile.entrySet();
Ini emptyFile;
try {
FileInputStream file = new FileInputStream(new File(EMPTY_DATA_FILE_NAME));
emptyFile = new Ini(file);
file.close();
} catch (Exception e) {
e.printStackTrace();
return null;
}
for(Entry<String, Section> entry : entries){
String key = (String) entry.getKey();
Section section = (Section) entry.getValue();
copySection(key, section, emptyFile);
}
return emptyFile;
}
public static Ini.Section copySection(String key, Ini.Section origin, Ini destinationFile){
Ini.Section newSection = destinationFile.get(key);
if(newSection==null) throw new IllegalArgumentException();
for(Entry<String, String> entry : origin.entrySet()){
newSection.put(entry.getKey().toString(), entry.getValue().toString());
}
return newSection;
}
Related
I have a countries Map with the following design:
England=24
Spain=21
Italy=10
etc
Then, I have a different citiesMap with the following design:
London=10
Manchester=5
Madrid=7
Barcelona=4
Roma=3
etc
Currently, I am printing these results on screen:
System.out.println("\nCountries:");
Map<String, Long> countryMap = countTotalResults(orderDataList, OrderData::getCountry);
writeResultInCsv(countryMap);
countryMap.entrySet().stream().forEach(System.out::println);
System.out.println("\nCities:\n");
Map<String, Long> citiesMap = countTotalResults(orderDataList, OrderData::getCity);
writeResultInCsv(citiesMap);
citiesMap.entrySet().stream().forEach(System.out::println);
I want to write each line of my 2 maps in the same CSV file. I have the following code:
public void writeResultInCsv(Map<String, Long> resultMap) throws Exception {
File csvOutputFile = new File(RUTA_FICHERO_RESULTADO);
try (PrintWriter pw = new PrintWriter(csvOutputFile)) {
resultMap.entrySet().stream()
.map(this::convertToCSV)
.forEach(pw::println);
}
}
public String convertToCSV(String[] data) {
return Stream.of(data)
.map(this::escapeSpecialCharacters)
.collect(Collectors.joining("="));
}
public String escapeSpecialCharacters(String data) {
String escapedData = data.replaceAll("\\R", " ");
if (data.contains(",") || data.contains("\"") || data.contains("'")) {
data = data.replace("\"", "\"\"");
escapedData = "\"" + data + "\"";
}
return escapedData;
}
But I get compilation error in writeResultInCsv method, in the following line:
.map(this::convertToCSV)
This is the compilation error I get:
reason: Incompatible types: Entry is not convertible to String[]
How can I indicate the following result in a CSV file in Java 8 in a simplified way?
This is the result and design that I want my CSV file to have:
Countries:
England=24
Spain=21
Italy=10
etc
Cities:
London=10
Manchester=5
Madrid=7
Barcelona=4
Roma=3
etc
Your resultMap.entrySet() is a Set<Map.Entry<String, Long>>. You then turn that into a Stream<Map.Entry<String, Long>>, and then run .map on this. Thus, the mapper you provide there needs to map objects of type Map.Entry<String, Long> to whatever you like. but you pass the convertToCSV method to it, which maps string arrays.
Your code tries to join on comma (Collectors.joining(",")), but your desired output contains zero commas.
It feels like one of two things is going on:
you copy/pasted this code from someplace or it was provided to you and you have no idea what any of it does. I would advise tearing this code into pieces: Take each individual piece, experiment with it until you understand it, then put it back together again and now you know what you're looking at. At that point you would know that having Collectors.joining(",") in this makes no sense whatsoever, and that you're trying to map an entry of String, Long using a mapping function that maps string arrays - which obviously doesn't work.
You would know all this stuff but you haven't bothered to actually look at your code. That seems a bit surprising, so I don't think this is it. But if it is - the code you have is so unrelated to the job you want to do, that you might as well remove your code entirely and turn this question into: "I have this. I want this. How do I do it?"
NB: A text file listing key=value pairs is not usually called a CSV file.
Consider the following class definition for an item sold at a supermarket / food shop:
There is a binary data file named “objects.dat” containing the details of 5 objects of type GroceryItem that were previously in memory before being saved directly to the data file.
Write code for a method named processFiles which will open the “objects.dat” file, read in the 5 individual GroceryItem objects, placing them into an ArrayList. Then, it will create a text file named “report.txt”, and write the barcode, name, and price of each GroceryItem out to the file, one GroceryItem per line. Include appropriate exception handling code, to display user-friendly messages when things go wrong.
I understand by your question that you want to read binary data and store in array.
public static void main(String[] args) {
Path test_path = Paths.get("D:/test", "test.txt");
try {
byte[] testArray = Files.readAllBytes(test_path );
String wikiString = new String(testArray , "ISO-8859-1");
System.out.println(testArray );
} catch (IOException io) {
System.out.println(io);
}
}
You want to populate an associative array in order to perform a map-side join. You’ve decided to
put this information in a text file, place that file into the DistributedCache and read it in your
Mapper before any records are processed.
Indentify which method in the Mapper you should use to implement code for reading the file and
populating the associative array?
map or configure ??
I believe you're looking for the setup() method.
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/Mapper.html#setup%28org.apache.hadoop.mapreduce.Mapper.Context%29
It gets called once at the initialization of each mapper task. So if there's anything you want to do before the map task starts to read the key/value pairs through the map method (such as, in your question, to read a file off the distributed cache and populate some member vars with the info), then that is the place to do it.
class IndexMapperExample implements Mapper {
void configure(JobConf conf) {
try {
String stopwordCacheName = new Path(HDFS_STOPWORD_LIST).getName();
Path [] cacheFiles = DistributedCache.getLocalCacheFiles(conf);
if (null != cacheFiles && cacheFiles.length > 0) {
for (Path cachePath : cacheFiles) {
if (cachePath.getName().equals(stopwordCacheName)) {
loadStopWords(cachePath);
break;
}
}
}
} catch (IOException ioe) {
System.err.println("IOException reading from distributed cache");
System.err.println(ioe.toString());
}
}
In above code snippet,file of distributed cache gets read in configure method.
that means distributed cache gets read in configure method.
Currently, I'm copying one instance at a time from one dataset to the other. Is there a way to do this so that string mappings remain intact? The mergeInstances works horizontally, is there an equivalent vertical merge?
This is one step of a loop I use to read datasets of the same structure from multiple arff files into one large dataset. There has got to be a simpler way.
Instances iNew = new ConverterUtils.DataSource(name).getDataSet();
for (int i = 0; i < iNew.numInstances(); i++) {
Instance nInst = iNew.instance(i);
inst.add(nInst);
}
If you want a totally fully automated method that also copy properly string and nominal attributes, you can use the following function:
public static Instances merge(Instances data1, Instances data2)
throws Exception
{
// Check where are the string attributes
int asize = data1.numAttributes();
boolean strings_pos[] = new boolean[asize];
for(int i=0; i<asize; i++)
{
Attribute att = data1.attribute(i);
strings_pos[i] = ((att.type() == Attribute.STRING) ||
(att.type() == Attribute.NOMINAL));
}
// Create a new dataset
Instances dest = new Instances(data1);
dest.setRelationName(data1.relationName() + "+" + data2.relationName());
DataSource source = new DataSource(data2);
Instances instances = source.getStructure();
Instance instance = null;
while (source.hasMoreElements(instances)) {
instance = source.nextElement(instances);
dest.add(instance);
// Copy string attributes
for(int i=0; i<asize; i++) {
if(strings_pos[i]) {
dest.instance(dest.numInstances()-1)
.setValue(i,instance.stringValue(i));
}
}
}
return dest;
}
Please note that the following conditions should hold (there are not checked in the function):
Datasets must have the same attributes structure (number of attributes, type of attributes)
Class index has to be the same
Nominal values have to exactly correspond
To modify on the fly the values of the nominal attributes of data2 to match the ones of data1, you can use:
data2.renameAttributeValue(
data2.attribute("att_name_in_data2"),
"att_value_in_data2",
"att_value_in_data1");
Why not make a new ARFF file which has the data from both of the originals? A simple
cat 1.arff > tmp.arff
tail -n+20 2.arff >> tmp.arff
where 20 is replaced by however many lines long your arff header is. This would then produce a new arff file with all of the desired instances, and you could read this new file with your existing code:
Instances iNew = new ConverterUtils.DataSource(name).getDataSet();
You could also invoke weka on the command line using this documentation: http://old.nabble.com/how-to-merge-two-data-file-a.arff-and-b.arff-into-one-data-list--td22890856.html
java weka.core.Instances append filename1 filename2 > output-file
However, there is no function in the documentation http://weka.sourceforge.net/doc.dev/weka/core/Instances.html#main%28java.lang.String which will allow you to append multiple arff files natively within your java code. As of Weka 3.7.6, the code that appends two arff files is this:
// read two files, append them and print result to stdout
else if ((args.length == 3) && (args[0].toLowerCase().equals("append"))) {
DataSource source1 = new DataSource(args[1]);
DataSource source2 = new DataSource(args[2]);
String msg = source1.getStructure().equalHeadersMsg(source2.getStructure());
if (msg != null)
throw new Exception("The two datasets have different headers:\n" + msg);
Instances structure = source1.getStructure();
System.out.println(source1.getStructure());
while (source1.hasMoreElements(structure))
System.out.println(source1.nextElement(structure));
structure = source2.getStructure();
while (source2.hasMoreElements(structure))
System.out.println(source2.nextElement(structure));
}
Thus it looks like Weka itself simply iterates through all of the instances in a data set and prints them, the same process your code uses.
Another possible solution is to use addAll from java.util.AbstractCollection, since Instances implement it.
instances1.addAll(instances2);
I've just shared an extended weka.core.Instaces class with methods like innerJoin, leftJoin, fullJoin, update and union.
table1.makeIndex(table1.attribute("Continent_ID");
table2.makeIndex(table2.attribute("Continent_ID");
Instances result = table1.leftJoin(table2);
Instances can have different number of attributes, levels of NOMINAL and STRING variables are merged together if neccesary.
Sources and some examples are here on GitHub: weka.join.
OK this method reads a dirctor, verify the file paths are ok and then pass each file to a method and updates a Map object.
But how can i explain this for java doc. I want to create a java doc and how should i explain this method for the documentation purpose. Please tell me, if you can help me with this example, i can work for my whole project. thank you:
private void chckDir() {
File[] files = Dir.listFiles();
if (files == null) {
System.out.println("Error");
break;
}
for (int i = 0; i < files.length; i++) {
File file = new File(files[i].getAbsoluteFile().toString());
Map = getMap(file);
}
}
Your method doesn't do what you said In your first sentence (doesn't verify file paths, and throws the result of getMap() away), but there's nothing wrong with putting that kind of sentence im the Javadoc.
There are some issues with your code:
The break statement will give a compilation error, I think. It should be a return.
It is bad style to name a field with a capital letter as the first character. If Dir and Map are field names, they should be dir and map respectively.
The statement Map = getMap(file); is going to repeatedly replace the Map field, and when you exit the loop, the field will refer to the object returned by the last getmap call. This is probably wrong.
Finally, change the file declaration as follows. (There is no need to create a new File object ... because getAbsoluteFile() reurns a File)
File file = files[i].getAbsoluteFile();