creating tree in recursive function - java

I am trying to implement C4.5 algorithm in java. To get initial idea about C4.5 algorithm I took a python code as reference from this link. On this project there is file named mine.py which contains the following function.
def mine_c45(table, result):
""" An entry point for C45 algorithm.
_table_ - a dict representing data table in the following format:
{
'<column name>': [<column values>],
'<column name>': [<column values>],
...
}
_result_: a string representing a name of column indicating a result.
"""
col = max([(k, gain(table, k, result)) for k in table.keys() if k != result],
key=lambda x: x[1])[0]
tree = []
for subt in get_subtables(table, col):
v = subt[col][0]
if is_mono(subt[result]):
tree.append(['%s=%s' % (col, v),
'%s=%s' % (result, subt[result][0])])
else:
del subt[col]
tree.append(['%s=%s' % (col, v)] + mine_c45(subt, result))
return tree
By using the code in link I tried to convert this code in Java with some modification in it. I got succeeded getting the output which I want but the issue is I am not able to build a tree in recursive manner.
Here the code converted in java.
public void mineC45(Map<String, Attribute> table, String result) {
int maxGain = 0;
double[] gains = new double[table.size()];
int counter = 0;
SplitPoints point = null;
for (Entry<String, Attribute> entry : table.entrySet()) {
if (!entry.getKey().equals(result)) {
boolean nominal = entry.getValue().isNominal();
if (nominal)
gains[counter++] = Utils
.gain(table, entry.getKey(), result);
else {
point = Utils.numericGain(table, entry.getKey(), result);
gains[counter++] = point.getGain();
}
}
}
// calculate maximum gain column index
maxGain = Utils.getMax(gains);
List<String> keys = new ArrayList<String>(table.keySet());
String column = keys.get(maxGain);
if (table.get(column).isNominal()) {
for (Map<String, Attribute> subTable : Utils.createSubTables(table,
column)) {
String value = subTable.get(column).getValues().get(0);
if (Utils.isMono(subTable.get(result))) {
System.out.println("\t" + column + " = " + value + " "
+ result + " = "
+ subTable.get(result).getValues().get(0));
} else {
subTable.remove(column);
System.out.println(column + " = " + value + " ");
mineC45(subTable, result);
}
}
} else {
boolean first = true;
for (Map<String, Attribute> subTable : Utils.createNSubtables(
table, column, result, point.getSplitValue())) {
String sign = "";
sign = first ? "<=" : ">";
first = false;
if (Utils.isMono(subTable.get(result))) {
System.out.println("\t" + column + " "
+ point.getSplitValue().toString() + " " + result
+ " = " + subTable.get(result).getValues().get(0));
} else {
subTable.remove(column);
System.out.println(column + " "
+ point.getSplitValue().toString() + " ");
mineC45(subTable, result);
}
}
}
}
Here I created Map<String, Attribute> which represents a table. The string key represents column name and attribute stores the list of values. If any one can explain me how would I convert the output into tree so that I can form rules.

Related

Map some names and values using java

I have a set of values as a repsonse like this.
from this
4,0,1581664239228,6,799,0,845,253,0,0,0,0,0,0,0,0,0,0,1448,594,0,1276257,0,0,0,0,1100,0,0,0,0,0,0,0,2047,2158,0,13,1
I have to map these values to below one..The order should be same like version: 4 , build: 0, tuneStartBaseUTCMS: 1581664239228 etc etc
version,build,tuneStartBaseUTCMS,ManifestDLStartTime,ManifestDLTotalTime,ManifestDLFailCount,VideoPlaylistDLStartTime,VideoPlaylistDLTotalTime,VideoPlaylistDLFailCount,AudioPlaylistDLStartTime,AudioPlaylistDLTotalTime,AudioPlaylistDLFailCount,VideoInitDLStartTime,VideoInitDLTotalTime,VideoInitDLFailCount,AudioInitDLStartTime,AudioInitDLTotalTime,AudioInitDLFailCount,VideoFragmentDLStartTime,VideoFragmentDLTotalTime,VideoFragmentDLFailCount,VideoBitRate,AudioFragmentDLStartTime,AudioFragmentDLTotalTime,AudioFragmentDLFailCount,AudioBitRate,drmLicenseAcqStartTime,drmLicenseAcqTotalTime,drmFailErrorCode,LicenseAcqPreProcessingDuration,LicenseAcqNetworkDuration,LicenseAcqPostProcDuration,VideoFragmentDecryptDuration,AudioFragmentDecryptDuration,gstPlayStartTime,gstFirstFrameTime,contentType,streamType,firstTune
I have written as follows...but it is not working as ex
String abcd = "4,0,1581664239228,6,799,0,845,253,0,0,0,0,0,0,0,0,0,0,1448,594,0,1276257,0,0,0,0,1100,0,0,0,0,0,0,0,2047,2158,0,13,1";
String valueName = "version,build,tuneStartBaseUTCMS,ManifestDLStartTime,ManifestDLTotalTime,ManifestDLFailCount,VideoPlaylistDLStartTime,VideoPlaylistDLTotalTime,VideoPlaylistDLFailCount,AudioPlaylistDLStartTime,AudioPlaylistDLTotalTime,AudioPlaylistDLFailCount,VideoInitDLStartTime,VideoInitDLTotalTime,VideoInitDLFailCount,AudioInitDLStartTime,AudioInitDLTotalTime,AudioInitDLFailCount,VideoFragmentDLStartTime,VideoFragmentDLTotalTime,VideoFragmentDLFailCount,VideoBitRate,AudioFragmentDLStartTime,AudioFragmentDLTotalTime,AudioFragmentDLFailCount,AudioBitRate,drmLicenseAcqStartTime,drmLicenseAcqTotalTime,drmFailErrorCode,LicenseAcqPreProcessingDuration,LicenseAcqNetworkDuration,LicenseAcqPostProcDuration,VideoFragmentDecryptDuration,AudioFragmentDecryptDuration,gstPlayStartTime,gstFirstFrameTime,contentType,streamType,firstTune";
String[] valueArr = abcd.split(",");
String[] valueNameArr = valueName.split(",");
List<String> valueList = Arrays.asList(valueArr);
List<String> valueNameList = Arrays.asList(valueNameArr);
System.out.println(valueList.size() + "jjj: " + "valueNameList::: " + valueNameList.size());
LinkedHashMap<String, String> result = new LinkedHashMap<String, String>();
for (String name : valueNameList) {
System.out.println("name: " + name);
for (String value : valueList) {
System.out.println("value: " + value);
result.put(name, value);
}
}
System.out.println("RESULT::::::::::::::::::::::::::::" + result);
Result prints:
{version=1, build=1, tuneStartBaseUTCMS=1, ManifestDLStartTime=1, ManifestDLTotalTime=1, ManifestDLFailCount=1, VideoPlaylistDLStartTime=1, VideoPlaylistDLTotalTime=1, VideoPlaylistDLFailCount=1, AudioPlaylistDLStartTime=1, AudioPlaylistDLTotalTime=1, AudioPlaylistDLFailCount=1, VideoInitDLStartTime=1, VideoInitDLTotalTime=1, VideoInitDLFailCount=1, AudioInitDLStartTime=1, AudioInitDLTotalTime=1, AudioInitDLFailCount=1, VideoFragmentDLStartTime=1, VideoFragmentDLTotalTime=1, VideoFragmentDLFailCount=1, VideoBitRate=1, AudioFragmentDLStartTime=1, AudioFragmentDLTotalTime=1, AudioFragmentDLFailCount=1, AudioBitRate=1, drmLicenseAcqStartTime=1, drmLicenseAcqTotalTime=1, drmFailErrorCode=1, LicenseAcqPreProcessingDuration=1, LicenseAcqNetworkDuration=1, LicenseAcqPostProcDuration=1, VideoFragmentDecryptDuration=1, AudioFragmentDecryptDuration=1, gstPlayStartTime=1, gstFirstFrameTime=1, contentType=1, streamType=1, firstTune=1}
Your loop is wrong
Try this
for(int i = 0; i < valueList.size(); i++){
result.put(valueNameList(i), valueList(i));
}
Is there not supposed to be a one-to-one relationship between abcd values and valueName ? If there is one-to-one, then an inner loop is wrong isn't it.
String abcd = "4,0,1581664239228,6,799,0,845,253,0,0,0,0,0,0,0,0,0,0,1448,594,0,1276257,0,0,0,0,1100,0,0,0,0,0,0,0,2047,2158,0,13,1";
String valueName = "version,build,tuneStartBaseUTCMS,ManifestDLStartTime,ManifestDLTotalTime,ManifestDLFailCount,VideoPlaylistDLStartTime,VideoPlaylistDLTotalTime,VideoPlaylistDLFailCount,AudioPlaylistDLStartTime,AudioPlaylistDLTotalTime,AudioPlaylistDLFailCount,VideoInitDLStartTime,VideoInitDLTotalTime,VideoInitDLFailCount,AudioInitDLStartTime,AudioInitDLTotalTime,AudioInitDLFailCount,VideoFragmentDLStartTime,VideoFragmentDLTotalTime,VideoFragmentDLFailCount,VideoBitRate,AudioFragmentDLStartTime,AudioFragmentDLTotalTime,AudioFragmentDLFailCount,AudioBitRate,drmLicenseAcqStartTime,drmLicenseAcqTotalTime,drmFailErrorCode,LicenseAcqPreProcessingDuration,LicenseAcqNetworkDuration,LicenseAcqPostProcDuration,VideoFragmentDecryptDuration,AudioFragmentDecryptDuration,gstPlayStartTime,gstFirstFrameTime,contentType,streamType,firstTune";
String[] list1 = abcd.split(",");
String[] list2 = valueName.split(",");
if (list1.length == list2.length) {
for (int x = 0; x < list1.length; x++) {
System.out.println(list2[x] + ":" + list1[x]);
}
}
Simply split and iterate
result
version:4
build:0
tuneStartBaseUTCMS:1581664239228
ManifestDLStartTime:6
ManifestDLTotalTime:799
ManifestDLFailCount:0
VideoPlaylistDLStartTime:845
VideoPlaylistDLTotalTime:253
VideoPlaylistDLFailCount:0
AudioPlaylistDLStartTime:0
AudioPlaylistDLTotalTime:0
AudioPlaylistDLFailCount:0
VideoInitDLStartTime:0
VideoInitDLTotalTime:0
VideoInitDLFailCount:0
AudioInitDLStartTime:0
AudioInitDLTotalTime:0
AudioInitDLFailCount:0
VideoFragmentDLStartTime:1448
VideoFragmentDLTotalTime:594
VideoFragmentDLFailCount:0
VideoBitRate:1276257
AudioFragmentDLStartTime:0
AudioFragmentDLTotalTime:0
AudioFragmentDLFailCount:0
AudioBitRate:0
drmLicenseAcqStartTime:1100
drmLicenseAcqTotalTime:0
drmFailErrorCode:0
LicenseAcqPreProcessingDuration:0
LicenseAcqNetworkDuration:0
LicenseAcqPostProcDuration:0
VideoFragmentDecryptDuration:0
AudioFragmentDecryptDuration:0
gstPlayStartTime:2047
gstFirstFrameTime:2158
contentType:0
streamType:13
firstTune:1

Arraylist find the count of consecutive duplicate elements

I am trying to find the COUNT of repeated elements in an array list.
for example if array named "answerSheerPacketList" list contains values like {20,20,30,40,40,20,20,20},i need to show output like {20=2,30=1,40=2,20=3}.
Map<String, Integer> hm = new HashMap<String, Integer>();
for (String a : answerSheerPacketList) {
Integer j = hm.getinsAnswerSheetId(a);
hm.put(a, (j == null) ? 1 : j + 1);
}
// displaying the occurrence of elements in the arraylist
for(Map.Entry<String, Integer> val : hm.entrySet()){
System.out.println("Element " + val.getKey() + " "
"occurs" + ": " + val.getValue()+ " times");
}
when i executed above code i got output like {20=5,30=1,40=2} but i am trying to get a output like {20=2,30=1,40=2,20=3}.
A simple approach here would be to just iterate the arraylist once, and then keep tallies as we go along:
List<Integer> list = new ArrayList<>();
list.add(20);
list.add(20);
list.add(30);
list.add(40);
list.add(40);
list.add(20);
list.add(20);
list.add(20);
Integer curr = null;
int count = 0;
System.out.print("{");
for (int val : list) {
if (curr == null) {
curr = val;
count = 1;
}
else if (curr != val) {
System.out.print("(" + curr + ", " + count + ")");
curr = val;
count = 1;
}
else {
++count;
}
}
System.out.print("(" + curr + ", " + count + ")");
System.out.print("}");
{(20, 2)(30, 1)(40, 2)(20, 3)}
This is a classic problem of counting runs of consecutive elements in an array. I have renamed the array to arr in the code for brevity.
int run = 1;
for (int i = 0; i < n; ++i) { // n is the size of array
if (i + 1 < n && arr[i] == arr[i + 1]) {
run++; // increment run if consecutive elements are equal
} else {
System.out.println(arr[i] + "=" + run + ", ");
run = 1; // reset run if they are not equal
}
}
Performance-wise, this approach is aysmptotically optimal and runs in O(n), where n is the number of elements in the array.
Set<Integer> distinctSet = new HashSet<>(answerSheerPacketList);
HashSet<Integer,Integer> elementCountSet=new HashSet<>();
for (Integer element: distinctSet) {
elementCountSet.put(element,Collections.frequency(answerSheerPacketList, element));
}
What you need is basically frequency counting. The following code will do it with a single pass through your answerSheerPacketList array:
int[] answerSheerPacketList = // initialization
Map<Integer, Integer> frequencyCount = new LinkedHashMap<>();
for (int i : answerSheerPacketList) {
Integer key = Integer.valueOf(i);
if (frequencyCount.containsKey(key)) {
frequencyCount.put(key, Integer.valueOf(frequencyCount.get(key) + 1));
} else {
frequencyCount.put(key, Integer.valueOf(1));
}
}
for (Integer key : frequencyCount.keySet()) {
System.out.println("Element " + key + " occurs: " + frequencyCount.get(key)
+ " times");
}

Take a list/array of names and count the number of times each unique name is listed

The input for this code is:
"John, Mary, Joe, John, John, John, Mary, Mary, Steve."
My goal is to print out:
"(Name) got (# of votes) votes."
Ending with a statement of the winner.
I can't seem to debug my code though. This is my code:
static void popularity_contest(List<String> name_list) {
largest_count = "";
largest_count = 0;
int n = name_list.size();
int count = 1;
int y = sorted(name_list);
for(i=1; i<name_list.length; i++){
if (n[i] == n[i-1]){
count += 1;
}
else
{
name = n[i-1];
System.out.println(n[i-1] + " got " + str.length(count) + " votes.");
if (count > largest_count)
{
largest_count = count;
largest_name = name;
count = 1;
}
System.out.println(str.length(y)-1 + " got " + str.length(count) + " votes.");
name = str.length(y)-1;
}
if (count > largest_count)
{
largest_count = count;
largest_name = name;
System.out.print(largest_name + " Wins!");
}
}
}
If you are allowed to use Java 8, this can be done very easily with streams and Collectors.groupingBy() :
Map<String, Long> collect = name_list.stream()
.collect(Collectors.groupingBy(Function.identity(),
Collectors.counting()));
You will get a Map<String, Long> with key representing the name, and value representing the number of times it repeats. Example :
{Tom=2, Timmy=1, Elena=1}
Although this might be too advanced as you are new to java.
I like the answer by #SchiduLuca but I thought I would present a solution not using streams. (In my code there might be a draw between two or more winners)
static void popularity_contest(List<String> name_list) {
Map<String, Integer> result = new HashMap<>();
for (String name : name_list) {
Integer count = result.get(name);
if (count == null) {
count = new Integer(1);
result.put(name, count);
} else {
result.put(name, count + 1);
}
}
//Print result and look for max # votes
Integer maxVotes = new Integer(0);
for (Entry<String, Integer> contestant : result.entrySet()) {
System.out.println(String.format("%s got %d votes", contestant.getKey(), contestant.getValue().intValue()));
if (contestant.getValue() > maxVotes) {
maxVotes = contestant.getValue();
}
}
//Print all winners
System.out.println("*** Winner(s) ***");
for (Entry<String, Integer> contestant : result.entrySet()) {
if (contestant.getValue() == maxVotes) {
System.out.println(String.format("%s got %d votes and is a winner", contestant.getKey(), contestant.getValue().intValue()));
}
}
}
The biggest error lies in your comparison of the names. You are using == to compare strings when you need to use the method equals() or equalsIgnoreCase(). Finally you can just use your list you're given and use the get() method when accessing the element at a particular index.
if (name_list.get(i).equalsIgnoreCase(name_list.get(i-1))
{
count += 1;
}

Get value from hashmap/keyset in java?

I have code where I am placing two values into a Hashmap, and then accessing them from within another method. I am iterating through one value "dog", but at the end of the method, I need to print out the "race" relating to that "dog" value...
Here's what I have so far:
DecimalFormat df = new DecimalFormat("#.##");
for (String dog: data.keySet()) { // use the dog
String dogPage = "http://www.gbgb.org.uk/raceCard.aspx?dogName=" + dog;
Document doc1 = Jsoup.connect(dogPage).get();
// System.out.println("Dog name: " + dog);
Element tblHeader = doc1.select("tbody").first();
for (Element element1 : tblHeader.children()){
String position = element1.select("td:eq(4)").text();
int starts = (position.length() + 1) / 4;
int starts1 = starts;
// System.out.println("Starts: " + starts);
Pattern p = Pattern.compile("1st");
Matcher m = p.matcher(position);
int count = 0;
while (m.find()){
count +=1;
}
double firsts = count / (double)starts1 * 100;
String firstsStr = (df.format(firsts));
// System.out.println("Firsts: " + firstsStr + "%");
Pattern p2 = Pattern.compile("2nd");
Matcher m2 = p2.matcher(position);
int count2 = 0;
while (m2.find()){
count2 +=1;
}
double seconds = count2 / (double)starts1 * 100;
String secondsStr = (df.format(seconds));
// System.out.println("Seconds: " + secondsStr + "%");
Pattern p3 = Pattern.compile("3rd");
Matcher m3 = p3.matcher(position);
int count3 = 0;
while (m3.find()){
count3 +=1;
}
double thirds = count3 / (double)starts1 * 100;
String thirdsStr = (df.format(thirds));
// System.out.println("Thirds: " + thirdsStr + "%");
if (starts1 > 20 && firsts < 20 && seconds > 30 && thirds > 20){
System.out.println("Dog name: " + dog);
// System.out.println("Race: " + race);
System.out.println("Firsts: " + firstsStr + "%");
System.out.println("Seconds: " + secondsStr + "%");
System.out.println("Thirds: " + thirdsStr + "%");
System.out.println("");
}
}
Am I able to use something similar to "String dog: data.keySet())" to get the value of "Race"? for example: String race: data.keySet())?
Previous method:
Document doc = Jsoup.connect(
"http://www.sportinglife.com/greyhounds/abc-guide").get();
Element tableHeader = doc.select("tbody").first();
Map<String, String> data = new HashMap<>();
for (Element element : tableHeader.children()) {
// Here you can do something with each element
if (element.text().indexOf("Pelaw Grange") > 0
|| element.text().indexOf("Shawfield") > 0
|| element.text().indexOf("Shelbourne Park") > 0
|| element.text().indexOf("Harolds Cross") > 0) {
// do nothing
} else {
String dog = element.select("td:eq(0)").text();
String race = element.select("td:eq(1)").text();
data.put(dog, race);
}
Any help is much appreciated, thanks!
Rob
I am assuming that the value part of the HashMap is Race.
If yes, then you can do the following:
String race = data.get(dog);
in your current code, you are doing the following:
for (String dog: data.keySet()) { // use the dog
String race = data.get(dog); // this will give the value of race for the key dog
// using dog to do fetch details from site...
}
You could also do the following:
for (Entry<String, String> entry: data.entrySet()) {
String dog = entry.getKey();
String race = entry.getValue();
// using dog to do fetch details from site...
}

Listing all the combinations of the elements of a set

Among elements of a set, I want to list all the 3-combinations of this set.
Is there a way to do it?
If it was a list;
for(int i=0; i<list.size()-2; i++)
for(int j=i+1; j<list.size()-1; j++)
for(int k=j+1; k<list.size(); k++)
System.out.println(list.get(i) + " " + list.get(j) + " " + list.get(k));
But how to do this with a set without converting it to a list?
Converting to a list and using the logic from your code would be my first choice. If you want to do it without converting to list, you can do it with iterators, like this:
Set<String> set = ...;
for (String a : set) {
boolean bGo = false;
for (String b : set) {
if (bGo) {
boolean cGo = false;
for (String c : set) {
if (cGo) {
System.out.println(a + " " + b + " " + c);
} else if (b.equals(c)) {
cGo = true;
}
}
} else if (a.equals(b)) {
bGo = true;
}
}
}
The logic above freely iterates the set in the outer loop. When it starts iterating the set in the first nested loop, it skips elements until the current element of the outer loop is found (i.e. until a.equals(b)). After that, it runs the third nested loop, which skips all data in the set until b, at which point it starts producing the combinations output.
Here is a demo on ideone.
Is your set a SortedSet ? If so, you can do this:
for (V x: mySet) {
for (V y: mySet.tailSet(x, false)) {
for (V z: mySet.tailSet(y, false)) {
System.out.println(x + " " + y + " " + z);
}
}
}

Categories