How to add parameters configuration of new operator with rapidminer? - java

After creating a new operator and testing it, I need to set some configuration such as the definition of minsupp for frequent itemsets algorithms. Actually, I am defining this parameter inside my java code. I like the minsup parameter be viewed in the parameter list when I select the new operator in the Rapidminer GUI.

If I understand you correctly, you want to add parameters to the operator, which are displayed in the GUI.
For that, you have to implement the function getParameterTypes() of your operator. You can get examples about the usage in almost every other operator. An operator with a lot of different parameters that can serve as a good reference is e.g. the k-Means operator, implemented in the class KMeans.
The basic concept is to add instances of ParameterType to a list and returning that list. The RapidMiner framework will do the rest.

The solution is to add instances of PArameterType to a list and returning that list. following is an example:
#Override
public List<ParameterType> getParameterTypes() {
List<ParameterType> types = super.getParameterTypes();
types.add(new ParameterTypeDouble(MinSupp, "Defines the the minimum frequence of an Itemset", 0.0, 1.0));
types.addAll(RandomGenerator.getRandomGeneratorParameters(this));
return types;
}
and Thanks Marius,

Related

What is the benefit of using a custom class over a map? [duplicate]

This question already has answers here:
Class Object vs Hashmap
(3 answers)
Closed 3 years ago.
I have some piece of code that returns a min and max values from some input that it takes. I need to know what are the benefits of using a custom class that has a minimum and maximum field over using a map that has these two values?
//this is the class that holds the min and max values
public class MaxAndMinValues {
private double minimum;
private double maximum;
//rest of the class code omitted
}
//this is the map that holds the min and max values
Map<String, Double> minAndMaxValuesMap
The most apparent answer would be Object Oriented Programming aspects like the possibility to data with functionality, and the possibility to derive that class.
But let's for the moment assume, that is not a major factor, and your example is so simplistic, that I wouldn't use a Map either. What I would use is the Pair class from Apache Commons: https://commons.apache.org/proper/commons-lang/javadocs/api-3.1/org/apache/commons/lang3/tuple/Pair.html
(ImmutablePair):
https://commons.apache.org/proper/commons-lang/javadocs/api-3.1/org/apache/commons/lang3/tuple/ImmutablePair.html
The Pair class is generic, and has two generic types, one for each field. You can basically define a Pair of something, and get type safety, IDE support, autocompletion, and the big benefit of knowing what is inside. Also a Pair features stuff that a Map can not. For example, a Pair is potentially Comparable. See also ImmutablePair, if you want to use it as key in another Map.
public Pair<Double, Double> foo(...) {
// ...
Pair<Double, Double> range = Pair.of(minimum, maximum);
return range;
}
The big advantage of this class is, that the type you return exposes the contained types. So if you need to, you could return different types from a single method execution (without using a map or complicated inner class).
e.g. Pair<String, Double> or Pair<String, List<Double>>...
In simple situation, you just need to store min and max value from user input, your custom class will be ok than using Map, the reason is: in Java, a Map object can be a HashMap, LinkedHashMap or and TreeMap. it get you a short time to bring your data into its structure and also when you get value from the object. So in simple case, as you just described, just need to use your custom class, morever, you can write some method in your class to process user input, what the Map could not process for you.
I would say to look from perspective of the usage of a programming language. Let it be any language, there will be multiple ways to achieve the result (easy/bad/complicated/performing ...). Considering an Object oriented language like java, this question points more on to the design side of your solution.
Think of accessibility.
The values in a Map is kind of public that , you can modify the contents as you like from any part of the code. If you had a condition that the min and max should be in the range [-100 ,100] & if some part of your code inserts a 200 into map - you have a bug. Ok we can cover it up with a validation , but how many instances of validations would you write? But an Object ? there is always the encapsulation possibilities.
Think of re-use
. If you had the same requirement in another place of code, you have to rewrite the map logic again(probably with all validations?) Doesn't look good right?
Think of extensibility
. If you wanted one more data like median or average -either you have to dirty the map with bad keys or create a new map. But a object is always easy to extend.
So it all relates to the design. If you think its a one time usage probably a map will do ( not a standard design any way. A map must contain one kind of data technically and functionally)
Last but not least, think of the code readability and cognitive complexity. it will be always better with objects with relevant responsibilities than unclear generic storage.
Hope I made some sense!
The benefit is simple : make your code clearer and more robust.
The MaxAndMinValues name and its class definition (two fields) conveys a min and a max value but overall it makes sure that will accept only these two things and its class API is self explanatory to know how to store/get values from it.
While Map<String, Double> minAndMaxValuesMap conveys also the idea that a min and a max value are stored in but it has also multiple drawbacks in terms of design :
we don't know how to retrieve values without looking how these were added.
About it, how to name the keys we we add entries in the map ? String type for key is too broad. For example "MIN", "min", "Minimum" will be accepted. An enum would solve this issue but not all.
we cannot ensure that the two values (min and max) were added in (while an arg constructor can do that)
we can add any other value in the map since that is a Map and not a fixed structure in terms of data.
Beyond the idea of a clearer code in general, I would add that if MaxAndMinValues was used only as a implementation detail inside a specific method or in a lambda, using a Map or even an array {15F, 20F} would be acceptable. But if these data are manipulated through methods, you have to do their meaning the clearest possible.
We used custom class over Hashmap to sort Map based on values part

Correct use of guava Predicate on two types

I'm not sure if I completely understand how guava's Predicate<T> should be use. I have two classes Promotion and Customer, and I want to check which one of the promotions is applicable to a customer.
public Optional<Promotion> getApplicablePromotionToCustomer(final List<Promotion> activePromotions,
final Customer customer) {
return FluentIterable.from(activePromotions).firstMatch(new Predicate<Promotion>() {
#Override
public boolean apply(final Promotion input) {
return input.getCustomerType().equals(customer.getType()) && new DateRangeComparator().overlaps(input.getDateRange(), customer.getDateRange());
}
});
}
My questions are related to the correct typing of Predicate. Is it correct to make Predicate of type Promotion or should I build a wrapper class with Promotion and Customer? I'm not even sure of how to word it. Am I using a "Predicate with a Customer and applying it to a Promotion"?
If I want to extract the anonymous implementation of Predicate to it's own class, I'll have to make the constructor take a Customer, and even a DateRangeComparator if want to make that customizable. Is this fine or approach is totally wrong?
Is it correct to make Predicate of type Promotion or should I build a
wrapper class with Promotion and Customer?
Yes it's correct. The filtering function you want to implement will be of the form f(x) = y where y belongs to {false, true}.
Here the type of x is the type of the element on which you want to apply the function (so the type of the Predicate). Since you filter a List<Promotion> the type of the predicate will be Predicate<Promotion>. The logic used to test the element (with the Customer and the DateRangeComparator is the function itself but the input type is definitely Promotion there.
Am I using a "Predicate with a Customer and applying it to a
Promotion"?
Everyone has its own wording, as long as you are clear I think it does not really matters. But yes, you apply the predicate to a Promotion.
If I want to extract the anonymous implementation of Predicate to it's
own class, I'll have to make the constructor take a Customer, and even
a DateRangeComparator if want to make that customizable. Is this fine
or approach is totally wrong?
Yes there's nothing wrong doing that. The only thing I would keep in mind is that you should try to implement a stateless predicate whenever possible, i.e a predicate that doesn't retain what it filtered so that each item can be processed independently. This is very useful when you start to do parallel computations, because every object that has to be tested can use the predicate as a standalone "box".

Switch statement or remotely invoke methods

I have a switch statement that compares a String with set of String where each match calls a different method.
switch(((Operation) expr.getData()).getValue()){
case "+":
return add(expr.getNext());
case "car":
return car(expr.getNext());
case "cdr":
return cdr(expr.getNext());
case "cons":
return cons(expr.getNext(), expr.getNext().getNext());
case "quote":
return quote(expr.getNext());
case "define":
handleDefine(expr.getNext());
break;
default:
return null;
}
However, to me this sounds like something that could be achieved far more elegantly and efficiently using a HashMap that links up to an Operation that contains a Method and the number of parameters so I could each method to a HashMap like:
nameToOperation.put("+", new Operation("+", 1, Driver.class.getMethod("add")));
nameToOperation.put("car", new Operation("car", 1, Driver.class.getMethod("car")));
So there would be N different instances of the Operation class each containing the String, Method and number of parameters
And then I could simply call the method using something similar to this (I understand this isn't how you use invoke):
Operation op = ((Operation) expr.getData())
if(op.getNumPars() == 1)
return(op.getMethod().invoke(expr.getNext()));
else
return(op.getMethod().invoke(expr.getNext(), expr.getNext().getNext()));
However, I still don't fully like this solution as I am losing type safety and it still doesn't look that great. Another example I have seen on stackoverflow that looked quite elegant but I don't fully understand is the first solution of the top answer on: How to call a method stored in a HashMap? (Java)
What does everyone on Stackoverflow think the best solution is?
Edit: Just in case anybody searches this and was wondering about my solution, I made each operation such as Add, Car, Cdr have their own class that implemented Command. I then had to make the majority of my methods static, which I suppose by nature each of them were anyway. This seems way more elegant than the original case statement.
basicaly , the answer recommends to go with Command pattern.
"The main advantage of the command design pattern is that it decouples the object that invokes the operation from the one that know how to perform it. And this advantage must be kept. There are implementations of this design pattern in which the invoker is aware of the concrete commands classes. This is wrong making the implementation more tightly coupled. The invoker should be aware only about the abstract command class"
Basicaly your map would be type safety. by declaring
Map <character,Command>
Open to Extendibility
It looks like you are trying to write a Scheme interpreter. In that case you're gonna need a map anyway since you need to store all the user defined values und functions.
When the user writes e.g. (define (add a b) (+ a b)), you store the function in the map using "add" as key.
But your functions should use lists as inputs, i.e. each function has exactly one argument which is a list. In Scheme all expressions are lists by the way. Usually a Scheme interpreter consists of a reader and an evaluator. The reader converts the code into a bunch of nested lists.
So basically "(define (add a b) (+ a b))" could be converted into a list structure similar to this.
List<Object> list = new ArrayList<Object>();
List<Object> list2 = new ArrayList<Object>();
list2.add("add"); list2.add("a"); list2.add("b");
List<Object> list3 = new ArrayList<Object>();
list3.add("+"); list3.add("a"); list3.add("b");
list.add("define"); list.add(list1); list.add(list2);
Of course your code doesn't actually look like this, instead the lists are constructed by recursive methods parsing the input code.
Those lists don't just contain strings btw., they also contain numbers and boolean values. Nested lists like this are the most simple form of an abstract syntax tree (AST). Since the syntax of Scheme is much simpler than that of most other languages, a very simple list structure is enough to store the parsed code.
The evaluator then processes those lists.
To evaluate a list you first recursively evaluate every element in the list and then apply the first element to the rest of the list. That first element must therefore be a user defined function or a build in command e.g. "define".

Abstraction: Optional Methods? [java] (Modeling Filters)

Background
For my assignment I am required to model filters such as those in signal processing at a basic level. Filters can take inputs of any type and output a different type if that's how the filter is implemented. The most simple filter outputs the input. Other example filters are arithmetic mean, max, or min filters which returns the maximum input. Similar filters can only return the mean/max/min of the last N inputs. Some filters are reset-able and has a method "reset" which takes an input of the same type. So for example a max3 filter returns the maximum number of the three last inputs or since the last reset including the input that the reset method takes. The assignment goes in further detail describing other more complicated filters but I'm having trouble with abstraction at the most basic level.
Attempt
So my first attempt was to create an interface "Filter" which had one method "filter". This would be implemented by filters to suite their own needs. I created an abstract class "StorageFilter" which stored a list of inputs accessible with protected set/get methods. Then I extended that class to implement the reset function in another abstract class "ResetableFilter". So filters that can't reset would extend the first abstract filter and filters that can would reset the second. But my implementation didn't really work out since filters are a little more complicated than that. There are a few main types of filters that I can pin down. Filters that:
store one input such as max/min filters: just compare stored value with input and if it's the new max/min then set the stored value to the input. We'll call this type 1.
store a list of inputs such as max/min filters of last N: only stores the last N inputs so the filter method can iterate through the list and find the max/min. We'll call this type 2. (This could also implemented in another way storing two values, one representing the current max/min and its "age")
store a list of inputs and a list of outputs such as complicated scalar linear filters which uses equations uses both to calculate the new output. We'll call this type 3.
store none at all such simple filters like the first example or a filter that just returns double the input. We'll call this type 4
So there are many types of things that a filter can store but not all filters are reset-able.
Problem
My general question is how can I implement optional methods while maintaining abstraction? This type (filter) can have an optional method (reset) that its subtypes can have. I can't just have an empty method "reset" that does nothing but the filter is still able to call. What is the best way to implement an optional method such as in this case while maintaining abstraction?
Potential Solutions
Use #Optional: Using this annotation means that filters that don't use this method will throw an UnsupportedOperationException. This is an acceptable way of maintaining abstraction since Java's Collections uses it but it's not desirable since exceptions are involved.
Create a "Reset" interface and have each Filter that can reset implement it: This is also bad because it will almost double the types of Filters I have to think about. (i.e Type 1 filters, reset-able Type 1 filters, ..., and Type 4 filters. Since type 4 filters don't store anything they don't reset ever)
The book "Code Complete 2" describes a scenario modeling a Cat. Cats can scratch but some cats are declawed and therefore can't scratch. Creating different classes for things that a cat can or cannot do complicates it so that you'll end up with classes like a ScratchlessTailess...Cat. The solution the book offers is to create an inner class "Claws" that is contained within the Cat class or build a constructor that includes whether the cat scratches. From this I think an optimal solution would be to create an inner interface "ResetableContainer" which has the one method reset which can be implemented to fit the different types. It could hold whatever the filters need to store and the reset will be implemented depending on what was stored. The problem is still how can I implement it to avoid all this complication with the different possibilities of storage (a single input or a list of inputs)?
It looks like you are running into a conceptual design problem, in that you are expecting the user of the Filter to always know exactly what it can and can't do. But at the same time, you want filters to be able to do lots of different things... these two ideas can't entirely mesh.
What you can do is create these 'optional' methods to return a value on execution:
/**
* Resets the filter.
*
* #returns
* false if this operation is not supported by the filter
*/
public boolean reset() {
return false;
}
A much better method: Include include an additional method that must be overridden, a more common design pattern: see this question for an example. As stated there, it's also probably good to just have more than one interface.
This sounds like a school assignment, so you may be constrained in a way that you can't do this, but what I would do is keep it simple: a single interface:
public interface Filter<In, Out> {
public Out filter(In toFilter);
public void reset();
public boolean canReset();
}
And then probably an abstract base class to provide good default implementations for those methods that have them:
public abstract class BaseFilter<In, Out> implements Filter<In, Out> {
public void reset() {}
public boolean canReset() { return false; }
}
I wouldn't have even included canReset except for the possibility of filters which are sometimes resetable and sometimes not. If that's not a possibility you want to support then you can remove the canReset and just always call reset() whenever you would reset it if it were a resetable filter.

functional java: what's this P1 thing?

I'm looking at Functional Java and I don't understand what a P1 is. Could anyone explain and/or give an example?
(background: I do know what currying and closures are)
This is taken straight from the Google Code project for Functional Java:
Joint union types (tuples) are products of other types. Products of arities 1-8 are provided (fj.P1 - fj.P8). These are useful for when you want to return more than one value from a function, or when you want to accept several values when implementing an interface method that accepts only one argument. They can also be used to get products over other datatypes, such as lists (zip function).
// Regular Java
public Integer albuquerqueToLA(Map<String, Map<String, Integer>> map) {
Map m = map.get("Albuquerque");
if (m != null)
return m.get("Los Angeles"); // May return null.
}
// Functional Java with product and option types.
public Option<Integer> albuquerqueToLA(TreeMap<P2<String, String>, Integer>() map) {
return m.get(p("Albuquerque", "Los Angeles"));
}
P1 looks like the 1-element, trivial product type. In Haskell it would be written as:
data P1 a = P1 a
(the Identity type in Haskell).
that is, it is a container that holds some other type a.
This type also implements the simplest monad, Identity, which allows for functions to be opaquely applied to the contents of the box.
Computationally, there is no reason to use the Identity monad instead of the much simpler act of simply applying functions to their arguments, however, it can be useful in the design of monad transformer stacks.
The monad implementation of the identity monad is trivial,
return a = P1 a
(P1 m) >>= k = k m
As you can see, this is just function application.
aha, found this post:
>>> Also, P1 is potentially lazy. We use it for the implementation of
>>> Stream, for example.
So instead of returning type T directly, I can have something that returns P1<T>, much like Google Collections Supplier<T>, and have it compute the contained value only when P1._1() is called.
(huh, this blog post Lazy Error Handling in Java was interesting too.....)

Categories