I've implemented the Apriori algorithm. it works pretty well, but I ran into a strange problem: I've defined a Rule class to maintain the generated rules.
Here it is:
public class Rule
{
private Set<Integer> left;
private Set<Integer> right;
private LookupArtist lookupArtist;
public Rule(LookupArtist lookupArtist){
left = new HashSet<>();
right = new HashSet<>();
this.lookupArtist = lookupArtist;
}
#Override
public boolean equals(Object another){
Rule rule = (Rule) another;
if(this.left.equals(rule.getLeft()) && this.right.equals(rule.getRight()))
return true;
else
return false;
}
#Override
public String toString(){
/* print the object */
}
public void addToLeft(Integer toAdd){
left.add(toAdd);
}
public void addToRight(Integer toAdd){
right.add(toAdd);
}
public Set<Integer> getLeft(){
return left;
}
public Set<Integer> getRight(){
return right;
}
}
I also implemented the equals() method in a different way just to try:
#Override
public boolean equals(Object another){
Rule rule = (Rule) another;
boolean flag = true;
for(Integer artist : left){
if(flag)
if(!rule.left.contains(artist))
flag=false;
}
if(flag)
for(Integer artist : right){
if(flag)
if(!rule.right.contains(artist))
flag=false;
}
return flag;
}
The LookupArtist object is used to map the integers to some Strings.
The problem is that when I print out the rules I found that some rules appear two times. I also found in debug mode some replicated rules, so it isn't be a print problem. The rules are saved in a map like this:
static Map<Rule, Float> rules;
.
.
.
Rule rule = new Rule(lookupArtist);
for(int j=0;j<i;j++){
rule.addToLeft(a[j]);
}
for(int j=i;j<a.length;j++){
rule.addToRight(a[j]);
}
if(!rules.containsKey(rule)){
rules.put(rule, getRuleConfidence(rule));
}
Any idea where the problem can be?
When using a HashSet for storing objects of a class that has a custom equals implementation, you must have a matching custom implementation for hashCode.
If two objects are equal (according to the custom equals implementation), they must have the samehashCode. In the code you posted, I don't see an overriding ofhashCodein theRule` class.
When you add an instance to the HashSet, hashCode method is used to determine the index in the hash table in which the instance will be stored. Then, the linked list of instances stored in this index is iterated to see if the instance is already there. When iterating over that list, equals is used. If two objects that are equal are mapped by hashCode to different indices in the HashSet, the duplication won't be detected, since they would be stored in separate linked lists.
This is stated in the Javadoc of equals :
* Note that it is generally necessary to override the <tt>hashCode</tt>
* method whenever this method is overridden, so as to maintain the
* general contract for the <tt>hashCode</tt> method, which states
* that equal objects must have equal hash codes.
And in the Javadoc of hashCode :
* <li>If two objects are equal according to the <tt>equals(Object)</tt>
* method, then calling the <code>hashCode</code> method on each of
* the two objects must produce the same integer result.
You should always override hashCode when you override equals and vice versa.
Add something like this to your Rule class:
#Override
public int hashCode() {
return left.hashCode()
^ right.hashCode()
^ lookupArtist.hashCode();
}
Here is a good answer explaining why it's important to override both.
Also, your equals method can be written as
#Override
public boolean equals(Object another){
Rule rule = (Rule) another;
return left.equals(rule.left)
&& right.equals(rule.right)
&& lookupArtist.equals(rule.lookupArtist);
}
A final remark: Your other attempt at the equals-implementation is not symmetrical, i.e. it's not the case that rule1.equals(rule2) if and only if rule2.equals(rule1). That's a violation of the contract of equals.
And where is your hashCode() method? It is also very important :)
Related
I have a POJO/DTO class with multiple list attribute like
class Boo {
private List<Foo> foos;
private List<Integer> pointers;
}
I want to compare if both lists contain the same values ignoring the order of the lists. Is it possible to achieve this without opening the object and ordering the lists?
Help would be appreciated. Thanks in Advance
"I want to compare if both contains same values instead of the order of list."
There is not a universal equality operator. Sometimes you want compare objects by certain properties. Probably the canonical example could be comparing strings, sometimes "computer" is equal or not than "Computer" or "VesterĂ¥len" is equal or not than "Vesteralen".
In Java, you can redefine the default equivalence relation between objects (modifying the default behavior!).
The object List use as default equivalence relation the default equivalence relation of the contained objects and checking that equality in order.
The following example ignore the elements order only in one property:
class My {
private final List<String> xs;
private final List<Integer> ys;
My(List<String> xs, List<Integer> ys) {
this.xs = xs;
this.ys = ys;
}
public List<Integer> getYs() {
return ys;
}
public List<String> getXs() {
return xs;
}
#Override
public int hashCode() {
return xs.hashCode() + 7 * ys.hashCode();
}
#Override
public boolean equals(Object obj) {
if(!(obj instanceof My))
return false;
My o = (My) obj;
return
// ignoring order
getXs().stream().sorted().collect(toList()).equals(o.getXs().stream().sorted().collect(toList()))
// checking order
&& getYs().equals(o.getYs());
}
}
public class Callme {
public static void main(String... args) {
My m1 = new My(asList("a", "b"), asList(1, 2));
My m2 = new My(asList("b", "a"), asList(1, 2));
My m3 = new My(asList("a", "b"), asList(2, 1));
System.out.println(m1.equals(m2));
System.out.println(m1.equals(m3));
}
}
with output
true
false
But I can't define YOUR required equivalence relation, for example I do not ignore if one list contains more elements than the other but maybe you wish (eg. to you is equal {a, b, a} than {b, a}).
So, define you equivalence relation for your object and override hashCode and equals.
This boils down to comparing the lists. If the order of the items is irrelevant anyways you might fare better using Set instead of List.
Your equals then would look like
public boolean equals(object other) {
//here be class and null checks
return foos.equals(other.foos) && pointers.equals(other.pointers);
}
If you cannot use Set - either because you can have the same item multiple times or because order matters - you have can do the same as above with a reciprocal containsAll() call. This still would not take duplicate entries into consideration but will work quite fine otherwise.
You state that you cannot edit the class Boo. One solution would be to have a service class which does this for you a bit similar to Objects.equals().
class BooComparer {
public static bool equals(Boo a, Boo b) {
//again do some null checks here
return a.foos.containsAll(b.foos)
&& b.foos.containsAll(a.foos)
&& a.pointers.containsAll(b.pointers)
&& b.pointers.containsAll(a.pointers)
}
}
If this works for you - fine. Maybe you have to compare other members, too. And again: this will ignore if one of the lists has an entry twice.
According to the official documentation for the Java Hashtable class (https://docs.oracle.com/javase/7/docs/api/java/util/Hashtable.html), the get() opperation will return one of it's recorded values, if said value has a key that returns true when the parameter is fed into that key's equals() opperation.
So, in theory, the following code should return "Hello!" for both of the Hashtable's get() queries:
public static class Coordinates implements Serializable {
private int ex;
private int why;
public Coordinates(int xCo, int yCo) {
ex = xCo;
why = yCo;
}
public final int x() {
return ex;
}
public final int y() {
return why;
}
public boolean equals(Object o) {
if(o == null) {
return false;
} else if(o instanceof Coordinates) {
Coordinates c = (Coordinates) o;
return this.x() == c.x() && this.y() == c.y();
} else {
return false;
}
}
}
Hashtable<Coordinates, String> testTable = new Hashtable<Coordinates, String>();
Coordinates testKey = new Coordinates(3, 1);
testTable.put(testKey, "Hello!");
testTable.get(testKey); //This will return the "Hello" String as expected.
testTable.get(new Coordinates(3, 1)); //This will only return a null value.
However, get() doesn't work as it's supposed to. It seems to only work if you litterally feed it the exact same object as whatever was the original key.
Is there any way to correct this and get the Hashtable to function the way it's described in the documentation? Do I need to make any adjustments to the custom equals() opperation in the Coordinates class?
To be able to store and retrieve objects from hash-based collections you should implement/oeverride the equals() as well as hashCode() methods of the Object class. In your case, you have overridden the equals() and left the hashCode() method to its default implementation inherited from the Object.
Here is the general contract of the hashCode() method you must consider while implementing it:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.
And here is an example implementation that is generated from my IDE (as alread mentioned by #Peter in the comment area) which you can modify to suit your requirements:
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + ex;
result = prime * result + why;
return result;
}
I am trying to write a custom hashCode fn, but I am not able to figure out the correct way to do that.
public class Person {
String name;
List<String> attributes;
#Override
public boolean equals(Object o) {
// Persons are equal if name is equal & if >= 2 of attributes are equal
// This I have implemented
}
#Override
public int hashCode() {
final int PRIME = 59;
int result = 1;
result = (result*PRIME) + (this.name == null ? 0 : this.name.hashCode());
//Not sure what to do here to account for attributes
return result;
}
}
I want the hashCode fn to be such that:
"If object1 and object2 are equal according to their equals() method, they must also have the same hash code"
Not sure how to do that?
As Oli points out in the comments, you cannot solve this by implementing equals() and relying on a Set to de-duplicate for you. Weird things could happen.
Thus you must resort to coding this yourself. Add the first item from your list into your new de-duplicated list. Then for each remaining item in your original list, compare it with those already present in your de-duplicated list and only add it if it passes your non-duplicate test.
Easiest way to fulfill the contract of the equals/hashcCode methods is to return a constant:
#Override
public int hashCode() {return 13;}
Otherwise your solution with a hash code based only on name will work.
I have a problem when retrieving values from a hashmap. The hashmap is declared as follows:
HashMap<TRpair,A> aTable = new HashMap<TRpair,A>();
I then put 112 values into the map as follows:
aTable.put(new TRpair(new T(<value>),new Integer(<value>)),new Ai());
where Ai is any one of 4 subclasses that extend A.
I then proceed to check what values are in the map, as follows:
int i = 0;
for (Map.Entry<TRpair,A> entry : aTable.entrySet()) {
System.out.println(entry.getKey().toString() + " " + entry.getValue().toString());
System.out.println(entry.getKey().equals(new TRpair(new T("!"),new Integer(10))));
i++;
}
i holds the value 112 at the end, as one would expect and the equality test prints true for exactly one entry, as expected.
However, when I do
System.out.println(aTable.get(new TRpair(new T("!"), new Integer(10))));
null is output, despite the above code snippet confirming that there is indeed one entry in the map with exactly this key.
If it helps, the class TRpair is declared as follows:
public class TRpair {
private final T t;
private final Integer r;
protected TRpair(Integer r1, T t1) {
terminal = t1;
row = r1;
}
protected TRpair(T t1, Integer r1) {
t = t1;
r = r1;
}
#Override
public boolean equals(Object o) {
TRpair p = (TRpair)o;
return (p.t.equals(t)) && (p.r.equals(r));
}
#Override
public String toString() {
StringBuilder sbldr = new StringBuilder();
sbldr.append("(");
sbldr.append(t.toString());
sbldr.append(",");
sbldr.append(r.toString());
sbldr.append(")");
return sbldr.toString();
}
}
the equals() and toString() methods in each of the Ai (extending A) and in the T class are overridden similarly and appear to behave as expected.
Why is the value output from the hashmap aTable null, when previously it has been confirmed that the value for the corresponding key is indeed in the map?
With many thanks,
Froskoy.
The keys/elements for a Hash collection but override hashCode() if euqals is overridden.
You could use.
public int hashCode() {
return t.hashCode() * 31 ^ r.hashCode();
}
BTW: It appears from your code that Integer r cannot be null in which case using int r makes more sense.
From Object.equals()
Note that it is generally necessary to override the hashCode method whenever this method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.
IIRC hashmap looks up by hashCode() and not by equality, and since you did not implemented hashcode you use default implementation which is consistent with object pointer equality -
you need to implement proper hashcode function which takes into account "T" parameter as well as integer (or not)
It is good practice that hashCode() and equals() are consistent, but not structly necessary if you know what you are doing.
Ok, I have heard from many places and sources that whenever I override the equals() method, I need to override the hashCode() method as well. But consider the following piece of code
package test;
public class MyCustomObject {
int intVal1;
int intVal2;
public MyCustomObject(int val1, int val2){
intVal1 = val1;
intVal2 = val2;
}
public boolean equals(Object obj){
return (((MyCustomObject)obj).intVal1 == this.intVal1) &&
(((MyCustomObject)obj).intVal2 == this.intVal2);
}
public static void main(String a[]){
MyCustomObject m1 = new MyCustomObject(3,5);
MyCustomObject m2 = new MyCustomObject(3,5);
MyCustomObject m3 = new MyCustomObject(4,5);
System.out.println(m1.equals(m2));
System.out.println(m1.equals(m3));
}
}
Here the output is true, false exactly the way I want it to be and I dont care of overriding the hashCode() method at all. This means that hashCode() overriding is an option rather being a mandatory one as everyone says.
I want a second confirmation.
It works for you because your code does not use any functionality (HashMap, HashTable) which needs the hashCode() API.
However, you don't know whether your class (presumably not written as a one-off) will be later called in a code that does indeed use its objects as hash key, in which case things will be affected.
As per the documentation for Object class:
The general contract of hashCode is:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
Because HashMap/Hashtable will lookup object by hashCode() first.
If they are not the same, hashmap will assert object are not the same and return not exists in the map.
The reason why you need to #Override neither or both, is because of the way they interrelate with the rest of the API.
You'll find that if you put m1 into a HashSet<MyCustomObject>, then it doesn't contains(m2). This is inconsistent behavior and can cause a lot of bugs and chaos.
The Java library has tons of functionalities. In order to make them work for you, you need to play by the rules, and making sure that equals and hashCode are consistent is one of the most important ones.
Most of the other comments already gave you the answer: you need to do it because there are collections (ie: HashSet, HashMap) that uses hashCode as an optimization to "index" object instances, an those optimizations expects that if: a.equals(b) ==> a.hashCode() == b.hashCode() (NOTE that the inverse doesn't hold).
But as an additional information you can do this exercise:
class Box {
private String value;
/* some boring setters and getters for value */
public int hashCode() { return value.hashCode(); }
public boolean equals(Object obj) {
if (obj != null && getClass().equals(obj.getClass()) {
return ((Box) obj).value.equals(value);
} else { return false; }
}
}
The do this:
Set<Box> s = new HashSet<Box>();
Box b = new Box();
b.setValue("hello");
s.add(b);
s.contains(b); // TRUE
b.setValue("other");
s.contains(b); // FALSE
s.iterator().next() == b // TRUE!!! b is in s but contains(b) returns false
What you learn from this example is that implementing equals or hashCode with properties that can be changed (mutable) is a really bad idea.
It is primarily important when searching for an object using its hashCode() value in a collection (i.e. HashMap, HashSet, etc.). Each object returns a different hashCode() value therefore you must override this method to consistently generate a hashCode value based on the state of the object to help the Collections algorithm locate values on the hash table.