Equals method in HashSet - java

I am trying to understand TreeSet. I have created am Employee object and trying to add the employee object to TreeSet. to realize this I have created class called sortName, which sorts the employee objects based on name. and I have written an equals method too (just to understand the execution flow). I read that in TreeSet to add elements in some sorted we must implement comparator interface and overwrite two methods (compare and equals) ofcourse equals is optional. When i try to run the program it works, I observed that the equals method is never invoked, why is that?
Lets draw a comparison between HashSet and TreeSet. in HashSet when the hashCode is same then it checks for equals method, otherwise not. I am interested in knowing how is the working in TreeSet?
Can anyone give me a example where even equals method is invoked for TreeSet?
public int compare(Object Obj1, Object Obj2) {
System.out.println("compare");
if (Obj1 instanceof Employee19 && Obj2 instanceof Employee19) {
Employee19 emp1=(Employee19) Obj1;
Employee19 emp2=(Employee19) Obj2;
return emp1.sname.compareTo(emp2.sname);
}
return 0;
}
public boolean equals(Object obj){
System.out.println("equals");
return true;
}
I even checked this link but that was not what I was looking...
HashSet with two equals object?

The TreeSet doesn't need to use the equals method, because it gets all the information it needs from the Comparator's compare method (or the compareTo method if it is relying on its elements being Comparable). Either way can tell if elements are equivalent, if the appropriate method returns 0.

Related

How Set interface enforces no duplicates add and not preserving insertion order stipulations

Q.1) As written in documentation of AbstractSet - "This class does not override any of the implementations from the AbstractCollection class." If it does not override or change add(Object o) or any other Collection interface contract implemented by AbstractCollection class, and merely inherits them and so as HashSet.
How do HashSet and other Set objects then enforce stipulations like no duplicate adding check or Hashtable way of inserting elements, which is totally different to how List or other Collection objects can add elements.
Q.2) In doc, for AbstractSet, it is written, AbstractSet merely adds implementation for equals and hashcode. However, in method details part, it is mentioned Object class has done overriding equals and hashcode method. Does AbstractSet only inherit without doing any change to these two methods? If so, what is the importance of AbstractSet class? Please clarify
Q1: How does HashSet enforce duplicate checks?
If you take a look at the implementation in java.util.HashSet, you'll see the following code:-
private static final Object PRESENT = new Object();
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
What happens is fairly simple; we use a private HashMap instance, which takes our provided value and inserts it as the key of the HashMap. The map's PRESENT value is never actually used or retrieved, but it allows us to use this backing map to verify whether or not the item exists in the Set.
If our provided value does not exist in the map, the call to map.put() will place the item in the map and return our object. Otherwise, the map remains unchanged and the method returns null. The HashMap is doing the hard work for the HashSet here.
This is different to the implementation provided by the AbstractCollection class, and hence the need to override.
Q2: AbstractSet's use of equals() & hashCode()
I think you have slightly misunderstood what AbstractSet is doing here. The purpose of AbstractSet is to provide a collection-safe implementation of equals and hashCode.
Equals checks are performed by verifying that we are comparing two Set objects, that they are of equal size, and that they contain the same items.
public boolean equals(Object o) {
if (o == this)
return true;
if (!(o instanceof Set))
return false;
Collection<?> c = (Collection<?>) o;
if (c.size() != size())
return false;
try {
return containsAll(c);
} catch (ClassCastException unused) {
return false;
} catch (NullPointerException unused) {
return false;
}
}
The hashCode is produced by looping over the Set instance, and hashing each item iteratively:
public int hashCode() {
int h = 0;
Iterator<E> i = iterator();
while (i.hasNext()) {
E obj = i.next();
if (obj != null)
h += obj.hashCode();
}
return h;
}
Any class extending from AbstractSet will use this implementation of equals() and hashCode() unless it overrides them explicitly. This implementation takes preference over the default equals and hashCode methods defined in java.lang.Object.
The documentation you provided are for Java 7, and I was checking the code of java 8 and I found the below so I think it isn't the same for java 7, still you can use the same methodology of checking the code when the documentation isn't very clear for you:
Q1: HashSet Overrides the add method in AbstractCollection you can easily check this if you open the HashSet code in some ide. If a parent doesn't override some methods doesn't mean its children can't do it.
Q2: Again by checking the code we notice that AbstractSet defines its own implementation of equals and hashCode methods. It also overrides the removeAll method of AbstractCollection.

Sort ArrayList using Comparable

I'm working on a project where I need to be able to sort an ArrayList of Car objects by price. In my Car class, I have
public class Car implements Comparable
and in the body of the code is the compareTo method:
public int compareTo(Object o)
{
Car rhs = (Car)o;
if (price > rhs.price)
return 1;
else if (price < rhs.price)
return -1;
else
return 0;
}
I just don't understand how to implement this method to sort by price- what does carList need to be compared to? I know this isn't correct but so far this is the sorting method.
public void sortByPrice()
{
Collections.sort(carList.compareTo(o));
}
Two problems: one syntatical and one conceptual.
The first issue is that while your compareTo is technically correct, you want to type-bind it to Car instead of Object.
public class Car implements Comparable<Car>
Inside of your compareTo method you'd then substitute Object for Car. You would also want to check for null.
The second is that sortByPrice sounds specific, but since compareTo is comparing based on price, that's somewhat okay.
All you'd need to do is call Collections.sort on the actual collection:
Collections.sort(carList);
Normally, one sorts a collection using
Collections.sort(collection)
while collection has to implement Comparable and sort uses the compareTo method to sort collection.
Your Car class must implement Comparable<Car>. Then your compareTo method will have signature:
public int compareTo(Car other) {}
As per the documentation, this method should:
Returns a negative integer, zero, or a positive integer as this object is less than, equal to, or greater than the specified object.
Then given a List<Car>, say list, you can call Collections.sort(list).
You're almost done! Only call to Collections.sort(carList); and that will by itself, use the overridden compareTo.
Actually, when you're not implementing compareTo, you'll have the very basic implementation, and calling Collections.sort(..) will use the basic implementation, which is comparing pointers in that case.

Java (ArrayList check with object's int)

I have to make an ArrayList that contains an object, the object has one int for year lets say 1
and I don't what another object with the same year 1.
If one object has the int = 1 , i dont want another object with that int(1) in my list.
i want to deny it.
Should I try using equal?
something like
#Override
public boolean equals(Object o){
Object object = (Object)o;
return this.getInt.equals(object.getInt());
}
Either use a Set...which explicitly disallows duplicates, or check if the list contains the element on insertion.
#Override
public boolean add(T element) {
if(contains(element)) {
return false;
} else {
return super.add(element);
}
}
Overriding equals wouldn't get you very far, as you'd be overriding it for the List itself (i.e. you'd be checking if two lists were equal).
Perhaps you can try using a HashMap linked that links that "int" with the object. That could be:
Map<Integer, Object> map = new HashMap<>();
map.put(object.getInt(), object);
...
//Each time you put a new object you could try this:
if(!map.contains(object.getInt()))
map.put(object.getInt, object);
//And you can retrieve your object by an int
int a = 1;
Object obj = map.get(1);
In this case, as the value is of type int, you can use equal operator.
public boolean equals(Object o){
Object object = (Object)o;
return (this.getInt()==object.getInt());
}
For this kind of requirement, ArrayList is not suggestible. As mentioned in the other answers try using HashMap.
Yes, you can. When you call
myArrayList.contains(myObejct);
the ArrayList will invode myObejct's equals method. So you can tell if the object is already in you list.
And I think you can change you method a little,
#Override
public boolean equals(Object o){
if (!(o instanceof YourClass))
return false;
YourClass object = (YourClass)o;
return this.getInt.equals(object.getInt());
}
because if you don't, the method "getInt" might cause a MethodNotFound exception.
Well, that is one way to approach the problem.
Your equals will probably work provided that you change Object object = (Object)o; to cast to the real class.
However, equals ought to cope with the case where o is not of the expected type. The contract requires you should return false rather than throwing a ClassCastException ...
You would then use list.contains(o) to test if an object with the same int value exists in the list. For example:
if (!list.contains(o)) {
list.add(o);
}
But when you override equals, it is best practice to also override hashcode ... so that your class continues to satisfy the equals / hashcode invariants. (If you neglect to do that, hash-based data structures will break for your class.)
However, this won't scale well, because the contains operation on an ArrayList has to test each element in the list, one at a time. As the list gets longer, the contains call takes longer ... in direct proportion; i.e. O(N) ... using Big O complexity notation.
So it may be better to use a Set implementation of some kind instead on ArrayList. Fepending on which set implementation you choose, you will get complexity of O(1) or O(logN). But the catch is that you will either have to to implement hashcode (for a HashSet or LinkedHashSet), or implement either Comparable or a Comparator (for a TreeSet).

Understanding HashSet

Well here is my question, Can "HashSet Objects" have elements duplicated??
If I read the Set Interface definition, I see:
A collection that contains no duplicate elements. More formally, sets contain no pair of elements e1 and e2 such that e1.equals(e2), and at most one null element. As implied by its name, this interface models the mathematical set abstraction.
And now we are going to write a simple example:
Define class A:
public class A {
#Override
public boolean equals(Object obj) {
return true;
}
}
Now execute this code;
Set<A> set = new HashSet<A>();
set.add(new A());
set.add(new A());
System.out.println(set.toString());
And this is the result:
[com.maths.graphs.A#b9e9a3, com.maths.graphs.A#18806f7]
Why a class what implements Set Interface like HashSet contains elements duplicated?
Thanks!!
You have broken the equals-hashcode contract.
If you override the equals method you must also override the hashCode() method such that:
Two objects which are equal give the same hash, and preferably unequal
objects are highly likely to give different hashcodes
This is important because many objects (unsurprisingly including the HashSet) use the hashcode as a quick, efficient early step to eliminate unequal objects. This is what has happened here since the hashcodes of the different As will be different as they are still using the implementation of .hashCode() provided within object.
If you were to create the class A as follows it would not allow more than 1 A in the set
public class A {
#Override
public boolean equals(Object obj) {
return true;
}
#Override
public int hashCode() {
int hash = 1; //any number since in this case all objects of class A are equal to everything
return hash;
}
}
From the javadoc
public int hashCode()
Returns a hash code value for the object. This method is supported for
the benefit of hash tables such as those provided by HashMap.
The general contract of hashCode is:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently
return the same integer, provided no information used in equals
comparisons on the object is modified. This integer need not remain
consistent from one execution of an application to another execution
of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must
produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on
each of the two objects must produce distinct integer results.
However, the programmer should be aware that producing distinct
integer results for unequal objects may improve the performance of
hash tables.
Most IDEs will object if you do not include an overriding HashCode method when overiding the equals method and can generate a hashCode method for you.
Notes
Strictly speaking my hashCode() method doesn't completely satisfy the contract. Since A#equals(Object obj) equals anything including objects which are not of type A it is impossible to fully satisfy the contract. Ideally the equals method would be changed to the following as well to cover all bases
#Override
public boolean equals(Object obj) {
if (obj instanceof A){
return true;
}else{
return false;
}
}
Here the HashSet does not have duplicates, as the two add methods add new objects in the HashSet and these are different Objects. The reason that the hash codes for the two elements of the set are different for this reason. Try changing the code to:
Set<A> set = new HashSet<A>();
A a = new A();
set.add(a);
set.add(a);
System.out.println(set.toString());
and you will see that there is only one value in the set.
Or just add the following in you code and check
#Override
public int hashCode() {
return 31;
}
You have violated the hashCode() method contract i.e for same key it should return same hashcode() every time

Why should I override hashCode() when I override equals() method?

Ok, I have heard from many places and sources that whenever I override the equals() method, I need to override the hashCode() method as well. But consider the following piece of code
package test;
public class MyCustomObject {
int intVal1;
int intVal2;
public MyCustomObject(int val1, int val2){
intVal1 = val1;
intVal2 = val2;
}
public boolean equals(Object obj){
return (((MyCustomObject)obj).intVal1 == this.intVal1) &&
(((MyCustomObject)obj).intVal2 == this.intVal2);
}
public static void main(String a[]){
MyCustomObject m1 = new MyCustomObject(3,5);
MyCustomObject m2 = new MyCustomObject(3,5);
MyCustomObject m3 = new MyCustomObject(4,5);
System.out.println(m1.equals(m2));
System.out.println(m1.equals(m3));
}
}
Here the output is true, false exactly the way I want it to be and I dont care of overriding the hashCode() method at all. This means that hashCode() overriding is an option rather being a mandatory one as everyone says.
I want a second confirmation.
It works for you because your code does not use any functionality (HashMap, HashTable) which needs the hashCode() API.
However, you don't know whether your class (presumably not written as a one-off) will be later called in a code that does indeed use its objects as hash key, in which case things will be affected.
As per the documentation for Object class:
The general contract of hashCode is:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
Because HashMap/Hashtable will lookup object by hashCode() first.
If they are not the same, hashmap will assert object are not the same and return not exists in the map.
The reason why you need to #Override neither or both, is because of the way they interrelate with the rest of the API.
You'll find that if you put m1 into a HashSet<MyCustomObject>, then it doesn't contains(m2). This is inconsistent behavior and can cause a lot of bugs and chaos.
The Java library has tons of functionalities. In order to make them work for you, you need to play by the rules, and making sure that equals and hashCode are consistent is one of the most important ones.
Most of the other comments already gave you the answer: you need to do it because there are collections (ie: HashSet, HashMap) that uses hashCode as an optimization to "index" object instances, an those optimizations expects that if: a.equals(b) ==> a.hashCode() == b.hashCode() (NOTE that the inverse doesn't hold).
But as an additional information you can do this exercise:
class Box {
private String value;
/* some boring setters and getters for value */
public int hashCode() { return value.hashCode(); }
public boolean equals(Object obj) {
if (obj != null && getClass().equals(obj.getClass()) {
return ((Box) obj).value.equals(value);
} else { return false; }
}
}
The do this:
Set<Box> s = new HashSet<Box>();
Box b = new Box();
b.setValue("hello");
s.add(b);
s.contains(b); // TRUE
b.setValue("other");
s.contains(b); // FALSE
s.iterator().next() == b // TRUE!!! b is in s but contains(b) returns false
What you learn from this example is that implementing equals or hashCode with properties that can be changed (mutable) is a really bad idea.
It is primarily important when searching for an object using its hashCode() value in a collection (i.e. HashMap, HashSet, etc.). Each object returns a different hashCode() value therefore you must override this method to consistently generate a hashCode value based on the state of the object to help the Collections algorithm locate values on the hash table.

Categories