An Easier Way to Detect Differences Between Objects? (Java) - java

For instance, I have an arraylist called StudentListA and StudentListB that contains many students.
StudentListA = {Student 1, Student 2, Student 3....}
StudentListB = {Student A, Student B, Student C....}
In each student, they have their own attributes such as name, address, gpa etc.
How do I compare if Student 1 has the same attribute value with Student A, and so on.
For now I am thinking of something like this:
int i = 0;
for (Student student : StudentListA){
if(student.getName().equals(studentListB.get(i).getName() &&
student.getAddress().equals(studentListB.get(i).getAddress()....){
//Do smtg
}
i++;
}
Is there an easier way to do this? Because I have quite a handful of attributes. I want to know if the first and second list have the exact same students or not.

You should move comparison of attributes to Student#equals method.
Then you will use Student#equals in the cycle:
for (Student student : StudentListA){
if(student.equals(studentListB.get(i))){
//Do smtg
}
i++;
}
You can look at Lombock EqualsAndHashCode - it would be easiest way to generate the #equals method.
If you aren't allowed to use lombock, then you will have to write the Student#equals yourself.
In real world it's rare to write implementation for #equals because IDE can help you to generate it. E.g. this is what Idea generated for me:
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Student student = (Student) o;
if (this.name != null ? !name.equals(student.name) : student.name != null) return false;
return true;
}

Is there an easier way to do this? Because I have quite a handful of attributes.
There is no shortcut in Java itself for testing whether two distinct objects have all corresponding attributes equal to each other. You can, however, write a method that performs such a test, so that you can reuse it, and so that your code is better factored. You can also choose that as your definition of value equality for objects of a given type, by making the method implementing that comparison an override of Object.equals().
If you do override Object.equals() then be sure to override Object.hashcode() as well, to maintain consistency with equals(). Objects that test equal to each other should have the same hash code, though the reverse is not necessarily true.

As suggested in other post, Key implementation is to override equals and hashCode, that is must. This can be easily done code generation using IDE/eclipse.
Implements the java.lang.Comparable interface
Override the compareTo method
//sample implementation,
public int compareTo(Student o){
if (o.hashCode() < this.hashCode()){return -1}
else if(o.hashCode() > this.hashCode()) {return 1}
return 0;
}
Use Collections.sort for sorting
Now you have two sorted list, you can use easily compare two sorted list.

Related

Why doesn't distinct work for a custom class in Java Streams? [duplicate]

Recently I read through this
Developer Works Document.
The document is all about defining hashCode() and equals() effectively and correctly, however I am not able to figure out why we need to override these two methods.
How can I take the decision to implement these methods efficiently?
Joshua Bloch says on Effective Java
You must override hashCode() in every class that overrides equals(). Failure to do so will result in a violation of the general contract for Object.hashCode(), which will prevent your class from functioning properly in conjunction with all hash-based collections, including HashMap, HashSet, and Hashtable.
Let's try to understand it with an example of what would happen if we override equals() without overriding hashCode() and attempt to use a Map.
Say we have a class like this and that two objects of MyClass are equal if their importantField is equal (with hashCode() and equals() generated by eclipse)
public class MyClass {
private final String importantField;
private final String anotherField;
public MyClass(final String equalField, final String anotherField) {
this.importantField = equalField;
this.anotherField = anotherField;
}
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result
+ ((importantField == null) ? 0 : importantField.hashCode());
return result;
}
#Override
public boolean equals(final Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
final MyClass other = (MyClass) obj;
if (importantField == null) {
if (other.importantField != null)
return false;
} else if (!importantField.equals(other.importantField))
return false;
return true;
}
}
Imagine you have this
MyClass first = new MyClass("a","first");
MyClass second = new MyClass("a","second");
Override only equals
If only equals is overriden, then when you call myMap.put(first,someValue) first will hash to some bucket and when you call myMap.put(second,someOtherValue) it will hash to some other bucket (as they have a different hashCode). So, although they are equal, as they don't hash to the same bucket, the map can't realize it and both of them stay in the map.
Although it is not necessary to override equals() if we override hashCode(), let's see what would happen in this particular case where we know that two objects of MyClass are equal if their importantField is equal but we do not override equals().
Override only hashCode
If you only override hashCode then when you call myMap.put(first,someValue) it takes first, calculates its hashCode and stores it in a given bucket. Then when you call myMap.put(second,someOtherValue) it should replace first with second as per the Map Documentation because they are equal (according to the business requirement).
But the problem is that equals was not redefined, so when the map hashes second and iterates through the bucket looking if there is an object k such that second.equals(k) is true it won't find any as second.equals(first) will be false.
Hope it was clear
Collections such as HashMap and HashSet use a hashcode value of an object to determine how it should be stored inside a collection, and the hashcode is used again in order to locate the object
in its collection.
Hashing retrieval is a two-step process:
Find the right bucket (using hashCode())
Search the bucket for the right element (using equals() )
Here is a small example on why we should overrride equals() and hashcode().
Consider an Employee class which has two fields: age and name.
public class Employee {
String name;
int age;
public Employee(String name, int age) {
this.name = name;
this.age = age;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public int getAge() {
return age;
}
public void setAge(int age) {
this.age = age;
}
#Override
public boolean equals(Object obj) {
if (obj == this)
return true;
if (!(obj instanceof Employee))
return false;
Employee employee = (Employee) obj;
return employee.getAge() == this.getAge()
&& employee.getName() == this.getName();
}
// commented
/* #Override
public int hashCode() {
int result=17;
result=31*result+age;
result=31*result+(name!=null ? name.hashCode():0);
return result;
}
*/
}
Now create a class, insert Employee object into a HashSet and test whether that object is present or not.
public class ClientTest {
public static void main(String[] args) {
Employee employee = new Employee("rajeev", 24);
Employee employee1 = new Employee("rajeev", 25);
Employee employee2 = new Employee("rajeev", 24);
HashSet<Employee> employees = new HashSet<Employee>();
employees.add(employee);
System.out.println(employees.contains(employee2));
System.out.println("employee.hashCode(): " + employee.hashCode()
+ " employee2.hashCode():" + employee2.hashCode());
}
}
It will print the following:
false
employee.hashCode(): 321755204 employee2.hashCode():375890482
Now uncomment hashcode() method , execute the same and the output would be:
true
employee.hashCode(): -938387308 employee2.hashCode():-938387308
Now can you see why if two objects are considered equal, their hashcodes must
also be equal? Otherwise, you'd never be able to find the object since the default
hashcode method in class Object virtually always comes up with a unique number
for each object, even if the equals() method is overridden in such a way that two
or more objects are considered equal. It doesn't matter how equal the objects are if
their hashcodes don't reflect that. So one more time: If two objects are equal, their
hashcodes must be equal as well.
You must override hashCode() in every
class that overrides equals(). Failure
to do so will result in a violation of
the general contract for
Object.hashCode(), which will prevent
your class from functioning properly
in conjunction with all hash-based
collections, including HashMap,
HashSet, and Hashtable.
   from Effective Java, by Joshua Bloch
By defining equals() and hashCode() consistently, you can improve the usability of your classes as keys in hash-based collections. As the API doc for hashCode explains: "This method is supported for the benefit of hashtables such as those provided by java.util.Hashtable."
The best answer to your question about how to implement these methods efficiently is suggesting you to read Chapter 3 of Effective Java.
Why we override equals() method
In Java we can not overload how operators like ==, +=, -+ behave. They are behaving a certain way. So let's focus on the operator == for our case here.
How operator == works.
It checks if 2 references that we compare point to the same instance in memory. Operator == will resolve to true only if those 2 references represent the same instance in memory.
So now let's consider the following example
public class Person {
private Integer age;
private String name;
..getters, setters, constructors
}
So let's say that in your program you have built 2 Person objects on different places and you wish to compare them.
Person person1 = new Person("Mike", 34);
Person person2 = new Person("Mike", 34);
System.out.println ( person1 == person2 ); --> will print false!
Those 2 objects from business perspective look the same right? For JVM they are not the same. Since they are both created with new keyword those instances are located in different segments in memory. Therefore the operator == will return false
But if we can not override the == operator how can we say to JVM that we want those 2 objects to be treated as same. There comes the .equals() method in play.
You can override equals() to check if some objects have same values for specific fields to be considered equal.
You can select which fields you want to be compared. If we say that 2 Person objects will be the same if and only if they have the same age and same name, then the IDE will create something like the following for automatic generation of equals()
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Person person = (Person) o;
return age == person.age &&
name.equals(person.name);
}
Let's go back to our previous example
Person person1 = new Person("Mike", 34);
Person person2 = new Person("Mike", 34);
System.out.println ( person1 == person2 ); --> will print false!
System.out.println ( person1.equals(person2) ); --> will print true!
So we can not overload == operator to compare objects the way we want but Java gave us another way, the equals() method, which we can override as we want.
Keep in mind however, if we don't provide our custom version of .equals() (aka override) in our class then the predefined .equals() from Object class and == operator will behave exactly the same.
Default equals() method which is inherited from Object will check whether both compared instances are the same in memory!
Why we override hashCode() method
Some Data Structures in java like HashSet, HashMap store their elements based on a hash function which is applied on those elements. The hashing function is the hashCode()
If we have a choice of overriding .equals() method then we must also have a choice of overriding hashCode() method. There is a reason for that.
Default implementation of hashCode() which is inherited from Object considers all objects in memory unique!
Let's get back to those hash data structures. There is a rule for those data structures.
HashSet can not contain duplicate values and HashMap can not contain duplicate keys
HashSet is implemented with a HashMap behind the scenes where each value of a HashSet is stored as a key in a HashMap.
So we have to understand how a HashMap works.
In a simple way a HashMap is a native array that has some buckets. Each bucket has a linkedList. In that linkedList our keys are stored. HashMap locates the correct linkedList for each key by applying hashCode() method and after that it iterates through all elements of that linkedList and applies equals() method on each of these elements to check if that element is already contained there. No duplicate keys are allowed.
When we put something inside a HashMap, the key is stored in one of those linkedLists. In which linkedList that key will be stored is shown by the result of hashCode() method on that key. So if key1.hashCode() has as a result 4, then that key1 will be stored on the 4th bucket of the array, in the linkedList that exists there.
By default hashCode() method returns a different result for each different instance. If we have the default equals() which behaves like == which considers all instances in memory as different objects we don't have any problem.
But in our previous example we said we want Person instances to be considered equal if their ages and names match.
Person person1 = new Person("Mike", 34);
Person person2 = new Person("Mike", 34);
System.out.println ( person1.equals(person2) ); --> will print true!
Now let's create a map to store those instances as keys with some string as pair value
Map<Person, String> map = new HashMap();
map.put(person1, "1");
map.put(person2, "2");
In Person class we have not overridden the hashCode method but we have overridden equals method. Since the default hashCode provides different results for different java instances person1.hashCode() and person2.hashCode() have big chances of having different results.
Our map might end with those persons in different linkedLists.
This is against the logic of a HashMap
A HashMap is not allowed to have multiple equal keys!
But ours now has and the reason is that the default hashCode() which was inherited from Object Class was not enough. Not after we have overridden the equals() method on Person Class.
That is the reason why we must override hashCode() method after we have overridden equals method.
Now let's fix that. Let's override our hashCode() method to consider the same fields that equals() considers, namely age, name
public class Person {
private Integer age;
private String name;
..getters, setters, constructors
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Person person = (Person) o;
return age == person.age &&
name.equals(person.name);
}
#Override
public int hashCode() {
return Objects.hash(name, age);
}
}
Now let's try again to save those keys in our HashMap
Map<Person, String> map = new HashMap();
map.put(person1, "1");
map.put(person2, "2");
person1.hashCode() and person2.hashCode() will definitely be the same. Let's say it is 0.
HashMap will go to bucket 0 and in that LinkedList will save the person1 as key with the value "1". For the second put HashMap is intelligent enough and when it goes again to bucket 0 to save person2 key with value "2" it will see that another equal key already exists there. So it will overwrite the previous key. So in the end only person2 key will exist in our HashMap.
Now we are aligned with the rule of Hash Map that says no multiple equal keys are allowed!
Identity is not equality.
equals operator == test identity.
equals(Object obj) method compares equality test(i.e. we need to tell equality by overriding the method)
Why do I need to override the equals and hashCode methods in Java?
First we have to understand the use of equals method.
In order to identity differences between two objects we need to override equals method.
For example:
Customer customer1=new Customer("peter");
Customer customer2=customer1;
customer1.equals(customer2); // returns true by JVM. i.e. both are refering same Object
------------------------------
Customer customer1=new Customer("peter");
Customer customer2=new Customer("peter");
customer1.equals(customer2); //return false by JVM i.e. we have two different peter customers.
------------------------------
Now I have overriden Customer class equals method as follows:
#Override
public boolean equals(Object obj) {
if (this == obj) // it checks references
return true;
if (obj == null) // checks null
return false;
if (getClass() != obj.getClass()) // both object are instances of same class or not
return false;
Customer other = (Customer) obj;
if (name == null) {
if (other.name != null)
return false;
} else if (!name.equals(other.name)) // it again using bulit in String object equals to identify the difference
return false;
return true;
}
Customer customer1=new Customer("peter");
Customer customer2=new Customer("peter");
Insteady identify the Object equality by JVM, we can do it by overring equals method.
customer1.equals(customer2); // returns true by our own logic
Now hashCode method can understand easily.
hashCode produces integer in order to store object in data structures like HashMap, HashSet.
Assume we have override equals method of Customer as above,
customer1.equals(customer2); // returns true by our own logic
While working with data structure when we store object in buckets(bucket is a fancy name for folder). If we use built-in hash technique, for above two customers it generates two different hashcode. So we are storing the same identical object in two different places. To avoid this kind of issues we should override the hashCode method also based on the following principles.
un-equal instances may have same hashcode.
equal instances should return same hashcode.
Simply put, the equals-method in Object check for reference equality, where as two instances of your class could still be semantically equal when the properties are equal. This is for instance important when putting your objects into a container that utilizes equals and hashcode, like HashMap and Set. Let's say we have a class like:
public class Foo {
String id;
String whatevs;
Foo(String id, String whatevs) {
this.id = id;
this.whatevs = whatevs;
}
}
We create two instances with the same id:
Foo a = new Foo("id", "something");
Foo b = new Foo("id", "something else");
Without overriding equals we are getting:
a.equals(b) is false because they are two different instances
a.equals(a) is true since it's the same instance
b.equals(b) is true since it's the same instance
Correct? Well maybe, if this is what you want. But let's say we want objects with the same id to be the same object, regardless if it's two different instances. We override the equals (and hashcode):
public class Foo {
String id;
String whatevs;
Foo(String id, String whatevs) {
this.id = id;
this.whatevs = whatevs;
}
#Override
public boolean equals(Object other) {
if (other instanceof Foo) {
return ((Foo)other).id.equals(this.id);
}
}
#Override
public int hashCode() {
return this.id.hashCode();
}
}
As for implementing equals and hashcode I can recommend using Guava's helper methods
Let me explain the concept in very simple words.
Firstly from a broader perspective we have collections, and hashmap is one of the datastructure in the collections.
To understand why we have to override the both equals and hashcode method, if need to first understand what is hashmap and what is does.
A hashmap is a datastructure which stores key value pairs of data in array fashion. Lets say a[], where each element in 'a' is a key value pair.
Also each index in the above array can be linked list thereby having more than one values at one index.
Now why is a hashmap used?
If we have to search among a large array then searching through each if them will not be efficient, so what hash technique tells us that lets pre process the array with some logic and group the elements based on that logic i.e. Hashing
EG: we have array 1,2,3,4,5,6,7,8,9,10,11 and we apply a hash function mod 10 so 1,11 will be grouped in together. So if we had to search for 11 in previous array then we would have to iterate the complete array but when we group it we limit our scope of iteration thereby improving speed. That datastructure used to store all the above information can be thought of as a 2d array for simplicity
Now apart from the above hashmap also tells that it wont add any Duplicates in it. And this is the main reason why we have to override the equals and hashcode
So when its said that explain the internal working of hashmap , we need to find what methods the hashmap has and how does it follow the above rules which i explained above
so the hashmap has method called as put(K,V) , and according to hashmap it should follow the above rules of efficiently distributing the array and not adding any duplicates
so what put does is that it will first generate the hashcode for the given key to decide which index the value should go in.if nothing is present at that index then the new value will be added over there, if something is already present over there then the new value should be added after the end of the linked list at that index. but remember no duplicates should be added as per the desired behavior of the hashmap. so lets say you have two Integer objects aa=11,bb=11.
As every object derived from the object class, the default implementation for comparing two objects is that it compares the reference and not values inside the object. So in the above case both though semantically equal will fail the equality test, and possibility that two objects which same hashcode and same values will exists thereby creating duplicates. If we override then we could avoid adding duplicates.
You could also refer to Detail working
import java.util.HashMap;
public class Employee {
String name;
String mobile;
public Employee(String name,String mobile) {
this.name = name;
this.mobile = mobile;
}
#Override
public int hashCode() {
System.out.println("calling hascode method of Employee");
String str = this.name;
int sum = 0;
for (int i = 0; i < str.length(); i++) {
sum = sum + str.charAt(i);
}
return sum;
}
#Override
public boolean equals(Object obj) {
// TODO Auto-generated method stub
System.out.println("calling equals method of Employee");
Employee emp = (Employee) obj;
if (this.mobile.equalsIgnoreCase(emp.mobile)) {
System.out.println("returning true");
return true;
} else {
System.out.println("returning false");
return false;
}
}
public static void main(String[] args) {
// TODO Auto-generated method stub
Employee emp = new Employee("abc", "hhh");
Employee emp2 = new Employee("abc", "hhh");
HashMap<Employee, Employee> h = new HashMap<>();
//for (int i = 0; i < 5; i++) {
h.put(emp, emp);
h.put(emp2, emp2);
//}
System.out.println("----------------");
System.out.println("size of hashmap: "+h.size());
}
}
hashCode() :
If you only override the hash-code method nothing happens, because it always returns a new hashCode for each object as an Object class.
equals() :
If you only override the equals method, if a.equals(b) is true it means the hashCode of a and b must be the same but that does not happen since you did not override the hashCode method.
Note : hashCode() method of Object class always returns a new hashCode for each object.
So when you need to use your object in the hashing based collection, you must override both equals() and hashCode().
Java puts a rule that
"If two objects are equal using Object class equals method, then the hashcode method should give the same value for these two objects."
So, if in our class we override equals() we should override hashcode() method also to follow this rule.
Both methods, equals() and hashcode(), are used in Hashtable, for example, to store values as key-value pairs. If we override one and not the other, there is a possibility that the Hashtable may not work as we want, if we use such object as a key.
Adding to #Lombo 's answer
When will you need to override equals() ?
The default implementation of Object's equals() is
public boolean equals(Object obj) {
return (this == obj);
}
which means two objects will be considered equal only if they have the same memory address which will be true only if you are
comparing an object with itself.
But you might want to consider two objects the same if they have the same value for one
or more of their properties (Refer the example given in #Lombo 's answer).
So you will override equals() in these situations and you would give your own conditions for equality.
I have successfully implemented equals() and it is working great.So why are they asking to override hashCode() as well?
Well.As long as you don't use "Hash" based Collections on your user-defined class,it is fine.
But some time in the future you might want to use HashMap or HashSet and if you don't override and "correctly implement" hashCode(), these Hash based collection won't work as intended.
Override only equals (Addition to #Lombo 's answer)
myMap.put(first,someValue)
myMap.contains(second); --> But it should be the same since the key are the same.But returns false!!! How?
First of all,HashMap checks if the hashCode of second is the same as first.
Only if the values are the same,it will proceed to check the equality in the same bucket.
But here the hashCode is different for these 2 objects (because they have different memory address-from default implementation).
Hence it will not even care to check for equality.
If you have a breakpoint inside your overridden equals() method,it wouldn't step in if they have different hashCodes.
contains() checks hashCode() and only if they are the same it would call your equals() method.
Why can't we make the HashMap check for equality in all the buckets? So there is no necessity for me to override hashCode() !!
Then you are missing the point of Hash based Collections.
Consider the following :
Your hashCode() implementation : intObject%9.
The following are the keys stored in the form of buckets.
Bucket 1 : 1,10,19,... (in thousands)
Bucket 2 : 2,20,29...
Bucket 3 : 3,21,30,...
...
Say,you want to know if the map contains the key 10.
Would you want to search all the buckets? or Would you want to search only one bucket?
Based on the hashCode,you would identify that if 10 is present,it must be present in Bucket 1.
So only Bucket 1 will be searched !!
Because if you do not override them you will be use the default implentation in Object.
Given that instance equality and hascode values generally require knowledge of what makes up an object they generally will need to be redefined in your class to have any tangible meaning.
In order to use our own class objects as keys in collections like HashMap, Hashtable etc.. , we should override both methods ( hashCode() and equals() ) by having an awareness on internal working of collection. Otherwise, it leads to wrong results which we are not expected.
class A {
int i;
// Hashing Algorithm
if even number return 0 else return 1
// Equals Algorithm,
if i = this.i return true else false
}
put('key','value') will calculate the hash value using hashCode() to determine the
bucket and uses equals() method to find whether the value is already
present in the Bucket. If not it will added else it will be replaced with current value
get('key') will use hashCode() to find the Entry (bucket) first and
equals() to find the value in Entry
if Both are overridden,
Map<A>
Map.Entry 1 --> 1,3,5,...
Map.Entry 2 --> 2,4,6,...
if equals is not overridden
Map<A>
Map.Entry 1 --> 1,3,5,...,1,3,5,... // Duplicate values as equals not overridden
Map.Entry 2 --> 2,4,6,...,2,4,..
If hashCode is not overridden
Map<A>
Map.Entry 1 --> 1
Map.Entry 2 --> 2
Map.Entry 3 --> 3
Map.Entry 4 --> 1
Map.Entry 5 --> 2
Map.Entry 6 --> 3 // Same values are Stored in different hasCodes violates Contract 1
So on...
HashCode Equal Contract
Two keys equal according to equal method should generate same hashCode
Two Keys generating same hashCode need not be equal (In above example all even numbers generate same hash Code)
1) The common mistake is shown in the example below.
public class Car {
private String color;
public Car(String color) {
this.color = color;
}
public boolean equals(Object obj) {
if(obj==null) return false;
if (!(obj instanceof Car))
return false;
if (obj == this)
return true;
return this.color.equals(((Car) obj).color);
}
public static void main(String[] args) {
Car a1 = new Car("green");
Car a2 = new Car("red");
//hashMap stores Car type and its quantity
HashMap<Car, Integer> m = new HashMap<Car, Integer>();
m.put(a1, 10);
m.put(a2, 20);
System.out.println(m.get(new Car("green")));
}
}
the green Car is not found
2. Problem caused by hashCode()
The problem is caused by the un-overridden method hashCode(). The contract between equals() and hashCode() is:
If two objects are equal, then they must have the same hash code.
If two objects have the same hash code, they may or may not be equal.
public int hashCode(){
return this.color.hashCode();
}
It is useful when using Value Objects. The following is an excerpt from the Portland Pattern Repository:
Examples of value objects are things
like numbers, dates, monies and
strings. Usually, they are small
objects which are used quite widely.
Their identity is based on their state
rather than on their object identity.
This way, you can have multiple copies
of the same conceptual value object.
So I can have multiple copies of an
object that represents the date 16 Jan
1998. Any of these copies will be equal to each other. For a small
object such as this, it is often
easier to create new ones and move
them around rather than rely on a
single object to represent the date.
A value object should always override
.equals() in Java (or = in Smalltalk).
(Remember to override .hashCode() as
well.)
Assume you have class (A) that aggregates two other (B) (C), and you need to store instances of (A) inside hashtable. Default implementation only allows distinguishing of instances, but not by (B) and (C). So two instances of A could be equal, but default wouldn't allow you to compare them in correct way.
Consider collection of balls in a bucket all in black color. Your Job is to color those balls as follows and use it for appropriate game,
For Tennis - Yellow, Red.
For Cricket - White
Now bucket has balls in three colors Yellow, Red and White. And that now you did the coloring Only you know which color is for which game.
Coloring the balls - Hashing.
Choosing the ball for game - Equals.
If you did the coloring and some one chooses the ball for either cricket or tennis they wont mind the color!!!
I was looking into the explanation " If you only override hashCode then when you call myMap.put(first,someValue) it takes first, calculates its hashCode and stores it in a given bucket. Then when you call myMap.put(first,someOtherValue) it should replace first with second as per the Map Documentation because they are equal (according to our definition)." :
I think 2nd time when we are adding in myMap then it should be the 'second' object like myMap.put(second,someOtherValue)
The methods equals and hashcode are defined in the object class. By default if the equals method returns true, then the system will go further and check the value of the hash code. If the hash code of the 2 objects is also same only then the objects will be considered as same. So if you override only equals method, then even though the overridden equals method indicates 2 objects to be equal , the system defined hashcode may not indicate that the 2 objects are equal. So we need to override hash code as well.
Equals and Hashcode methods in Java
They are methods of java.lang.Object class which is the super class of all the classes (custom classes as well and others defined in java API).
Implementation:
public boolean equals(Object obj)
public int hashCode()
public boolean equals(Object obj)
This method simply checks if two object references x and y refer to the same object. i.e. It checks if x == y.
It is reflexive: for any reference value x, x.equals(x) should return true.
It is symmetric: for any reference values x and y, x.equals(y) should return true if and only if y.equals(x) returns true.
It is transitive: for any reference values x, y, and z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) should return true.
It is consistent: for any reference values x and y, multiple invocations of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the object is modified.
For any non-null reference value x, x.equals(null) should return
false.
public int hashCode()
This method returns the hash code value for the object on which this method is invoked. This method returns the hash code value as an integer and is supported for the benefit of hashing based collection classes such as Hashtable, HashMap, HashSet etc. This method must be overridden in every class that overrides the equals method.
The general contract of hashCode is:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified.
This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.
Equal objects must produce the same hash code as long as they are
equal, however unequal objects need not produce distinct hash codes.
Resources:
JavaRanch
Picture
If you override equals() and not hashcode(), you will not find any problem unless you or someone else uses that class type in a hashed collection like HashSet.
People before me have clearly explained the documented theory multiple times, I am just here to provide a very simple example.
Consider a class whose equals() need to mean something customized :-
public class Rishav {
private String rshv;
public Rishav(String rshv) {
this.rshv = rshv;
}
/**
* #return the rshv
*/
public String getRshv() {
return rshv;
}
/**
* #param rshv the rshv to set
*/
public void setRshv(String rshv) {
this.rshv = rshv;
}
#Override
public boolean equals(Object obj) {
if (obj instanceof Rishav) {
obj = (Rishav) obj;
if (this.rshv.equals(((Rishav) obj).getRshv())) {
return true;
} else {
return false;
}
} else {
return false;
}
}
#Override
public int hashCode() {
return rshv.hashCode();
}
}
Now consider this main class :-
import java.util.HashSet;
import java.util.Set;
public class TestRishav {
public static void main(String[] args) {
Rishav rA = new Rishav("rishav");
Rishav rB = new Rishav("rishav");
System.out.println(rA.equals(rB));
System.out.println("-----------------------------------");
Set<Rishav> hashed = new HashSet<>();
hashed.add(rA);
System.out.println(hashed.contains(rB));
System.out.println("-----------------------------------");
hashed.add(rB);
System.out.println(hashed.size());
}
}
This will yield the following output :-
true
-----------------------------------
true
-----------------------------------
1
I am happy with the results. But if I have not overridden hashCode(), it will cause nightmare as objects of Rishav with same member content will no longer be treated as unique as the hashCode will be different, as generated by default behavior, here's the would be output :-
true
-----------------------------------
false
-----------------------------------
2
In the example below, if you comment out the override for equals or hashcode in the Person class, this code will fail to look up Tom's order. Using the default implementation of hashcode can cause failures in hashtable lookups.
What I have below is a simplified code that pulls up people's order by Person. Person is being used as a key in the hashtable.
public class Person {
String name;
int age;
String socialSecurityNumber;
public Person(String name, int age, String socialSecurityNumber) {
this.name = name;
this.age = age;
this.socialSecurityNumber = socialSecurityNumber;
}
#Override
public boolean equals(Object p) {
//Person is same if social security number is same
if ((p instanceof Person) && this.socialSecurityNumber.equals(((Person) p).socialSecurityNumber)) {
return true;
} else {
return false;
}
}
#Override
public int hashCode() { //I am using a hashing function in String.java instead of writing my own.
return socialSecurityNumber.hashCode();
}
}
public class Order {
String[] items;
public void insertOrder(String[] items)
{
this.items=items;
}
}
import java.util.Hashtable;
public class Main {
public static void main(String[] args) {
Person p1=new Person("Tom",32,"548-56-4412");
Person p2=new Person("Jerry",60,"456-74-4125");
Person p3=new Person("Sherry",38,"418-55-1235");
Order order1=new Order();
order1.insertOrder(new String[]{"mouse","car charger"});
Order order2=new Order();
order2.insertOrder(new String[]{"Multi vitamin"});
Order order3=new Order();
order3.insertOrder(new String[]{"handbag", "iPod"});
Hashtable<Person,Order> hashtable=new Hashtable<Person,Order>();
hashtable.put(p1,order1);
hashtable.put(p2,order2);
hashtable.put(p3,order3);
//The line below will fail if Person class does not override hashCode()
Order tomOrder= hashtable.get(new Person("Tom", 32, "548-56-4412"));
for(String item:tomOrder.items)
{
System.out.println(item);
}
}
}
hashCode() method is used to get a unique integer for given object. This integer is used for determining the bucket location, when this object needs to be stored in some HashTable, HashMap like data structure. By default, Object’s hashCode() method returns and integer representation of memory address where object is stored.
The hashCode() method of objects is used when we insert them into a HashTable, HashMap or HashSet. More about HashTables on Wikipedia.org for reference.
To insert any entry in map data structure, we need both key and value. If both key and values are user define data types, the hashCode() of the key will be determine where to store the object internally. When require to lookup the object from the map also, the hash code of the key will be determine where to search for the object.
The hash code only points to a certain "area" (or list, bucket etc) internally. Since different key objects could potentially have the same hash code, the hash code itself is no guarantee that the right key is found. The HashTable then iterates this area (all keys with the same hash code) and uses the key's equals() method to find the right key. Once the right key is found, the object stored for that key is returned.
So, as we can see, a combination of the hashCode() and equals() methods are used when storing and when looking up objects in a HashTable.
NOTES:
Always use same attributes of an object to generate hashCode() and equals() both. As in our case, we have used employee id.
equals() must be consistent (if the objects are not modified, then it must keep returning the same value).
Whenever a.equals(b), then a.hashCode() must be same as b.hashCode().
If you override one, then you should override the other.
http://parameshk.blogspot.in/2014/10/examples-of-comparable-comporator.html
String class and wrapper classes have different implementation of equals() and hashCode() methods than Object class. equals() method of Object class compares the references of the objects, not the contents. hashCode() method of Object class returns distinct hashcode for every single object whether the contents are same.
It leads problem when you use Map collection and the key is of Persistent type, StringBuffer/builder type. Since they don't override equals() and hashCode() unlike String class, equals() will return false when you compare two different objects even though both have same contents. It will make the hashMap storing same content keys. Storing same content keys means it is violating the rule of Map because Map doesnt allow duplicate keys at all.
Therefore you override equals() as well as hashCode() methods in your class and provide the implementation(IDE can generate these methods) so that they work same as String's equals() and hashCode() and prevent same content keys.
You have to override hashCode() method along with equals() because equals() work according hashcode.
Moreover overriding hashCode() method along with equals() helps to intact the equals()-hashCode() contract: "If two objects are equal, then they must have the same hash code."
When do you need to write custom implementation for hashCode()?
As we know that internal working of HashMap is on principle of Hashing. There are certain buckets where entrysets get stored. You customize the hashCode() implementation according your requirement so that same category objects can be stored into same index.
when you store the values into Map collection using put(k,v)method, the internal implementation of put() is:
put(k, v){
hash(k);
index=hash & (n-1);
}
Means, it generates index and the index is generated based on the hashcode of particular key object. So make this method generate hashcode according your requirement because same hashcode entrysets will be stored into same bucket or index.
That's it!
IMHO, it's as per the rule says - If two objects are equal then they should have same hash, i.e., equal objects should produce equal hash values.
Given above, default equals() in Object is == which does comparison on the address, hashCode() returns the address in integer(hash on actual address) which is again distinct for distinct Object.
If you need to use the custom Objects in the Hash based collections, you need to override both equals() and hashCode(), example If I want to maintain the HashSet of the Employee Objects, if I don't use stronger hashCode and equals I may endup overriding the two different Employee Objects, this happen when I use the age as the hashCode(), however I should be using the unique value which can be the Employee ID.
To help you check for duplicate Objects, we need a custom equals and hashCode.
Since hashcode always returns a number its always fast to retrieve an object using a number rather than an alphabetic key. How will it do? Assume we created a new object by passing some value which is already available in some other object. Now the new object will return the same hash value as of another object because the value passed is same. Once the same hash value is returned, JVM will go to the same memory address every time and if in case there are more than one objects present for the same hash value it will use equals() method to identify the correct object.
When you want to store and retrieve your custom object as a key in Map, then you should always override equals and hashCode in your custom Object .
Eg:
Person p1 = new Person("A",23);
Person p2 = new Person("A",23);
HashMap map = new HashMap();
map.put(p1,"value 1");
map.put(p2,"value 2");
Here p1 & p2 will consider as only one object and map size will be only 1 because they are equal.
public class Employee {
private int empId;
private String empName;
public Employee(int empId, String empName) {
super();
this.empId = empId;
this.empName = empName;
}
public int getEmpId() {
return empId;
}
public void setEmpId(int empId) {
this.empId = empId;
}
public String getEmpName() {
return empName;
}
public void setEmpName(String empName) {
this.empName = empName;
}
#Override
public String toString() {
return "Employee [empId=" + empId + ", empName=" + empName + "]";
}
#Override
public int hashCode() {
return empId + empName.hashCode();
}
#Override
public boolean equals(Object obj) {
if (this == obj) {
return true;
}
if (!(this instanceof Employee)) {
return false;
}
Employee emp = (Employee) obj;
return this.getEmpId() == emp.getEmpId() && this.getEmpName().equals(emp.getEmpName());
}
}
Test Class
public class Test {
public static void main(String[] args) {
Employee emp1 = new Employee(101,"Manash");
Employee emp2 = new Employee(101,"Manash");
Employee emp3 = new Employee(103,"Ranjan");
System.out.println(emp1.hashCode());
System.out.println(emp2.hashCode());
System.out.println(emp1.equals(emp2));
System.out.println(emp1.equals(emp3));
}
}
In Object Class equals(Object obj) is used to compare address comparesion thats why when in Test class if you compare two objects then equals method giving false but when we override hashcode() the it can compare content and give proper result.
Both the methods are defined in Object class. And both are in its simplest implementation. So when you need you want add some more implementation to these methods then you have override in your class.
For Ex: equals() method in object only checks its equality on the reference. So if you need compare its state as well then you can override that as it is done in String class.
There's no mention in this answer of testing the equals/hashcode contract.
I've found the EqualsVerifier library to be very useful and comprehensive. It is also very easy to use.
Also, building equals() and hashCode() methods from scratch involves a lot of boilerplate code. The Apache Commons Lang library provides the EqualsBuilder and HashCodeBuilder classes. These classes greatly simplify implementing equals() and hashCode() methods for complex classes.
As an aside, it's worth considering overriding the toString() method to aid debugging. Apache Commons Lang library provides the ToStringBuilder class to help with this.

Overwriting the hashCode() and equals() method in Java

As the heading suggests, my question has something to do with overriding the hashCode() and equals() method. However, this is also not completely true, I just did not know another way to summarize my question.
I have an object Label that contains multiple components, one of which is a List that contains multiple objects of type Node. An example of Label would be: [(n1, n2, n3), (n4, n5)]. I want to store all unique Label objects generated in a LinkedHashSet. However, this is not working as expected. Suppose that the LinkedHashSet currently contains the Label described above, and that we now generated a new Label called other which turned out to contain the same nodes as the already added label, thus also [(n1, n2, n3), (n4, n5)]. Since it has the same list of nodes, the other components in Label are also identical. I won't explain here why, just assume that it is, because that is the case. However, when checking if the LinkedHashSet already contains the Label it returns false, since the objects have different object ID's.
One approach would be to write a for-loop over the LinkedHashSet and compare the new label with all labels in the LinkedHashSet, but that would be very expensive in terms of running time, so I am looker for a cheaper option. Any suggestion is welcome!
Another approach would be to adapt the equals() method, but this was not working out, since I also have to adapt the hashCode() method and I do not know what to change this to to get this working.
Implementing equals(...) and hashCode(...) is the right approach here. Usually people use their IDE to generate those methods for them, here is what this would look like with Java 7+:
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Label label = (Label) o;
return Objects.equal(nodes, label.nodes);
}
#Override
public int hashCode() {
return Objects.hashCode(nodes);
}
If you have multiple fields this become:
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Label label = (Label) o;
return Objects.equal(nodes, label.nodes) &&
Objects.equal(other, label.other);
}
#Override
public int hashCode() {
return Objects.hashCode(nodes, other);
}
However this will only work if the implementations for Node also implement equals(...) and hashCode(...).
You definitely don't want to "for-loop over the LinkedHashSet and compare..." Detecting duplicates is what the LinkedHashSet is supposed to do. In order to do that, it needs appropriate implementations of equals and hashCode.
You probably understand that you need to implement equals such that it returns true when a Label is "equal to" another Label, however you define that. If two labels are equal when they have the same nodes, then yeah, you pretty much have to look at all the nodes and check that they're the same.
That leaves hashCode. You must implement hashCode to be consistent with equals, that is, if two labels are equal, they must have the same hash code. That's because LinkedHashSet is going to use the hash code to determine the bucket in which a Label resides, and then use equals to compare the new Label with the ones that already exist in bucket. If two labels are equal but generate different hash codes, LinkedHashSet won't be able to detect them as duplicates.
The simplest thing to do would be to incorporate the hash codes of all the nodes into the hash code of the Label. Something like:
int hashCode() {
int hc = 1;
for (Node n : allMyNodes) {
hc = hc * 31 + n.hashCode();
}
return hc;
}
If there are lots of nodes and it's unlikely that two labels will share the same node unless they're "equal," you could just use the hash code of the first node, instead of rummaging through them all.
Use #EqualsAndHashCode from lombok at your Node (or Label) class.
Then you can add nodes to your LinkedHashSet.

Is it a bad practice to return super.equals and super.hashcode in a class?

Let's say i have this class:
public class Person {
private Integer idFromDatabase;
private String name;
//Getters and setters
}
The field idFromDatabase is the attribute that should be verified in equals and used to create the hashCode. But sometimes, i am working with a list of People in memory, and have not yet stored the objects on the database, so the idFromDatabase is null for all objects, which would cause hashCode to return the same value for every object.
I solved this issue by adding the following to equals and hashCode metods:
if(idFromDatabase == null) return super.equals(o);
and
if(idFromDatabase == null) return super.hashCode();
It worked, but is it safe? Can i do it for every class that relies on a database field for equality check?
if(idFromDatabase == null) return super.equals(o); is incorrect as super's equals (if implemented correctly) does a getClass() check, which will of course be different, thus super.equals will always be false.
As already noted by #Jeroen Vannevel, if you are likely to end up with having 2 or more objects not stored in database holding the exact same information, then this technique will not help you in identifying this.
#Solver is also quite true in that a subclass is meant to have different behavior than its superclass, so you shouldn't return that they're equal.
However, in your particular example, you are just extending the Object class, so your assumption that it is safe is true (if we exclude the possibility of having 2 not-yet-persisted same Persons in memory).
Object provides the most basic equals method:
For any non-null reference values x and y, this method returns true
if and only if x and y refer to the same object
(x == y has the value true).
The hashCode method of Object:
As much as is reasonably practical, [...] does return distinct integers for distinct objects
These definitions make it clear that if you're only extending Object, then this technique is safe.
From your description I'm inferring that when comparing two People objects:
If both have an ID, they are equal if they have the same ID, even if they have different names
Otherwise, they are only equal if they are the same instance.
If that's correct, then:
#Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (this.idFromDatabase == null)
return false;
if (! (obj instanceof People))
return false;
People that = (People)obj;
if (that.idFromDatabase == null)
return false;
return this.idFromDatabase.equals(that.idFromDatabase);
}
#Override
public int hashCode() {
// Use super.hashCode to distribute objects without an idFromDatabase
return (this.idFromDatabase != null ? this.idFromDatabase.hashCode() : super.hashCode());
}
There are a few problems with your reasoning.
equals and hashcode are not subtype-friendly so it doesn't make sense to start thinking about super calls,
super is Object anyway so it's equals and hashcode are useless in this context.
What if you have two Person objects referring to the same person, but only one is stored in the database. Are they the same or different?
One universal solution is to make two classes
Person which stores a 'local' person. Doesn't contain idFromDatabase,
StoredPerson which contains idFromDatabase and a Person (or all fields of Person, but this is harder to maintain)
This way, at least equals and hashcode are well-defined and well-behaved at all times.
Implementation and usage
If you use any kind of Set/Map to store people, you now have two of them. When you save new Persons to database, you remove them from the 'local' Set/Map, wrap them in StoredPerson, and put them in the 'database' Set/Map.
If you want a searchable list of all people, make one with all Persons from both datasets into one. When you find a Person you're interested in and want to retrieve the idFromDatabase, if any, then you'd do good to prepare a map from Person to StoredPerson beforehand.
Thus you need at least,
Set<Person> localPeople = new HashSet<>();
Map<Person, StoredPerson> storedPeople = new HashMap<>();
and something like this:
void savePerson(Person person) {
synchronized (lockToPreserveInvariants) {
int id = db.insert(person);
StoredPerson sp = new StoredPerson(id, person);
localPeople.remove(person);
storedPeople.put(person, sp);
}
}

Why does my custom equals method (doubles and integers) not work?

I have a custom equals to check the equality of my object called Pair.
class Pair implements Comparable <Parr> {
double coef;
int power;
Pair(double a, int b) {
coef = a;
power = b;
}
My custom equals method is (located in class pair):
#Override
public boolean equals(Object o) {
if (!(o instanceof Pair))
return false;
Pair that = (Pair) o;
return that.coef == this.coef && that.power == this.power;
}
I've checked with print my object if the objects are the same, and they are indeed the same.
1.0 1 2.0 0
1.0 1 2.0 0
I call my custom equals from a different file, called Test.
class Test {
public static void main(String[] args) {
orig = pol1.differentiate().integrate();
System.out.print(orig);
if (orig.equals(pol1))
System.out.println(" (is equal.)");
else
System.out.println(" (is not equal.)");
And my class Polynomial, which is an arraylist with objects of Pair inside.
class Polynominal implements PolynominalInterface {
ArrayList<Pair> terms = new ArrayList<Pair>();
I looked on the internet, and I found that I cannot use == in my Equals method, but I'm using Intergers and Doubles, so equals() would not work.
Can anyone point me in the right direction?
If orig and pol1 are instances of Polynomial then this
if (orig.equals(pol1))
would only work if you implement Polynomial#equals() as well; which would iterate the two ArrayLists and make sure individual Pairs are equal (using Pair#equals() of course).
Ok, thanks to Ravi Thapliyal I found the solution.
After adding an custom equals method in my Polynominal class, the problem was fixed.
#Override
public boolean equals(Object o) {
if (!(o instanceof Polynomial))
return false;
Polynomial that = (Polynomial) o;
return that.terms.equals(terms);
}
Use the Double.compare(double, double) method instead of ==.
Floating point comparison is "fuzzy" in Java.
You would need to implement a Polynomail.equals() method something like the following:
public boolean equals(Object o) {
if (!(o instanceof Polynomial)) return false;
Polynomial other = (Polynomial) o;
if (this.terms==null && other.terms==null) return true;
// A suitable equals() method already exists for ArrayList, so we can use that
// this will in turn use Pair.equals() which looks OK to me
if (this.terms!=null && other.terms!=null) return this.terms.equals(other.terms);
return false;
}
Two issues come to mind: the first is that the default hashCode() method will seldom return the same value for any two distinct object instances, regardless of their contents. This is a good thing if the equals() method will never report two distinct object instances as equal, but is a bad thing if it will. Every object which overrides Object.equals() should also override Object.hashCode() so that if x.equals(y), then x.hashCode()==y.hashCode(); this is important because even non-hashed generic collections may use objects' hash codes to expedite comparisons. If you don't want to write a "real" hash function, simply pick some arbitrary integer and have your type's hashCode() method always return that. Any hashed collection into which your type is stored will perform slowly, but all collections into which it is stored should behave correctly.
The second issue you may be seeing is that floating-point comparisons are sometimes dodgy. Two numbers may be essentially equal but compare unequal. Worse, the IEEE decided for whatever reason that floating-point "not-a-number" values should compare unequal to everything--even themselves.
Factoring both of these issues together, I would suggest that you might want to rewrite your equals method to chain to the equals method of double. Further, if neither field of your object will be modified while it's stored in a collection, have your hashCode() method compute the hashCode of the int, multiply it by some large odd number, and then add or xor that with the hashCode of the double. If your object might be modified while stored in a collection, have hashCode() return a constant. If you don't override hashCode() you cannot expect the equals methods of any objects which contain yours to work correctly.

how can I implement the union and intersection of set theory with sets of objects

hi i've seen this post how to implement the union and intersection when it you have two sets of data,that are strings.how can i do the same when my sets contain objects,and i want to get the union of only one property of each object?
But I want to override them somehow so that it wont add an object if there's an object already in my set that has the same value in a selected property.If i'm not clear enough tell me so i can write an example.
I think the best way to do this is to use a ConcurrentMap.
ConcurrentMap<String, MyType> map = new ConcurrentHashMap<>();
// the collection to retain.
for(MyType mt: retainedSet) map.put(mt.getKey(), mt);
// the collection to keep if not duplicated
for(MyType mt: onlyIfNewSet) map.putIfAbsent(mt.getKey(), mt);
// to get the intersection.
Set<String> toKeep = new HashSet<>();
for(MyType mt: onlyIfNewSet) toKeep.add(mt.getKey());
// keep only the keys which match.
map.keySet().retainAll(toKeep);
Google Guava, has Sets class which contains these methods and many more.
As in this answer, use Collection methods retainAll(Collection)- intersection and #addAll(Collection)- union.
Since those methods use equals, you also have to override equals method in your Class and implement one-property based comparison.
In case it's simple comparison, here's an example (generated by my IDEA):
public class Main {
private Integer age;
...
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Main main = (Main) o;
if (age != null ? !age.equals(main.age) : main.age != null) return false;
return true;
}
Intersection is using contains, which uses equals. You should implement equals() method on the class that you want to do intersection.
I didn't find specific comments about set.addAll() implementation, but most probably it also uses equals() to determine if an object is already on the set.
If you want to compare only by a field, your equals() should only compare this field.

Categories