I have a class Reminder that has both hashcode and equals overridden like this:
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + ((cronExpression == null) ? 0 : cronExpression.hashCode());
result = prime * result + ((subject == null) ? 0 : subject.hashCode());
result = prime * result + timeout;
result = prime * result + ((type == null) ? 0 : type.hashCode());
return result;
}
#Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (!(obj instanceof Reminder))
return false;
Reminder other = (Reminder) obj;
if (cronExpression == null) {
if (other.cronExpression != null)
return false;
} else if (!cronExpression.equals(other.cronExpression))
return false;
if (subject == null) {
if (other.subject != null)
return false;
} else if (!subject.equals(other.subject))
return false;
if (timeout != other.timeout)
return false;
if (type == null) {
if (other.type != null)
return false;
} else if (!type.equals(other.type))
return false;
return true;
}
Both overrides were automatically generated using Eclipse. I'm using the Reminder in a HashSet instantiated like this: private Set<Reminder> localReminders = new HashSet<Reminder>();
When updating this set, I'm using localreminders.contains(anotherReminder) and for some reason that I've been trying to figure out for a while now, it does not call the overridden equals method. Even though cronExpression, subject, timeout and type of the reminders compared are the same, contains returns false.
So far I've only come across answers where equalsand/or hashcode were implemented incorrectly or not at all. Any help would be very much appreciated!
Let me know if you need more information like additional code for this!
EDIT: the properties used in hashcodeand equals are all String, except for timeout which is int.
EDIT2: while debugging, I currently have these two reminders in my HashSet:
Reminder [cronExpression=0 10 10 ? * *, subject=, type=OTHER_TYPE, audioPath=/other_type_reminder.mp3, muted=false, future=DelegatingErrorHandlingRunnable for Task#af94b0, timeout=35940]
Reminder [cronExpression=50 53 10 ? * *, subject=sub, type=TYPE, audioPath=/type_reminder.mp3, muted=false, future=DelegatingErrorHandlingRunnable for ReminderTask#f1f373, timeout=35940]
The one that I am checking whether it is contained in my set looks like this:
Reminder [cronExpression=50 53 10 ? * *, subject=sub, type=TYPE, audioPath=/type_reminder.mp3, muted=false, future=null, timeout=35940]
The only difference I can spot here is that in one, the future is null while it is actually set in the other. But since the future property is not included in either hashcode or ´equals`, this should not matter.
As you can see in the implementation of the equals method you call cronExpression.equals(other.cronExpression) and subject.equals(other.subject) and type.equals(other.type). If only one of this is not implemented right then you get wrong result. Please check if all of the properties that you use in this method has correct implementation of equals.
By the way also check the implementation of the methods cronExpression.hashCode(), subject.hashCode() and type.hashCode(). They are used in your hashCode method.
Edit: If as you said cronExpression, subject and type are Strings then it should be easy for you to make main method populate two objects from class Reminder with the same info and test the methods. To be sure where is the problem you can call if(firstReminder.equals(secondReminder)).
From my experiance you can have problems with the strings. For example if one of the string has space at the end is different then the other or similar kind of issue.
Edit 2: Ok, from your input It seems this objects to have the same strings.
Is it possible Reminder class to be extended and you to compare child class object with Reminder object? If this happen in the child class equals and hashcode can be implemented and then the result can be wrong.
Also just be sure can you log the size of each string? This is very strange.
Maybe it is possible you to have hidden character. See this for more information: Is there an invisible character that is not regarded as whitespace?
Good luck!
The Problem may be with your hashcode() method. It should generate a unique code. There are some guidelines to overridde hashcode().Hashcode Best Practice
If hashcode of objects are different then equals() will not called even if they are equal.
Because HashSet first check hashcodes of both objects and if hashcodes are equal then only it will call equals() to check whether both objects are really equal or not.
Read Oracle Javadoc to override hashcode override contract
You need to provide us the import of the Reminder class if you want us to be able to help you.
For your culture and curiosity : java.util.HashSet.contains(Object o), reading the code it points to :
public boolean containsKey(Object key) {
return getNode(hash(key), key) != null;
}
which itself points to :
static final int hash(Object key) {
int h;
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}
As you can see, the important part of your implementation is Reminder.hashCode().
Regarding your specific issue : As you are probably using quartz for org.quartz.CronExpression, you can see that org.quartz.CronExpression.hashCode() method is not implemented, so it calls it's parent hashCode(), which is Object.hashCode().
From the documentation (JRE 7), you can read :
As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)
So both of similar item with different instance of org.quartz.CronExpression will have different hashCode() result.
Related
I regularly used Eclipse's code generation tools (Source / Generate hashCode() and equals()...) to create the equals() implementation for simple POJO classes. If I choose to "Use instanceof to compare types" this produces an equals() implementation similar to this:
#Override
public boolean equals(Object obj) {
if (this == obj) {
return true;
}
if (obj == null) {
return false;
}
if (!(obj instanceof MyClass)) {
return false;
}
MyClass other = (MyClass) obj;
// check the relevant fields for equality
}
Today a colleague pointed out, that the second if statement is not necessary at all, since the instanceof type check will return false whenever obj is null. (See question 3328138.)
Now, I guess that the folks writing the code templates for Eclipse JDT are worth their salt, too. So I figure there must be some reason for that null check, but I'm not really sure what it is?
(Also question 7570764 might give a hint: if we use getClass() comparison for type checking instead instanceof, obj.getClass() is not null safe. Maybe the code template is just not clever enough to omit the null check if we use instanceof.)
EDIT: Dragan noticed in his answer, that the instanceof type check is not the default setting in Eclipse, so I edited that out of the question. But that does not change anything.
Also please do not suggest that I should use getClass() or (even better!) a different IDE. That's not the point, that does not answer the question. I didn't ask for advice on how to write an equals() implementation, whether to use instanceof or getClass(), etc.
The question roughly is: is this a minor bug in Eclipse? And if it's not, then why does it qualify as a feature?
It is unnecessary because instanceof has a built in null check.
But instanceof is a lot more than a simple foo == null. It is a full instruction preparing a class check doing unnecessary work before the null check is done. (see for more details http://docs.oracle.com/javase/specs/jvms/se7/html/jvms-6.html#jvms-6.5.instanceof)
So a separate null check could be a performance improvement.
Did a quick measurement and no surprise foo==null is faster than a nullcheck with instanceof.
But usually you do not have a ton of nulls in an equals() leaving you with a duplicate unnecessary nullcheck most of the times... which will likely eat up any improvement made during null comparisons.
My conclusion: It is unnecessary.
Code used for testing for completeness (remember to use -Djava.compiler=NONE else you will only measure the power of java):
public class InstanceOfTest {
public static void main(String[] args) {
Object nullObject = null;
long start = System.nanoTime();
for(int i = Integer.MAX_VALUE; i > 0; i--) {
if (nullObject instanceof InstanceOfTest) {}
}
long timeused = System.nanoTime() - start;
long start2 = System.nanoTime();
for(int i = Integer.MAX_VALUE; i > 0; i--) {
if (nullObject == null) {}
}
long timeused2 = System.nanoTime() - start2;
System.out.println("instanceof");
System.out.println(timeused);
System.out.println("nullcheck");
System.out.println(timeused2);
}
}
Indeed, it is unnecessary and it is the mistake of the authors of the Eclipse template. And it is not the first one; I found more of smaller errors there. For example, the generation of the toString() method when I want to omit null values:
public class A {
private Integer a;
private Integer b;
#Override
public String toString() {
StringBuilder builder = new StringBuilder();
builder.append("A [");
if (a != null)
builder.append("a=").append(a).append(", ");
if (b != null)
builder.append("b=").append(b);
builder.append("]");
return builder.toString();
}
}
If a is not null and b is, there will be an extra comma before the closing ].
So, regarding your statement: "Now, I guess that the folks writing the code templates for Eclipse JDT are worth their salt, too.", I assume they are, but it would not hurt them to pay more attention to these tiny inconsistencies. :)
I've created a class which can be inherited to create both a Stack and a Queue using LinkedLists, I've passed all the JUnit tests except the equals one, I still have no idea why it doesn't work.
#Override public boolean equals(Object o) {
if( o == null) return false;
if(o == this) return true;
if(!(o instanceof PushPop)) return false;
PushPop test1= this;
PushPop test = (PushPop)o;
while(!test.isEmpty() && !test1.isEmpty()){
if(test1.pop() != test.pop()) return false;
}
return true;
}
The test sends out an assertion error whenever it's comparing the values, specifically whenever a stack/queue has an extra value than the second one.
OK, I have located your problem:
In your test, you do the following [taken from comment]
stack.push(i);
Assert.assertFalse(stack.equals(stack2));
stack2.push(i);
Assert.assertTrue(stack.equals(stack2));
This seems reasonable. However, equals clears both stacks so when you push i onto stack2, it is no longer equal to the now-empty stack.
Hence your error.
The solution: Don't Modify Your Object In An equals Method.
I'd suggest cloning or comparing whatever your underlying data structure is (e.g., if you're using Nodes, having Node implement equals).
I need clarification whether my approach is right or wrong any modifications required.
Let me explain clearly. I will have a excel file in which there will be country code country name years(mm/yyyy) as extra 10 columns
countrycode country Name 12/2000 11/2000 10/2000 09/2000 08/2000 07/2000 06/2000 05/2000 04/2000 03/2000 02/2000 01/2000
IND India 10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8 10.9 11.1 11.2 11.3
USA Uinted States 8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9 9.1 9.2 9.3
In a row if anyof the price is repeated for that particular year and country , i need to show message as Duplicate present in Excel file.
For the above , i implemented by this way. For a VO i override the hashCode() with the hashcode of (coutrycode + year + price) and equals method too and
while inserting in database i pass this VO to HashSet and I eliminate duplicate and compare the size of original list size with HashSet size.
But sometime if there is unique price also I am getting message as duplicate.
Please suggest me my approach is right or wrong or another way I can implement.
Buddy you have taken the right thought and approach to solve the problem but just missing a little edge (information) to solve the problem correctly.
I would like to provide a little hint, that I believe can help and rectify the problem and understand the basics really very well.
If you look at the documentation (or the source code) of hashCode for the String and Double variables, it states
STRING
Returns a hash code for this string. The hash code for a String object is computed as
s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
using int arithmetic, where s[i] is the ith character of the string, n is the length of the string, and ^ indicates exponentiation. (The hash value of the empty string is zero.)
Returns: a hash code value for this object.
DOUBLE
Returns a hash code for this Double object. The result is the exclusive OR of the two halves of the long integer bit representation, exactly as produced by the method doubleToLongBits(double), of the primitive double value represented by this Double object. That is, the hash code is the value of the expression:
(int)(v^(v>>>32))
where v is defined by:
long v = Double.doubleToLongBits(this.doubleValue());
Returns: a hash code value for this object.
So the hashCode() function returns a unique value in most case, but there are so many cases when the it returns the same int value for the two objects.
I think you are also getting caught in the same scenario.
A little more hint, you can use the HashMap<Integer,List<String>> where the Integer value is hashCode as you calculated and the List<String> is the collection of actual value got by forming the String from coutrycode + year + price .
And the last part is comparison, you can get the List<String> at the calculated hashCode() of new value and check if the same String value do exists in the List.
Hashbased collections depends on the hashcode() and equals() methods to correctly identify duplicates. If you modify these to fit exactly one usecase you are probably likely to have all sorts of side-effects in other use cases.
To say it more explicitly. If you change the methods of your VO to use only a subset of the data, you are likely to encounter unforeseen problems some where else where you might store VOs in hashbased collections.
You should keep hashcode() and equals() consistent with data equality, i.e. using all attributes for tests, as suggested in many sources (Source generators in eclipse, #EqualsAndHashcode annotations from Lombok, 'Effective Java' by Joshua Bloch, etc.).
In your explicit case you could create a specific wrapper to calculate your hashcodes and equality based on the subset.
As an example:
public void doit(List<VO> vos) {
Set<VOWrapper> dups = new HashSet<>();
for (VO vo : vos) {
if (dups.contains(new VOWrapper(vo))) {
System.out.println("Found a duplicate");
} else {
dups.add(new VOWrapper(vo));
// Process vo
}
}
}
Based on this VO
#Data // Lombok generates getters/setters/equals/hashcode (using all fields)
public class VO {
private String countrycode;
private String country;
private int month;
private int year;
private double price;
}
And this wrapper
public class VOWrapper {
private final VO vo;
public VOWrapper(VO vo) { this.vo = vo; }
// Equals method with only 3 fields used
#Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
VO other = ((VOWrapper) obj).vo;
if (vo.getCountry() == null) {
if (other.getCountry() != null)
return false;
} else if (!vo.getCountry().equals(other.getCountry()))
return false;
if (vo.getCountrycode() == null) {
if (other.getCountrycode() != null)
return false;
} else if (!vo.getCountrycode().equals(other.getCountrycode()))
return false;
if (Double.doubleToLongBits(vo.getPrice()) != Double.doubleToLongBits(other.getPrice()))
return false;
return true;
}
//Hashcode method with only 3 fields used
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + ((vo.getCountry() == null) ? 0 : vo.getCountry().hashCode());
result = prime * result + ((vo.getCountrycode() == null) ? 0 : vo.getCountrycode().hashCode());
long temp;
temp = Double.doubleToLongBits(vo.getPrice());
result = prime * result + (int) (temp ^ (temp >>> 32));
return result;
}
}
It is perfectly valid to write code like:
List<CountryInstance> list = ...;
Set<CountryInstance> set = new HashSet<CountryInstance>(list);
if(set.size() < list.size()){
/* There are duplicates */
For it to work you need value class instances. To create one you need to override equals and hashcode. Before you do that read What issues should be considered when overriding equals and hashCode in Java?
If you are just parsing all the values into Strings then your approach sounds logical to me.
I read your description. You seem to say that a unexpected duplicates are detected. So this really means that 'equals' method is not behaving as you expect I think. If 'hashCode' was incorrect, I think you would get the opposite problem (duplicate NOT detected).
If you are still experiencing issues then attach the implementation of 'hashCode' and 'equals' and it might help to quickly answer the problem.
One more thing. I assume that all sample countries are unique in the file? I mean no countries are duplicated later on in the file?
while working with my application I've encountered a problem while trying to remove object from the java collection ( Set pulled from database with EclipseLink ).
The object which I want to remove in an entity class which has overriden equals method.
I've even checked whether any of the objects in the collection is eqauls to the one I want to remove with the following code:
for(AlbumEntity entity : deleteGroup.getAlbums()){
System.out.println("VAL: " + deleteAlbum.equals(entity));
}
In this case, one of the values returned is true. However, if I do:
boolean result = deleteGroup.getAlbums().remove(deleteAlbum);
the value of result is false and the size of collection stays the same.
Thanks for your help in advance
edit:
#Override
public int hashCode() {
int hash = 0;
hash += (id != null ? id.hashCode() : 0);
return hash;
}
#Override
public boolean equals(Object object) {
if (!(object instanceof AlbumEntity)) {
return false;
}
AlbumEntity other = (AlbumEntity) object;
if ((this.id == null && other.id != null) || (this.id != null && !this.id.equals(other.id))) {
return false;
}
return true;
}
A few possibilities:
1) There is a problem with the implementation of id's equals or hashCode methods. In this case, you could have id1.equals(id2) but id1.hashCode() != id2.hashCode(). This would cause inconsistency between equals and hashCode() for the album objects and could cause the symptoms you're seeing.
2) The id for one or more albums changes at some point after the for loop that checks deleteAlbum.equals(entity) for each album in the Set. If an id changes for an album, the remove() method may not be able to find it. An id could change from null to some non null number if got saved to the database - EclipseLink might do this for you without you explicitly asking it to.
3) Because of EclipseLink's meddling, deleteGroup might not actually be a HashSet when you run your code. The docs for EclipseLink suggest it will give you an "indirection object" instead of the java.util.Set (or java.util.HashSet I presume) declared in your class, depending on how it is configured. In that case, the contains and remove methods might not do what you expect them to.
See Overriding equals and hashCode in Java for more details on these and other possible problems involving equals and hashCode, which can cause bizarre behavior with Sets.
Okay let's try a bit of testing:
1:
Iterator<AlbumEntity> it = deleteGroup.getAlbums().iterator();
while(it.hasNext()){
AlbumEntity entity = it.next();
Assert.assertTrue(deleteGroup.getAlbums().contains(entity))
}
Does this test run successfully?
I am trying to optimize a piece of code which compares elements of list.
Eg.
public void compare(Set<Record> firstSet, Set<Record> secondSet){
for(Record firstRecord : firstSet){
for(Record secondRecord : secondSet){
// comparing logic
}
}
}
Please take into account that the number of records in sets will be high.
Thanks
Shekhar
firstSet.equals(secondSet)
It really depends on what you want to do in the comparison logic... ie what happens if you find an element in one set not in the other? Your method has a void return type so I assume you'll do the necessary work in this method.
More fine-grained control if you need it:
if (!firstSet.containsAll(secondSet)) {
// do something if needs be
}
if (!secondSet.containsAll(firstSet)) {
// do something if needs be
}
If you need to get the elements that are in one set and not the other.
EDIT: set.removeAll(otherSet) returns a boolean, not a set. To use removeAll(), you'll have to copy the set then use it.
Set one = new HashSet<>(firstSet);
Set two = new HashSet<>(secondSet);
one.removeAll(secondSet);
two.removeAll(firstSet);
If the contents of one and two are both empty, then you know that the two sets were equal. If not, then you've got the elements that made the sets unequal.
You mentioned that the number of records might be high. If the underlying implementation is a HashSet then the fetching of each record is done in O(1) time, so you can't really get much better than that. TreeSet is O(log n).
If you simply want to know if the sets are equal, the equals method on AbstractSet is implemented roughly as below:
public boolean equals(Object o) {
if (o == this)
return true;
if (!(o instanceof Set))
return false;
Collection c = (Collection) o;
if (c.size() != size())
return false;
return containsAll(c);
}
Note how it optimizes the common cases where:
the two objects are the same
the other object is not a set at all, and
the two sets' sizes are different.
After that, containsAll(...) will return false as soon as it finds an element in the other set that is not also in this set. But if all elements are present in both sets, it will need to test all of them.
The worst case performance therefore occurs when the two sets are equal but not the same objects. That cost is typically O(N) or O(NlogN) depending on the implementation of this.containsAll(c).
And you get close-to-worst case performance if the sets are large and only differ in a tiny percentage of the elements.
UPDATE
If you are willing to invest time in a custom set implementation, there is an approach that can improve the "almost the same" case.
The idea is that you need to pre-calculate and cache a hash for the entire set so that you could get the set's current hashcode value in O(1). Then you can compare the hashcode for the two sets as an acceleration.
How could you implement a hashcode like that? Well if the set hashcode was:
zero for an empty set, and
the XOR of all of the element hashcodes for a non-empty set,
then you could cheaply update the set's cached hashcode each time you added or removed an element. In both cases, you simply XOR the element's hashcode with the current set hashcode.
Of course, this assumes that element hashcodes are stable while the elements are members of sets. It also assumes that the element classes hashcode function gives a good spread. That is because when the two set hashcodes are the same you still have to fall back to the O(N) comparison of all elements.
You could take this idea a bit further ... at least in theory.
WARNING - This is highly speculative. A "thought experiment" if you like.
Suppose that your set element class has a method to return a crypto checksums for the element. Now implement the set's checksums by XORing the checksums returned for the elements.
What does this buy us?
Well, if we assume that nothing underhand is going on, the probability that any two unequal set elements have the same N-bit checksums is 2-N. And the probability 2 unequal sets have the same N-bit checksums is also 2-N. So my idea is that you can implement equals as:
public boolean equals(Object o) {
if (o == this)
return true;
if (!(o instanceof Set))
return false;
Collection c = (Collection) o;
if (c.size() != size())
return false;
return checksums.equals(c.checksums);
}
Under the assumptions above, this will only give you the wrong answer once in 2-N time. If you make N large enough (e.g. 512 bits) the probability of a wrong answer becomes negligible (e.g. roughly 10-150).
The downside is that computing the crypto checksums for elements is very expensive, especially as the number of bits increases. So you really need an effective mechanism for memoizing the checksums. And that could be problematic.
And the other downside is that a non-zero probability of error may be unacceptable no matter how small the probability is. (But if that is the case ... how do you deal with the case where a cosmic ray flips a critical bit? Or if it simultaneously flips the same bit in two instances of a redundant system?)
There is a method in Guava Sets which can help here:
public static <E> boolean equals(Set<? extends E> set1, Set<? extends E> set2){
return Sets.symmetricDifference(set1,set2).isEmpty();
}
There's an O(N) solution for very specific cases where:
the sets are both sorted
both sorted in the same order
The following code assumes that both sets are based on the records comparable. A similar method could be based on on a Comparator.
public class SortedSetComparitor <Foo extends Comparable<Foo>>
implements Comparator<SortedSet<Foo>> {
#Override
public int compare( SortedSet<Foo> arg0, SortedSet<Foo> arg1 ) {
Iterator<Foo> otherRecords = arg1.iterator();
for (Foo thisRecord : arg0) {
// Shorter sets sort first.
if (!otherRecords.hasNext()) return 1;
int comparison = thisRecord.compareTo(otherRecords.next());
if (comparison != 0) return comparison;
}
// Shorter sets sort first
if (otherRecords.hasNext()) return -1;
else return 0;
}
}
You have the following solution from https://www.mkyong.com/java/java-how-to-compare-two-sets/
public static boolean equals(Set<?> set1, Set<?> set2){
if(set1 == null || set2 ==null){
return false;
}
if(set1.size() != set2.size()){
return false;
}
return set1.containsAll(set2);
}
Or if you prefer to use a single return statement:
public static boolean equals(Set<?> set1, Set<?> set2){
return set1 != null
&& set2 != null
&& set1.size() == set2.size()
&& set1.containsAll(set2);
}
If you are using Guava library it's possible to do:
SetView<Record> added = Sets.difference(secondSet, firstSet);
SetView<Record> removed = Sets.difference(firstSet, secondSet);
And then make a conclusion based on these.
I would put the secondSet in a HashMap before the comparison. This way you will reduce the second list's search time to n(1). Like this:
HashMap<Integer,Record> hm = new HashMap<Integer,Record>(secondSet.size());
int i = 0;
for(Record secondRecord : secondSet){
hm.put(i,secondRecord);
i++;
}
for(Record firstRecord : firstSet){
for(int i=0; i<secondSet.size(); i++){
//use hm for comparison
}
}
public boolean equals(Object o) {
if (o == this)
return true;
if (!(o instanceof Set))
return false;
Set<String> a = this;
Set<String> b = o;
Set<String> thedifference_a_b = new HashSet<String>(a);
thedifference_a_b.removeAll(b);
if(thedifference_a_b.isEmpty() == false) return false;
Set<String> thedifference_b_a = new HashSet<String>(b);
thedifference_b_a.removeAll(a);
if(thedifference_b_a.isEmpty() == false) return false;
return true;
}
I think method reference with equals method can be used. We assume that the object type without a shadow of a doubt has its own comparison method. Plain and simple example is here,
Set<String> set = new HashSet<>();
set.addAll(Arrays.asList("leo","bale","hanks"));
Set<String> set2 = new HashSet<>();
set2.addAll(Arrays.asList("hanks","leo","bale"));
Predicate<Set> pred = set::equals;
boolean result = pred.test(set2);
System.out.println(result); // true