Should all classes have a .equals and .hashcode method?

Should all classes have a .equals and .hashcode method? - java

I'm working on a unit testing project where the asserts call the .equals method. However the people on the project before didn't generate these methods.
Is it considered best practice to auto generate these methods as you code? Should all coders be doing this?
I was searching for more information on the .equals and .hashcode method and most of them seem to be geared towards how to implement or override them.

It's mostly a question of taste - if you don't expect to use the equals method (e.g., aren't using assertEquals, never mean to use this class as a key in a Map, etc), writing it means you may be writing dead code, and some conventions would advocate avoiding it.
Here, there doesn't seem to be a question - if you intend to use assertEquals, you need an the equals method implemented. If you're going to implement it, you should probably also implement hashCode in order to future proof your code against sneaky, hard to find, bugs.

Auto generating of these methods leads us to some standard implementation. One standard implementation is coded in Object: comparing links and a native hash code calculation.
Unless you can imagine some other standard implementation fitting for all entities in your project, you probably shoudn't auto generate equals and hash code: implement it manually, when you know all conditions of future comparation.

Related

Customize equals / hashCode in Idea's "Generate..."

When creating a new class, it is useful to generate all the boilerplate by IDE (unless e.g. Lombok is used, of course). I tried to do it with IntelliJ Idea and I didn't like the equals and hashCode methods.
In fact, not even Idea itself liked the equals method. The code inspection says that the statement can be simplified. Well, it looks slightly better after applying Simplify n+1 times where n is amount of fields used in the methods, but it is still not the intended result.
Objects.equals(objA, objB) and Objects.hash(Object...) are considered best practice where I work. Is it possible to modify the templates used in Quick Generation feature?
If not, is there any update planned to enhance its behavior so that it at least passes inspections?

The latest IDEA 14.1 EAP (https://confluence.jetbrains.com/display/IDEADEV/IDEA+14.1+EAP) contains this possibility, please try it.

Should you avoid Guavas Ordering.usingToString()?

This question was prompted after reading Joshua Bloch's "Effective Java". Specifically in Item #10, he argues that it is bad practice to parse an object's string representation and use it for anything except a friendlier printout/debug. The reason is that such a use "is error-prone, results in fragile systems that break if you change the format".
To me it looks like Guava's Ordering.usingToString() is a spot on example of this. So is it bad practice to use it?

Well, if the sorting is only used for deciding in which order to display things to a user, I'd argue it's part of "friendlier printout/debug".
If, however, your codes correctness depends on the ordering, then I'd argue that it's indeed a bad idea to depend on toString.

As the author of that method, I would agree: it's really just a crutch. For those "look, I just need an Ordering<Object>, dammit" cases. It should probably be removed, since you can get its behavior with Ordering.onResultOf(Functions.toStringFunction) anyway.

If your program ever used the toString() for lexical sorting using natural ordering in such a way that program execution depends on it, then it would be wise to override the default toString() of the class that extended. You should in that case make the toString() method final and clearly document that it is used for ordering.
It would however be much better to create another method returning a String and create an ordering depending on that result, possibly by creating a specific Comparator to do the sorting. See for instance the final method name() used for enumerations in Java. In general it creates the same String as toString() but it is still possible to perform ordering with it even if toString() has been overridden.
If you use the last method, then the Ordering.usingToString() would not be of much use of course.

There are some obvious cases where it actually makes sense like StringBuffer etc. Obviously it doesn't make sense for most "business" classes to depend on toString().

Issues with using objects as Map keys with Java

Given an object we will call loc that simply holds 2 int member values, I believe I need to come up with a mechanism to generate a hashcode for the object. What I tried below doesn't work as it uses an object reference, and the 2 references will be different despite having the same members variables.
Map<Loc,String> mapTest = new HashMap<Loc,String>();
mapTest.put(new Loc(1,2), "String 1");
mapTest.put(new Loc(0,1), "String 2");
mapTest.put(new Loc(2,2), "String 3");
System.out.println("Should be String 2 " + mapTest.get(new Loc(0,1)));
After some reading it seems I need to roll my own hashcode for this object, and use that hashcode as the key. Just wanted to confirm that I am on the right track here, and if someone could guide me to simple implementations to look at that would be excellent.
Thanks

Yes, you need to override equals() and hashCode() and they need to behave consistently (that is, equal objects have to have the same hash code). No you do not use the hash coe directly; Map uses it.

Yes, you're on the right track.
See articles like this for more details.
There are a lot of different ways to implement a hashcode, you'll probably just want to combine the hashcodes of each integer primitive.

Writing correct equals and hashcode methods can be tricky and the consequences of getting it wrong can be subtle and annoying. If you are able to, I would use the apache commons-lang library and take advantage of the HashCodeBuilder and EqualsBuilder classes. They will make it much easier to get the implementations right. The benefit of using these libraries is that it is much harder to get the boiler plate wrong, they hide the visual noise these methods tend to create and they make it harder for someone to come a long later and mess it up. Of course another alternative is to let your IDE generate those methods for you which works but just creates more of the noisy code vomit Java is known for.

If you want to use your type as a key type in a map, it's essential that it provides sane implementations of equals and hashCode. Fortunately, you don't have to write these implementations manually. Eclipse (and I guess other IDEs as well) can generate this boilerplate for you. Or you can even use Project Lombok for that.
Ideally the object to be used as a key in a map should be immutable. This can save you from many bugs led to by the equality issues in the context of mutation.

You need to implement both hashCode() and equals(). Joshua Bloch's Effective Java should be the definitive source on the "how" part of your question, and I'm not sure if it's okay to reproduce it here, so I'll just refer you to it.

JPA : not overriding equals() and hashCode() in the entities?

After reading this article , im bending toward not overriding equals() and hashCode() altogether.
In the summary of that article, concerning the no eq/hC at all column, the only consequence is that i couldnt do the comparison operations like :
contains() in a List for detached entities, or
compare the same entities from different sessions
and expect the correct result.
But im still in doubt and would like to ask your experiences about this whether it is a bad practice to skip equals and hashCode altogether and what other consequences that i still dont know for now.
Just another point of information, im bending towards using List Collections over Set. And my assumption is that i dont really need to override hashCode and equal when storing in a List.

Read this very nice article on the subject: Don't Let Hibernate Steal Your Identity.
The conclusion of the article goes like this:
Object identity is deceptively hard to implement correctly when
objects are persisted to a database. However, the problems stem
entirely from allowing objects to exist without an id before they are
saved. We can solve these problems by taking the responsibility of
assigning object IDs away from object-relational mapping frameworks
such as Hibernate. Instead, object IDs can be assigned as soon as the
object is instantiated. This makes object identity simple and
error-free, and reduces the amount of code needed in the domain model.

whether it is a bad practice to skip equals and hashCode altogether
Yes. You should always override your equals and hashCode. Period. The reason is that this method is present already in your class, implemented in Object. Turns out that this implementation is generic, and nearly 100% of the times it's a wrong implementation for your own objects. So, by skipping equals/hashCode you are in fact providing a wrong implementation and will (in the best case scenario) confuse whoever uses these classes. It may be your colleagues, or it may be some framework you are using (which can lead to unpredictable and hard-to-debug issues).
There's no reason to not implement these methods. Most IDEs provides a generator for equals/hashCode. You just need to inform the IDE about your business key.

You got the exact opposite conclusion from that article of what it was trying to convey.
Hibernate heavily relies on equals being implemented properly. It will malfunction if you don't.
In fact, almost everything does; including standard java collections.
The default implementation does not work when using persistence. You should always implement both equals and hashcode. There's a simple rule on how to do it, too:
For entities, use the key of the object.
For value objects, use the values
Always make sure the values you use in your equals/hashcode are immutable. If you pass these out (like in a getter), preferably pass them out in an immutable form.
This advice will improve your life :)

How should one unit test the hashCode-equals contract?

In a nutshell, the hashCode contract, according to Java's object.hashCode():
The hash code shouldn't change unless something affecting equals() changes
equals() implies hash codes are ==
Let's assume interest primarily in immutable data objects - their information never changes after they're constructed, so #1 is assumed to hold. That leaves #2: the problem is simply one of confirming that equals implies hash code ==.
Obviously, we can't test every conceivable data object unless that set is trivially small. So, what is the best way to write a unit test that is likely to catch the common cases?
Since the instances of this class are immutable, there are limited ways to construct such an object; this unit test should cover all of them if possible. Off the top of my head, the entry points are the constructors, deserialization, and constructors of subclasses (which should be reducible to the constructor call problem).
[I'm going to try to answer my own question via research. Input from other StackOverflowers is a welcome safety mechanism to this process.]
[This could be applicable to other OO languages, so I'm adding that tag.]

EqualsVerifier is a relatively new open source project and it does a very good job at testing the equals contract. It doesn't have the issues the EqualsTester from GSBase has. I would definitely recommend it.

My advice would be to think of why/how this might ever not hold true, and then write some unit tests which target those situations.
For example, let's say you had a custom Set class. Two sets are equal if they contain the same elements, but it's possible for the underlying data structures of two equal sets to differ if those elements are stored in a different order. For example:
MySet s1 = new MySet( new String[]{"Hello", "World"} );
MySet s2 = new MySet( new String[]{"World", "Hello"} );
assertEquals(s1, s2);
assertTrue( s1.hashCode()==s2.hashCode() );
In this case, the order of the elements in the sets might affect their hash, depending on the hashing algorithm you've implemented. So this is the kind of test I'd write, since it tests the case where I know it would be possible for some hashing algorithm to produce different results for two objects I've defined to be equal.
You should use a similar standard with your own custom class, whatever that is.

It's worth using the junit addons for this. Check out the class EqualsHashCodeTestCase http://junit-addons.sourceforge.net/ you can extend this and implement createInstance and createNotEqualInstance, this will check the equals and hashCode methods are correct.

I would recommend the EqualsTester from GSBase. It does basically what you want. I have two (minor) problems with it though:
The constructor does all the work, which I don't consider to be good practice.
It fails when an instance of class A equals to an instance of a subclass of class A. This is not necessarily a violation of the equals contract.

[At the time of this writing, three other answers were posted.]
To reiterate, the aim of my question is to find standard cases of tests to confirm that hashCode and equals are agreeing with each other. My approach to this question is to imagine the common paths taken by programmers when writing the classes in question, namely, immutable data. For example:
Wrote equals() without writing hashCode(). This often means equality was defined to mean equality of the fields of two instances.
Wrote hashCode() without writing equals(). This may mean the programmer was seeking a more efficient hashing algorithm.
In the case of #2, the problem seems nonexistent to me. No additional instances have been made equals(), so no additional instances are required to have equal hash codes. At worst, the hash algorithm may yield poorer performance for hash maps, which is outside the scope of this question.
In the case of #1, the standard unit test entails creating two instances of the same object with the same data passed to the constructor, and verifying equal hash codes. What about false positives? It's possible to pick constructor parameters that just happen to yield equal hash codes on a nonetheless unsound algorithm. A unit test that tends to avoid such parameters would fulfill the spirit of this question. The shortcut here is to inspect the source code for equals(), think hard, and write a test based on that, but while this may be necessary in some cases, there may also be common tests that catch common problems - and such tests also fulfill the spirit of this question.
For example, if the class to be tested (call it Data) has a constructor that takes a String, and instances constructed from Strings that are equals() yielded instances that were equals(), then a good test would probably test:
new Data("foo")
another new Data("foo")
We could even check the hash code for new Data(new String("foo")), to force the String to not be interned, although that's more likely to yield a correct hash code than Data.equals() is to yield a correct result, in my opinion.
Eli Courtwright's answer is an example of thinking hard of a way to break the hash algorithm based on knowledge of the equals specification. The example of a special collection is a good one, as user-made Collections do turn up at times, and are quite prone to muckups in the hash algorithm.

This is one of the only cases where I would have multiple asserts in a test. Since you need to test the equals method you should also check the hashCode method at the same time. So on each of your equals method test cases check the hashCode contract as well.
A one = new A(...);
A two = new A(...);
assertEquals("These should be equal", one, two);
int oneCode = one.hashCode();
assertEquals("HashCodes should be equal", oneCode, two.hashCode());
assertEquals("HashCode should not change", oneCode, one.hashCode());
And of course checking for a good hashCode is another exercise. Honestly I wouldn't bother to do the double check to make sure the hashCode wasn't changing in the same run, that sort of problem is better handled by catching it in a code review and helping the developer understand why that's not a good way to write hashCode methods.

You can also use something similar to http://code.google.com/p/guava-libraries/source/browse/guava-testlib/src/com/google/common/testing/EqualsTester.java
to test equals and hashCode.

If I have a class Thing, as most others do I write a class ThingTest, which holds all the unit tests for that class. Each ThingTest has a method
public static void checkInvariants(final Thing thing) {
...
}
and if the Thing class overrides hashCode and equals it has a method
public static void checkInvariants(final Thing thing1, Thing thing2) {
ObjectTest.checkInvariants(thing1, thing2);
... invariants that are specific to Thing
}
That method is responsible for checking all invariants that are designed to hold between any pair of Thing objects. The ObjectTest method it delegates to is responsible for checking all invariants that must hold between any pair of objects. As equals and hashCode are methods of all objects, that method checks that hashCode and equals are consistent.
I then have some test methods that create pairs of Thing objects, and pass them to the pairwise checkInvariants method. I use equivalence partitioning to decide what pairs are worth testing. I usually create each pair to be different in only one attribute, plus a test that tests two equivalent objects.
I also sometimes have a 3 argument checkInvariants method, although I finds that is less useful in findinf defects, so I do not do this often

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.