Check if two objects are completely equal in Java

Check if two objects are completely equal in Java - java

I've got a Java class, here's an example:
public class Car {
private int fuelType;
private Date made;
private String name;
.
.
. // and so on
Now let's say I have two car objects and I want to compare if all their variables are equal.
Right now, I've solved this by overriding method equals(Object o) and I check if all the variables match in both objects.
The problem here is that if I have 20 classes, I'll have to override equals(Object o) in every single one of them.
Is there a way create some sort of universal method that could compare any of the two objects that I pass to it and let me know if they match in every single variable or not?

You have a few options for automating Equals & Hashcode (option #3 BLEW MY MIND!):
Your IDE. I would not recommend it for most objects as they can slowly drift out of date with the actual class definition. They also look ugly and pollute your codebase with boilerplate code.
Apache Commons has a bunch of stuff for making this easier, including a reflective version so no risk of drifting out of date with the class definition. It is better than #1 unless you require a speedy equals/hashcode, but still too much boilerplate for my liking.
Project Lombok and annotation processing. Whack an EqualsAndHashCode annotation on ya class and be done with it. I recommend using Project Lombok. It adds a touch of magic into the build (but not much) and so requires a plugin for your IDE to behave nicely but they are a small price to pay for no boilerplate code. Lombok is an annotation processor that run at compile time so you have no runtime performance hit.
Using a different language that supports it out the box, but also targets the JVM. Groovy uses an annotation and Kotlin supports data classes. Unless your existing code can quickly be converted, I would avoid this.
Google's Auto has an AutoValue. Like Project Lombok this is an annotation processor, however has less magic at the expense of little more boilerplate (thanks to Louis Wasserman)

you can use :
org.apache.commons.lang.builder.CompareToBuilder.reflectionCompare(Object lhs, Object rhs);
it uses reflection to compare the fileds
here is the javadoc : javadoc

I'll take the dissenting opinion to the majority (use apache commons with reflection) here: Yes, this is a bit code you have to write (let your IDE generate really), but you only have to do it once and the number of data classes that need to implement equals/hashcode is generally rather manageable - at least in all of the large projects (250k+ LOC) I worked on.
Sure if you add a new member to the class you will have to remember to update the equals/hashcode functions, but that's generally easy to notice, at the latest during code reviews.
And honestly if you use a simple little helper class that's even in Java7, you can cut down the code that Wana Ant showed immensely. Really all you need is:
#Override
public boolean equals(Object o) {
if (o instanceof Car) { // nb: broken if car is not final - other topic
Car other = (Car) o;
return Objects.equals(fuelType, other.fuelType) &&
Objects.equals(made, other.made) &&
Objects.equals(name, other.name);
}
return false;
}
similar for hashcode:
#Override
public int hashCode() {
return Objects.hash(fuelType, made, name);
}
Not as short as the reflection solution? True, but it's simple, easy to maintain, adapt and read - and performance is orders of magnitude better (which for classes that implement equals and hashcode is often important)

Typically you can generate equals/hashCode methods by your IDE - all big players in this field are capable of that (Eclipse, IntelliJ Idea and Netbeans).
Generally you can create some code that will use reflection but I don't recommend this one as objective approach is clearer and more maintainable. Also reflection won't be as fast as "standard" way. If you really want to go this way, there exist utilities like EqualsBuilder and HashCodeBuilder.
Just for your information, there are JVM-based languages that already support these features, e.g. Kotlin data classes, which can be pretty nicely used in existing Java projects.

I'll just throw in a plug for my favorite solution to this problem: #AutoValue.
This is an open-source project from Google that provides an annotation processor that generates a synthetic class that implements equals and hashCode for you.
Since it's auto-generated code, you don't have to worry about accidentally forgetting a field or messing up the equals or hashCode implementation. But since the code is generated at compile time, there's zero runtime overhead (unlike reflection-based solutions). It's also "API-invisible" -- users of your class can't tell the difference between an #AutoValue type and a type you implemented yourself, and you can change back and forth in the future without breaking callers.
See also this presentation which explains the rationale and does a better job comparing it to other approaches.

Theoretically you could use reflection to create some kind of util, as many people suggest you in comments. Personally I don't recommend you to do it. you will end up with something which is partially working.
Many things in Java rely on equal or hashCode, for example method contains which you can find in anything which implements Collection.
Overriding equal (and hashCode) is recommended solution. By addition, i think any decent IDE will have option to generate them for you. Hence you can do it quicker than by using reflection.

That's the way I would do it:
#Override
public boolean equals(Object obj) {
if (obj instanceof Car) {
return internalEquals((Car) obj);
}
return super.equals(obj);
}
protected boolean internalEquals(Car other) {
if(this==other){
return true;
}
if (other != null) {
//suppose fuelType can be Integer.
if (this.getFuelType() !=null) {
if (other.getFuelType() == null) {
return false;
} else if (!this.getFuelType().equals(other.getFuelType())) {
return false;
}
} else if(other.getFuelType()!=null){
return false;
}
if (this.getName() != null) {
if (other.getName() == null) {
return false;
} else if (!this.getName().equals(other.getName())) {
return false;
}
}
else if(other.getName()!=null){
return false;
}
if (this.getDate() != null) {
if (other.getDate() == null) {
return false;
} else if (!this.getDate().getTime()!=(other.getDate().getTime())) {
return false;
}
}
else if(other.getDate()!=null){
return false;
}
return true;
} else {
return false;
}
}
EDIT
Simplified version
public class Utils{
/**
* Compares the two given objects and returns true,
* if they are equal and false, if they are not.
* #param a one of the two objects to compare
* #param b the other one of the two objects to compare
* #return if the two given lists are equal.
*/
public static boolean areObjectsEqual(Object a, Object b) {
if (a == b){
return true;
}
return (a!=null && a.equals(b));
}
public static boolean areDatesEqual(Date a, Date b){
if(a == b){
return true;
}
if(a==null || b==null){
return false;
}
return a.getTime() == b.getTime();
}
}
#Override
public boolean equals(other obj) {
if(this == other){
return true;
}
if(other == null){
return false;
}
if (other instanceof Car) {
return internalEquals((Car) other);
}
return super.equals(obj);
}
protected boolean internalEquals(Car other) {
//suppose fuelType can be Integer.
if (!Utils.areObjectsEqual(this.getName(), other.getName()){
return false;
}
if (!Utils.areObjectsEqual(this.getName(), other.getName()){
return false;
}
if (!Utils.areDatesEqual(this.getDate(), other.getDate()){
return false;
}
return true;
}
}
Also don't forget about hashcode, they code hand in hand.

Related

Java avoid using too many if statement or too many validator classes

I am using lot of if statements to check.Like:
if(statement 1){
block 1;
}
if(statement 2){
block 2;
}
...//about at least 20 if
if(statement n){
block n;
}
To avoid using too many if-statement, I have tried to use strategy pattern which would create validator class for each if-statement.Like:
public interface Validator<SomeObejct>{
public Result validate(SomeObject o);
}
public class SomeValidator implements Validator<SomeObject> {
#Override
public boolean validate(SomeObject o) throw Exception{
if(statement 1){
block 1;
}
}
Because I may have at least 20 if-statement, it may need at least 20 validator classes. So if there is any better solution for that? Or how can I manage these 20 validotor classes?
Edit:
To be more specific, I am writing some code for checking the problem on my schedule. For example:
if(currentDate > mustFinishDate){
warning();
}
if(NotScheduleADateForThisTask){
warning();
}
if(DateFormatNotCorrect){
error();
}
Above the date check may also be the if-statement block.

You can use the Composite pattern to maintain a list of all validators:
class ValidatorComposite<T> implements Validator<T> {
List<Validator<T>> validators = new ArrayList<>();
public void addValidator(Validator<T> add) { validators.add(add)); }
public Result validate(T toValidate) {
Result result = Result.OK;
for (Validator<T> v : validators) {
result = v.validate(toValidate);
if (result != Result.OK) break;
}
return result;
}
}
and since Validator only has one method, for Java 8 it's a functional interface, so you don't really need "20 classes" but can create a list on the fly using lambdas.
ValidatorComposite<SomeObject> val = new ValidatorComposite<>();
val.addValidator(so -> condition1 ? block1(so) : Result.OK);
val.addValidator(so -> condition2 ? block2(so) : Result.OK);
and so on.
Your code sample isn't really consistent because first you declare Validator to return Result and later let the implementation return boolean (and even throws an Exception), so I kind of intergrated both by ignoring the exception and using a Result.OK value.

these days what you should not probably care about is performance because of power of computers. now most programmers try to write readable and clean codes.
so i believe if writing 20 ifs makes your code easier to understand and more flexible its not bad to implement that.
BTW you can use switch case too.
switch (variable){
case 1:{
//block 1
}
case 2:{
//block2
}
...
}
if your cases are not similar and have different aspects using that Validator pattern will lead to inflexibility(It may lead to this point, it depends on situation).

How to scalably consider nulls in equals()?

To avoid a null pointer exception in my equals(), I have to consider the possibility of one or more nulls.
Is there a scalable way for considering nulls? The following code will become ugly quite fast if I add more type parameters.
public boolean equals(Pair<T, U> p) {
boolean firstIsEqual, secondIsEqual;
if (p.getFirst()==null) {
firstIsEqual = first == null;
}
else {
firstIsEqual = (first!=null) && (first.equals(p.getFirst()));
}
// copy-paste...
if (p.getSecond()==null) {
secondIsEqual = second == null;
}
else {
secondIsEqual = (second!=null) && (second.equals(p.getSecond()));
}
return (firstIsEqual && secondIsEqual);
}

The only solution that scale are:
to use a library like Lombok that provide means to generate such methods during the compile phase
to use the built-in code generation features of your IDE
Seriously: nobody in the real world writes equals/hashCode methods manually. You use a tool and tell that which fields should be used in the computation. If a field is added, you have to remember to re-generate the methods.
Or you use a jvm language that supports data classes such as Kotlin or Scala.

Effective way to test if object is null

What is better use in order to check if object is null.
To check if the object is null or to set a flag for it.
By saying better im looking for performance (faster and safer).
public class A
{
Object test;
boolean isTestObjectSet;
public A(Object test)
{
this.test = test;
isTestObjectSet = true;
}
public A()
{
}
public void doSomething()
{
if(test != null)
//do something
//VS
if(isTestObjectSet)
//do something
}
}

Using isTestObjectSet is just making things more complicated than they need to be. Just use test != null: it better conveys your intention and doesn't force you to keep this isTestObjectSet variable in-sync with whether or not test is set. There is absolutely no performance difference between the two variants.

In my opinion - explicit checks (comparing to 'null') is faster, in terms not cluttering your code with many intermediary boolean variables.
Safer? Both of the checks are boolean checks, so it is always 'true'/'false' comparison.

Cyclomatic Complexity, joining conditions and readability

Consider the following method (in Java - and please just ignore the content):
public boolean equals(Object object) {
if (this == object) {
return true;
}
if (object == null) {
return false;
}
if (getClass() != object.getClass()) {
return false;
}
if (hashCode() != object.hashCode()) {
return false;
}
return true;
}
I have some plugin that calculates: eV(g)=5 and V(g)=5 - that is, it calculates Essential and common CC.
Now, we can write the above method as:
public boolean equals2(Object object) {
if (this == object) {
return true;
}
if (object == null || getClass() != object.getClass()) {
return false;
}
return hashCode() == object.hashCode();
}
and this plugin calculates eV(g)=3 and V(g)=3.
But how I do understand CC, the values should be the same! CC is not about counting the lines of code, but the independent paths. Therefore, joining two if in one line does not really reduces CC. In fact, it only can make things less readable.
Am I right?
EDIT
Forgot to share this small convenient table for calculating CC quickly: Start with a initial (default) value of one (1). Add one (1) for each occurrence of each of the following:
if statement
while statement
for statement
case statement
catch statement
&& and || boolean operations
?: ternary operator and ?: Elvis operator.
?. null-check operator
EDIT 2
I proved that my plugin is not working well, since when I inline everything in one line:
public boolean equals(Object object) {
return this == object || object != null && getClass() == object.getClass() && hashCode() == object.hashCode();
}
it returns CC == 1, which is clearly wrong. Anyway, the question remains: is CC reduced
[A] 5 -> 4, or
[B] 4 -> 3
?

Long story short...
Your approach is a good approach to calculate CC, you just need to decide what you really want to do with it, and modify accordingly, if you need so.
For your second example, both CC=3 and CC=5 seem to be good.
The long story...
There are many different ways to calculate CC. You need to decide what is your purpose, and you need to know what are the limitations of your analysis.
The original definition from McCabe is actually the cyclomatic complexity (from graph theory) of the control flow graph. To calculate that one, you need to have a control flow graph, which might require a more precise analysis than your current one.
Static analyzers want to calculate metrics fast, so they do not analyze the control flow, but they calculate a complexity metric that is, say, close to it. As a result, there are several approaches...
For example, you can read a discussion about the CC metric of SonarQube here or another example how SourceMeter calculates McCC here.
What is common, that these tools count conditional statements, just like you do.
But, these metrics wont be always equal with the number of independent execution paths... at least, they give a good estimation.
Two different ways to calculate CC (McCabe and Myers' extension):
V_l(g) = number of decision nodes + 1
V_2(g) = number of simple_predicates in decision nodes + 1
If your goal is to estimate the number of test cases, V2 is the one for you. But, if you want to have a measure for code comprehension (e.g. you want to identify methods that are hard to maintain and should be simplified in the code), V1 is easier to calculate and enough for you.
In addition, static analyzers measure a number of additional complexity metrics too (e.g. Nesting Level).

Converting this
if (hashCode() != object.hashCode()) {
return false;
}
return true;
to this
return hashCode() == object.hashCode();
obviously reduces CC by one, even by your quick table. There is only one path through the second version.
For the other case, while we can't know exactly how your plugin calculates those figures, it is reasonable to guess that it is treating if (object == null || getClass() != object.getClass()) as "if a non-null object's class matches then ...", which is a single check and thus adds just one to CC. I would consider that a reasonable shortcut since null checks can be rolled up into "real" checks very easily, even within the human brain.
My opinion is that the main aim of a CC-calculating IDE plugin should be to encourage you to make your code more maintainable by others. While there is a bug in the plugin (that inlined single-line conditional is not particularly maintainable), the general idea of rewarding a developer by giving them a better score for more readable code is laudable, even if it is slightly incorrect.
As to your final question: CC is 5 if you strictly consider logical paths; 4 if you consider cases you should consider writing unit tests for; and 3 if you consider how easy it is for someone else to quickly read and understand your code.

In the second method
return hashCode() == object.hashCode(); costs 0 so you win 1. It's considered as calculation and not logical branch.
But for the first method I don't know why it's cost 5, I calculate 4.

As far as style is concerned, I consider the following the most readable:
public boolean equals(Object object) {
return this == object || (object != null && eq(this, object));
};
private static boolean eq(Object x, Object y) {
return x.getClass() == y.getClass()
&& x.hashCode() == y.hashCode(); // safe because we have perfect hashing
}
In practice, it may not be right to exclude subclasses from being equal, and generally one can not assume that equal hash codes imply equal objects ... therefore, I'd rather write something like:
public boolean equals(Object object) {
return this == object || (object instanceof MyType && eq(this, (MyType) object));
}
public static boolean eq(MyType x, MyType y) {
return x.id.equals(y.id);
}
This is shorter, clearer in intent, just as extensible and efficient as your code, and has a lower cyclomatic complexity (logical operators are not commonly considered branches for counting cyclomatic complexity).

Adding additional checks to the instanceOf check in java

I have a situation where I check for the instanceOf some classes for proceeding with my logic.
For eg.
if (obj instanceof X)
{
result = true;
}
But now this is being used in lot of places in my legacy code.
My problem is now this instanceOf should return true only if some global property variable is true.
I am looking for an alternative solution to replacing all these instanceOf checks as shown below:
if (GLOBALPROPERTY == true)
{
if (obj instanceof X)
{
result = true;
}
}
Can I inject this check inside the class X itself so that, it will return false wherever I check for instanceOf this class.

No, you can't.
The nearest thing I can think of is aspect-oriented programming with AspectJ. But that would be tricky - you'd probably have to switch from using instanceof to using proper polymorphic method calls, first.
But pragmatically, you'll probably just have to do a search and replace through your entire codebase.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.