Why does the Collection interface have equals() and hashCode()? - java

Why does the Collection interface have equals(Object o) and hashCode(), given that any implementation will have those by default (inherited from Object) ?

From the Collection JavaDoc:
While
the Collection interface adds no stipulations to the general contract
for the Object.equals, programmers who implement the Collection
interface "directly" (in other words, create a class that is a
Collection but is not a Set or a List) must exercise care if they
choose to override the Object.equals. It is not necessary to do so,
and the simplest course of action is to rely on Object's
implementation, but the implementor may wish to implement a "value
comparison" in place of the default "reference comparison." (The List
and Set interfaces mandate such value comparisons.)
The general contract for the Object.equals method states that equals
must be symmetric (in other words, a.equals(b) if and only if
b.equals(a)). The contracts for List.equals and Set.equals state that
lists are only equal to other lists, and sets to other sets. Thus, a
custom equals method for a collection class that implements neither
the List nor Set interface must return false when this collection is
compared to any list or set. (By the same logic, it is not possible to
write a class that correctly implements both the Set and List
interfaces.)
and
While the Collection interface adds no stipulations to the general contract for the Object.hashCode method, programmers should take note that any class that overrides the Object.equals method must also override the Object.hashCode method in order to satisfy the general contract for the Object.hashCode method. In particular, c1.equals(c2) implies that c1.hashCode()==c2.hashCode().

To answer your specific question: why does it have these methods? It's done simply for convenience to be able to include Java Docs giving hints as to what implementers should do with these methods (e.g. comparing equality of values rather than references).

To add to the other great answers. In the Collections interface, the equals method is defined in that interface to make some decisions in the way equaling two instances of collection should work. From the JAVA 8 documentation:
More generally, implementations of the various Collections Framework
interfaces are free to take advantage of the specified behavior of
underlying Object methods wherever the implementor deems it
appropriate.
So you don’t add methods from the Object class for any other reason that giving more definitiveness to the java doc. This is the reason why you don’t count those methods in the abstract methods in the abstract methods of an interface.
Moreover, in JAVA 8, along the same line of reasoning, default methods from the Object class are not allowed and will generate a compile error. I believe it’s was done to prevent this type of confusion. So if you try to create a default method called hashCode(), for example, it will not compile.
Here is a more in-depth explanation for this behavior in JAVA 8 from the Lambda FAQ:
An interface cannot provide a default implementation for any of the
methods of the Object class. This is a consequence of the “class wins”
rule for method resolution: a method found on the superclass chain
always takes precedence over any default methods that appear in any
superinterface. In particular, this means one cannot provide a default
implementation for equals, hashCode, or toString from within an
interface.
This seems odd at first, given that some interfaces actually define
their equals behavior in documentation. The List interface is an
example. So, why not allow this?
One reason is that it would become more difficult to reason about when
a default method is invoked. The current rules are simple: if a class
implements a method, that always wins over a default implementation.
Since all instances of interfaces are subclasses of Object, all
instances of interfaces have non-default implementations of equals,
hashCode, and toString already. Therefore, a default version of these
on an interface is always useless, and it may as well not compile.
Another reason is that providing default implementations of these
methods in an interface is most likely misguided. These methods
perform computations over the object’s state, but the interface, in
general, has no access to state; only the implementing class has
access to this state. Therefore, the class itself should provide the
implementations, and default methods are unlikely to be useful.

Just to add to the great answers above, it makes sense to have the 'equals' or `hashCode' methods in this scenario:
Collection<Whatever> list1 = getArrayList();
Collection<Whatever> list2 = getAnotherArrayList();
if(list1.equals(list2)){
// do something
}
In the absence of the equals method in the interface, we'll be forced to use concrete types, which is generally not a good practice :
ArrayList<Whatever> list1 = getArrayList();
ArrayList<Whatever> list2 = getAnotherArrayList();
if(list1.equals(list2)){
// do something
}

Related

Java Documentation: What is meaning of 'methods inherited from interface X

I must be missing some basic Java terminology here:
Classes can be extended, therefore their methods can be inherited by their sub-classes.
Interfaces can be implemented. An implementing class will have to implement all of the interface's methods - the interface itself does not implement anything, only declares.
So, how come when I look at the documentation of HashSet (https://docs.oracle.com/javase/7/docs/api/java/util/HashSet.html), I see a list of methods which are inherited from interface java.util.Set ?
I guess you are referring to the statements like this from the generated JavaDoc HTML:
Methods inherited from interface java.util.Set ...
Inheritance in this sense means that the signature of the methods in question is inherited, but not necessarily the implementation. The reason for that is simple: In Java you usually do not look into the implementations of third party code, but only into the interfaces with their signatures and JavaDoc.
So, basically, the signatures of those methods are inherited from interface Set and implemented in the HashSet or AbstractSet. Hence, actually it is implementing the interface Set.
Sidenote: In Java 8, you can have Interfaces implementing methods, but that's a different story.
I think this has more to do with the javadoc than with the language. In Java, all the methods in the interface have to be implemented. So from a language standpoint, there's no real difference betwen add and addAll. Both are declared in Set; HashSet is a concrete class; therefore it must provide an implementation for both.
The difference really just has to do with whether the author had anything to add to the javadoc in the interface. For add, it's necessary to add javadoc to HashSet, because Set defines add as an optional operation (it could be implemented by throwing an exception), therefore HashSet needs to specify that add actually does something useful. For addAll, however, there's no need to add any documentation in HashSet that wasn't already in Set's javadoc.
So I think that the javadoc page is slightly inaccurate; it really should say that the "javadoc" is inherited from Set, not the methods. (Technically, the methods aren't inherited, because an abstract method from an interface isn't inherited if there's another method with the same signature--see JLS 8.4.8. That applies equally to all methods declared in the interface, whether or not the javadoc says they're "inherited".) However, saying "Documentation inherited from class java.util.Set" might look a little odd to readers. So I'm OK with a slight technical inaccuracy here if it conveys the message adequately. Most readers wouldn't notice the inaccuracy, and it really doesn't matter. In fact, I didn't notice this little flaw in the javadoc until you posted this question--and I'm someone who used to work on a compiler for a different language and spent many hours reading the language standard and delving into the exact definitions of the terms so that I could find out exactly what the standard required.

Override the Object.toString() Method with a default interface method in Java [duplicate]

Default methods are a nice new tool in our Java toolbox. However, I tried to write an interface that defines a default version of the toString method. Java tells me that this is forbidden, since methods declared in java.lang.Object may not be defaulted. Why is this the case?
I know that there is the "base class always wins" rule, so by default (pun ;), any default implementation of an Object method would be overwritten by the method from Object anyway. However, I see no reason why there shouldn't be an exception for methods from Object in the spec. Especially for toString it might be very useful to have a default implementation.
So, what is the reason why Java designers decided to not allow default methods overriding methods from Object?
This is yet another of those language design issues that seems "obviously a good idea" until you start digging and you realize that its actually a bad idea.
This mail has a lot on the subject (and on other subjects too.) There were several design forces that converged to bring us to the current design:
The desire to keep the inheritance model simple;
The fact that once you look past the obvious examples (e.g., turning AbstractList into an interface), you realize that inheriting equals/hashCode/toString is strongly tied to single inheritance and state, and interfaces are multiply inherited and stateless;
That it potentially opened the door to some surprising behaviors.
You've already touched on the "keep it simple" goal; the inheritance and conflict-resolution rules are designed to be very simple (classes win over interfaces, derived interfaces win over superinterfaces, and any other conflicts are resolved by the implementing class.) Of course these rules could be tweaked to make an exception, but I think you'll find when you start pulling on that string, that the incremental complexity is not as small as you might think.
Of course, there's some degree of benefit that would justify more complexity, but in this case it's not there. The methods we're talking about here are equals, hashCode, and toString. These methods are all intrinsically about object state, and it is the class that owns the state, not the interface, who is in the best position to determine what equality means for that class (especially as the contract for equality is quite strong; see Effective Java for some surprising consequences); interface writers are just too far removed.
It's easy to pull out the AbstractList example; it would be lovely if we could get rid of AbstractList and put the behavior into the List interface. But once you move beyond this obvious example, there are not many other good examples to be found. At root, AbstractList is designed for single inheritance. But interfaces must be designed for multiple inheritance.
Further, imagine you are writing this class:
class Foo implements com.libraryA.Bar, com.libraryB.Moo {
// Implementation of Foo, that does NOT override equals
}
The Foo writer looks at the supertypes, sees no implementation of equals, and concludes that to get reference equality, all he need do is inherit equals from Object. Then, next week, the library maintainer for Bar "helpfully" adds a default equals implementation. Ooops! Now the semantics of Foo have been broken by an interface in another maintenance domain "helpfully" adding a default for a common method.
Defaults are supposed to be defaults. Adding a default to an interface where there was none (anywhere in the hierarchy) should not affect the semantics of concrete implementing classes. But if defaults could "override" Object methods, that wouldn't be true.
So, while it seems like a harmless feature, it is in fact quite harmful: it adds a lot of complexity for little incremental expressivity, and it makes it far too easy for well-intentioned, harmless-looking changes to separately compiled interfaces to undermine the intended semantics of implementing classes.
It is forbidden to define default methods in interfaces for methods in java.lang.Object, since the default methods would never be "reachable".
Default interface methods can be overwritten in classes implementing the interface and the class implementation of the method has a higher precedence than the interface implementation, even if the method is implemented in a superclass. Since all classes inherit from java.lang.Object, the methods in java.lang.Object would have precedence over the default method in the interface and be invoked instead.
Brian Goetz from Oracle provides a few more details on the design decision in this mailing list post.
To give a very pedantic answer, it is only forbidden to define a default method for a public method from java.lang.Object. There are 11 methods to consider, which can be categorized in three ways to answer this question.
Six of the Object methods cannot have default methods because they are final and cannot be overridden at all: getClass(), notify(), notifyAll(), wait(), wait(long), and wait(long, int).
Three of the Object methods cannot have default methods for the reasons given above by Brian Goetz: equals(Object), hashCode(), and toString().
Two of the Object methods can have default methods, though the value of such defaults is questionable at best: clone() and finalize().
public class Main {
public static void main(String... args) {
new FOO().clone();
new FOO().finalize();
}
interface ClonerFinalizer {
default Object clone() {System.out.println("default clone"); return this;}
default void finalize() {System.out.println("default finalize");}
}
static class FOO implements ClonerFinalizer {
#Override
public Object clone() {
return ClonerFinalizer.super.clone();
}
#Override
public void finalize() {
ClonerFinalizer.super.finalize();
}
}
}
I do not see into the head of Java language authors, so we may only guess. But I see many reasons and agree with them absolutely in this issue.
The main reason for introducing default methods is to be able to add new methods to interfaces without breaking the backward compatibility of older implementations. The default methods may also be used to provide "convenience" methods without the necessity to define them in each of the implementing classes.
None of these applies to toString and other methods of Object. Simply put, default methods were designed to provide the default behavior where there is no other definition. Not to provide implementations that will "compete" with other existing implementations.
The "base class always wins" rule has its solid reasons, too. It is supposed that classes define real implementations, while interfaces define default implementations, which are somewhat weaker.
Also, introducing ANY exceptions to general rules cause unnecessary complexity and raise other questions. Object is (more or less) a class as any other, so why should it have different behaviour?
All and all, the solution you propose would probably bring more cons than pros.
The reasoning is very simple, it is because Object is the base class for all the Java classes. So even if we have Object's method defined as default method in some interface, it will be useless because Object's method will always be used. That is why to avoid confusion, we cannot have default methods that are overriding Object class methods.

Java8: Why is it forbidden to define a default method for a method from java.lang.Object

Default methods are a nice new tool in our Java toolbox. However, I tried to write an interface that defines a default version of the toString method. Java tells me that this is forbidden, since methods declared in java.lang.Object may not be defaulted. Why is this the case?
I know that there is the "base class always wins" rule, so by default (pun ;), any default implementation of an Object method would be overwritten by the method from Object anyway. However, I see no reason why there shouldn't be an exception for methods from Object in the spec. Especially for toString it might be very useful to have a default implementation.
So, what is the reason why Java designers decided to not allow default methods overriding methods from Object?
This is yet another of those language design issues that seems "obviously a good idea" until you start digging and you realize that its actually a bad idea.
This mail has a lot on the subject (and on other subjects too.) There were several design forces that converged to bring us to the current design:
The desire to keep the inheritance model simple;
The fact that once you look past the obvious examples (e.g., turning AbstractList into an interface), you realize that inheriting equals/hashCode/toString is strongly tied to single inheritance and state, and interfaces are multiply inherited and stateless;
That it potentially opened the door to some surprising behaviors.
You've already touched on the "keep it simple" goal; the inheritance and conflict-resolution rules are designed to be very simple (classes win over interfaces, derived interfaces win over superinterfaces, and any other conflicts are resolved by the implementing class.) Of course these rules could be tweaked to make an exception, but I think you'll find when you start pulling on that string, that the incremental complexity is not as small as you might think.
Of course, there's some degree of benefit that would justify more complexity, but in this case it's not there. The methods we're talking about here are equals, hashCode, and toString. These methods are all intrinsically about object state, and it is the class that owns the state, not the interface, who is in the best position to determine what equality means for that class (especially as the contract for equality is quite strong; see Effective Java for some surprising consequences); interface writers are just too far removed.
It's easy to pull out the AbstractList example; it would be lovely if we could get rid of AbstractList and put the behavior into the List interface. But once you move beyond this obvious example, there are not many other good examples to be found. At root, AbstractList is designed for single inheritance. But interfaces must be designed for multiple inheritance.
Further, imagine you are writing this class:
class Foo implements com.libraryA.Bar, com.libraryB.Moo {
// Implementation of Foo, that does NOT override equals
}
The Foo writer looks at the supertypes, sees no implementation of equals, and concludes that to get reference equality, all he need do is inherit equals from Object. Then, next week, the library maintainer for Bar "helpfully" adds a default equals implementation. Ooops! Now the semantics of Foo have been broken by an interface in another maintenance domain "helpfully" adding a default for a common method.
Defaults are supposed to be defaults. Adding a default to an interface where there was none (anywhere in the hierarchy) should not affect the semantics of concrete implementing classes. But if defaults could "override" Object methods, that wouldn't be true.
So, while it seems like a harmless feature, it is in fact quite harmful: it adds a lot of complexity for little incremental expressivity, and it makes it far too easy for well-intentioned, harmless-looking changes to separately compiled interfaces to undermine the intended semantics of implementing classes.
It is forbidden to define default methods in interfaces for methods in java.lang.Object, since the default methods would never be "reachable".
Default interface methods can be overwritten in classes implementing the interface and the class implementation of the method has a higher precedence than the interface implementation, even if the method is implemented in a superclass. Since all classes inherit from java.lang.Object, the methods in java.lang.Object would have precedence over the default method in the interface and be invoked instead.
Brian Goetz from Oracle provides a few more details on the design decision in this mailing list post.
To give a very pedantic answer, it is only forbidden to define a default method for a public method from java.lang.Object. There are 11 methods to consider, which can be categorized in three ways to answer this question.
Six of the Object methods cannot have default methods because they are final and cannot be overridden at all: getClass(), notify(), notifyAll(), wait(), wait(long), and wait(long, int).
Three of the Object methods cannot have default methods for the reasons given above by Brian Goetz: equals(Object), hashCode(), and toString().
Two of the Object methods can have default methods, though the value of such defaults is questionable at best: clone() and finalize().
public class Main {
public static void main(String... args) {
new FOO().clone();
new FOO().finalize();
}
interface ClonerFinalizer {
default Object clone() {System.out.println("default clone"); return this;}
default void finalize() {System.out.println("default finalize");}
}
static class FOO implements ClonerFinalizer {
#Override
public Object clone() {
return ClonerFinalizer.super.clone();
}
#Override
public void finalize() {
ClonerFinalizer.super.finalize();
}
}
}
I do not see into the head of Java language authors, so we may only guess. But I see many reasons and agree with them absolutely in this issue.
The main reason for introducing default methods is to be able to add new methods to interfaces without breaking the backward compatibility of older implementations. The default methods may also be used to provide "convenience" methods without the necessity to define them in each of the implementing classes.
None of these applies to toString and other methods of Object. Simply put, default methods were designed to provide the default behavior where there is no other definition. Not to provide implementations that will "compete" with other existing implementations.
The "base class always wins" rule has its solid reasons, too. It is supposed that classes define real implementations, while interfaces define default implementations, which are somewhat weaker.
Also, introducing ANY exceptions to general rules cause unnecessary complexity and raise other questions. Object is (more or less) a class as any other, so why should it have different behaviour?
All and all, the solution you propose would probably bring more cons than pros.
The reasoning is very simple, it is because Object is the base class for all the Java classes. So even if we have Object's method defined as default method in some interface, it will be useless because Object's method will always be used. That is why to avoid confusion, we cannot have default methods that are overriding Object class methods.

Why does Comparator declare equals?

The Comparator interface has its own equals() method. Any class will get equals() by default through Object class. What is the need to have equals() method inside an interface?
Comparator refines the contract of Object.equals: It has to satisfy the constraints set out by Object.equals and then some.
Additionally, this method can return true only if the specified object is also a comparator and it imposes the same ordering as this comparator. Thus, comp1.equals(comp2) implies that sgn(comp1.compare(o1, o2))==sgn(comp2.compare(o1, o2)) for every object reference o1 and o2.
Declaring an equals inside Comparator allows you to document this in the form of javadoc.
Note that the documentation of the API also serves as the contract, so it's not just cosmetics here. It's explicit constraints that other code and your code can rely on.
In similar situations where you have less established methods, it may also serve as documenting an intent. I.e., Interface.method should be there, regardless of how its super interfaces evolves.
From the Java documentations, the reason why Comparator has it's own equals() method:
However, overriding this method may, in some cases, improve performance by allowing programs to determine that two distinct comparators impose the same order.
Read its javadoc. It's there only to explain what equals() must return if you choose to override it in a class implementing Comparator. You might think that no comparator can be equal to any other, but it's not the case. You might think that two comparators are equal if they return the same thing for any arguments, but it's not the case. The javadoc explains that two comparators are equal if they impose the same ordering, whatever the arguments given. The javadoc also says:
Note that it is always safe not to override Object.equals(Object)
Most of the time, you don't override equals() in comparators.
from the docs:
it is always safe not to override Object.equals(Object). However, overriding this method may, in some cases, improve performance by allowing programs to determine that two distinct comparators impose the same order.
Technically, the declaration of the method is redundant (the compiler does not care), but...
Declaring the equals method in this interface makes it part of the contract between caller and different Comparators and allows it to specify/extend its semantics.
It specifies that two Comparators are equal only if they impose the same ordering with their compare() method. This extends the semantics of Object.equals() and must therefore be documented in the interface.
Putting an Object method in an interface declaration allows Javadoc
declaration of the meaning equals is required to have in classes that
implement the interface.
Comparator interface have their own equals() method
Well,. first of all, it should be clear that whenever you implement Comparable interface you should provide your program to decide when objects are equal, less or greater.
I am quite confuse about have equals() inside Comparator. Any class will get equals() by default through Object class.
The equals() method implementation u inherit from Object class only checks whether the two referances point to the same object or not. It dosn't apply any comparison. It's you who will provide in your class (or possibly in your Interface) the criteria for objects to be equal.
Then what is need to have equals() method inside an interface?
Obviously , whenever you implement when objects are less , greater than you must implement when they are equal. so rather than relying on default Object equals() method you should provide your logic to check equality
The equals() method in Comparator is provided to force a user implementing the Comparator interface to implement equals() with some additional rules and constraints in addition to the ones already applied on equals() from Object.
The additional rule being:
This method must obey the general contract of Object.equals(Object).
Additionally, this method can return true only if the specified object
is also a comparator and it imposes the same ordering as this
comparator. Thus, comp1.equals(comp2) implies that
sgn(comp1.compare(o1, o2))==sgn(comp2.compare(o1, o2)) for every
object reference o1 and o2.

What's the difference between these two java variable declarations?

public class SomeClass {
private HashSet<SomeObject> contents = new HashSet<SomeObject>();
private Set<SomeObject> contents2 = new HashSet<SomeObject>();
}
What's the difference? In the end they are both a HashSet isn't it? The second one looks just wrong to me, but I have seen it frequently used, accepted and working.
Set is an interface, and HashSet is a class that implements the Set interface.
Declaring the variable as type HashSet means that no other implementation of Set may be used. You may want this if you need specific functionality of HashSet.
If you do not need any specific functionality from HashSet, it is better to declare the variable as type Set. This leaves the exact implementation open to change later. You may find that for the data you are using, a different implementation works better. By using the interface, you can make this change later if needed.
You can see more details here: When should I use an interface in java?
Set is a collection interface that HashSet implements.
The second option is usually the ideal choice as it's more generic.
Since the HashSet class implements the Set interface, its legal to assign a HashSet to a Set variable. You could not go the other way however (assign a Set to a more specific HashSet variable).
Set is an interface that HashSet implements, so if you do this:
Set<E> mySet = new HashSet<E>();
You will still have access to the functionality of HashSet, but you also have the flexibility to replace the concrete instance with an instance of another Set class in the future, such as LinkedHashSet or TreeSet, or another implementation.
The first method uses a concrete class, allowing you to replace the class with an instance of itself or a subclass, but with less flexibility. For example, TreeSet could not be used if your variable type was HashSet.
This is Item 52 from Joshua Bloch's Effective Java, 2nd Edition.
Refer to Objects by their interfaces
... You should favor the use of interfaces rather than classes to refer to objects. If appropriate interface types exist, then parameters, return values, variables, and fields should all be declared using interface types. The only time you really need to refer to an object's class is when you're creating it with a constructor...
// Usually Good - uses interface as type
List<T> tlist = new Vector<T>();
// Typically Bad - uses concrete class as type!
Vector<T> vec = new Vector<T>();
This practice does carry some caveats - if the implementation you want has special behavior not guaranteed by the generic interface, then you have to document your requirements accordingly.
For example, Vector<T> is synchronized, whereas ArrayList<T> (also an implementer of List<T>) does not, so if you required synchronized containers in your design (or not), you would need to document that.
One thing worth to mention, is that interface vs. concrete class rule is most important for types exposed in API, eg. method parameter or return type. For private fields and variables it only ensures you aren't using any methods from concrete implementation (i.e. HashSet), but then it's private, so doesn't really matter.
Another thing is that adding another type reference will slightly increase size of your compiled class. Most people won't care, but these things adds up.

Categories