I have a TreeSet containing wrappers which store a Foo object at a certain position, defined like so:
class Wrapper implements Comparable<Wrapper> {
private final Foo foo;
private final Double position;
...
#Override boolean equals(Object o) {
...
if(o instanceof Wrapper)
return o.getFoo().equals(this.foo);
if(o instanceof Foo)
return o.equals(this.foo);
}
#Override public int compareTo(MarkerWithPosition o) {
return position.compareTo(o.getPosition());
}
}
NavigableSet<Wrapper> fooWrappers = new TreeSet<Wrapper>();
because I want my TreeSet to be ordered by position but searchable by foo. But when I perform these operations:
Foo foo = new Foo(bar);
Wrapper fooWrapper = new Wrapper(foo, 1.0);
fooWrappers.add(fooWrapper);
fooWrapper.equals(new Wrapper(new Foo(bar), 1.0));
fooWrapper.equals(new Foo(bar));
fooWrappers.contains(fooWrapper);
fooWrappers.contains(new Wrapper(foo, 1.0));
fooWrappers.contains(new Wrapper(new Foo(bar), 1.0));
fooWrappers.contains(new Wrapper(foo, 2.0));
fooWrappers.contains(foo);
I get:
true
true
true
true
true
false
Exception in thread "main" java.lang.ClassCastException: org.gridqtl.Marker cannot be cast to java.lang.Comparable
at java.util.TreeMap.getEntry(TreeMap.java:325)
at java.util.TreeMap.containsKey(TreeMap.java:209)
at java.util.TreeSet.contains(TreeSet.java:217)
when I expecting them all to return true, so it seems like TreeSet.contains is not using my equals method as the API suggests. Is there another method I need to overwrite?
TreeSet is a Set implementation that does indeed use compareTo, as explained in the javadoc - emphasis mine:
Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the Set interface. (See Comparable or Comparator for a precise definition of consistent with equals.) This is so because the Set interface is defined in terms of the equals operation, but a TreeSet instance performs all element comparisons using its compareTo (or compare) method, so two elements that are deemed equal by this method are, from the standpoint of the set, equal. The behavior of a set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Set interface.
TreeSet is an ordered set.
equals cannot give you ordering information, hence TreeSet has to use something else.
This 'something else' is Comparable interface, or its cousin Comparator interface.
Both interfaces provide an information about how to order 2 objects of a class.
Related
Q.1) As written in documentation of AbstractSet - "This class does not override any of the implementations from the AbstractCollection class." If it does not override or change add(Object o) or any other Collection interface contract implemented by AbstractCollection class, and merely inherits them and so as HashSet.
How do HashSet and other Set objects then enforce stipulations like no duplicate adding check or Hashtable way of inserting elements, which is totally different to how List or other Collection objects can add elements.
Q.2) In doc, for AbstractSet, it is written, AbstractSet merely adds implementation for equals and hashcode. However, in method details part, it is mentioned Object class has done overriding equals and hashcode method. Does AbstractSet only inherit without doing any change to these two methods? If so, what is the importance of AbstractSet class? Please clarify
Q1: How does HashSet enforce duplicate checks?
If you take a look at the implementation in java.util.HashSet, you'll see the following code:-
private static final Object PRESENT = new Object();
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
What happens is fairly simple; we use a private HashMap instance, which takes our provided value and inserts it as the key of the HashMap. The map's PRESENT value is never actually used or retrieved, but it allows us to use this backing map to verify whether or not the item exists in the Set.
If our provided value does not exist in the map, the call to map.put() will place the item in the map and return our object. Otherwise, the map remains unchanged and the method returns null. The HashMap is doing the hard work for the HashSet here.
This is different to the implementation provided by the AbstractCollection class, and hence the need to override.
Q2: AbstractSet's use of equals() & hashCode()
I think you have slightly misunderstood what AbstractSet is doing here. The purpose of AbstractSet is to provide a collection-safe implementation of equals and hashCode.
Equals checks are performed by verifying that we are comparing two Set objects, that they are of equal size, and that they contain the same items.
public boolean equals(Object o) {
if (o == this)
return true;
if (!(o instanceof Set))
return false;
Collection<?> c = (Collection<?>) o;
if (c.size() != size())
return false;
try {
return containsAll(c);
} catch (ClassCastException unused) {
return false;
} catch (NullPointerException unused) {
return false;
}
}
The hashCode is produced by looping over the Set instance, and hashing each item iteratively:
public int hashCode() {
int h = 0;
Iterator<E> i = iterator();
while (i.hasNext()) {
E obj = i.next();
if (obj != null)
h += obj.hashCode();
}
return h;
}
Any class extending from AbstractSet will use this implementation of equals() and hashCode() unless it overrides them explicitly. This implementation takes preference over the default equals and hashCode methods defined in java.lang.Object.
The documentation you provided are for Java 7, and I was checking the code of java 8 and I found the below so I think it isn't the same for java 7, still you can use the same methodology of checking the code when the documentation isn't very clear for you:
Q1: HashSet Overrides the add method in AbstractCollection you can easily check this if you open the HashSet code in some ide. If a parent doesn't override some methods doesn't mean its children can't do it.
Q2: Again by checking the code we notice that AbstractSet defines its own implementation of equals and hashCode methods. It also overrides the removeAll method of AbstractCollection.
The code shown below does output:
[b]
[a, b]
However I would expect it to print two identical lines in the output.
import java.util.*;
public class Test{
static void test(String... abc) {
Set<String> s = new TreeSet<String>(String.CASE_INSENSITIVE_ORDER);
s.addAll(Arrays.asList("a", "b"));
s.removeAll(Arrays.asList(abc));
System.out.println(s);
}
public static void main(String[] args) {
test("A");
test("A", "C");
}
}
The spec clearly states that removeAll
"Removes all this collection's elements that are also contained in the
specified collection."
So from my understanding current behavior is unpredictable . Please help me understand this
You only read documentation partly. You forgot one important paragraph from TreeSet:
Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the Set interface. (See Comparable or Comparator for a precise definition of consistent with equals.) This is so because the Set interface is defined in terms of the equals operation, but a TreeSet instance performs all element comparisons using its compareTo (or compare) method, so two elements that are deemed equal by this method are, from the standpoint of the set, equal. The behavior of a set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Set interface.
Now removeAll implementation comes from AbstractSet and utilizes equals method. According to your code you will have that "a".equals("A") is not true so that elements are not considered equal even if you provided a comparator which manages them when used in the TreeSet itself. If you try with a wrapper then the problem goes away:
import java.util.*;
import java.lang.*;
class Test
{
static class StringWrapper implements Comparable<StringWrapper>
{
public final String string;
public StringWrapper(String string)
{
this.string = string;
}
#Override public boolean equals(Object o)
{
return o instanceof StringWrapper &&
((StringWrapper)o).string.compareToIgnoreCase(string) == 0;
}
#Override public int compareTo(StringWrapper other) {
return string.compareToIgnoreCase(other.string);
}
#Override public String toString() { return string; }
}
static void test(StringWrapper... abc)
{
Set<StringWrapper> s = new TreeSet<>();
s.addAll(Arrays.asList(new StringWrapper("a"), new StringWrapper("b")));
s.removeAll(Arrays.asList(abc));
System.out.println(s);
}
public static void main(String[] args)
{
test(new StringWrapper("A"));
test(new StringWrapper("A"), new StringWrapper("C"));
}
}
This because you are now providing a consistent implementation between equals and compareTo of your object so you never have incoherent behavior between how the objects are added inside the sorted set and how all the abstract behavior of the set uses them.
This is true in general, a sort of rule of three for Java code: if you implement compareTo or equals or hashCode you should always implement all of them to avoid problems with standard collections (even if hashCode is less crucial unless you are using these objects in any hashed collection). This is specified many times around java documentation.
This is an inconsistency in the implementation of TreeSet<E>, bordering on the bug. The code will ignore custom comparator when the number of items in the collection that you pass to removeAll is greater than or equal to the number of items in the set.
The inconsistency is caused by a small optimization: if you look at the implementation of removeAll, which is inherited from AbstractSet, the optimization goes as follows:
public boolean removeAll(Collection<?> c) {
boolean modified = false;
if (size() > c.size()) {
for (Iterator<?> i = c.iterator(); i.hasNext(); )
modified |= remove(i.next());
} else {
for (Iterator<?> i = iterator(); i.hasNext(); ) {
if (c.contains(i.next())) {
i.remove();
modified = true;
}
}
}
return modified;
}
you can see that the behavior is different when c has fewer items than this set (top branch) vs. when it has as many or more items (bottom branch).
Top branch uses the comparator associated with this set, while the bottom branch uses equals for comparison c.contains(i.next()) - all in the same method!
You can demonstrate this behavior by adding a few extra elements to the original tree set:
s.addAll(Arrays.asList("x", "z", "a", "b"));
Now the output for both test cases becomes identical, because remove(i.next()) utilizes the comparator of the set.
The reason is because the comparator String.CASE_INSENSITIVE_ORDER you use is not consistent with equals.
As stated by TreeSet:
Note that the ordering maintained by a set (whether or not an explicit comparator is provided)
must be consistent with equals if it is to correctly implement the Set interface.
Consistency with equals as stated by Comparable:
The natural ordering for a class C is said to be consistent with equals if and only if
e1.compareTo(e2) == 0 has the same boolean value as e1.equals(e2)
for every e1 and e2 of class C.
And as an example for the case insensitive comparator you use:
"a".compareTo("A") == 0 => true
while
"a".equals("A") => false
It's written in all decent java courses, that if you implement the Comparable interface, you should (in most cases) also override the equals method to match its behavior.
Unfortunately, in my current organization people try to convince me to do exactly the opposite. I am looking for the most convincing code example to show them all the evil that will happen.
I think you can beat them by showing the Comparable javadoc that says:
It is strongly recommended (though not required) that natural
orderings be consistent with equals. This is so because sorted sets
(and sorted maps) without explicit comparators behave "strangely" when
they are used with elements (or keys) whose natural ordering is
inconsistent with equals. In particular, such a sorted set (or sorted
map) violates the general contract for set (or map), which is defined
in terms of the equals method.
For example, if one adds two keys a and b such that (!a.equals(b) &&
a.compareTo(b) == 0) to a sorted set that does not use an explicit
comparator, the second add operation returns false (and the size of
the sorted set does not increase) because a and b are equivalent from
the sorted set's perspective.
So especially with SortedSet (and SortedMap) if the compareTo method returns 0, it assumes it as equal and doesn't add that element second time even the the equals method returns false, and causes confusion as specified in the SortedSet javadoc
Note that the ordering maintained by a sorted set (whether or not an
explicit comparator is provided) must be consistent with equals if the
sorted set is to correctly implement the Set interface. (See the
Comparable interface or Comparator interface for a precise definition
of consistent with equals.) This is so because the Set interface is
defined in terms of the equals operation, but a sorted set performs
all element comparisons using its compareTo (or compare) method, so
two elements that are deemed equal by this method are, from the
standpoint of the sorted set, equal. The behavior of a sorted set is
well-defined even if its ordering is inconsistent with equals; it just
fails to obey the general contract of the Set interface.
If you don't override the equals method, it inherits its behaviour from the Object class.
This method returns true if and only if the specified object is not null and refers to the same instance.
Suppose the following class:
class VeryStupid implements Comparable
{
public int x;
#Override
public int compareTo(VeryStupid o)
{
if (o != null)
return (x - o.x);
else
return (1);
}
}
We create 2 instances:
VeryStupid one = new VeryStupid();
VeryStupid two = new VeryStupid();
one.x = 3;
two.x = 3;
The call to one.compareTo(two) returns 0 indicating the instances are equal but the call to one.equals(two) returns false indicating they're not equal.
This is inconsistent.
Consistency of compareTo and equals is not required but strongly recommended.
I'll give it a shot with this example:
private static class Foo implements Comparable<Foo> {
#Override
public boolean equals(Object _other) {
System.out.println("equals");
return super.equals(_other);
}
#Override
public int compareTo(Foo _other) {
System.out.println("compareTo");
return 0;
}
}
public static void main (String[] args) {
Foo a, b;
a = new Foo();
b = new Foo();
a.compareTo(b); // prints 'compareTo', returns 0 => equal
a.equals(b); // just prints 'equals', returns false => not equal
}
You can see that your (maybe very important and complicated) comparission code is ignored when you use the default equals-method.
the method int compareTo(T o) allow you know if the T o is (in some way) superior or inferior of this, so it allow you to order a list of T o.
In the scenario of int compareTo(T o) you have to do :
is o InstanceOfThis ? => true/false ;
is o EqualOfThis ? => true/false ;
is o SuperiorOfThis ? => true/false ;
is o InferiorOfThis ? true/false ;
So you see you have the equality test, and the best way to not implement the equality two times is to put it in the boolean equals(Object obj) method.
Well here is my question, Can "HashSet Objects" have elements duplicated??
If I read the Set Interface definition, I see:
A collection that contains no duplicate elements. More formally, sets contain no pair of elements e1 and e2 such that e1.equals(e2), and at most one null element. As implied by its name, this interface models the mathematical set abstraction.
And now we are going to write a simple example:
Define class A:
public class A {
#Override
public boolean equals(Object obj) {
return true;
}
}
Now execute this code;
Set<A> set = new HashSet<A>();
set.add(new A());
set.add(new A());
System.out.println(set.toString());
And this is the result:
[com.maths.graphs.A#b9e9a3, com.maths.graphs.A#18806f7]
Why a class what implements Set Interface like HashSet contains elements duplicated?
Thanks!!
You have broken the equals-hashcode contract.
If you override the equals method you must also override the hashCode() method such that:
Two objects which are equal give the same hash, and preferably unequal
objects are highly likely to give different hashcodes
This is important because many objects (unsurprisingly including the HashSet) use the hashcode as a quick, efficient early step to eliminate unequal objects. This is what has happened here since the hashcodes of the different As will be different as they are still using the implementation of .hashCode() provided within object.
If you were to create the class A as follows it would not allow more than 1 A in the set
public class A {
#Override
public boolean equals(Object obj) {
return true;
}
#Override
public int hashCode() {
int hash = 1; //any number since in this case all objects of class A are equal to everything
return hash;
}
}
From the javadoc
public int hashCode()
Returns a hash code value for the object. This method is supported for
the benefit of hash tables such as those provided by HashMap.
The general contract of hashCode is:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently
return the same integer, provided no information used in equals
comparisons on the object is modified. This integer need not remain
consistent from one execution of an application to another execution
of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must
produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on
each of the two objects must produce distinct integer results.
However, the programmer should be aware that producing distinct
integer results for unequal objects may improve the performance of
hash tables.
Most IDEs will object if you do not include an overriding HashCode method when overiding the equals method and can generate a hashCode method for you.
Notes
Strictly speaking my hashCode() method doesn't completely satisfy the contract. Since A#equals(Object obj) equals anything including objects which are not of type A it is impossible to fully satisfy the contract. Ideally the equals method would be changed to the following as well to cover all bases
#Override
public boolean equals(Object obj) {
if (obj instanceof A){
return true;
}else{
return false;
}
}
Here the HashSet does not have duplicates, as the two add methods add new objects in the HashSet and these are different Objects. The reason that the hash codes for the two elements of the set are different for this reason. Try changing the code to:
Set<A> set = new HashSet<A>();
A a = new A();
set.add(a);
set.add(a);
System.out.println(set.toString());
and you will see that there is only one value in the set.
Or just add the following in you code and check
#Override
public int hashCode() {
return 31;
}
You have violated the hashCode() method contract i.e for same key it should return same hashcode() every time
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I'm trying to understand, how the compareTo method is called in this program.
class Student implements Comparable {
String dept, name;
public Student(String dept, String name) {
this.dept = dept;
this.name = name;
}
public String getDepartment() {
return dept;
}
public String getName() {
return name;
}
public String toString() {
return "[dept=" + dept + ",name=" + name + "]";
}
public int compareTo(Object obj) {
Student emp = (Student) obj;
System.out.println("Compare to : " +dept.compareTo(emp.getDepartment()));
int deptComp = dept.compareTo(emp.getDepartment());
return ((deptComp == 0) ? name.compareTo(emp.getName()) : deptComp);
}
public boolean equals(Object obj) {
if (!(obj instanceof Student)) {
return false;
}
Student emp = (Student) obj;
boolean ii = dept.equals(emp.getDepartment()) && name.equals(emp.getName());
System.out.println("Boolean equal :" +ii);
return ii ;
}
public int hashCode() {
int i2 = 31 * dept.hashCode() + name.hashCode();
System.out.println("HashCode :" + i2);
return i2;
}
}
public class CompareClass {
public static void main(String args[]) {
Student st[] = { new Student("Finance", "A"),
new Student("Finance", "B"), new Student("Finance", "C"),
new Student("Engineering", "D"),
new Student("Engineering", "E"),
new Student("Engineering", "F"), new Student("Sales", "G"),
new Student("Sales", "H"), new Student("Support", "I"), };
Set set = new TreeSet(Arrays.asList(st));
System.out.println(Arrays.asList(st));
System.out.println(set);
}
}
Why is Arrays.asList(st) used?
What is use of equals() and hashcode()?
Why Arrays.asList(st) is used ?
Because the TreeSet constructor TreeSet(Collection c) accepts a Collection and not a String[] , hence you convert the String[] to a List which is a Collection using the method List asList(T... a). Note here , the array is same as varargs in this case.
What is use of equals() and hashcode() ?
Object class provides two methods hashcode() and equals() to represent the identity of an object.
You are using a TreeSet in your code . As per the documentation:
the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the Set interface. (See Comparable or Comparator for a precise definition of consistent with equals.) This is so because the Set interface is defined in terms of the equals operation, but a TreeSet instance performs all element comparisons using its compareTo (or compare) method, so two elements that are deemed equal by this method are, from the standpoint of the set, equal.
Hence in your case , implementing Comparable and overriding compareTo() is enough .
Suggested Reading:
Overriding equals and hashCode in Java.
Hashset vs Treeset
What is the difference between compare() and compareTo()?
.equals() is used because you are comparing two Objects.
There is a very good explanation of the Comparable interface in the Oracle documentation: http://docs.oracle.com/javase/6/docs/api/java/lang/Comparable.html
In this code, Arrays.asList(st) is used basically because the author thought it was simpler to instantiate an Array in Java and convert it to a List than it is to create an ArrayList and call .add for each item. It's really not critical to what is going on, though. Another thing that happens on that same line is where the magic is.
Set set = new TreeSet(Arrays.asList(st));
This creates a TreeSet from the list. It is worth taking a look at this post: What is the difference between Set and List? briefly. In a Set, all elements are unique, so when you create a Set from a List that contains duplicates the Set constructor will throw the extra items away. How does it determine what elements are duplicates? It uses the methods of the Comparable interface. Similarly, a List is sorted but a Set is not so the implementation can choose to store the items in the Set in whatever order is most efficient. In the case of a TreeSet it handily explains how it does it right at the top of the Oracle documentation:
Note that the ordering maintained by a set (whether or not an explicit
comparator is provided) must be consistent with equals if it is to
correctly implement the Set interface. (See Comparable or Comparator
for a precise definition of consistent with equals.) This is so
because the Set interface is defined in terms of the equals operation,
but a TreeSet instance performs all element comparisons using its
compareTo (or compare) method, so two elements that are deemed equal
by this method are, from the standpoint of the set, equal. The
behavior of a set is well-defined even if its ordering is inconsistent
with equals; it just fails to obey the general contract of the Set
interface.
Some of the Java API was built around arrays and some of it was built around collections. asList is basically an adapter that lets your array be accessed like a collection.
Some data structures and algorithms operate on what's called the "hash" of a piece of data. This is done largely for performance reasons. In most cases the hash is a single number representing a particular object. You can see how this might be useful for sorting a collection quickly or checking equivalence.
equals exists of course to test if two objects represent the same thing.
I m trying to understand ,how the compareTo method is called in this program.
Because what you use here is a TreeSet, which implements SortedSet, and for which uniqueness is calculated by comparing elements using their natural ordering, and not equality.
Classes implementing Comparable of themselves, or a superclass of themselves, can be compared to one another. For classes which do not, you can supply a Comparator instead.
When you add an element to a SortedSet, the set will first compare it to elements already present in the set, and only add it if no comparison gives 0. See below for a demonstration.
See also Collections.sort().
1.Why Arrays.asList(st) is used ?
because this is Java 1.4 code. In Java 5, you'd use Arrays.asList(s1, s2, etc) (ie, a varargs method).
2.What is use of equals() and hashcode() ?
In this case, none.
Sample program (with generics this time) illustrating the difference between a SortedSet and a HashSet, using BigDecimal:
final BigDecimal one = BigDecimal.ONE;
final BigDecimal oneDotZero = new BigDecimal("1.0");
one.equals(oneDotZero); // false
one.compareTo(oneDotZero); // 0
// HashSet: uses .equals() and .hashCode();
final Set<BigDecimal> hashset = new HashSet<>();
hashset.add(one); hashset.add(oneDotZero);
hashset.size(); // 2
// TreeSet: uses Comparable
final Set<BigDecimal> treeset = new TreeSet<>();
treeset.add(one); treeset.add(oneDotZero);
treeset.size(); // 1
.equals and .hashcode are methods inherited from the Object class in java by every class.
When creating your own class you would usually override these two default implementations because the default Object function generally does not lead to the desired behavior.
They are there for good measure really, but as it is they are not being used.