Suppose I have this class:
public class Node implements Comparable<Node>{
public float key;
public TreeSet<Node> neighbors;
public Node{
//fill neighbors somehow
}
#Override
public int compareTo(Node n) {
if(this.key == n.key)
return 0;
else if(this.key > n.key)
return 1;
else
return -1;
}
}
So this is a classic node of a graph, where each node is connected to a set of nodes (i.e. its neighbors). I'm using TreeSet because I often (very often) to know all the neighbors with their key bigger (smaller) than a certain value. Now, let's suppose I have this method:
//swap nodes keys
void swapKeys(Node a, Node b){
float ak = a.key;
a.key = b.key;
b.key = ak;
}
Notice that this method changes only the two nodes keys, nothing more.
Do this "break" the structure, or everything will continue to work fine?
If this breaks the structure, what about this simple solution:
//swap nodes keys
void swapKeys(Node a, Node b){
a.remove(b);
b.remove(a);
float ak = a.key;
a.key = b.key;
b.key = ak;
a.add(b);
b.add(a);
}
From the TreeSet documentation :
Note that the ordering maintained by a set (whether or not an explicit
comparator is provided) must be consistent with equals if it is to
correctly implement the Set interface.
Your Node class' Comparable implementation is not consistent with equals. (compareTo can return 0 for two Node instances wich are not equal).
This in itself makes your Node class unfit to be elements of a TreeSet.
Even the proposed workaround is not sufficient.
You may be tempted to fix this by implementing equals() (and hashCode()) to be based upon the value contained in the node. But to no avail, as this would go against a warning on the documention of the general Set interface :
Note: Great care must be exercised if mutable objects are used as set
elements. The behavior of a set is not specified if the value of an
object is changed in a manner that affects equals comparisons while
the object is an element in the set. A special case of this
prohibition is that it is not permissible for a set to contain itself
as an element.
So adding equals and hashCode is still not sufficient : your instances must also be immutable.
However the simplest solution, seems to be to forego the Comparable interface altogether, to not implement equals and hashCode, and to simply use a HashSet instead of a TreeSet. In that case you can change the contents of your nodes without consequences to the proper functioning of the set of neighbours.
Related
I already looked into different questions but these usually ask about consistency or ordering, while I am interested into ordering of two HashSets containing the same elements at the same time.
I want to create a HashSet of HashSets containing integers. Over time I will put HashSets of size 3 in this bigger HashSet and I will want to see if a newly created HashSet is already contained within the bigger HashSet.
Now my question is will it always find duplicates or can the ordering of two HashSets with the same elements be different?
I am conflicted as they use the same hashcode() function but does that mean they will always be the same?
HashSet<HashSet<Integer>> test = new HashSet<>();
HashSet<Integer> one = new HashSet<>();
one.add(1);
one.add(2);
one.add(5);
test.add(one);
HashSet<Integer> two = new HashSet<>();
two.add(5);
two.add(1);
two.add(2);
//Some other stuff that runs over time
System.out.println(test.contains(two));
Above code tries to illustrate what I mean, does this always return true? (Keep in mind I might initialise another HashSet with the same elements and try the contains again)
Yes, the above always returns true. Sets have no order, and when you test whether two Sets are equal to each other, you are checking that they have the same elements. Order has no meaning.
To elaborate, test.contains(two) will return true, if an only if test contains an element having the same hashCode() as two which is equal to two (according to the equals method).
Two sets s1 and s2 that have the same elements have the same hashCode() and s1.equals(s2) returns true.
This is required by the contract of equals and hashCode of the Set interface:
equals
Compares the specified object with this set for equality. Returns true if the specified object is also a set, the two sets have the same size, and every member of the specified set is contained in this set (or equivalently, every member of this set is contained in the specified set). This definition ensures that the equals method works properly across different implementations of the set interface.
hashCode
Returns the hash code value for this set. The hash code of a set is defined to be the sum of the hash codes of the elements in the set, where the hash code of a null element is defined to be zero. This ensures that s1.equals(s2) implies that s1.hashCode()==s2.hashCode() for any two sets s1 and s2, as required by the general contract of Object.hashCode.
As you can see, one and two don't even have to use the same implementation of the Set interface in order for test.contains(two) to return true. They just have to contain the same elements.
The key property of sets is about uniqueness of keys.
By "default", insertion order doesn't matter at all.
A linked LinkedHashSet guarantees to you that when iterating, you get the elements always in the same order (the one used for inserting them). But even then, when comparing such sets, it is still only about their content, not that insertion order part.
In other words: no matter what (default) implementation of the Set interface you are using, you should always see consistent behavior. Of course you free to implement your own Set and to violate that contract, but well, violating contracts leads to violated contracts, aka bugs.
You can look for yourself, this is open source code:
public boolean equals(Object o) {
if (o == this)
return true;
if (!(o instanceof Set))
return false;
Collection<?> c = (Collection<?>) o;
if (c.size() != size())
return false;
try {
return containsAll(c);
} catch (ClassCastException unused) {
return false;
} catch (NullPointerException unused) {
return false;
}
}
public int hashCode() {
int h = 0;
Iterator<E> i = iterator();
while (i.hasNext()) {
E obj = i.next();
if (obj != null)
h += obj.hashCode();
}
return h;
}
You can easily see that the hashcode will be the sum of the hashcode of the elements, so it is not affected by any order and that equals use containsAll(...) so also here the order doesn't matter.
write a function:
Input: two set, one reflect equal relationship, like {A = B, B = C}, another one reflect not-equal relationship, like {A != C}
Output: check whether the two set conflict or not, return a boolean type
public boolean isConflict(List<String> equalRelation, List<String> notEqualRelation) {
//......
}
Create a graph from your data. Every variable is a node and every equal is an undirected edge between corresponding nodes. After that every connected component represents equality between all of its nodes. For finding contradictions you could simply check if two nodes in a not equal relationship are in a same component or not. If they are this a contradiction.
You can use dfs,bfs for finding connected components. But if your data is dynamic it is better to use disjoint-set data structure.
You have done half the job:
your set of set equal relationship => keep it like Set< Pair< String,String> > , and the same thing for not-equal relationship
create a set of entities equal, like a Map : any symbol => reduced symbol equals. For example: A=B gives you, if A< B : A=>A (implicit), B=>A, and if A=>C, then B=>C. You have to iterate another time to reduce everything. With that,you map every equal symbols to an unique one (the least in my example);
finally, iterate through your Set of non-equals, and check every pair (X,Y) => is reduced(X)=reduced(Y) which means X=Y : you have a conflict there
If you want to keep Pair<>, use this, courtesy of :
A Java collection of value pairs? (tuples?)
public class Pair<L,R> implements java.io.Serializable {
private final L left;
private final R right;
public Pair(L left, R right) {
this.left = left;
this.right = right;
}
public L getLeft() { return left; }
public R getRight() { return right; }
#Override
public int hashCode() { return left.hashCode() ^ right.hashCode(); }
#Override
public boolean equals(Object o) {
if (o == null) return false;
if (!(o instanceof Pair)) return false;
Pair pairo = (Pair) o;
return this.left.equals(pairo.getLeft()) &&
this.right.equals(pairo.getRight());
}
}
Vogel612 is pretty much right. The basic algorithm is: complete the equality relation, then test each inequality in the list to see if any conflicts. However, you would not generally do any completion on the inequalities.
I’ll mention that equivalence is transitive (a = b = c implies a = c) but inequality is not (1 ≠ 1+1 and 2 ≠ 1, but 1+1 = 2). Both are symmetric, though (if a = b, b = a), and equivalence is also reflexive (each element must equal itself).
The exact data structure to use might depend on whether we have a list of all the elements in advance, how many there are, and how sparse the truth table for the relation is likely to be. If the amount of memory we need to store the truth table as a two-dimensional array is reasonable, we could do it that way. Vogel612’s approach of storing the table as a collection of rows, each row containing the set of elements equivalent to its index, is a good way of storing a truth table with only a few elements equivalent to every element. (If every row is guaranteed to contain at least one element, itself, this should become an array or arrayList.) Another approach is to create an arbitrary total ordering on the element names, such as the order in which we added them to an arrayList, the hash value of their names, or a lexicographic ordering. This order has nothing to do with the values of those elements or the provided equality relation, but can be used to put every pair into a canonical form. That is, our elements are listed a, b, c, and a < b < c when we compare their names even if the values are equal by the relation. Keep a set (probably a hashSet in Java) of tuples of elements (a,b) such that 'a' precedes 'b' in this ordering and their contents are equal, letting the facts that a = a, b = b and b = a be implicit. This will be efficient when very few other pairs of elements are equivalent. There are also more complicated data structures for sparse matrix storage that are not in the standard library.
In any case, you always want x ≠ x to fail and to look for the tuples in the same canonical form that you stored them in.
I attempted to implement Fortune's algorithm in Java, and to avoid writing an AVLTree I decided to use a TreeMap with BeachNode keys.
BeachNode has the following code:
public abstract class BeachNode implements Comparable<BeachNode>{
static int idcounter=0;
int id;
public StatusNode(){
//Allows for two nodes with the same coordinate
id=idcounter;
idcounter++;
}
#Override
public int compareTo(BeachNode s) {
if(s.getX()<this.getX())
return -1;
if(s.getX()>this.getX())
return 1;
if(s.id<this.id)
return 1;
if(s.id>this.id)
return -1;
return 0;
}
public abstract double getX();
}
The getX() method of Intersection returns a value dependent on the present height of the sweep line- so it changes partway through execution.
My first question:
(I'm fairly sure the answer is yes, but peace of mind would be nice)
If I ensure that for any two BeachNodes A and B, signum(A.compareTo(B)) remains constant while A and B are in the beach tree, will the TreeMap still function despite the changes underlying the compareTo?
My second question:
How can I ensure that such a contract is obeyed?
Note that in Fortune's algorithm, we track at what sweep line positions two intersections will meet- when the sweep line reaches one of these positions, one of the intersections is removed.
This means two intersections A and B in the beach tree will never cross positions, but during a CircleEvent A.compareTo(B) will return 0- interfering with processing the CircleEvent. The contract will be broken only briefly, during the CircleEvent that would remove one Intersection.
This is my first question on StackOverflow, if it is poorly posed or incomplete please inform me and I will do my best to rectify it.
According to the TreeMap documentation, the tree will be sorted according to the compareTo method, so any changes that are not reflected in the sign of a.compareTo(b) are allowed. However, you also need to implement an equals method with the same semantics as compareTo. This is really easy if you already have a compareTo method:
public boolean equals(Object object) {
if (!(object instanceof BeachNode)) {
return false;
}
BeachNode other = (BeachNode) object;
return this.compareTo(other) == 0;
}
And, since you're overriding equals, you should override hashCode. This is also pretty easy, since you are only using a couple fields to define equality.
public int hashCode() {
int hash = 1;
hash = (17 * hash) + (Double getX()).hashCode());
hash = (31 * hash) + id;
return hash;
}
I'm not sure about your second question, but since the id of the two intersections should stay different, wouldn't they not be equal? If I'm wrong, hopefully someone who understands the algorithm can help you work that out.
Does the following code use the hashCode method of my Scooter class:
void using_ArrayList(){
List coll=new ArrayList();
Scooter s1=new Scooter();
s1.setNumber("HR26KC345352344");
s1.setHorse_power(123.321);
s1.setYear_of_made(1997);
Scooter s2=new Scooter();
s2.setNumber("HR26KC34535");
s2.setHorse_power(123.321);
s2.setYear_of_made(1997);
Scooter s3=new Scooter();
s3.setNumber("HR26KC345352344");
s3.setHorse_power(123.321);
s3.setYear_of_made(1997);
coll.add(s1);
coll.add(s2);
coll.add(s3);
Scooter s=new Scooter();
s.setNumber("HR26KC345352344");
System.out.println(coll.contains(s));
}
Scooter Class:
class Scooter{
private String number;
private double horse_power;
private int year_of_made;
public String getNumber() {
return number;
}
public void setNumber(String number) {
this.number = number;
}
public double getHorse_power() {
return horse_power;
}
public void setHorse_power(double horse_power) {
this.horse_power = horse_power;
}
public int getYear_of_made() {
return year_of_made;
}
public void setYear_of_made(int year_of_made) {
this.year_of_made = year_of_made;
}
public boolean equals(Object o){
if((o instanceof Scooter)&&((Scooter)o).getNumber()==this.getNumber()){
System.out.println("EQUALS:TRUE"); //OK
return true;
}
else{
System.out.println("EQUALS:FALSE"); //OK
return false;
}
}
public int hashCode(){
System.out.println("HASHCODE");// NOT able To reach here...
return number.length();
}
}
I am able to reach equals() method. But not able to reach hashCode() method. Does hashCode() method is NOT used by ArrayList collection? Please tell me, as I am new to Java-Collections.
Unlike, say, a HashMap, an ArrayList does not need to use the hashCode() method since the order of the elements in an ArrayList is determined by the order in which they were inserted, and not by hashing.
Does hashCode() method is NOT used by ArrayList collection?
I assume that you mean the hashCode() methods of the elements of the ArrayList. The only case where the element hashCode() methods are called by the ArrayList object, is when computing the hash code of the ArrayList itself.
You can confirm this by looking at the source code, or reading the javadocs. (The behaviour is specified in the List API and implemented in AbstractList ...)
So, it is expected that you do not see calls to hashCode() in your example, either from List.add or List.contains. In particular, contains iterates the list elements calling equals on each one until a call returns true. The ArrayList implementation does nothing clever to make contains go fast.
I would also like to point out that the explanation given in this answer is incorrect:
"Unlike, say, a HashMap, an ArrayList does not need to use the hashCode() method since the order of the elements in an ArrayList is determined by the order in which they were inserted, and not by hashing."
This is incorrect in a couple of respects:
The order of an ArrayList is not determined by the order in which elements are added. It is determined by the order and position of the additions (and other operations).
The reason for using hashCode in HashMap is NOT to determine an order. (Indeed, the order of a HashMap is nonsensical ... to a first approximation.) Hashing (via hashCode) is actually used to provide O(1) lookup by key.
LinkedHashMap is a counter-example to the stated reasoning. It preserves insertion order AND use hashCode.
In fact, the two issues (element order and use of hashing) are orthogonal. The actual reasons that hashCode() isn't used for lookup in ArrayList are more pragmatic:
The addition of a hash table-like data structure would increase the ArrayList's memory overheads by a factor of at least 5 per element, and probably more.
Implementing BOTH O(1) hash-based lookup based on elements values (for contains) AND O(1) lookup by element position (for get) in the same data structure is very complicated.
ArrayList is designed to be memory and time efficient for a subset of use-cases. Hashing would be a hindrance in those use-cases. But either way, it doesn't use it.
It is obvious from your code, it doesn't use the hashCode. This methods is used in hashes, not linear collections like ArrayList. And search there is performed in O(N)
Hashcode is used in only those collections which needs to identify unique values.
so its used in Set kind of collections.
in arraylist duplicates are allowed and hence no concept of checking hashcode.
Have the following class:
public class Member {
private int x;
private long y;
private double d;
public Member(int x, long y, double d) {
this.x = x;
this.y = y;
this.d = d;
}
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + x;
result = (int) (prime * result + y);
result = (int) (prime * result + Double.doubleToLongBits(d));
return result;
}
#Override
public boolean equals(Object obj) {
if (this == obj) {
return true;
}
if (obj instanceof Member) {
Member other = (Member) obj;
return other.x == x && other.y == y
&& Double.compare(d, other.d) == 0;
}
return false;
}
public static void main(String[] args) {
Set<Member> test = new HashSet<Member>();
Member b = new Member(1, 2, 3);
test.add(b);
System.out.println(b.hashCode());
b.x = 0;
System.out.println(b.hashCode());
Member first = test.iterator().next();
System.out.println(test.contains(first));
System.out.println(b.equals(first));
System.out.println(test.add(first));
}
}
It produces the following results:
30814
29853
false
true
true
Because the hashCode depends of the state of the object it can no longer by retrieved properly, so the check for containment fails. The HashSet in no longer working properly. A solution would be to make Member immutable, but is that the only solution? Should all classes added to HashSets be immutable? Is there any other way to handle the situation?
Regards.
Objects in hashsets should either be immutable, or you need to exercise discipline in not changing them after they've been used in a hashset (or hashmap).
In practice I've rarely found this to be a problem - I rarely find myself needing to use complex objects as keys or set elements, and when I do it's usually not a problem just not to mutate them. Of course if you've exposed the references to other code by this time, it can become harder.
Yes. While maintaining your class mutable, you can compute the hashCode and the equals methods based on immutable values of the class ( perhaps a generated id ) to adhere to the hashCode contract defined in Object class:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.
Depending on your situation this may be easier or not.
class Member {
private static long id = 0;
private long id = Member.id++;
// other members here...
public int hashCode() { return this.id; }
public boolean equals( Object o ) {
if( this == o ) { return true; }
if( o instanceOf Member ) { return this.id == ((Member)o).id; }
return false;
}
...
}
If you need a thread safe attribute, you may consider use: AtomicLong instead, but again, it depends on how are you going to use your object.
As already mentioned, one can accept the following three solutions:
Use immutable objects; even when your class is mutable, you may use immutable identities on your hashcode implementation and equals checking, eg an ID-like value.
Similarly to the above, implement add/remove to get a clone of the inserted object, not the actual reference. HashSet does not offer a get function (eg to allow you alter the object later on); thus, you are safe there won't exist duplicates.
Exercise discipline in not changing them after they've been used, as #Jon Skeet suggests
But, if for some reason you really need to modify objects after being inserted to a HashSet, you need to find a way of "informing" your Collection with the new changes. To achieve this functionality:
You can use the Observer design pattern, and extend HashSet to implement the Observer interface. Your Member objects must be Observable and update the HashSet on any setter or other method that affects hashcode and/or equals.
Note 1: Extending 3, using 4: we may accept alterations, but those that do not create an already existing object (eg I updated a user's ID, by assigning a new ID, not setting it to an existing one). Otherwise, you have to consider the scenario where an object is transformed in such a way that is now equal to another object already existing in the Set. If you accept this limitation, 4th suggestion will work fine, else you must be proactive and define a policy for such cases.
Note 2: You have to provide both previous and current states of the altered object on your update implementation, because you have to initially remove the older element (eg use getClone() before setting new values), then add the object with the new state. The following snippet is just an example implementation, it needs changes based on your policy of adding a duplicate.
#Override
public void update(Observable newItem, Object oldItem) {
remove(oldItem);
if (add(newItem))
newItem.addObserver(this);
}
I've used similar techniques on projects, where I require multiple indices on a class, so I can look up with O(1) for Sets of objects that share a common identity; imagine it as a MultiKeymap of HashSets (this is really useful, as you can then intersect/union indices and work similarly to SQL-like searching). In such cases I annotate methods (usually setters) that must fireChange-update each of the indices when a significant change occurs, so indices are always updated with the latest states.
Jon Skeet has listed all alternatives. As for why the keys in a Map or Set must not change:
The contract of a Set implies that at any time, there are no two objects o1 and o2 such that
o1 != o2 && set.contains(o1) && set.contains(o2) && o1.equals(o2)
Why that is required is especially clear for a Map. From the contract of Map.get():
More formally, if this map contains a mapping from a key
k to a value v such that (key==null ? k==null : key.equals(k)), then this method returns v, otherwise it returns null. (There can be at most one such mapping.)
Now, if you modify a key inserted into a map, you might make it equal to some other key already inserted. Moreover, the map can not know that you have done so. So what should the map do if you then do map.get(key), where key is equal to several keys in the map? There is no intuitive way to define what that would mean - chiefly because our intuition for these datatypes is the mathematical ideal of sets and mappings, which don't have to deal with changing keys, since their keys are mathematical objects and hence immutable.
Theoretically (and more often than not practically too) your class either:
has a natural immutable identity that can be inferred from a subset of its fields, in which case you can use those fields to generate the hashCode from.
has no natural identity, in which case using a Set to store them is unnecessary, you could just as well use a List.
Never change 'hashable field" after putting in hash based container.
As if you (Member) registered your phone number (Member.x) in yellow page(hash based container), but you changed your number, then no one can find you in the yellow page any more.