I use a class derived from TreeMap with my own comparator as keys in a LinkedHashMap. Working with this construct I found some weird behaviour I could not explain myself. Maybe one of you can help. I tried to reproduce my issue with primitives. When I create a TreeMap of primitive types, the natural sort ordering should suffice and do not need a comparator in the constructor of the TreeMap, right?!
Here is the MWE:
package treemapputtest;
import java.util.LinkedHashMap;
import java.util.Map;
import java.util.TreeMap;
public class TreeMapPutTest {
public static void main(String[] args) {
System.out.println("simple:");
simpleTest();
System.out.println("\n\ncomplex:");
complexTest();
}
private static void simpleTest(){
TreeMap<Integer,String> map = new TreeMap<>();
System.out.println("map: " + map.hashCode() + " | " + Integer.toHexString(map.hashCode()));
map.put(1, "a");
System.out.println("map: " + map.hashCode() + " | " + Integer.toHexString(map.hashCode()));
map.put(2, "b");
System.out.println("map: " + map.hashCode() + " | " + Integer.toHexString(map.hashCode()));
}
private static void complexTest(){
TreeMap<Integer,String> internalMap = new TreeMap<>();
internalMap.put(1, "a");
internalMap.put(2, "b");
System.out.println("prior: " + internalMap.hashCode() + " | " + Integer.toHexString(internalMap.hashCode()));
LinkedHashMap<TreeMap<Integer,String>,Double> myMap = new LinkedHashMap<>();
myMap.put(internalMap, 1.0);
doSomethingWithMyInternalMap(myMap.keySet().iterator().next());
System.out.println("after:");
for (Map.Entry<TreeMap<Integer,String>,Double> entry : myMap.entrySet()){
System.out.println(" " + Integer.toHexString(entry.getKey().hashCode()));
}
}
private static void doSomethingWithMyInternalMap(TreeMap<Integer,String> intern){
intern.put(3, "c");
}
}
The output is:
simple:
map: 0 | 0
map: 96 | 60
map: 192 | c0
complex:
prior: 192 | c0
after:
120
So my question is: Why does the result of hashCode() change when I add stuff to the TreeMap? For the TreeMap alone this is not a big deal, but as this creates a "new object"/the reference to the old object is changed, I get errors after updating the TreeMap in the LinkedHashMap.
The Object API says for hashCode():
The general contract of hashCode is: Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified.
Does putting additional stuff in the TreeMap change something in the equals() method of TreeMap? Do I have to somehow override equals() and hashCode()?
I think you have misunderstanding about hashCode here. Let's emphasize the point in the text you quoted here:
The general contract of hashCode is: Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified.
Whenever you add (or remove) data in a map, you're changing the information used in its equals method - an empty map isn't equal to a map with [1->a] in it, and a map with [1->a] isn't equal to a map with [1->a; 2->b].
This has nothing to do with creating a new object, and the reference to the old map does not change. If you call System.identityHashCode(map) instead of map.hashCode() you'll see the object reference does not change no matter how many times you call put on it.
Does putting additional stuff in the TreeMap change something in the equals() method of TreeMap?
Yes, of course, since the contract of equals, for a Map, is
Compares the specified object with this map for equality. Returns true if the given object is also a map and the two maps represent the same mappings. More formally, two maps m1 and m2 represent the same mappings if m1.entrySet().equals(m2.entrySet())
So, every entry of the map is used to check for equality, and is thus also used to compute the hashCode. And adding an entry thus modified the hashCode.
You're expecting hashCode to be some sort of immutable identifier for an object. It's not. Not at all.
Does putting additional stuff in the TreeMap change something in the equals() method of TreeMap? Do I have to somehow override equals() and hashCode()?
Why wouldn't it? A new TreeMap is empty. So with your reasoning, if its equals() and hashCode() methods didn't adjust according to the contents of the TreeMap all instances of TreeMap (that start from empty) would have the same hashcode, because it would be the same one computed on creation with no entries inside it, irrespective what is added afterwards.
The equals() and hashCode() functions for most Collections adjust according to their contents. This way a Set or List of elements could be compared to another instance of a Set or List of elements, and equals() would return true if they contain the same elements (in the same order in case of List). Similarly for Map the internal equals() implementation is making sure to check whatever is considered equivalent for that Map implementation.
Related
I need a clarification regarding the use of TreeMap and LinkedList. Do these two structures use compareTo or equals?
In particular, TreeMap keeps the order in the keys, I suppose using the compareTo method of the class defined for the keys. However, when using get, do they use compareTo or equals to see if the key you pass is contained?
I have the same doubt for contains and getIndex inside LinkedList.
TreeMap uses compareTo, and the documentation warns you of problems if compareTo is not consistent with equals (i.e. that a.compareTo(b) == 0 <=> a.equals(b) should be true).
Note that the ordering maintained by a tree map ... must be consistent with equals if this sorted map is to correctly implement the Map interface.
LinkedList uses equals.
The reason why TreeMap has to use an ordering consistent with equals is that the contract of Map defines the behavior in terms of equals. For example, containsKey is defined as:
returns true if and only if this map contains a mapping for a key k such that (key==null ? k==null : key.equals(k))
Let's say you define a class like this:
class Bad implements Comparable<Bad> {
#Override public int compareTo(Bad other) { return 0; }
}
If you were to write:
Bad b1 = new Bad();
Bad b2 = new Bad();
Then:
Map<Bad, String> hm = new HashMap<>();
hm.put(b1, "");
System.out.println(hm.containsKey(b2)); // false
whereas
Map<Bad, String> tm = new TreeMap<>();
tm.put(b1, "");
System.out.println(tm.containsKey(b2)); // true
despite the fact that
System.out.println(tm.keySet().stream().anyMatch(k -> k.equals(b2))); // false
Thus, TreeMap violates the contract of Map, because Bad does not implement Comparable consistently with equals.
The Javadoc of TreeMap and LinkedList answers this:
V java.util.TreeMap.get(Object key)
Returns the value to which the specified key is mapped, or null if this map contains no mapping for the key.
More formally, if this map contains a mapping from a key k to a value v such that key compares equal to k according to the map's ordering, then this method returns v; otherwise it returns null. (There can be at most one such mapping.)
and
boolean java.util.LinkedList.contains(Object o)
Returns true if this list contains the specified element. More formally, returns true if and only if this list contains at least one element e such that (o==null ? e==null : o.equals(e)).
So, for TreeMap the Comparator\Comparable implementation is used to determine equality of keys, while for LinkedLists equals is used.
I'm using a HashMap and I haven't been able to get a straight answer on how the get() method works in the case of collisions.
Let's say n > 1 objects get placed in the same key. Are they stored in a LinkedList? Are they overwritten so that only the last object placed in that key exists there anymore? Are they using some other collision method?
If they are placed in a LinkedList, is there a way to retrieve that entire list? If not, is there some other built in map for Java in which I can do this?
For my purposes, separate chaining would be ideal, as if there are collisions, I need to be able to look through the list and get information about all the objects in it. What would be the best way to do this in Java?
Thanks for all your help!
The documentation for Hashmap.put() clearly states, "Associates the specified value with the specified key in this map. If the map previously contained a mapping for the key, the old value is replaced"
If you would like to have a list of objects associated with a key, then store a list as the value.
Note that 'collision' generally refers to the internal working of the HashMap, where two keys have the same hash value, not the use of the same key for two different values.
Are they overwritten so that only the last object placed in that key exists there anymore?
Yes, assuming you're putting multiple values with the same key (according to Object.equals, not Object.hashCode.) That's specified in the Map.put javadoc:
If the map previously contained a mapping for the key, the old value is replaced by the specified value.
If you want to map a key to multiple values, you're probably better off using something like Guava's ListMultimap, ArrayListMultimap in specific, which maps keys to lists of values. (Disclosure: I contribute to Guava.) If you can't tolerate a third-party library, then really you have to have a Map<Key, List<Value>>, though that can get a bit unwieldy.
Let's say n > 1 objects get placed in the same key. Are they stored in a linked list? Are they overwritten so that only the last object placed in that key exists there anymore? Are they using some other collision method?
There could be single instance for the same key so the last one overrides the prior one
Map<String, Integer> map = new HashMap<String, Integer>();
map.put("a", 1);
map.put("a", 2);// it overrides 1 and puts 2 there
chaining comes where there turns the same hash for different keys
See
Java papers hash table working
Cite: "Let's say n > 1 objects get placed in the same key. Are they stored in a linked list? Are they overwritten so that only the last object placed in that key exists there anymore? Are they using some other collision method?"
Yes, if the hashmap contained something under this key, it will override it.
You can implement your own class to handle that or more simple use a HashMap> in where K is your Key Object and V the object value.
Have in mind that with last solution when you do a map.get(K) will retrieve a List or the implementation that you choose (i.e: ArrayList) so all the methods of this implementation are available for you and perhaps fulfils your requirements. For example if you used Arraylist you have the size, trimToSize, removeRange, etc.
collision resolution for hashing in java is not based on chaining. To my understanding, JDK uses double hashing which is one of the best way of open addressing. So there's no list going to be associated with a hash slot.
You might put the objects for which the hash function resolves to the same key can be put in list and this list can be updated in the table/map.
package hashing;
import java.util.HashMap;
import java.util.Map;
public class MainAnimal {
/**
* #param args
*/
public static void main(String[] args) {
Animal a1 = new Animal(1);
Animal a2 = new Animal(2);
Map<Animal, String> animalsMap = new HashMap<Animal, String>();
animalsMap.put(a1,"1");
animalsMap.put(a2,"2");
System.out.println(animalsMap.get(a1));
Map<String, Integer> map = new HashMap<String, Integer>();
map.put("a", 1);
map.put("a", 2);// it overrides 1 and puts 2 there
System.out.println(map.get("a"));
}
}
class Animal {
private int index = 0;
Animal(int index){
this.index = index;
}
public boolean equals(Object obj){
if(obj instanceof Animal) {
Animal animal = (Animal) obj;
if(animal.getIndex()==this.getIndex())
return true;
else
return false;
}
return false;
}
public int hashCode() {
return 0;
}
public int getIndex() {
return index;
}
public void setIndex(int index) {
this.index = index;
}
}
In the above code, am showing two different things.
case 1 - two different instances resolving to same hashkey
case 2 - two same instances acting as keys for two different entries.
Animal instances, a1 & a2 resolves to same key. But they are not overriden. Hashing mechanism probes through the hash slots and places the entries on different slots.
with the second case, keys resolve to same hash key and also the equals method satisfies. Hence overriding happens.
Now if in the animal class I override the equals method this way -
public boolean equals(Object obj){
// if(obj instanceof Animal) {
// Animal animal = (Animal) obj;
// if(animal.getIndex()==this.getIndex())
// return true;
// else
// return false;
// }
// return false;
return true;
}
Overriding happens. The behavior is like using same instance. Since a1 and a2 are in the same bucket and equals return true as well.
It puzzles me how the following segment can lead to a null value of the Boolean mandatory, although it is not null at the corresponding key in the actual hashtable:
for (List<List<A>> a : hashMap.keySet()) {
Boolean mandatory = hashMap.get(a);
}
A HashMap will return null if the key specified is not bound to a value.
Issue is almost certainly that comparison op on a -- a List -- against keys is failing.
Let me guess: are you modifying these lists (the key object) after you have called a put? Did you remove all entries in one of the keys? Remember an empty list is equal to all empty ArrayLists. Further remember that List.equals() compares list content (one by one) to test equality.
package sof_6462281;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
/**
* Demonstrate the fact that the Map uses key.equals(k) to
* test for key equality. Further demonstrate that it is a
* very bad idea to use mutable collections are keys to maps.
*/
public class ListAsKey {
public static void main(String[] args) {
Map<List<A>, Boolean> map = new HashMap<List<A>, Boolean>();
List<A> alist = new ArrayList<A>();
map.put(alist, true);
for (List<A> a : map.keySet()) {
Boolean b = map.get(a);
System.out.format("\t%s(ArrayList#%d) => %s\n",a, a.hashCode(), map.get(a));
}
// you changed your list after the put, didn't you?
alist.add(new A());
for (List<A> a : map.keySet()) {
Boolean b = map.get(a);
System.out.format("\t%s(ArrayList#%d) => %s\n",a, a.hashCode(), map.get(a));
}
alist.clear();
for (List<A> a : map.keySet()) {
Boolean b = map.get(a);
System.out.format("\t%s(ArrayList#%d) => %s\n",a, a.hashCode(), map.get(a));
}
}
public static final class A { /* foo */ }
}
Results:
[](ArrayList#1) => true
[sof_6462281.ListAsKey$A#4b71bbc9](ArrayList#1265744872) => null
[](ArrayList#1) => true
edit: added more ops to above and added console out.
Boolean can be null because it wraps the value-type primitive boolean. I am unsure what you mean its not null at the corresponding key in the actual hashtable. You are iterating over the keys then getting the values at those keys. The value at a key was inserted as null so when you are retrieving it you are getting the null.
Using a mutable object for a Map key is always a dangerous thing. If you maintain any reference to any of those keys after inserting into the map, then it is very likely that one of those keys will be modified at some point in the future which will invalidate the contents of your map.
A less likely, but possible scenario, even assuming you somehow don't screw up your List<List<>> key is if you have messed up the equals method of class A, then your Lists' equals method will also be messed up, again screwing up your map.
Look at alphazero's nice code example if you need further proof that what you are attempting to do is a bad idea.
I have a data structure containing a list of objects, like this:
class A {
private List<Object> list;
}
How to properly define a hash function for the list, assuming each element of the list has a correct hashCode()?
If the actual List implementation is fully conformant to the interface, the provided hashCode implementation should be sufficient:
Returns the hash code value for this list. The hash code of a list is defined to be the result of the following calculation:
hashCode = 1;
Iterator i = list.iterator();
while (i.hasNext()) {
Object obj = i.next();
hashCode = 31*hashCode + (obj==null ? 0 : obj.hashCode());
}
(List documentation)
The List interface requires conforming implementations to provide equals based on the elements of the list. Thus, they had to specify the hashCode algorithm explicitely
Why do you want to define hashCode for your list, when it already has it implemented (along with equals)?
(Provided it is java.util.List of course - however if not, the link above shows you the exact implementation you can use for your own list type.)
The hash code of a list is defined by the List interface of List. This can be used as part of your object's hash code, though there are a couple of cases where you might not want to use it - if the elements of your list have an expensive hash code function, or if the list can hold a reference to the object, and you would then get a stack overflow if the list's algorithm was used. In that case, just use the length of the list or another hash value.
In the Java library, List implementations (LinkedList, ArrayList) use the default hashCode implementation provided by AbstractList. which is defined as:
int hashCode = 1;
Iterator<E> i = iterator();
while (i.hasNext()) {
E obj = i.next();
hashCode = 31*hashCode + (obj==null ? 0 : obj.hashCode());
}
return hashCode;
Any specific reason why you just wouldn't do:
Arrays.hashCode(<cast list to array>);
Something like:
Arrays.hashCode((String []) myList.toArray());
Maybe the question should have been "how to compute a hashcode of an object containing a list".
class A {
private List<Object> list;
#Override
public int hashCode() {
return list.hashCode();
}
// ... don't forget to implement custom equals ...
}
public final Comparator<String> ID_IGN_CASE_COMP = new Comparator<String>() {
public int compare(String s1, String s2) {
return s1.compareToIgnoreCase(s2);
}
};
private Map< String, Animal > _animals = new TreeMap< String, Animal >(ID_IGN_CASE_COMP);
My problem is, how to use method get(id) ignoring the given comparator. I want the map to be order by Case Insensitive but, I want it to be case sensitive when I fetch the values by a given key.
I think the answer is easy. Implement your own comparator that does a case insensitive sort but does NOT return 0 for "A" and "a"... sort them too.
The issue is that your comparator returns 0 for the compare( "A", "a" ) case which means it is the same key as far as the map is concerned.
Use a comparator like:
public final Comparator<String> ID_IGN_CASE_COMP = new Comparator<String>() {
public int compare(String s1, String s2) {
int result = s1.compareToIgnoreCase(s2);
if( result == 0 )
result = s1.compareTo(s2);
return result;
}
};
Then all keys will go in regardless of case and "a" and "A" will still be sorted together.
In other words, get("a") will give you a different value from get("A")... and they will both show up in keySet() iterators. They will just be sorted together.
In a TreeMap, adding two keys a and b (in that order) so that compare(a, b) returns 0 will result in that the latest added entry (b) will overwrite the first one (a).
In your case, this means that there will never be any use for case insensitive get(id).
quoting http://java.sun.com/javase/6/docs/api/java/util/TreeMap.html
Note that the ordering maintained by a sorted map (whether or not an explicit comparator is provided) must be consistent with equals if this sorted map is to correctly implement the Map interface. (See Comparable or Comparator for a precise definition of consistent with equals.) This is so because the Map interface is defined in terms of the equals operation, but a map performs all key comparisons using its compareTo (or compare) method, so two keys that are deemed equal by this method are, from the standpoint of the sorted map, equal. The behavior of a sorted map is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Map interface.
This is probably not what you want.
If the map is comparably small and you don't need to fetch the sorted entries very many times, a solution is to use a HashMap (or a TreeMap without explicitly setting the comparator), and sort the entries case-insensitively when you need them ordered.
You'll have to use two separate TreeMaps for that, with the same contents but different comparators.
maybe it'll do the job:
new Comparator<String>(){
public int compare(String s1, String s2)
{
String s1n = s1.toLowerCase();
String s2n = s2.toLowerCase();
if(s1n.equals(s2n))
{
return s1.compareTo(s2);
}
return s1n.compareTo(s2n);
}
};
}
you need a multimap: each entry of this multimap keeps the case insensitive keys and aanother map with the original keys as value.
There are many freely usable implementations of multimaps such as Common Collections, Google Collections, etc
In addition to all the other answers and agreeing, that it is impossible to have a single TreeMap structure with different comparators:
From your question I understand that you have two requirements: the data model shall be case sensitive (you want the case sensitive values when you use get()), the presenter shall be case insensitive (you want an case sensitive ordering, presentation is just an assumption).
Let's assume, we populate the Map with the mappings (aa,obj1), (aA,obj2), (Aa,obj3), (AA,obj4). The iterator will provides the values in the order: (obj4, obj3, obj2, obj1)(*). Now which order do you expect if the map was ordered case-insensitive? All four keys would be equal and the order undefined. Or are you looking for a solution that would resolve the collection {obj1, obj2, obj3, obj4} for the key 'AA'? But that's a different approach.
SO encourages the community to be honest: therefore my advice at this point is to look at your requirement again :)
(*) not tested, assumed that 'A' < 'a' = true.
Use floorEntry and then higherEntry in a loop to find the entries case-insensitively; stop when you find the exact key match.