By default hashCode and equals works fine.
I have used objects with hash tables like HashMap, without overriding this methods, and it was fine. For example:
public class Main{
public static void main(String[] args) throws Exception{
Map map = new HashMap<>();
Object key = new Main();
map.put(key, "2");
Object key2 = new Main();
map.put(key2, "3");
System.out.println(map.get(key));
System.out.println(map.get(key2));
}
}
This code works fine. By default hashCode returning memory address of object, and equals checks if two objects is the same. So what is the problem with using default implementation of this methods?
Note this example from an old pdf I have:
This code
public class Name {
private String first, last;
public Name(String first, String last) { this.first = first; this.last = last;
}
public boolean equals(Object o) {
if (!(o instanceof Name)) return false;
Name n = (Name)o;
return n.first.equals(first) && n.last.equals(last);
}
public static void main(String[] args) {
Set s = new HashSet();
s.add(new Name("Donald", "Duck"));
System.out.println(
s.contains(new Name("Donald", "Duck")));
}
}
...will not always give the same result because as it is stated in the pdf
Donald is in the set, but the set can’t find him. The Name class
violates the hashCode contract
Because, in this case, there are two strings composing the object the hashcode should also be composed of those two elements.
To fix this code we should add a hashCode method:
public int hashCode() {
return 31 * first.hashCode() + last.hashCode();
}
This question in the pdf ends saying that we should
override hashCode when overriding equals
In your example, whenever you want to retrieve something from you HashMap, you need to have key and key2, because their equals() is the same as object identity. This makes the HashMap completely useless, because you cannot retrieve anything from it without having these two keys. Passing the keys around doesn't make sense, because you could just as well pass the values around, it would be equally awkward.
Now try to imagine some use case, where a HashMap actually makes sense. For example, suppose that you get String-valued requests from the outside, and want to return, say, ip-addresses. The keys that come from the outside obviously cannot be the same as the keys you used to set up your map. Therefore you need some methods that compare requests from the outside to the keys you used during the initialization phase. This is exactly what equals is good for: it defines an equivalence relation on objects that are not identical in the sense of being represented by the same bits in physical memory. hashCode is a coarser version of equals, which is necessary to retrieve values from HashMaps quickly.
Your example is not very useful as it would be simpler to have simple variables. i.e. the only way to lookup the value in the map is to hold the original key. In which case, you may as well just hold the value and not have a Map in the first place.
If instead you want to be able to create a new key which is considered equivalent to a key used previously, you have to provide how equivalence is determined.
Given that most objects are never asked for their identity hash code, the system does not keep for most objects any information that would be sufficient to establish a permanent identity. Instead, Java uses two bits in the object header to distinguish three states:
The identity hashcode for the object has never been queried.
The identity hashcode has been queried, but the object has not been moved by the GC since then.
The identity hashcode has been queried, and the object has been moved since then.
For objects in the first state, asking for the identity hash code will change the object to the second state and process it as a second-state object.
For objects in the second state, including those which had moments before been in the first state, the identity hash code will be formed from the address.
When an object in the second state is moved by the GC, the GC will allocate an extra 32 bits to the object, which will be used to hold a hash-code derived from its original address. The object will then be assigned to the third state.
Subsequent requests for the hash code from a state-3 object will use that value that was stored when it was moved.
At times when the system knows that no objects within a certain address range are in state 2, it may change the formula used to compute hash codes from addresses in that range.
Although at any given time there may only be one object at any given address, it is entirely possible that an object might be asked for its identity hash code and later moved, and that another object might be placed at the either same address as the first one, or an address that would hash to the same value (the system might change the formula used to compute hash values to avoid duplication, but would be unable to eliminate it).
Related
I have a method that checks if two objects are equal(by reference).
public boolean isUnique( T uniqueIdOfFirstObject, T uniqueIdOfSecondObject ) {
return (uniqueIdOfFirstObject == uniqueIdOfSecondObject);
}
(Use case) Assuming that I don't have any control over creation of the object.
I have a method
void currentNodeExistOrAddToHashSet(Object newObject, HashSet<T> objectHash) {
// will it be 100% precise? Assuming both object have the same field values.
if(!objectHash.contains(newObject){
objectHash.add(newObject);
}
}
or I could do something like this
void currentNodeExistOrAddToHashSet(Object newObject, HashSet<T> objectHash){
//as per my knowledge, there might be collision for different objects.
int uniqueId = System.identityHashCode(newObject);
if(!objectHash.contains(uniqueId){
objectHash.add(uniqueId);
}
}
Is it possible to get a 100% collision proof Id in java i.e different object having different IDs, the same object having same ids irrespective of the content of the object?
Since you put them into a HashSet that uses hashcode/equals and hashCode is 32 bits long - this has a limit; thus collision will happen. Especially since a HashSet actually only cares about n-last bits before making itself bigger in size and thus adding one more bit and so on. You can read a lot more about this here for example.
The question is different here: why you want a collision free structure in the first place? If you define a fairly well distributed hashCode and a fairly decent equals - these things should not matter to you at all. If you worry about performance of a search, it is O(1) for HashSet.
You could define hashCode and equality based on UUID, like let's say UUID#randomUUID - but this still bounds your hashCode to the same 32-bits, thus collision could still happen.
From what I understand, when two objects are put in a HashMap that have the same hashcode, they are put in a LinkedList (I think) of objects with the same hash code. I am wondering if there is a way to either extend HashMap or manipulate the existing methods to return a list or array of objects that share a hash code instead of going into equals to see if they are the same object.
The reasoning is that I'm trying to optimize a part of a code that, currently, is just a while loop that finds the first object with that hashcode and stores/removes it. This would be a lot faster if I could just return the full list in one go.
Here's the bit of code I'd like to replace:
while (WorkingMap.containsKey(toSearch)) {
Occurences++;
Possibles.add(WorkingMap.get(toSearch));
WorkingMap.remove(toSearch);
}
The keys are Chunk objects and the values are Strings. Here are the hashcode() and equals() functions for the Chunk class:
/**
* Returns a string representation of the ArrayList of words
* thereby storing chunks with the same words but with different
* locations and next words in the same has bucket, triggering the
* use of equals() when searching and adding
*/
public int hashCode() {
return (Words.toString()).hashCode();
}
#Override
/**
* result depends on the value of location. A location of -1 is obviously
* not valid and therefore indicates that we are searching for a match rather
* than adding to the map. This allows multiples of keys with matching hashcodes
* to be considered unequal when adding to the hashmap but equal when searching
* it, which is integral to the MakeMap() and GetOptions() methods of the
* RandomTextGenerator class.
*
*/
public boolean equals(Object obj) {
Chunk tempChunk = (Chunk)obj;
if (LocationInText == -1 && Words.size() == tempChunk.GetText().size())
{
for (int i = 0; i < Words.size(); i++) {
if (!Words.get(i).equals(tempChunk.GetText().get(i))) {
return false;
}
}
return true;
}
else {
if (tempChunk.GetLocation() == LocationInText) {
return true;
}
return false;
}
}
Thanks!
HashMap does not expose any way to do this, but I think you're misunderstanding how HashMap works in the first place.
The first thing you need to know is that if every single object had exactly the same hash code, HashMap would still work. It would never "mix up" keys. If you call get(key), it will only return the value associated with key.
The reason this works is that HashMap only uses hashCode as a first grouping, but then it checks the object you passed to get against the keys stored in the map using the .equals method.
There is no way, from the outside, to tell that HashMap uses linked lists. (In fact, in more recent versions of Java, it doesn't always use linked lists.) The implementation doesn't provide any way to look at hash codes, to find out how hash codes are grouped, or anything along those lines.
while (WorkingMap.containsKey(toSearch)) {
Occurences++;
Possibles.add(WorkingMap.get(toSearch));
WorkingMap.remove(toSearch);
}
This code does not "find the first object with that hashcode and store/remove it." It finds the one and only object equal to toSearch according to .equals, stores and removes it. (There can only be one such object in a Map.)
Your while isn't really going. It makes max one turn, if the WorkingMap is a plain Java HashMap. .get(key) return the last saved Object in the HashMap that is saved on 'key'. If it matched toSearch, than it going once.
I'm not sure about that many open questions here. But if you need that one and your farther code is understanding
What kind of type is class Possibles? ArrayList?
// this one should make the same as your while
if(WorkingMap.containsKey(toSearch)) {
Possibles.add(WorkingMap.get(toSearch));
WorkingMap.remove(toSearch);
}
// farher: expand your Possibles to get that LinkedList what you want to have.
public class possibilities {
// List<LinkedList<String>> container = new ArrayList<LinkedList<String>>();
public Map<Chunk, LinkedList<String>> container2 = new HashMap<Chunk, LinkedList<String>>();
public void put(Chunk key, String value) {
if(!this.container2.containsKey(key)) {
this.container2.put(key, new LinkedList<String>());
}
this.container2.get(key).add(value);
}
}
// this one works with updated Possibles
if(WorkingMap.containsKey(toSearch)) {
Possibles.put(toSearch, WorkingMap.get(toSearch));
WorkingMap.remove(toSearch);
}
//---
How ever, yes it can go like that, but keys should not be a complex object.
Notice: That LinkedLists takes memory and how big are chunks? check Memory Usage
Possibles.(get)container2.keySet();
Good Look
Sail
From what I understand, when two objects are put in a HashMap that have the same hashcode, they are put in a LinkedList (I think) of objects with the same hash code.
Yes, but it's more complicated than that. It often needs to put objects in linked lists even when they have differing hash codes, since it only uses some bits of the hash codes to choose which bucket to store objects in; the number of bits it uses depends on the current size of the internal hash table, which approximately depends on the number of things in the map. And when a bucket needs to contain multiple objects it will also try to use binary trees like a TreeMap if possible (if objects are mutually Comparable), rather than linked lists.
Anyway.....
I am wondering if there is a way to either extend HashMap or manipulate the existing methods to return a list or array of objects that share a hash code instead of going into equals to see if they are the same object.
No.
A HashMap compares keys for equality according to the equals method. Equality according to the equals method is the only valid way to set, replace, or retrieve values associated with a particular key.
Yes, it also uses hashCode as a way to arrange objects in a structure that allows for far faster location of potentially equal objects. Still, the contract for matching keys is defined in terms of equals, not hashCode.
Note that it is perfectly legal for every hashCode method to be implemented as return 0; and the map will still work just as correctly (but very slowly). So any idea that involves getting a list of objects sharing a hash code is either impossible or pointless or both.
I'm not 100% sure what you're doing in your equals method with the LocationInText variable, but it looks dangerous, as it violates the contract of the equals method. It is required that the equals method be symmetric, transitive, and consistent:
Symmetric: for any non-null reference values x and y, x.equals(y) should return true if and only if y.equals(x) returns true.
Transitive: for any non-null reference values x, y, and z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) should return true.
Consistent: for any non-null reference values x and y, multiple invocations of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the objects is modified.
And the hashCode method is required to always agree with equals about equal objects:
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
The LocationInText variable is playing havoc with those rules, and may well break things. If not today, then some day. Get rid of it!
Here's the bit of code I'd like to replace:
while (WorkingMap.containsKey(toSearch)) {
Occurences++;
Possibles.add(WorkingMap.get(toSearch));
WorkingMap.remove(toSearch);
}
Something that jumps out at me is that you only need to do the key lookup once, instead of doing it three times, since Map.remove returns the removed value or null if the key is not present:
for (;;) {
String s = WorkingMap.remove(toSearch);
if (s == null) break;
Occurences++;
Possibles.add(s);
}
Either way, the loop is still faulty, since it is supposed to be impossible for a map to contain more than one key equal to toSearch. I can't overstate that the LocationInText variable as you're using it is not a good idea.
I agree with the other commenters it looks like you're looking for a map-of-list structure. Some Java libraries like Guava offer a Multimap for this, but you can do it manually pretty easily. I think the declaration you want is:
Map<Chunk,List<String>> map = new HashMap<>();
To add a new chunk-string pair to the map, do:
void add(Chunk chunk, String string) {
map.computeIfAbsent(chunk, k -> new ArrayList<>()).add(string);
}
That method puts a new ArrayList in the map if the chunk is new, or fetches the existing ArrayList if there is one for that chunk. Then it adds the string to the list that it fetched or created.
To retrieve the list of all strings for a particular chunk value is as simple as map.get(chunkToSearch), which you can add to your Possibles list as Possibles.addAll(map.get(chunkToSearch));.
Other potential optimizations I'd point out:
In your Chunk.hashCode method, consider caching the hash code instead of recomputing it every time the method is called. If Chunk is mutable (which is not a good idea for a map key, but vaguely allowed so long as you're careful) then recompute the hash code only after the Chunk's value has changed. Also, if Words is a List, which it seems to be, it would likely be faster to use its hash code than convert it to a string and use the string's hash code, but I'm not sure.
In your Chunk.equals method, you can return true immediately if the instances are the same (which they often will be). Also, if GetText returns a copy of the data, then don't call it; you can access the private Words list of the other Chunk since you are in the same class, and finally, you can just defer to the List.equals method:
#Override
public boolean equals(Object o) {
return (this == o) || (o instanceof Chunk && this.Words.equals(((Chunk)o).Words));
}
Simple! Fast!
Given that I some class with various fields in it:
class MyClass {
private String s;
private MySecondClass c;
private Collection<someInterface> coll;
// ...
#Override public int hashCode() {
// ????
}
}
and of that, I do have various objects which I'd like to store in a HashMap. For that, I need to have the hashCode() of MyClass.
I'll have to go into all fields and respective parent classes recursively to make sure they all implement hashCode() properly, because otherwise hashCode() of MyClass might not take into consideration some values. Is this right?
What do I do with that Collection? Can I always rely on its hashCode() method? Will it take into consideration all child values that might exist in my someInterface object?
I OPENED A SECOND QUESTION regarding the actual problem of uniquely IDing an object here: How do I generate an (almost) unique hash ID for objects?
Clarification:
is there anything more or less unqiue in your class? The String s? Then only use that as hashcode.
MyClass hashCode() of two objects should definitely differ, if any of the values in the coll of one of the objects is changed. HashCode should only return the same value if all fields of two objects store the same values, resursively. Basically, there is some time-consuming calculation going on on a MyClass object. I want to spare this time, if the calculation had already been done with the exact same values some time ago. For this purpose, I'd like to look up in a HashMap, if the result is available already.
Would you be using MyClass in a HashMap as the key or as the value? If the key, you have to override both equals() and hashCode()
Thus, I'm using the hashCode OF MyClass as the key in a HashMap. The value (calculation result) will be something different, like an Integer (simplified).
What do you think equality should mean for multiple collections? Should it depend on element ordering? Should it only depend on the absolute elements that are present?
Wouldn't that depend on the kind of Collection that is stored in coll? Though I guess ordering not really important, no
The response you get from this site is gorgeous. Thank you all
#AlexWien that depends on whether that collection's items are part of the class's definition of equivalence or not.
Yes, yes they are.
I'll have to go into all fields and respective parent classes recursively to make sure they all implement hashCode() properly, because otherwise hashCode() of MyClass might not take into consideration some values. Is this right?
That's correct. It's not as onerous as it sounds because the rule of thumb is that you only need to override hashCode() if you override equals(). You don't have to worry about classes that use the default equals(); the default hashCode() will suffice for them.
Also, for your class, you only need to hash the fields that you compare in your equals() method. If one of those fields is a unique identifier, for instance, you could get away with just checking that field in equals() and hashing it in hashCode().
All of this is predicated upon you also overriding equals(). If you haven't overridden that, don't bother with hashCode() either.
What do I do with that Collection? Can I always rely on its hashCode() method? Will it take into consideration all child values that might exist in my someInterface object?
Yes, you can rely on any collection type in the Java standard library to implement hashCode() correctly. And yes, any List or Set will take into account its contents (it will mix together the items' hash codes).
So you want to do a calculation on the contents of your object that will give you a unique key you'll be able to check in a HashMap whether the "heavy" calculation that you don't want to do twice has already been done for a given deep combination of fields.
Using hashCode alone:
I believe hashCode is not the appropriate thing to use in the scenario you are describing.
hashCode should always be used in association with equals(). It's part of its contract, and it's an important part, because hashCode() returns an integer, and although one may try to make hashCode() as well-distributed as possible, it is not going to be unique for every possible object of the same class, except for very specific cases (It's easy for Integer, Byte and Character, for example...).
If you want to see for yourself, try generating strings of up to 4 letters (lower and upper case), and see how many of them have identical hash codes.
HashMap therefore uses both the hashCode() and equals() method when it looks for things in the hash table. There will be elements that have the same hashCode() and you can only tell if it's the same element or not by testing all of them using equals() against your class.
Using hashCode and equals together
In this approach, you use the object itself as the key in the hash map, and give it an appropriate equals method.
To implement the equals method you need to go deeply into all your fields. All of their classes must have equals() that matches what you think of as equal for the sake of your big calculation. Special care needs to be be taken when your objects implement an interface. If the calculation is based on calls to that interface, and different objects that implement the interface return the same value in those calls, then they should implement equals in a way that reflects that.
And their hashCode is supposed to match the equals - when the values are equal, the hashCode must be equal.
You then build your equals and hashCode based on all those items. You may use Objects.equals(Object, Object) and Objects.hashCode( Object...) to save yourself a lot of boilerplate code.
But is this a good approach?
While you can cache the result of hashCode() in the object and re-use it without calculation as long as you don't mutate it, you can't do that for equals. This means that calculation of equals is going to be lengthy.
So depending on how many times the equals() method is going to be called for each object, this is going to be exacerbated.
If, for example, you are going to have 30 objects in the hashMap, but 300,000 objects are going to come along and be compared to them only to realize that they are equal to them, you'll be making 300,000 heavy comparisons.
If you're only going to have very few instances in which an object is going to have the same hashCode or fall in the same bucket in the HashMap, requiring comparison, then going the equals() way may work well.
If you decide to go this way, you'll need to remember:
If the object is a key in a HashMap, it should not be mutated as long as it's there. If you need to mutate it, you may need to make a deep copy of it and keep the copy in the hash map. Deep copying again requires consideration of all the objects and interfaces inside to see if they are copyable at all.
Creating a unique key for each object
Back to your original idea, we have established that hashCode is not a good candidate for a key in a hash map. A better candidate for that would be a hash function such as md5 or sha1 (or more advanced hashes, like sha256, but you don't need cryptographic strength in your case), where collisions are a lot rarer than a mere int. You could take all the values in your class, transform them into a byte array, hash it with such a hash function, and take its hexadecimal string value as your map key.
Naturally, this is not a trivial calculation. So you need to think if it's really saving you much time over the calculation you are trying to avoid. It is probably going to be faster than repeatedly calling equals() to compare objects, as you do it only once per instance, with the values it had at the time of the "big calculation".
For a given instance, you could cache the result and not calculate it again unless you mutate the object. Or you could just calculate it again only just before doing the "big calculation".
However, you'll need the "cooperation" of all the objects you have inside your class. That is, they will all need to be reasonably convertible into a byte array in such a way that two equivalent objects produce the same bytes (including the same issue with the interface objects that I mentioned above).
You should also beware of situations in which you have, for example, two strings "AB" and "CD" which will give you the same result as "A" and "BCD", and then you'll end up with the same hash for two different objects.
For future readers.
Yes, equals and hashCode go hand in hand.
Below shows a typical implementation using a helper library, but it really shows the "hand in hand" nature. And the helper library from apache keeps things simpler IMHO:
#Override
public boolean equals(Object o) {
if (this == o) {
return true;
}
if (o == null || getClass() != o.getClass()) {
return false;
}
MyCustomObject castInput = (MyCustomObject) o;
boolean returnValue = new org.apache.commons.lang3.builder.EqualsBuilder()
.append(this.getPropertyOne(), castInput.getPropertyOne())
.append(this.getPropertyTwo(), castInput.getPropertyTwo())
.append(this.getPropertyThree(), castInput.getPropertyThree())
.append(this.getPropertyN(), castInput.getPropertyN())
.isEquals();
return returnValue;
}
#Override
public int hashCode() {
return new org.apache.commons.lang3.builder.HashCodeBuilder(17, 37)
.append(this.getPropertyOne())
.append(this.getPropertyTwo())
.append(this.getPropertyThree())
.append(this.getPropertyN())
.toHashCode();
}
17, 37 .. those you can pick your own values.
From your clarifications:
You want to store MyClass in an HashMap as key.
This means the hashCode() is not allowed to change after adding the object.
So if your collections may change after object instantiation, they should not be part of the hashcode().
From http://docs.oracle.com/javase/8/docs/api/java/util/Map.html
Note: great care must be exercised if mutable objects are used as map
keys. The behavior of a map is not specified if the value of an object
is changed in a manner that affects equals comparisons while the
object is a key in the map.
For 20-100 objects it is not worth that you enter the risk of an inconsistent hash() or equals() implementation.
There is no need to override hahsCode() and equals() in your case.
If you don't overide it, java takes the unique object identity for equals and hashcode() (and that works, epsecially because you stated that you don't need an equals() considering the values of the object fields).
When using the default implementation, you are on the safe side.
Making an error like using a custom hashcode() as key in the HashMap when the hashcode changes after insertion, because you used the hashcode() of the collections as part of your object hashcode may result in an extremly hard to find bug.
If you need to find out whether the heavy calculation is finished, I would not absue equals(). Just write an own method objectStateValue() and call hashcode() on the collection, too. This then does not interfere with the objects hashcode and equals().
public int objectStateValue() {
// TODO make sure the fields are not null;
return 31 * s.hashCode() + coll.hashCode();
}
Another simpler possibility: The code that does the time consuming calculation can raise an calculationCounter by one as soon as the calculation is ready. You then just check whether or not the counter has changed. this is much cheaper and simpler.
This may not be the real world scenario but just curious to know what happens, below is the code.
I am creating a set of object of class UsingSet.
According to hashing concept in Java, when I first add object which contains "a", it will create a bucket with hashcode 97 and put the object inside it.
Again when it encounters an object with "a", it will call the overridden hashcode method in the class UsingSet and it will get hashcode 97 so what is next?
As I have not overridden equals method, the default implementation will return false. So where will be the Object with value "a" be kept, in the same bucket where the previous object with hashcode 97 kept? or will it create new bucket?
anybody know how it will be stored internally?
/* package whatever; // don't place package name! */
import java.util.*;
import java.lang.*;
import java.io.*;
class UsingSet {
String value;
public UsingSet(String value){
this.value = value;
}
public String toString() {
return value;
}
public int hashCode() {
int hash = value.hashCode();
System.out.println("hashcode called" + hash);
return hash;
}
public static void main(String args[]) {
java.util.Set s = new java.util.HashSet();
s.add(new UsingSet("A"));
s.add(new UsingSet("b"));
s.add(new UsingSet("a"));
s.add(new UsingSet("b"));
s.add(new UsingSet("a"));
s.add(new Integer(1));
s.add(new Integer(1));
System.out.println("s = " + s);
}
}
output is:
hashcode called65
hashcode called98
hashcode called97
hashcode called98
hashcode called97
s = [1, b, b, A, a, a]
HashCode & Equals methods
Only Override HashCode, Use the default Equals:
Only the references to the same object will return true. In other words, those objects you expected to be equal will not be equal by calling the equals method.
Only Override Equals, Use the default HashCode: There might be duplicates in the HashMap or HashSet. We write the equals method and expect{"abc", "ABC"} to be equals. However, when using a HashMap, they might appear in different buckets, thus the contains() method will not detect them each other.
James Large answer is incorrect, or rather misleading (and part incorrect as well). I will explain.
If two objects are equal according to their equals() method, they must also have the same hash code.
If two objects have the same hash code, they do NOT have to be equal too.
Here is the actual wording from the java.util.Object documentation:
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.
It is true, that if two objects don't have the same hash then they are not equal. However, hashing is not a way to check equality - so it is wildly incorrect to say that it is a faster way to check equality.
Also, it is also wildly incorrect to say the hashCode function is an efficient way to do anything. This is all up to implementation, but the default implementation for hashCode of a string is very inefficient as the String gets large. It will perform a calculation based on each char of the String, so if you are using large Strings as keys, then this becomes very inefficient; moreso if you have a large number of buckets.
In a Map (HashSet uses a HashMap internally), there are buckets and in each bucket is a linked list. Java uses the hashCode() function to find out which bucket it belongs in (it actually will modify the hash, depending on how many buckets exist). Since two objects may share the same hash, it will iterate through the linked list sequentially next, checking the equals() method to see if the object is a duplicate. Per the java.util.Set documenation:
A collection that contains no duplicate elements.
So, if its hashCode() leads it to a bucket, in which that bucket contains an Object where the .equals() evaluates to true, then the previous Object is overwritten with the new Object. You can probably view here for more information:
How does a Java HashMap handle different objects with the same hash code?
Generally speaking though, it is good practice that if you overwrite the hashCode function, you also overwrite the equals function (if I'm not mistaken, this breaks the contract if you choose not to).
Simply you can Assume hashcode and equals methods as a 2D search like:-
Where Hashcode is the Rows and the object list is the Column.
Consider the following class structure.
public class obj
{
int Id;
String name;
public obj(String name,int id)
{
this.id=id;
this.name=name;
}
}
now if you create the objects like this:-
obj obj1=new obj("Hassu",1);
obj obj2=new obj("Hoor",2);
obj obj3=new obj("Heniel",3);
obj obj4=new obj("Hameed",4);
obj obj5=new obj("Hassu",1);
and you place this objects in map like this :-
HashMap hMap=new HashMap();
1. hMap.put(obj1,"value1");
2. hMap.put(obj2,"value2");
3. hMap.put(obj3,"value3");
4. hMap.put(obj4,"value4");
5. hMap.put(obj5,"value5");
now if you have not override the hashcode and equals then after putting all the objects till line 5 if you put obj5 in the map as By Default HashCode you get different hashCode so the row(Bucket will be different).
So in runtime memory it will be stored like this.
|hashcode | Objects
|-----------| ---------
|000562 | obj1
|000552 | obj2
|000588 | obj3
|000546 | obj4
|000501 | obj5
Now if you create the same object Like :-
obj obj6 = new obj("hassu",1);
And if you search for this value in the map.like
if(hMap.conaints(obj6))
or
hMpa.get(obj 6);
though the key(obj1) with the same content is available you will get false and null respectively.
Now if you override only equals method.
and perform the same content search key will also get the Null as the HashCode for obj6 is different and in that hashcode you wont find any key.
Now if you override only hashCode method.
You will get the same bucket (HashCode row) but the content cant be checked and it will take the reference checked implementation by Super Object Class.
SO here if you search for the key hMap.get(obj6) you will get the correct hashcode:- 000562 but as the reference for both obj1 and obj6 is different you will get null.
Set will behave differently.
Uniqueness wont happen. Because unique will be achieved by both hashcode and equals methods.
output will be liked this s = [A, a, b, 1] instead of early one.
Apart that remove and contains all wont work.
Without looking at your code...
The whole point of hash codes is to speed up the process of testing two objects for equality. It can be costly to test whether two large, complex objects are equal, but it is trivially easy to compare their hash codes, and hash codes can be pre-computed.
The rule is: If two objects don't have the same hash code, that means they are not equal. No need to do the expensive equality test.
So, the answer to the question in your title: If you define an equals() method that says object A is equal to object B, and you define a hashCode() method that says object A is not equal to object B (i.e., it says they have different hash codes), and then you hand those two objects to some library that cares whether they are equal or not (e.g., if you put them in a hash table), then the behavior of the library is going to be undefined (i.e., probably wrong).
Added information: Wow! I really missed seeing the forest for the trees here---thinking about the purpose of hashCode() without putting it in the context of HashMap. If m is a Map with N entries, and k is a key; what is the purpose of calling m.get(k)? The purpose, obviously, is to search the map for an entry whose key is equal to k.
What if hash codes and hash maps had not been invented? Well the best you could do, assuming that the keys have a natural, total order, is to search a TreeMap, comparing the given key for equality with O(log(N)) other keys. In the worst case, where the keys have no order, you would have to compare the given key for equality with every key in the map until you either find a match or tested them all. In other words, the complexity of m.get(k) would be O(N).
When m is a HashMap, the complexity of m.get(k) is O(1), whether the keys can be ordered or not.
So, I messed up by saying that the point of hash codes was to speed up the process of testing two objects for equality. It's really about testing an object for equality with a whole collection of other objects. That's where comparing hash codes doesn't just help a little; It helps by orders of magnitude...
...If the k.hashCode() and k.equals(o) methods obey the rule: j.hashCode()!=k.hashCode() implies !j.equals(k).
I'm profiling some old java code and it appears that my caching of values using a static HashMap and a access method does not work.
Caching code (a bit abstracted):
static HashMap<Key, Value> cache = new HashMap<Key, Value>();
public static Value getValue(Key key){
System.out.println("cache size="+ cache.size());
if (cache.containsKey(key)) {
System.out.println("cache hit");
return cache.get(key);
} else {
System.out.println("no cache hit");
Value value = calcValue();
cache.put(key, value);
return value;
}
}
Profiling code:
for (int i = 0; i < 100; i++)
{
getValue(new Key());
}
Result output:
cache size=0
no cache hit
(..)
cache size=99
no cache hit
It looked like a standard error in Key's hashing code or equals code.
However:
new Key().hashcode == new Key().hashcode // TRUE
new Key().equals(new Key()) // TRUE
What's especially weird is that cache.put(key, value) just adds another value to the hashmap, instead of replacing the current one.
So, I don't really get what's going on here. Am I doing something wrong?
edit
Ok, I see that in the real code the Key gets used in other methods and changes, which therefore get's reflected in the hashCode of the object in the HashMap. Could that be the cause of this behaviour, that it goes missing?
On a proper #Override of equals/hashCode
I'm not convinced that you #Override (you are using the annotation, right?) hashCode/equals properly. If you didn't use #Override, you may have defined int hashcode(), or boolean equals(Key), neither of which would do what is required.
On key mutation
If you are mutating the keys of the map, then yes, trouble will ensue. From the documentation:
Note: great care must be exercised if mutable objects are used as map keys. The behavior of a map is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is a key in the map.
Here's an example:
Map<List<Integer>,String> map =
new HashMap<List<Integer>,String>();
List<Integer> theOneKey = new ArrayList<Integer>();
map.put(theOneKey, "theOneValue");
System.out.println(map.containsKey(theOneKey)); // prints "true"
theOneKey.add(42);
System.out.println(map.containsKey(theOneKey)); // prints "false"
By the way, prefer interfaces to implementation classes in type declarations. Here's a quote from Effective Java 2nd Edition: Item 52: Refer objects by their interfaces
[...] you should favor the use of interfaces rather than classes to refer to objects. If appropriate interface types exist, then parameters, return values, variables, and fields should all be declared using interface types.
In this case, if at all possible, you should declare cache as simply a Map instead of a HashMap.
I'd recommend double and triple checking the equals and hashCode methods. Note that it's hashCode, not hashcode.
Looking at the (abstracted) code, everything seems to be in order. It may be that the actual code is not like your redacted version, and that this is more a reflection of how you expect the code to work and not what is happening in practice!
If you can post the code, please do that. In the meantime, here are some pointers to try:
After adding a Key, use exactly the same Key instance again, and verify that it produces a cache hit.
In your test, verify the hashcodes are equal, and that the objects are equal.
Is the Map implementation really a HashMap? WeakHashMap will behave in the way you describe once the keys are no longer reachable.
I'm not sure what your Key class is, but (abstractly similarly to you) what I'd do for a simple check is:
Key k1 = new Key();
Key k2 = new Key();
System.out.println("k1 hash:" + k1.hashcode);
System.out.println("k2 hash:" + k2.hashcode);
System.out.println("ks equal:" + k1.equals(k2));
getValue(k1);
getValue(k2);
if this code shows the anomaly -- same hashcode, equal keys, yet no cache yet -- then there's cause to worry (or, better, debug your Key class;-). The way you're testing, with new Keys all the time, might produce keys that don't necessarily behave the same way.