Related
In Java, obj.hashCode() returns some value. What is the use of this hash code in programming?
hashCode() is used for bucketing in Hash implementations like HashMap, HashTable, HashSet, etc.
The value received from hashCode() is used as the bucket number for storing elements of the set/map. This bucket number is the address of the element inside the set/map.
When you do contains() it will take the hash code of the element, then look for the bucket where hash code points to. If more than 1 element is found in the same bucket (multiple objects can have the same hash code), then it uses the equals() method to evaluate if the objects are equal, and then decide if contains() is true or false, or decide if element could be added in the set or not.
From the Javadoc:
Returns a hash code value for the object. This method is supported for the benefit of hashtables such as those provided by java.util.Hashtable.
The general contract of hashCode is:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.
As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the Java programming language.)
hashCode() is a function that takes an object and outputs a numeric value. The hashcode for an object is always the same if the object doesn't change.
Functions like HashMap, HashTable, HashSet, etc. that need to store objects will use a hashCode modulo the size of their internal array to choose in what "memory position" (i.e. array position) to store the object.
There are some cases where collisions may occur (two objects end up with the same hashcode), and that, of course, needs to be solved carefully.
The value returned by hashCode() is the object's hash code, which is the object's memory address in hexadecimal.
By definition, if two objects are equal, their hash code must also be equal. If you override the equals() method, you change the way two objects are equated and Object's implementation of hashCode() is no longer valid. Therefore, if you override the equals() method, you must also override the hashCode() method as well.
This answer is from the java SE 8 official tutorial documentation
A hashcode is a number generated from any object.
This is what allows objects to be stored/retrieved quickly in a Hashtable.
Imagine the following simple example:
On the table in front of you. you have nine boxes, each marked with a number 1 to 9. You also have a pile of wildly different objects to store in these boxes, but once they are in there you need to be able to find them as quickly as possible.
What you need is a way of instantly deciding which box you have put each object in. It works like an index. you decide to find the cabbage so you look up which box the cabbage is in, then go straight to that box to get it.
Now imagine that you don't want to bother with the index, you want to be able to find out immediately from the object which box it lives in.
In the example, let's use a really simple way of doing this - the number of letters in the name of the object. So the cabbage goes in box 7, the pea goes in box 3, the rocket in box 6, the banjo in box 5 and so on.
What about the rhinoceros, though? It has 10 characters, so we'll change our algorithm a little and "wrap around" so that 10-letter objects go in box 1, 11 letters in box 2 and so on. That should cover any object.
Sometimes a box will have more than one object in it, but if you are looking for a rocket, it's still much quicker to compare a peanut and a rocket, than to check a whole pile of cabbages, peas, banjos, and rhinoceroses.
That's a hash code. A way of getting a number from an object so it can be stored in a Hashtable. In Java, a hash code can be any integer, and each object type is responsible for generating its own. Lookup the "hashCode" method of Object.
Source - here
Although hashcode does nothing with your business logic, we have to take care of it in most cases. Because when your object is put into a hash based container(HashSet, HashMap...), the container puts/gets the element's hashcode.
hashCode() is a unique code which is generated by the JVM for every object creation.
We use hashCode() to perform some operation on hashing related algorithm like Hashtable, Hashmap etc..
The advantages of hashCode() make searching operation easy because when we search for an object that has unique code, it helps to find out that object.
But we can't say hashCode() is the address of an object. It is a unique code generated by JVM for every object.
That is why nowadays hashing algorithm is the most popular search algorithm.
One of the uses of hashCode() is building a Catching mechanism.
Look at this example:
class Point
{
public int x, y;
public Point(int x, int y)
{
this.x = x;
this.y = y;
}
#Override
public boolean equals(Object o)
{
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Point point = (Point) o;
if (x != point.x) return false;
return y == point.y;
}
#Override
public int hashCode()
{
int result = x;
result = 31 * result + y;
return result;
}
class Line
{
public Point start, end;
public Line(Point start, Point end)
{
this.start = start;
this.end = end;
}
#Override
public boolean equals(Object o)
{
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Line line = (Line) o;
if (!start.equals(line.start)) return false;
return end.equals(line.end);
}
#Override
public int hashCode()
{
int result = start.hashCode();
result = 31 * result + end.hashCode();
return result;
}
}
class LineToPointAdapter implements Iterable<Point>
{
private static int count = 0;
private static Map<Integer, List<Point>> cache = new HashMap<>();
private int hash;
public LineToPointAdapter(Line line)
{
hash = line.hashCode();
if (cache.get(hash) != null) return; // we already have it
System.out.println(
String.format("%d: Generating points for line [%d,%d]-[%d,%d] (no caching)",
++count, line.start.x, line.start.y, line.end.x, line.end.y));
}
Suppose I have this class:
public class Node implements Comparable<Node>{
public float key;
public TreeSet<Node> neighbors;
public Node{
//fill neighbors somehow
}
#Override
public int compareTo(Node n) {
if(this.key == n.key)
return 0;
else if(this.key > n.key)
return 1;
else
return -1;
}
}
So this is a classic node of a graph, where each node is connected to a set of nodes (i.e. its neighbors). I'm using TreeSet because I often (very often) to know all the neighbors with their key bigger (smaller) than a certain value. Now, let's suppose I have this method:
//swap nodes keys
void swapKeys(Node a, Node b){
float ak = a.key;
a.key = b.key;
b.key = ak;
}
Notice that this method changes only the two nodes keys, nothing more.
Do this "break" the structure, or everything will continue to work fine?
If this breaks the structure, what about this simple solution:
//swap nodes keys
void swapKeys(Node a, Node b){
a.remove(b);
b.remove(a);
float ak = a.key;
a.key = b.key;
b.key = ak;
a.add(b);
b.add(a);
}
From the TreeSet documentation :
Note that the ordering maintained by a set (whether or not an explicit
comparator is provided) must be consistent with equals if it is to
correctly implement the Set interface.
Your Node class' Comparable implementation is not consistent with equals. (compareTo can return 0 for two Node instances wich are not equal).
This in itself makes your Node class unfit to be elements of a TreeSet.
Even the proposed workaround is not sufficient.
You may be tempted to fix this by implementing equals() (and hashCode()) to be based upon the value contained in the node. But to no avail, as this would go against a warning on the documention of the general Set interface :
Note: Great care must be exercised if mutable objects are used as set
elements. The behavior of a set is not specified if the value of an
object is changed in a manner that affects equals comparisons while
the object is an element in the set. A special case of this
prohibition is that it is not permissible for a set to contain itself
as an element.
So adding equals and hashCode is still not sufficient : your instances must also be immutable.
However the simplest solution, seems to be to forego the Comparable interface altogether, to not implement equals and hashCode, and to simply use a HashSet instead of a TreeSet. In that case you can change the contents of your nodes without consequences to the proper functioning of the set of neighbours.
Have the following class:
public class Member {
private int x;
private long y;
private double d;
public Member(int x, long y, double d) {
this.x = x;
this.y = y;
this.d = d;
}
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + x;
result = (int) (prime * result + y);
result = (int) (prime * result + Double.doubleToLongBits(d));
return result;
}
#Override
public boolean equals(Object obj) {
if (this == obj) {
return true;
}
if (obj instanceof Member) {
Member other = (Member) obj;
return other.x == x && other.y == y
&& Double.compare(d, other.d) == 0;
}
return false;
}
public static void main(String[] args) {
Set<Member> test = new HashSet<Member>();
Member b = new Member(1, 2, 3);
test.add(b);
System.out.println(b.hashCode());
b.x = 0;
System.out.println(b.hashCode());
Member first = test.iterator().next();
System.out.println(test.contains(first));
System.out.println(b.equals(first));
System.out.println(test.add(first));
}
}
It produces the following results:
30814
29853
false
true
true
Because the hashCode depends of the state of the object it can no longer by retrieved properly, so the check for containment fails. The HashSet in no longer working properly. A solution would be to make Member immutable, but is that the only solution? Should all classes added to HashSets be immutable? Is there any other way to handle the situation?
Regards.
Objects in hashsets should either be immutable, or you need to exercise discipline in not changing them after they've been used in a hashset (or hashmap).
In practice I've rarely found this to be a problem - I rarely find myself needing to use complex objects as keys or set elements, and when I do it's usually not a problem just not to mutate them. Of course if you've exposed the references to other code by this time, it can become harder.
Yes. While maintaining your class mutable, you can compute the hashCode and the equals methods based on immutable values of the class ( perhaps a generated id ) to adhere to the hashCode contract defined in Object class:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.
Depending on your situation this may be easier or not.
class Member {
private static long id = 0;
private long id = Member.id++;
// other members here...
public int hashCode() { return this.id; }
public boolean equals( Object o ) {
if( this == o ) { return true; }
if( o instanceOf Member ) { return this.id == ((Member)o).id; }
return false;
}
...
}
If you need a thread safe attribute, you may consider use: AtomicLong instead, but again, it depends on how are you going to use your object.
As already mentioned, one can accept the following three solutions:
Use immutable objects; even when your class is mutable, you may use immutable identities on your hashcode implementation and equals checking, eg an ID-like value.
Similarly to the above, implement add/remove to get a clone of the inserted object, not the actual reference. HashSet does not offer a get function (eg to allow you alter the object later on); thus, you are safe there won't exist duplicates.
Exercise discipline in not changing them after they've been used, as #Jon Skeet suggests
But, if for some reason you really need to modify objects after being inserted to a HashSet, you need to find a way of "informing" your Collection with the new changes. To achieve this functionality:
You can use the Observer design pattern, and extend HashSet to implement the Observer interface. Your Member objects must be Observable and update the HashSet on any setter or other method that affects hashcode and/or equals.
Note 1: Extending 3, using 4: we may accept alterations, but those that do not create an already existing object (eg I updated a user's ID, by assigning a new ID, not setting it to an existing one). Otherwise, you have to consider the scenario where an object is transformed in such a way that is now equal to another object already existing in the Set. If you accept this limitation, 4th suggestion will work fine, else you must be proactive and define a policy for such cases.
Note 2: You have to provide both previous and current states of the altered object on your update implementation, because you have to initially remove the older element (eg use getClone() before setting new values), then add the object with the new state. The following snippet is just an example implementation, it needs changes based on your policy of adding a duplicate.
#Override
public void update(Observable newItem, Object oldItem) {
remove(oldItem);
if (add(newItem))
newItem.addObserver(this);
}
I've used similar techniques on projects, where I require multiple indices on a class, so I can look up with O(1) for Sets of objects that share a common identity; imagine it as a MultiKeymap of HashSets (this is really useful, as you can then intersect/union indices and work similarly to SQL-like searching). In such cases I annotate methods (usually setters) that must fireChange-update each of the indices when a significant change occurs, so indices are always updated with the latest states.
Jon Skeet has listed all alternatives. As for why the keys in a Map or Set must not change:
The contract of a Set implies that at any time, there are no two objects o1 and o2 such that
o1 != o2 && set.contains(o1) && set.contains(o2) && o1.equals(o2)
Why that is required is especially clear for a Map. From the contract of Map.get():
More formally, if this map contains a mapping from a key
k to a value v such that (key==null ? k==null : key.equals(k)), then this method returns v, otherwise it returns null. (There can be at most one such mapping.)
Now, if you modify a key inserted into a map, you might make it equal to some other key already inserted. Moreover, the map can not know that you have done so. So what should the map do if you then do map.get(key), where key is equal to several keys in the map? There is no intuitive way to define what that would mean - chiefly because our intuition for these datatypes is the mathematical ideal of sets and mappings, which don't have to deal with changing keys, since their keys are mathematical objects and hence immutable.
Theoretically (and more often than not practically too) your class either:
has a natural immutable identity that can be inferred from a subset of its fields, in which case you can use those fields to generate the hashCode from.
has no natural identity, in which case using a Set to store them is unnecessary, you could just as well use a List.
Never change 'hashable field" after putting in hash based container.
As if you (Member) registered your phone number (Member.x) in yellow page(hash based container), but you changed your number, then no one can find you in the yellow page any more.
In Java, obj.hashCode() returns some value. What is the use of this hash code in programming?
hashCode() is used for bucketing in Hash implementations like HashMap, HashTable, HashSet, etc.
The value received from hashCode() is used as the bucket number for storing elements of the set/map. This bucket number is the address of the element inside the set/map.
When you do contains() it will take the hash code of the element, then look for the bucket where hash code points to. If more than 1 element is found in the same bucket (multiple objects can have the same hash code), then it uses the equals() method to evaluate if the objects are equal, and then decide if contains() is true or false, or decide if element could be added in the set or not.
From the Javadoc:
Returns a hash code value for the object. This method is supported for the benefit of hashtables such as those provided by java.util.Hashtable.
The general contract of hashCode is:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.
As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the Java programming language.)
hashCode() is a function that takes an object and outputs a numeric value. The hashcode for an object is always the same if the object doesn't change.
Functions like HashMap, HashTable, HashSet, etc. that need to store objects will use a hashCode modulo the size of their internal array to choose in what "memory position" (i.e. array position) to store the object.
There are some cases where collisions may occur (two objects end up with the same hashcode), and that, of course, needs to be solved carefully.
The value returned by hashCode() is the object's hash code, which is the object's memory address in hexadecimal.
By definition, if two objects are equal, their hash code must also be equal. If you override the equals() method, you change the way two objects are equated and Object's implementation of hashCode() is no longer valid. Therefore, if you override the equals() method, you must also override the hashCode() method as well.
This answer is from the java SE 8 official tutorial documentation
A hashcode is a number generated from any object.
This is what allows objects to be stored/retrieved quickly in a Hashtable.
Imagine the following simple example:
On the table in front of you. you have nine boxes, each marked with a number 1 to 9. You also have a pile of wildly different objects to store in these boxes, but once they are in there you need to be able to find them as quickly as possible.
What you need is a way of instantly deciding which box you have put each object in. It works like an index. you decide to find the cabbage so you look up which box the cabbage is in, then go straight to that box to get it.
Now imagine that you don't want to bother with the index, you want to be able to find out immediately from the object which box it lives in.
In the example, let's use a really simple way of doing this - the number of letters in the name of the object. So the cabbage goes in box 7, the pea goes in box 3, the rocket in box 6, the banjo in box 5 and so on.
What about the rhinoceros, though? It has 10 characters, so we'll change our algorithm a little and "wrap around" so that 10-letter objects go in box 1, 11 letters in box 2 and so on. That should cover any object.
Sometimes a box will have more than one object in it, but if you are looking for a rocket, it's still much quicker to compare a peanut and a rocket, than to check a whole pile of cabbages, peas, banjos, and rhinoceroses.
That's a hash code. A way of getting a number from an object so it can be stored in a Hashtable. In Java, a hash code can be any integer, and each object type is responsible for generating its own. Lookup the "hashCode" method of Object.
Source - here
Although hashcode does nothing with your business logic, we have to take care of it in most cases. Because when your object is put into a hash based container(HashSet, HashMap...), the container puts/gets the element's hashcode.
hashCode() is a unique code which is generated by the JVM for every object creation.
We use hashCode() to perform some operation on hashing related algorithm like Hashtable, Hashmap etc..
The advantages of hashCode() make searching operation easy because when we search for an object that has unique code, it helps to find out that object.
But we can't say hashCode() is the address of an object. It is a unique code generated by JVM for every object.
That is why nowadays hashing algorithm is the most popular search algorithm.
One of the uses of hashCode() is building a Catching mechanism.
Look at this example:
class Point
{
public int x, y;
public Point(int x, int y)
{
this.x = x;
this.y = y;
}
#Override
public boolean equals(Object o)
{
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Point point = (Point) o;
if (x != point.x) return false;
return y == point.y;
}
#Override
public int hashCode()
{
int result = x;
result = 31 * result + y;
return result;
}
class Line
{
public Point start, end;
public Line(Point start, Point end)
{
this.start = start;
this.end = end;
}
#Override
public boolean equals(Object o)
{
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Line line = (Line) o;
if (!start.equals(line.start)) return false;
return end.equals(line.end);
}
#Override
public int hashCode()
{
int result = start.hashCode();
result = 31 * result + end.hashCode();
return result;
}
}
class LineToPointAdapter implements Iterable<Point>
{
private static int count = 0;
private static Map<Integer, List<Point>> cache = new HashMap<>();
private int hash;
public LineToPointAdapter(Line line)
{
hash = line.hashCode();
if (cache.get(hash) != null) return; // we already have it
System.out.println(
String.format("%d: Generating points for line [%d,%d]-[%d,%d] (no caching)",
++count, line.start.x, line.start.y, line.end.x, line.end.y));
}
I have a bunch of objects of a class Puzzle. I have overridden equals() and hashCode(). When it comes time to present the solutions to the user, I'd like to filter out all the Puzzles that are "similar" (by the standard I have defined), so the user only sees one of each.
Similarity is transitive.
Example:
Result of computations:
A (similar to A)
B (similar to C)
C
D
In this case, only A or D and B or C would be presented to the user - but not two similar Puzzles. Two similar puzzles are equally valid. It is only important that they are not both shown to the user.
To accomplish this, I wanted to use an ADT that prohibits duplicates. However, I don't want to change the equals() and hashCode() methods to return a value about similarity instead. Is there some Equalator, like Comparator, that I can use in this case? Or is there another way I should be doing this?
The class I'm working on is a Puzzle that maintains a grid of letters. (Like Scrabble.) If a Puzzle contains the same words, but is in a different orientation, it is considered to be similar. So the following to puzzle:
(2, 2): A
(2, 1): C
(2, 0): T
Would be similar to:
(1, 2): A
(1, 1): C
(1, 0): T
Okay you have a way of measuring similarity between objects. That means they form a Metric Space.
The question is, is your space also a Euclidean space like normal three dimensional space, or integers or something like that? If it is, then you could use a binary space partition in however many dimensions you've got.
(The question is, basically: is there a homomorphism between your objects and an n-dimensional real number vector? If so, then you can use techniques for measuring closeness of points in n-dimensional space.)
Now, if it's not a euclidean space then you've got a bigger problem. An example of a non-euclidean space that programers might be most familiar with would be the Levenshtein Distance between to strings.
If your problem is similar to seeing how similar a string is to a list of already existing strings then I don't know of any algorithms that would do that without O(n2) time. Maybe there are some out there.
But another important question is: how much time do you have? How many objects? If you have time or if your data set is small enough that an O(n2) algorithm is practical, then you just have to iterate through your list of objects to see if it's below a certain threshold. If so, reject it.
Just overload AbstractCollection and replace the Add function. Use an ArrayList or whatever. Your code would look kind of like this
class SimilarityRejector<T> extends AbstractCollection<T>{
ArrayList<T> base;
double threshold;
public SimilarityRejector(double threshold){
base = new ArrayList<T>();
this.threshold = threshold;
}
public void add(T t){
boolean failed = false;
for(T compare : base){
if(similarityComparison(t,compare) < threshold) faled = true;
}
if(!failed) base.add(t);
}
public Iterator<T> iterator() {
return base.iterator();
}
public int size() {
return base.size();
}
}
etc. Obviously T would need to be a subclass of some class that you can perform a comparison on. If you have a euclidean metric, then you can use a space partition, rather then going through every other item.
I'd use a wrapper class that overrides equals and hashCode accordingly.
private static class Wrapper {
public static final Puzzle puzzle;
public Wrapper(Puzzle puzzle) {
this.puzzle = puzzle;
}
#Override
public boolean equals(Object object) {
// ...
}
#Override
public int hashCode() {
// ...
}
}
and then you wrap all your puzzles, put them in a map, and get them out again…
public Collection<Collection<Puzzle>> method(Collection<Puzzles> puzzles) {
Map<Wrapper,<Collection<Puzzle>> map = new HashMap<Wrapper,<Collection<Puzzle>>();
for (Puzzle each: puzzles) {
Wrapper wrapper = new Wrapper(each);
Collection<Puzzle> coll = map.get(wrapper);
if (coll == null) map.put(wrapper, coll = new ArrayList<Puzzle>());
coll.add(puzzle);
}
return map.values();
}
Create a TreeSet using your Comparator
Adds all elements into the set
All duplicates are stripped out
Normally "similarity" is not a transitive relationship. So the first step would be to think of this in terms of equivalence rather than similarity. Equivalence is reflexive, symmetric and transitive.
Easy approach here is to define a puzzle wrapper whose equals() and hashCode() methods are implemented according to the equivalence relation in question.
Once you have that, drop the wrapped objects into a java.util.Set and that filters out duplicates.
IMHO, most elegant way was described by Gili (TreeSet with custom Comparator).
But if you like to make it by yourself, seems this easiest and clearest solution:
/**
* Distinct input list values (cuts duplications)
* #param items items to process
* #param comparator comparator to recognize equal items
* #return new collection with unique values
*/
public static <T> Collection<T> distinctItems(List<T> items, Comparator<T> comparator) {
List<T> result = new ArrayList<>();
for (int i = 0; i < items.size(); i++) {
T item = items.get(i);
boolean exists = false;
for (int j = 0; j < result.size(); j++) {
if (comparator.compare(result.get(j), item) == 0) {
exists = true;
break;
}
}
if (!exists) {
result.add(item);
}
}
return result;
}