Getting index of a key in a Hashtable in Java - java

I am writing an implementation of a Dictionary class. I am currently writing the add method
(public V add (K key, V value))
Algorithm:
if the table is too full
rehash
grab the index based on the key
probe with the index and key to resolve collisions
if the table at the index is null, or has been removed
increment the number of entries
increment the number of locations used
set the table at that index to a new tableentry
else
grab the value currently at that index
set the value at the index to the new value
return the old value
I can't figure out how to grab the index based on the key provided. I also don't know how to refer to a specific index in a Hashtable.
Thanks

Assuming that you are re-implementing a Dictionary/Hashtable from scratch for whatever reason (if you're not, and use a Hashtable - then the algorithm you're citing is completely out of place).
The classic approach when implementing an array-backed hashtable is to calculate the hash of the key, and then put the value in the array index of that key modulus the array size (after accounting for collisions in whatever way).
So, "grab the index based on the key" would be index = hash(key) % backing_array_length;.

Related

Hashmap with two key object having same value but different bucket index

If Hashmap has two keys having same value, for eg:
HashMap map=new HashMap();
map.put("a","abc");
map.put("a","xyz");
So here put two key with "a" value and suppose for first bucketindex=1 and
second bucketindex=9
So my question is if bucket index for both is coming different after
applying hashing algorithm, in this how to handle for not inserting
duplicate key as it is already present and hashmap cannot have duplicate
key.
please suggest your view on this.
There won't be any such thing as "second bucket index".
I suggest you add something like System.out.println(map.toString()) in order to see what that second put() has done to your map.
EDIT:
In the method put(key,value), the "bucket index" is computed as a function of the key element's value, not the value element's value (so "a" and "a" give the same index for the bucket). This function is supposed to be deterministic so feeding it the same value ("a" in your case), the same hashCode() will come out and subsequently, the same bucket index.
In Java if a hashing function returns the same hash, equality of two objects is determined by equals() method. And if the objects are found equal, the old one is simply replaced by the new one.
Instead, if the objects are not equal, they just get chained in a linked list (or a balanced tree) and the map contains both objects, because they are different.
So, back to your question: "if bucket index for both is coming different after applying hashing algorithm" - this is impossible for equal objects. Equal objects must have the same hash code.
To make #Erwin's answer more clear, here's the source code of HashMap from JDK
public V put(K key, V value) {
return putVal(hash(key), key, value, false, true);
}
static final int hash(Object key) {
int h;
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}
Digging more deep you will find that the bucket index is calculated from key's hash code.
To make it simple and straightforward, putting duplicate key with different values to the same HashMap will result just one single entry, which the second put is just overwriting the value of the entry.
If your question is how to create a hashmap that can handle more than one value for the same key, what you need is a Map> so it adds a new value to the arraylist everytime the key is the same.

Iterating over a key set of a LinkedHashMap

is it possible to retrieve the next (next, as in the next key value which has been inserted) key value of a LinkedHashMap?
E.g. the current key value is 2 and the next key value is 4. I want to use the value of the next key without setting my iterator (or whatsoever) one index further. Apparently using one iterator doesn't seem to do the job. Another idea would be to cast the set returned by myHashMap.keySet() to some other implementing class but I am not sure if it is possible to retrieve the next element.
Any ideas?
Assume there is LinkedHashMap<K,V> linkHashMap, now define a custom method getNextKey which takes a parameter index and return the next index value if valid.
Code snippet (not working code)-
public K getNextKey(int index){
// put check if index is valid
K[] keyArray = linkedHasMap.keySet.toArray(new K[linkedHasMap.size()]);
return K[index+1];
}

Mapping int to int (in Java)

In Java.
How can I map a set of numbers(integers for example) to another set of numbers?
All the numbers are positive and all the numbers are unique in their own set.
The first set of numbers can have any value, the second set of numbers represent indexes of an array, and so the goal is to be able to access the numbers in the second set through the numbers in the first set. This is a one to one association.
Speed is crucial as the method will have to be called many times each second.
Edit: I tried it with SE hashmap implementation, but found it to be slow for my purposes.
There's an article, devoted to this problem (with a solution): Implementing a world fastest Java int-to-int hash map
Code can be found in related GitHub repository. (Best results are in class IntIntMap4a.java )
Citation from the article:
Summary
If you want to optimize your hash map for speed, you have to do as much as you can of the following list:
Use underlying array(s) with capacity equal to a power of 2 - it will allow you to use cheap & instead of expensive % for array index
Do not store the state in the separate array - use dedicated fields for free/removed keys and values.
Interleave keys and values in the one array - it will allow you to load a value into memory for free.
Implement a strategy to get rid of 'removed' cells - you can sacrifice some of remove performance in favor of more frequent get/put.
Scramble the keys while calculating the initial cell index - this is required to deal with the case of consecutive keys.
Yes, I know how to use citation formatting. But it looks awful and doesn't handle bullet lists well.
The structure you are looking for is called an associative array. In computer science, an associative array, map, symbol table, or dictionary is an abstract data type composed of a collection of (key, value) pairs, such that each possible key appears just once in the collection.
In java in particular as already mentioned this is easily done with a HashMap.
HashMap<Integer, Integer> cache = new HashMap<Integer, Integer>();
You can insert elements with the method put
cache.put(21, 42);
and you can retrieve a value with get
Integer key = 21
Integer value = cache.get(key);
System.out.println("Key: " + key +" value: "+ value);
Key: 21 value: 42
If you want to iterate through data you need to define an iterator:
Iterator<Integer> Iterator = cache.keySet().iterator();
while(Iterator.hasNext()){
Integer key = Iterator.next();
System.out.println("key: " + key + " value: " + cache.get(key));
}
Sounds like HashMap<Integer,Integer> is what you're looking for.
If you are willing to use an external library, you can use apache's IntToIntMap, which is a part of Apache Lucene.
It implements a pretty efficient int to int map that uses primitives for tasks that should not suffer the boxing overhead.
If you have a limit for the size of the first list, you can just use a large array. Suppose you know there first list only has numbers 0-99, you can use int[100]. Use the first number as an array index.
Your requirements can be satisfied by the Map interface. As an example, see HashMap<K,V>.
See Map and HashMap

Collision resolution in Java HashMap

Java HashMap uses put method to insert the K/V pair in HashMap.
Lets say I have used put method and now HashMap<Integer, Integer> has one entry with key as 10 and value as 17.
If I insert 10,20 in this HashMap it simply replaces the the previous entry with this entry due to collision because of same key 10.
If the key collides HashMap replaces the old K/V pair with the new K/V pair.
So my question is when does the HashMap use Chaining collision resolution technique?
Why it did not form a linkedlist with key as 10 and value as 17,20?
When you insert the pair (10, 17) and then (10, 20), there is technically no collision involved. You are just replacing the old value with the new value for a given key 10 (since in both cases, 10 is equal to 10 and also the hash code for 10 is always 10).
Collision happens when multiple keys hash to the same bucket. In that case, you need to make sure that you can distinguish between those keys. Chaining collision resolution is one of those techniques which is used for this.
As an example, let's suppose that two strings "abra ka dabra" and "wave my wand" yield hash codes 100 and 200 respectively. Assuming the total array size is 10, both of them end up in the same bucket (100 % 10 and 200 % 10). Chaining ensures that whenever you do map.get( "abra ka dabra" );, you end up with the correct value associated with the key. In the case of hash map in Java, this is done by using the equals method.
In a HashMap the key is an object, that contains hashCode() and equals(Object) methods.
When you insert a new entry into the Map, it checks whether the hashCode is already known. Then, it will iterate through all objects with this hashcode, and test their equality with .equals(). If an equal object is found, the new value replaces the old one. If not, it will create a new entry in the map.
Usually, talking about maps, you use collision when two objects have the same hashCode but they are different. They are internally stored in a list.
It could have formed a linked list, indeed. It's just that Map contract requires it to replace the entry:
V put(K key, V value)
Associates the specified value with the specified key in this map
(optional operation). If the map previously contained a mapping for
the key, the old value is replaced by the specified value. (A map m is
said to contain a mapping for a key k if and only if m.containsKey(k)
would return true.)
http://docs.oracle.com/javase/6/docs/api/java/util/Map.html
For a map to store lists of values, it'd need to be a Multimap. Here's Google's: http://google-collections.googlecode.com/svn/trunk/javadoc/com/google/common/collect/Multimap.html
A collection similar to a Map, but which may associate multiple values
with a single key. If you call put(K, V) twice, with the same key but
different values, the multimap contains mappings from the key to both
values.
Edit: Collision resolution
That's a bit different. A collision happens when two different keys happen to have the same hash code, or two keys with different hash codes happen to map into the same bucket in the underlying array.
Consider HashMap's source (bits and pieces removed):
public V put(K key, V value) {
int hash = hash(key.hashCode());
int i = indexFor(hash, table.length);
// i is the index where we want to insert the new element
addEntry(hash, key, value, i);
return null;
}
void addEntry(int hash, K key, V value, int bucketIndex) {
// take the entry that's already in that bucket
Entry<K,V> e = table[bucketIndex];
// and create a new one that points to the old one = linked list
table[bucketIndex] = new Entry<>(hash, key, value, e);
}
For those who are curious how the Entry class in HashMap comes to behave like a list, it turns out that HashMap defines its own static Entry class which implements Map.Entry. You can see for yourself by viewing the source code:
GrepCode for HashMap
First of all, you have got the concept of hashing a little wrong and it has been rectified by #Sanjay.
And yes, Java indeed implement a collision resolution technique. When two keys get hashed to a same value (as the internal array used is finite in size and at some point the hashcode() method will return same hash value for two different keys) at this time, a linked list is formed at the bucket location where all the informations are entered as an Map.Entry object that contains a key-value pair. Accessing an object via a key will at worst require O(n) if the entry in present in such a lists. Comparison between the key you passed with each key in such list will be done by the equals() method.
Although, from Java 8 , the linked lists are replaced with trees (O(log n))
Your case is not talking about collision resolution, it is simply replacement of older value with a new value for the same key because Java's HashMap can't contain duplicates (i.e., multiple values) for the same key.
In your example, the value 17 will be simply replaced with 20 for the same key 10 inside the HashMap.
If you are trying to put a different/new value for the same key, it is not the concept of collision resolution, rather it is simply replacing the old value with a new value for the same key. It is how HashMap has been designed and you can have a look at the below API (emphasis is mine) taken from here.
public V put(K key, V value)
Associates the specified value with the
specified key in this map. If the map previously contained a mapping
for the key, the old value is replaced.
On the other hand, collision resolution techniques comes into play only when multiple keys end up with the same hashcode (i.e., they fall in the same bucket location) where an entry is already stored. HashMap handles the collision resolution by using the concept of chaining i.e., it stores the values in a linked list (or a balanced tree since Java8, depends on the number of entries).
When multiple keys end up in same hash code which is present in same bucket.
When the same key has different values then the old value will be replaced with new value.
Liked list converted to balanced Binary tree from java 8 version on wards in worst case scenario.
Collision happen when 2 distinct keys generate the same hashcode() value.
When there are more collisions then there it will leads to worst performance of hashmap.
Objects which are are equal according to the equals method must return the same hashCode value.
When both objects return the same has code then they will be moved into the same bucket.
There is difference between collision and duplication.
Collision means hashcode and bucket is same, but in duplicate, it will be same hashcode,same bucket, but here equals method come in picture.
Collision detected and you can add element on existing key. but in case of duplication it will replace new value.
It isn't defined to do so. In order to achieve this functionality, you need to create a map that maps keys to lists of values:
Map<Foo, List<Bar>> myMap;
Or, you could use the Multimap from google collections / guava libraries
There is no collision in your example. You use the same key, so the old value gets replaced with the new one. Now, if you used two keys that map to the same hash code, then you'd have a collision. But even in that case, HashMap would replace your value! If you want the values to be chained in case of a collision, you have to do it yourself, e.g. by using a list as a value.

problem getting value from HashMap

Hi all I'm using a HashMap to hold one of my object with a string key. when I put an object with a key it has no problem, when I put my second object I got my object added but can't get it with its key. Somewhat it goes to somewhere that is "next". I took a screenshot from debug mode (eclipse), below
although size shows 2, I can't see my second item in hashmap, but in other hashmap's next node.
To note something I use my key like in a form "name.tag", tag and name in same time can never be the same, but "tag" can be the same. does hashmap has something to do with dot operator when evaluating keys? I hope I could write clearly,
Thanks in advance
Edit:
Here is a piece of code I use to create my hashmap
private HashMap<String,ParameterItem> parseParametersNode(DataModel parent,Element element){
NodeList parameterChilds=element.getChildNodes();//gep element parameters
HashMap<String, ParameterItem> parameterItems=new HashMap<String, ParameterItem>();
for(int i=0;i<parameterChilds.getLength();i++){
if(parameterChilds.item(i).getNodeType()==Node.ELEMENT_NODE){
Element el=(Element) parameterChilds.item(i);
NamedNodeMap atts=el.getAttributes();
ParameterItem item=new ParameterItem();
for(int j=0;j<atts.getLength();j++){
Attr attribute=(Attr) atts.item(j);
String attributeValue=attribute.getValue();
String attributeName=attribute.getName();
item.setParsedProperty(attributeName, attributeValue);
} /*check attributes later*/
//finish loop and insert paramitem to params
String key="key"+i;
if(item.getTag()!=null && item.getName()!=null)
key=item.getName()+"."+item.getTag();
parameterItems.put(key, item);
// testParam=item;
// parameterItems.put(key, testParam);
}
}
return parameterItems;
}
There is not really a problem here: you have a hash collision. That is, both of your keys have been placed in the same hash bucket. It appears you have only four buckets (odd, I thought the initial default was 10 or 16), so the chance of that with random data is 25 percent. Your size incremented just fine. The next is the internal implementation’s way of pointing to the next element in the same bucket. If the number of elements in each buckets gets too big, Java will internally rehash into more buckets.
I do not see why you need a HashTable here since you are numbering your keys consecutively (you could use an ArrayList), but maybe this is just starter code and your real use case is different.
You have the code:
String key="key"+i;
but right after this you set key again not adding to it:
if(item.getTag()!=null && item.getName()!=null)
key=item.getName()+"."+item.getTag();
Should this be key +=item.getName()+"."+item.getTag(); ?

Categories