Related
I am looking at the source code for HashMap in Java 7, and I see that the put method will check if any entry is already present and if it is present then it will replace the old value with the new value.
for (Entry<K,V> e = table[i]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
So, basically it means that there would always be only one entry for the given key, I have seen this by debugging as well, but if I am wrong then please correct me.
Now, since there is only one entry for a given key, why does the get method have a FOR loop, since it could have simply returned the value directly?
for (Entry<K,V> e = table[indexFor(hash, table.length)];
e != null;
e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
I feel the above loop is unnecessary. Please help me understand if I am wrong.
table[indexFor(hash, table.length)] is the bucket of the HashMap that may contain the key we are looking for (if it is present in the Map).
However, each bucket may contain multiple entries (either different keys having the same hashCode(), or different keys with different hashCode() that still got mapped to the same bucket), so you must iterate over these entries until you find the key you are looking for.
Since the expected number of entries in each bucket should be very small, this loop is still executed in expected O(1) time.
If you see the internal working of get method of HashMap.
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry<K,V> e = table[indexFor(hash, table.length)];e != null;e = e.next)
{
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
First, it gets the hash code of the key object, which is passed, and
finds the bucket location.
If the correct bucket is found, it returns the value (e.value)
If no match is found, it returns null.
Some times there may be chances of Hashcode collision and for solving this collision Hashmap uses equals() and then store that element into LinkedList in same bucket.
Lets take example:
Fetch the data for key vaibahv:
map.get(new Key("vaibhav"));
Steps:
Calculate hash code of Key {“vaibhav”}.It will be generated as 118.
Calculate index by using index method it will be 6.
Go to index 6 of array and compare first element’s key with given
key. If both are equals then return the value, otherwise check for
next element if it exists.
In our case it is not found as first element and next of node object
is not null.
If next of node is null then return null.
If next of node is not null traverse to the second element and
repeat the process 3 until key is not found or next is not null.
For this retrieval process for loop will be used.
For more reference you can refer
this
For the record, in java-8, this is present also (sort of, since there are TreeNodes also):
if ((e = first.next) != null) {
if (first instanceof TreeNode)
return ((TreeNode<K,V>)first).getTreeNode(hash, key);
do {
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
return e;
} while ((e = e.next) != null);
}
Basically (for the case when the bin is not a Tree), iterate the entire bin, until you find the entry we are looking for.
Looking at this implementation you might understand why providing a good hash is good - so that not all entries end up in the same bucket, thus a bigger time to search for it.
I think #Eran has already answered your query well and #Prashant has also made a good attempt along with other people who have answered, so let me explain it using an example so that concept becomes very clear.
Concepts
Basically what #Eran is trying to say that in a given bucket (basically at given index of the array) it is possible that there is more than one entry (nothing but Entry object) and this is possible when 2 or more keys give different hash but give the same index/bucket location.
Now, in order to put the entry in the hashmap, this is what happens at a high level (read carefully because I have gone the extra mile to explain some good things which are otherwise not part of your question):
Get the hash: what happens here is that first hash is calculated for a given key (notice that this is not hashCode, a hash is calculated using the hashCode and it is done as-as to mitigate the risk of poorly written hash function).
Get the index: This is basically the index of the array or in other words bucket. Now, why this index is calculated instead of directly using the hash as the index is because to mitigate the risk that hash could more than the size of the hashmap, so this index calculation step ensures that index will always be less than the size of the hashmap.
And when a situation occurs when 2 keys give different hash but the same index, then both those will go in the same bucket, and that is the reason that FOR loop is important.
Example
Below is a simple example I have created to demonstrate the concept to you:
public class Person {
private int id;
Person(int _id){
id = _id;
}
public int getId() {
return id;
}
public void setId(int id) {
this.id = id;
}
#Override
public int hashCode() {
return id;
}
}
Test class:
import java.util.Map;
public class HashMapHashingTest {
public static void main(String[] args) {
Person p1 = new Person(129);
Person p2 = new Person(133);
Map<Person, String> hashMap = new MyHashMap<>(2);
hashMap.put(p1, "p1");
hashMap.put(p2, "p2");
System.out.println(hashMap);
}
}
Debug screenshot (please click and zoom because it is looking small):
Notice, that in above example, both Person object gives different hash value (136 and 140 respectively) but gives the same index of 0, so both objects go in the same bucket. In the screenshot, you can see that both objects are at index 0 and there you have a next also populated which basically points to the second object.
Update:
Another easiest way to see that more than one key is going into the same bucket is by creating a class and overriding the hashCode method to always return the same int value, now what would happen is that all the objects of that class would give the same index/bucket location but since you have not overridden the equals method so they would not be considered same and hence will form a list at that index/bucket location.
Another twist in this would suppose you override the equals method as well and compare all objects equal then only one object will be present at the index/bucket location because all objects are equal.
While the other answers explain what is going on, OP's comments on those answers leads me to think a different angle of explanation is required.
Simplified example
Let's say you are going to toss 10 strings into a hash map: "A", "B", "C", "Hi", "Bye", "Yo", "Yo-yo", "Z", "1", "2"
You are using HashMap as your hash map instead of making your own hash map (good choice). Some of the stuff below will not use HashMap implementation directly but will approach it from a more theoretical and abstract point of view.
HashMap does not magically know that you are going to add 10 strings to it, nor does it know what strings you will be putting into it later. It has to provide places to put whatever you might give to it... for all it knows you are going to put 100,000 strings in it - perhaps every word in the dictionary.
Let's say that, because of the constructor argument you chose when making your new HashMap(n) that your hash map has 20 buckets. We'll call them bucket[0] through bucket[19].
map.put("A", value); Let's say that the hash value for "A" is 5. The hash map can now do bucket[5] = new Entry("A", value);
map.put("B", value); Assume hash("B") = 3. So, bucket[3] = new Entry("B", value);
map.put("C"), value); - hash("C") = 19 - bucket[19] = new Entry("C", value);
map.put("Hi", value); Now here's where it gets interesting. Let's say that your hash function is such that hash("Hi") = 3. So now hash map wants to do bucket[3] = new Entry("Hi", value); We have a problem! bucket[3] is where we put the key "B", and "Hi" is definitely a different key than "B"... but they have the same hash value. We have a collision!
Because of this possibility, the HashMap is not actually implemented this way. A hash map needs to have buckets that can hold more than 1 entry in them. NOTE: I did not say more than 1 entry with the same key, as we cannot have that, but it needs to have buckets that can hold more than 1 entry of different keys. We need a bucket that can hold both "B" and "Hi".
So let's not do bucket[n] = new Entry(key, value);, but instead let's have bucket be of type Bucket[] instead of Entry[]. So now we do bucket[n].add( new Entry(key, value) );
So let's change to...
bucket[3].add("B", value);
and
bucket[3].add("Hi", value);
As you can see, we now have the entries for "B" and "Hi" in the same bucket. Now, when we want to get them back out, we need to loop through everything in the bucket, for example, with a for loop.
So the looping is present because of the collisions. Not collisions of key, but collisions of hash(key).
Why do we use such a crazy data structure?
You might be asking at this point, "Wait, WHAT!?! Why would we do such a weird thing like that??? Why are we using such a contrived and convoluted data structure???" The answer to that question would be...
A hash map works like this because of the properties that such a peculiar setup provides to us due to the way the math works out. If you use a good hash function which minimizes conflicts, and if you size your HashMap to have more buckets than the number of entries that you guess will be in it, then you have an optimized hash map which will be the fastest data structure for inserts and queries of complex data.
Your HashMap may be too small
Since you say you are often seeing this for-loop being iterated over with multiple elements in your debugging, that means that your HashMap might be too small. If you have a reasonable guess as to how many things you might put into it, try to set the size to be larger than that. Notice in my example above that I was inserting 10 strings but had a hash map with 20 buckets. With a good hash function, this will yield very few collisions.
Note:
Note: the above example is a simplification of the matter and does take some shortcuts for brevity. A full explanation is even slightly more complicated, but everything you need to know to answer the question as asked is here.
Hash tables has buckets because hashes of objects do not have to be unique. If hashes of objects are equal, means, objects, probably, are equal. If hashes of objects are different, then objects are exactly different.
Therefore, objects with the same hashes are grouped into buckets. The for loop is used to iterate objects contained in such a bucket.
In fact, this means that the algorithmic complexity of finding an object in such a hash table is not constant (although very close to it), but something between logarithmic and linear.
I would like to put it in simple words. the put method have a FOR loop to iterate over the list of keys which falls under the same bucket of hashCode.
What happens when you do put the key-value pair into the hashmap:
So for every key you pass to the HashMap, it will calculate the hashCode for it.
So many keys can fall under the same hashCode bucket. Now HashMap will check if the same key is already present or not in the same bucket.
In Java 7, HashMap maintains all the keys of the same bucket in a list. So before inserting the key it will traverse through the list to check if the same key is present or not. That's why there is a FOR loop.
So in average case its time complexity: O(1) and in worst case its time complexity is O(N).
I have a String[] dataValues as below:
ONE:9
TWO:23
THREE:14
FOUR:132
ONE:255
TWO:727
FIVE:3
THREE:196
FOUR:1843
ONE:330
TWO:336
THREE:190
FOUR:3664
I want to total the values of ONE, TWO, THREE, FOUR, FIVE.
So I created a HashMap for the same:
Map<String, Integer> totals = new HashMap<String, Integer>();
for(String dataValue : dataValues){
String[] keyVal = dataValue.split(":");
totals.put(keyVal[0], totals.get(keyVal[0]).intValue() + Integer.parseInt(keyVal[1]));
}
But above code will obviously throw below exception if the key is not already existing in the map:
Exception in thread "main" java.lang.NullPointerException
What is the best way to get the totals in my usecase above?
You can just get the value for the given key and checks if its not null:
for(String dataValue : dataValues){
String[] keyVal = dataValue.split(":");
Integer i = totals.get(keyVal[0]);
if(i == null) {
totals.put(keyVal[0], Integer.parseInt(keyVal[1]));
} else {
totals.put(keyVal[0], i + Integer.parseInt(keyVal[1]));
}
}
What is the best way to get the totals in my usecase above?
With Java 8 you can use the merge function
for(String dataValue : dataValues){
String[] keyVal = dataValue.split(":");
totals.merge(keyVal[0], Integer.parseInt(keyVal[1]), Integer::sum);
}
What this function does? Let's cite the doc:
If the specified key is not already associated with a value or is
associated with null, associates it with the given non-null value.
Otherwise, replaces the associated value with the results of the given
remapping function, or removes if the result is null
So as you get it, if there is no value associated with the key, you just map it with the int value of keyVal[1]. If there is already one, you need to provide a function to decide what you will do with both values (the one that is already mapped and the one that you want to map).
In your case you want to sum them, so this function looks like (a, b) -> a + b, which can be replaced by the method reference Integer.sum because it's a function that takes two int and returns an int, so a valid candidate (and that have the semantic you need of course).
But wait, we can actually do better! This is where the Stream API and the collectors class come handy.
Get a Stream<String> from the file, split each line into an array, group each array by its first element (the key), map its second element (the values) to integer and sum them:
import static java.util.stream.Collectors.*;
...
Map<String, Integer> map = Files.lines(Paths.get("file"))
.map(s -> s.split(":"))
.collect(groupingBy(arr -> arr[0], summingInt(arr -> Integer.parseInt(arr[1])));
and another way would be to use the toMap collector.
.collect(toMap(arr -> arr[0], arr -> Integer.parseInt(arr[1]), Integer::sum));
From the same Stream<String[]>, you collect the results in a Map<String, Integer> from which the key is arr[0], the values are the int values hold by arr[1]. If you have the same keys you merge the values by summing them.
Both give the same result, I like the first one because with the name of the collector it makes the intent clear that you are grouping elements but it's up to you to choose.
Maybe a bit difficult to understand it at first, but it's very powerful once you grab the concept of these (downstream) collectors.
Hope it helps! :)
Since Java 8 instead of map.get you can use map.getOrDefault which in case of lack of data will return default data defined by you like
totals.getOrDefault(keyVal[0], 0).intValue()
Here is an elegant (edit: pre Java 8) solution :
Integer storedVal = hashMap.get(str);
String str = keyVal[0];
int num = Integer.parseInt(keyVal[1]);
hashMap.put(str, storedVal == null ? num : storedVal + num);
Check to see that the key exists. If it does not, create it with your held int.
If the key does exist, retrieve the value and do math, storing the sum.
This works because if a key already exists, a 'put' will override the value.
Suppose we have a HashMap<String, Integer> in Java.
How do I update (increment) the integer-value of the string-key for each existence of the string I find?
One could remove and reenter the pair, but overhead would be a concern.
Another way would be to just put the new pair and the old one would be replaced.
In the latter case, what happens if there is a hashcode collision with a new key I am trying to insert? The correct behavior for a hashtable would be to assign a different place for it, or make a list out of it in the current bucket.
map.put(key, map.get(key) + 1);
should be fine. It will update the value for the existing mapping. Note that this uses auto-boxing. With the help of map.get(key) we get the value of corresponding key, then you can update with your requirement. Here I am updating to increment value by 1.
Java 8 way:
You can use computeIfPresent method and supply it a mapping function, which will be called to compute a new value based on existing one.
For example,
Map<String, Integer> words = new HashMap<>();
words.put("hello", 3);
words.put("world", 4);
words.computeIfPresent("hello", (k, v) -> v + 1);
System.out.println(words.get("hello"));
Alternatevely, you could use merge method, where 1 is the default value and function increments existing value by 1:
words.merge("hello", 1, Integer::sum);
In addition, there is a bunch of other useful methods, such as putIfAbsent, getOrDefault, forEach, etc.
The simplified Java 8 way:
map.put(key, map.getOrDefault(key, 0) + 1);
This uses the method of HashMap that retrieves the value for a key, but if the key can't be retrieved it returns the specified default value (in this case a '0').
This is supported within core Java: HashMap<K,V> getOrDefault(Object key, V defaultValue)
hashmap.put(key, hashmap.get(key) + 1);
The method put will replace the value of an existing key and will create it if doesn't exist.
Replace Integer by AtomicInteger and call one of the incrementAndGet/getAndIncrement methods on it.
An alternative is to wrap an int in your own MutableInteger class which has an increment() method, you only have a threadsafety concern to solve yet.
One line solution:
map.put(key, map.containsKey(key) ? map.get(key) + 1 : 1);
#Matthew's solution is the simplest and will perform well enough in most cases.
If you need high performance, AtomicInteger is a better solution ala #BalusC.
However, a faster solution (provided thread safety is not an issue) is to use TObjectIntHashMap which provides a increment(key) method and uses primitives and less objects than creating AtomicIntegers. e.g.
TObjectIntHashMap<String> map = new TObjectIntHashMap<String>()
map.increment("aaa");
You can increment like below but you need to check for existence so that a NullPointerException is not thrown
if(!map.containsKey(key)) {
p.put(key,1);
}
else {
p.put(key, map.getKey()+1);
}
Does the hash exist (with 0 as the value) or is it "put" to the map on the first increment? If it is "put" on the first increment, the code should look like:
if (hashmap.containsKey(key)) {
hashmap.put(key, hashmap.get(key)+1);
} else {
hashmap.put(key,1);
}
It may be little late but here are my two cents.
If you are using Java 8 then you can make use of computeIfPresent method. If the value for the specified key is present and non-null then it attempts to compute a new mapping given the key and its current mapped value.
final Map<String,Integer> map1 = new HashMap<>();
map1.put("A",0);
map1.put("B",0);
map1.computeIfPresent("B",(k,v)->v+1); //[A=0, B=1]
We can also make use of another method putIfAbsent to put a key. If the specified key is not already associated with a value (or is mapped to null) then this method associates it with the given value and returns null, else returns the current value.
In case the map is shared across threads then we can make use of ConcurrentHashMap and AtomicInteger. From the doc:
An AtomicInteger is an int value that may be updated atomically. An
AtomicInteger is used in applications such as atomically incremented
counters, and cannot be used as a replacement for an Integer. However,
this class does extend Number to allow uniform access by tools and
utilities that deal with numerically-based classes.
We can use them as shown:
final Map<String,AtomicInteger> map2 = new ConcurrentHashMap<>();
map2.putIfAbsent("A",new AtomicInteger(0));
map2.putIfAbsent("B",new AtomicInteger(0)); //[A=0, B=0]
map2.get("B").incrementAndGet(); //[A=0, B=1]
One point to observe is we are invoking get to get the value for key B and then invoking incrementAndGet() on its value which is of course AtomicInteger. We can optimize it as the method putIfAbsent returns the value for the key if already present:
map2.putIfAbsent("B",new AtomicInteger(0)).incrementAndGet();//[A=0, B=2]
On a side note if we plan to use AtomicLong then as per documentation under high contention expected throughput of LongAdder is significantly higher, at the expense of higher space consumption. Also check this question.
The cleaner solution without NullPointerException is:
map.replace(key, map.get(key) + 1);
Since I can't comment to a few answers due to less reputation, I will post a solution which I applied.
for(String key : someArray)
{
if(hashMap.containsKey(key)//will check if a particular key exist or not
{
hashMap.put(hashMap.get(key),value+1);// increment the value by 1 to an already existing key
}
else
{
hashMap.put(key,value);// make a new entry into the hashmap
}
}
Integer i = map.get(key);
if(i == null)
i = (aValue)
map.put(key, i + 1);
or
Integer i = map.get(key);
map.put(key, i == null ? newValue : i + 1);
Integer is Primitive data types http://cs.fit.edu/~ryan/java/language/java-data.html, so you need to take it out, make some process, then put it back. if you have a value which is not Primitive data types, you only need to take it out, process it, no need to put it back into the hashmap.
Use a for loop to increment the index:
for (int i =0; i<5; i++){
HashMap<String, Integer> map = new HashMap<String, Integer>();
map.put("beer", 100);
int beer = map.get("beer")+i;
System.out.println("beer " + beer);
System.out ....
}
There are misleading answers to this question here that imply Hashtable put method will replace the existing value if the key exists, this is not true for Hashtable but rather for HashMap. See Javadoc for HashMap http://docs.oracle.com/javase/7/docs/api/java/util/HashMap.html#put%28K,%20V%29
Use Java8 built in fuction 'computeIfPresent'
Example:
public class ExampleToUpdateMapValue {
public static void main(String[] args) {
Map<String,String> bookAuthors = new TreeMap<>();
bookAuthors.put("Genesis","Moses");
bookAuthors.put("Joshua","Joshua");
bookAuthors.put("Judges","Samuel");
System.out.println("---------------------Before----------------------");
bookAuthors.entrySet().stream().forEach(System.out::println);
// To update the existing value using Java 8
bookAuthors.computeIfPresent("Judges", (k,v) -> v = "Samuel/Nathan/Gad");
System.out.println("---------------------After----------------------");
bookAuthors.entrySet().stream().forEach(System.out::println);
}
}
Try:
HashMap hm=new HashMap<String ,Double >();
NOTE:
String->give the new value; //THIS IS THE KEY
else
Double->pass new value; //THIS IS THE VALUE
You can change either the key or the value in your hashmap, but you can't change both at the same time.
Suppose we have a HashMap<String, Integer> in Java.
How do I update (increment) the integer-value of the string-key for each existence of the string I find?
One could remove and reenter the pair, but overhead would be a concern.
Another way would be to just put the new pair and the old one would be replaced.
In the latter case, what happens if there is a hashcode collision with a new key I am trying to insert? The correct behavior for a hashtable would be to assign a different place for it, or make a list out of it in the current bucket.
map.put(key, map.get(key) + 1);
should be fine. It will update the value for the existing mapping. Note that this uses auto-boxing. With the help of map.get(key) we get the value of corresponding key, then you can update with your requirement. Here I am updating to increment value by 1.
Java 8 way:
You can use computeIfPresent method and supply it a mapping function, which will be called to compute a new value based on existing one.
For example,
Map<String, Integer> words = new HashMap<>();
words.put("hello", 3);
words.put("world", 4);
words.computeIfPresent("hello", (k, v) -> v + 1);
System.out.println(words.get("hello"));
Alternatevely, you could use merge method, where 1 is the default value and function increments existing value by 1:
words.merge("hello", 1, Integer::sum);
In addition, there is a bunch of other useful methods, such as putIfAbsent, getOrDefault, forEach, etc.
The simplified Java 8 way:
map.put(key, map.getOrDefault(key, 0) + 1);
This uses the method of HashMap that retrieves the value for a key, but if the key can't be retrieved it returns the specified default value (in this case a '0').
This is supported within core Java: HashMap<K,V> getOrDefault(Object key, V defaultValue)
hashmap.put(key, hashmap.get(key) + 1);
The method put will replace the value of an existing key and will create it if doesn't exist.
Replace Integer by AtomicInteger and call one of the incrementAndGet/getAndIncrement methods on it.
An alternative is to wrap an int in your own MutableInteger class which has an increment() method, you only have a threadsafety concern to solve yet.
One line solution:
map.put(key, map.containsKey(key) ? map.get(key) + 1 : 1);
#Matthew's solution is the simplest and will perform well enough in most cases.
If you need high performance, AtomicInteger is a better solution ala #BalusC.
However, a faster solution (provided thread safety is not an issue) is to use TObjectIntHashMap which provides a increment(key) method and uses primitives and less objects than creating AtomicIntegers. e.g.
TObjectIntHashMap<String> map = new TObjectIntHashMap<String>()
map.increment("aaa");
You can increment like below but you need to check for existence so that a NullPointerException is not thrown
if(!map.containsKey(key)) {
p.put(key,1);
}
else {
p.put(key, map.getKey()+1);
}
Does the hash exist (with 0 as the value) or is it "put" to the map on the first increment? If it is "put" on the first increment, the code should look like:
if (hashmap.containsKey(key)) {
hashmap.put(key, hashmap.get(key)+1);
} else {
hashmap.put(key,1);
}
It may be little late but here are my two cents.
If you are using Java 8 then you can make use of computeIfPresent method. If the value for the specified key is present and non-null then it attempts to compute a new mapping given the key and its current mapped value.
final Map<String,Integer> map1 = new HashMap<>();
map1.put("A",0);
map1.put("B",0);
map1.computeIfPresent("B",(k,v)->v+1); //[A=0, B=1]
We can also make use of another method putIfAbsent to put a key. If the specified key is not already associated with a value (or is mapped to null) then this method associates it with the given value and returns null, else returns the current value.
In case the map is shared across threads then we can make use of ConcurrentHashMap and AtomicInteger. From the doc:
An AtomicInteger is an int value that may be updated atomically. An
AtomicInteger is used in applications such as atomically incremented
counters, and cannot be used as a replacement for an Integer. However,
this class does extend Number to allow uniform access by tools and
utilities that deal with numerically-based classes.
We can use them as shown:
final Map<String,AtomicInteger> map2 = new ConcurrentHashMap<>();
map2.putIfAbsent("A",new AtomicInteger(0));
map2.putIfAbsent("B",new AtomicInteger(0)); //[A=0, B=0]
map2.get("B").incrementAndGet(); //[A=0, B=1]
One point to observe is we are invoking get to get the value for key B and then invoking incrementAndGet() on its value which is of course AtomicInteger. We can optimize it as the method putIfAbsent returns the value for the key if already present:
map2.putIfAbsent("B",new AtomicInteger(0)).incrementAndGet();//[A=0, B=2]
On a side note if we plan to use AtomicLong then as per documentation under high contention expected throughput of LongAdder is significantly higher, at the expense of higher space consumption. Also check this question.
The cleaner solution without NullPointerException is:
map.replace(key, map.get(key) + 1);
Since I can't comment to a few answers due to less reputation, I will post a solution which I applied.
for(String key : someArray)
{
if(hashMap.containsKey(key)//will check if a particular key exist or not
{
hashMap.put(hashMap.get(key),value+1);// increment the value by 1 to an already existing key
}
else
{
hashMap.put(key,value);// make a new entry into the hashmap
}
}
Integer i = map.get(key);
if(i == null)
i = (aValue)
map.put(key, i + 1);
or
Integer i = map.get(key);
map.put(key, i == null ? newValue : i + 1);
Integer is Primitive data types http://cs.fit.edu/~ryan/java/language/java-data.html, so you need to take it out, make some process, then put it back. if you have a value which is not Primitive data types, you only need to take it out, process it, no need to put it back into the hashmap.
Use a for loop to increment the index:
for (int i =0; i<5; i++){
HashMap<String, Integer> map = new HashMap<String, Integer>();
map.put("beer", 100);
int beer = map.get("beer")+i;
System.out.println("beer " + beer);
System.out ....
}
There are misleading answers to this question here that imply Hashtable put method will replace the existing value if the key exists, this is not true for Hashtable but rather for HashMap. See Javadoc for HashMap http://docs.oracle.com/javase/7/docs/api/java/util/HashMap.html#put%28K,%20V%29
Use Java8 built in fuction 'computeIfPresent'
Example:
public class ExampleToUpdateMapValue {
public static void main(String[] args) {
Map<String,String> bookAuthors = new TreeMap<>();
bookAuthors.put("Genesis","Moses");
bookAuthors.put("Joshua","Joshua");
bookAuthors.put("Judges","Samuel");
System.out.println("---------------------Before----------------------");
bookAuthors.entrySet().stream().forEach(System.out::println);
// To update the existing value using Java 8
bookAuthors.computeIfPresent("Judges", (k,v) -> v = "Samuel/Nathan/Gad");
System.out.println("---------------------After----------------------");
bookAuthors.entrySet().stream().forEach(System.out::println);
}
}
Try:
HashMap hm=new HashMap<String ,Double >();
NOTE:
String->give the new value; //THIS IS THE KEY
else
Double->pass new value; //THIS IS THE VALUE
You can change either the key or the value in your hashmap, but you can't change both at the same time.
I hope this question is not considered too basic for this forum, but we'll see. I'm wondering how to refactor some code for better performance that is getting run a bunch of times.
Say I'm creating a word frequency list, using a Map (probably a HashMap), where each key is a String with the word that's being counted and the value is an Integer that's incremented each time a token of the word is found.
In Perl, incrementing such a value would be trivially easy:
$map{$word}++;
But in Java, it's much more complicated. Here the way I'm currently doing it:
int count = map.containsKey(word) ? map.get(word) : 0;
map.put(word, count + 1);
Which of course relies on the autoboxing feature in the newer Java versions. I wonder if you can suggest a more efficient way of incrementing such a value. Are there even good performance reasons for eschewing the Collections framework and using a something else instead?
Update: I've done a test of several of the answers. See below.
Now there is a shorter way with Java 8 using Map::merge.
myMap.merge(key, 1, Integer::sum)
or
myMap.merge(key, 1L, Long::sum)
for longs respectively.
What it does:
if key do not exists, put 1 as value
otherwise sum 1 to the value linked to key
More information here.
Some test results
I've gotten a lot of good answers to this question--thanks folks--so I decided to run some tests and figure out which method is actually fastest. The five methods I tested are these:
the "ContainsKey" method that I presented in the question
the "TestForNull" method suggested by Aleksandar Dimitrov
the "AtomicLong" method suggested by Hank Gay
the "Trove" method suggested by jrudolph
the "MutableInt" method suggested by phax.myopenid.com
Method
Here's what I did...
created five classes that were identical except for the differences shown below. Each class had to perform an operation typical of the scenario I presented: opening a 10MB file and reading it in, then performing a frequency count of all the word tokens in the file. Since this took an average of only 3 seconds, I had it perform the frequency count (not the I/O) 10 times.
timed the loop of 10 iterations but not the I/O operation and recorded the total time taken (in clock seconds) essentially using Ian Darwin's method in the Java Cookbook.
performed all five tests in series, and then did this another three times.
averaged the four results for each method.
Results
I'll present the results first and the code below for those who are interested.
The ContainsKey method was, as expected, the slowest, so I'll give the speed of each method in comparison to the speed of that method.
ContainsKey: 30.654 seconds (baseline)
AtomicLong: 29.780 seconds (1.03 times as fast)
TestForNull: 28.804 seconds (1.06 times as fast)
Trove: 26.313 seconds (1.16 times as fast)
MutableInt: 25.747 seconds (1.19 times as fast)
Conclusions
It would appear that only the MutableInt method and the Trove method are significantly faster, in that only they give a performance boost of more than 10%. However, if threading is an issue, AtomicLong might be more attractive than the others (I'm not really sure). I also ran TestForNull with final variables, but the difference was negligible.
Note that I haven't profiled memory usage in the different scenarios. I'd be happy to hear from anybody who has good insights into how the MutableInt and Trove methods would be likely to affect memory usage.
Personally, I find the MutableInt method the most attractive, since it doesn't require loading any third-party classes. So unless I discover problems with it, that's the way I'm most likely to go.
The code
Here is the crucial code from each method.
ContainsKey
import java.util.HashMap;
import java.util.Map;
...
Map<String, Integer> freq = new HashMap<String, Integer>();
...
int count = freq.containsKey(word) ? freq.get(word) : 0;
freq.put(word, count + 1);
TestForNull
import java.util.HashMap;
import java.util.Map;
...
Map<String, Integer> freq = new HashMap<String, Integer>();
...
Integer count = freq.get(word);
if (count == null) {
freq.put(word, 1);
}
else {
freq.put(word, count + 1);
}
AtomicLong
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ConcurrentMap;
import java.util.concurrent.atomic.AtomicLong;
...
final ConcurrentMap<String, AtomicLong> map =
new ConcurrentHashMap<String, AtomicLong>();
...
map.putIfAbsent(word, new AtomicLong(0));
map.get(word).incrementAndGet();
Trove
import gnu.trove.TObjectIntHashMap;
...
TObjectIntHashMap<String> freq = new TObjectIntHashMap<String>();
...
freq.adjustOrPutValue(word, 1, 1);
MutableInt
import java.util.HashMap;
import java.util.Map;
...
class MutableInt {
int value = 1; // note that we start at 1 since we're counting
public void increment () { ++value; }
public int get () { return value; }
}
...
Map<String, MutableInt> freq = new HashMap<String, MutableInt>();
...
MutableInt count = freq.get(word);
if (count == null) {
freq.put(word, new MutableInt());
}
else {
count.increment();
}
A little research in 2016: https://github.com/leventov/java-word-count, benchmark source code
Best results per method (smaller is better):
time, ms
kolobokeCompile 18.8
koloboke 19.8
trove 20.8
fastutil 22.7
mutableInt 24.3
atomicInteger 25.3
eclipse 26.9
hashMap 28.0
hppc 33.6
hppcRt 36.5
Time\space results:
Map<String, Integer> map = new HashMap<>();
String key = "a random key";
int count = map.getOrDefault(key, 0); // ensure count will be one of 0,1,2,3,...
map.put(key, count + 1);
And that's how you increment a value with simple code.
Benefit:
No need to add a new class or use another concept of mutable int
Not relying on any library
Easy to understand what's going on exactly (Not too much abstraction)
Downside:
The hash map will be searched twice for get() and put(). So it will not be the most performant code.
Theoretically, once you call get(), you already know where to put(), so you should not have to search again. But searching in hash map usually takes a very minimal time that you can kind of ignore this performance issue.
But if you are very serious about the issue, you are a perfectionist, another way is to use merge method, this is (probably) more efficient than the previous code snippet as you will be (theoretically) searching the map only once: (though this code is not obvious from first sight, it's short and performant)
map.merge(key, 1, (a,b) -> a+b);
Suggestion: you should care about code readability more than little performance gain in most of the time. If the first code snippet is easier for you to understand then use it. But if you are able to understand the 2nd one fine then you can also go for it!
As a follow-up to my own comment: Trove looks like the way to go. If, for whatever reason, you wanted to stick with the standard JDK, ConcurrentMap and AtomicLong can make the code a tiny bit nicer, though YMMV.
final ConcurrentMap<String, AtomicLong> map = new ConcurrentHashMap<String, AtomicLong>();
map.putIfAbsent("foo", new AtomicLong(0));
map.get("foo").incrementAndGet();
will leave 1 as the value in the map for foo. Realistically, increased friendliness to threading is all that this approach has to recommend it.
Google Guava is your friend...
...at least in some cases. They have this nice AtomicLongMap. Especially nice because you are dealing with long as value in your map.
E.g.
AtomicLongMap<String> map = AtomicLongMap.create();
[...]
map.getAndIncrement(word);
Also possible to add more then 1 to the value:
map.getAndAdd(word, 112L);
It's always a good idea to look at the Google Collections Library for this kind of thing. In this case a Multiset will do the trick:
Multiset bag = Multisets.newHashMultiset();
String word = "foo";
bag.add(word);
bag.add(word);
System.out.println(bag.count(word)); // Prints 2
There are Map-like methods for iterating over keys/entries, etc. Internally the implementation currently uses a HashMap<E, AtomicInteger>, so you will not incur boxing costs.
You should be aware of the fact that your original attempt
int count = map.containsKey(word) ? map.get(word) : 0;
contains two potentially expensive operations on a map, namely containsKey and get. The former performs an operation potentially pretty similar to the latter, so you're doing the same work twice!
If you look at the API for Map, get operations usually return null when the map does not contain the requested element.
Note that this will make a solution like
map.put( key, map.get(key) + 1 );
dangerous, since it might yield NullPointerExceptions. You should check for a null first.
Also note, and this is very important, that HashMaps can contain nulls by definition. So not every returned null says "there is no such element". In this respect, containsKey behaves differently from get in actually telling you whether there is such an element. Refer to the API for details.
For your case, however, you might not want to distinguish between a stored null and "noSuchElement". If you don't want to permit nulls you might prefer a Hashtable. Using a wrapper library as was already proposed in other answers might be a better solution to manual treatment, depending on the complexity of your application.
To complete the answer (and I forgot to put that in at first, thanks to the edit function!), the best way of doing it natively, is to get into a final variable, check for null and put it back in with a 1. The variable should be final because it's immutable anyway. The compiler might not need this hint, but its clearer that way.
final HashMap map = generateRandomHashMap();
final Object key = fetchSomeKey();
final Integer i = map.get(key);
if (i != null) {
map.put(i + 1);
} else {
// do something
}
If you do not want to rely on autoboxing, you should say something like map.put(new Integer(1 + i.getValue())); instead.
Another way would be creating a mutable integer:
class MutableInt {
int value = 0;
public void inc () { ++value; }
public int get () { return value; }
}
...
Map<String,MutableInt> map = new HashMap<String,MutableInt> ();
MutableInt value = map.get (key);
if (value == null) {
value = new MutableInt ();
map.put (key, value);
} else {
value.inc ();
}
of course this implies creating an additional object but the overhead in comparison to creating an Integer (even with Integer.valueOf) should not be so much.
You can make use of computeIfAbsent method in Map interface provided in Java 8.
final Map<String,AtomicLong> map = new ConcurrentHashMap<>();
map.computeIfAbsent("A", k->new AtomicLong(0)).incrementAndGet();
map.computeIfAbsent("B", k->new AtomicLong(0)).incrementAndGet();
map.computeIfAbsent("A", k->new AtomicLong(0)).incrementAndGet(); //[A=2, B=1]
The method computeIfAbsent checks if the specified key is already associated with a value or not? If no associated value then it attempts to compute its value using the given mapping function. In any case it returns the current (existing or computed) value associated with the specified key, or null if the computed value is null.
On a side note if you have a situation where multiple threads update a common sum you can have a look at LongAdder class.Under high contention, expected throughput of this class is significantly higher than AtomicLong, at the expense of higher space consumption.
Quite simple, just use the built-in function in Map.java as followed
map.put(key, map.getOrDefault(key, 0) + 1);
Memory rotation may be an issue here, since every boxing of an int larger than or equal to 128 causes an object allocation (see Integer.valueOf(int)). Although the garbage collector very efficiently deals with short-lived objects, performance will suffer to some degree.
If you know that the number of increments made will largely outnumber the number of keys (=words in this case), consider using an int holder instead. Phax already presented code for this. Here it is again, with two changes (holder class made static and initial value set to 1):
static class MutableInt {
int value = 1;
void inc() { ++value; }
int get() { return value; }
}
...
Map<String,MutableInt> map = new HashMap<String,MutableInt>();
MutableInt value = map.get(key);
if (value == null) {
value = new MutableInt();
map.put(key, value);
} else {
value.inc();
}
If you need extreme performance, look for a Map implementation which is directly tailored towards primitive value types. jrudolph mentioned GNU Trove.
By the way, a good search term for this subject is "histogram".
I suggest to use Java 8 Map::compute().
It considers the case when a key doesn't exist, too.
Map.compute(num, (k, v) -> (v == null) ? 1 : v + 1);
Instead of calling containsKey() it is faster just to call map.get and check if the returned value is null or not.
Integer count = map.get(word);
if(count == null){
count = 0;
}
map.put(word, count + 1);
Are you sure that this is a bottleneck? Have you done any performance analysis?
Try using the NetBeans profiler (its free and built into NB 6.1) to look at hotspots.
Finally, a JVM upgrade (say from 1.5->1.6) is often a cheap performance booster. Even an upgrade in build number can provide good performance boosts. If you are running on Windows and this is a server class application, use -server on the command line to use the Server Hotspot JVM. On Linux and Solaris machines this is autodetected.
There are a couple of approaches:
Use a Bag alorithm like the sets contained in Google Collections.
Create mutable container which you can use in the Map:
class My{
String word;
int count;
}
And use put("word", new My("Word") ); Then you can check if it exists and increment when adding.
Avoid rolling your own solution using lists, because if you get innerloop searching and sorting, your performance will stink. The first HashMap solution is actually quite fast, but a proper like that found in Google Collections is probably better.
Counting words using Google Collections, looks something like this:
HashMultiset s = new HashMultiset();
s.add("word");
s.add("word");
System.out.println(""+s.count("word") );
Using the HashMultiset is quite elegent, because a bag-algorithm is just what you need when counting words.
A variation on the MutableInt approach that might be even faster, if a bit of a hack, is to use a single-element int array:
Map<String,int[]> map = new HashMap<String,int[]>();
...
int[] value = map.get(key);
if (value == null)
map.put(key, new int[]{1} );
else
++value[0];
It would be interesting if you could rerun your performance tests with this variation. It might be the fastest.
Edit: The above pattern worked fine for me, but eventually I changed to use Trove's collections to reduce memory size in some very large maps I was creating -- and as a bonus it was also faster.
One really nice feature is that the TObjectIntHashMap class has a single adjustOrPutValue call that, depending on whether there is already a value at that key, will either put an initial value or increment the existing value. This is perfect for incrementing:
TObjectIntHashMap<String> map = new TObjectIntHashMap<String>();
...
map.adjustOrPutValue(key, 1, 1);
Google Collections HashMultiset :
- quite elegant to use
- but consume CPU and memory
Best would be to have a method like : Entry<K,V> getOrPut(K);
(elegant, and low cost)
Such a method will compute hash and index only once,
and then we could do what we want with the entry
(either replace or update the value).
More elegant:
- take a HashSet<Entry>
- extend it so that get(K) put a new Entry if needed
- Entry could be your own object.
--> (new MyHashSet()).get(k).increment();
"put" need "get" (to ensure no duplicate key).
So directly do a "put",
and if there was a previous value, then do an addition:
Map map = new HashMap ();
MutableInt newValue = new MutableInt (1); // default = inc
MutableInt oldValue = map.put (key, newValue);
if (oldValue != null) {
newValue.add(oldValue); // old + inc
}
If count starts at 0, then add 1: (or any others values...)
Map map = new HashMap ();
MutableInt newValue = new MutableInt (0); // default
MutableInt oldValue = map.put (key, newValue);
if (oldValue != null) {
newValue.setValue(oldValue + 1); // old + inc
}
Notice : This code is not thread safe. Use it to build then use the map, not to concurrently update it.
Optimization : In a loop, keep old value to become the new value of next loop.
Map map = new HashMap ();
final int defaut = 0;
final int inc = 1;
MutableInt oldValue = new MutableInt (default);
while(true) {
MutableInt newValue = oldValue;
oldValue = map.put (key, newValue); // insert or...
if (oldValue != null) {
newValue.setValue(oldValue + inc); // ...update
oldValue.setValue(default); // reuse
} else
oldValue = new MutableInt (default); // renew
}
}
The various primitive wrappers, e.g., Integer are immutable so there's really not a more concise way to do what you're asking unless you can do it with something like AtomicLong. I can give that a go in a minute and update. BTW, Hashtable is a part of the Collections Framework.
I'd use Apache Collections Lazy Map (to initialize values to 0) and use MutableIntegers from Apache Lang as values in that map.
Biggest cost is having to serach the map twice in your method. In mine you have to do it just once. Just get the value (it will get initialized if absent) and increment it.
The Functional Java library's TreeMap datastructure has an update method in the latest trunk head:
public TreeMap<K, V> update(final K k, final F<V, V> f)
Example usage:
import static fj.data.TreeMap.empty;
import static fj.function.Integers.add;
import static fj.pre.Ord.stringOrd;
import fj.data.TreeMap;
public class TreeMap_Update
{public static void main(String[] a)
{TreeMap<String, Integer> map = empty(stringOrd);
map = map.set("foo", 1);
map = map.update("foo", add.f(1));
System.out.println(map.get("foo").some());}}
This program prints "2".
I don't know how efficient it is but the below code works as well.You need to define a BiFunction at the beginning. Plus, you can make more than just increment with this method.
public static Map<String, Integer> strInt = new HashMap<String, Integer>();
public static void main(String[] args) {
BiFunction<Integer, Integer, Integer> bi = (x,y) -> {
if(x == null)
return y;
return x+y;
};
strInt.put("abc", 0);
strInt.merge("abc", 1, bi);
strInt.merge("abc", 1, bi);
strInt.merge("abc", 1, bi);
strInt.merge("abcd", 1, bi);
System.out.println(strInt.get("abc"));
System.out.println(strInt.get("abcd"));
}
output is
3
1
If you're using Eclipse Collections, you can use a HashBag. It will be the most efficient approach in terms of memory usage and it will also perform well in terms of execution speed.
HashBag is backed by a MutableObjectIntMap which stores primitive ints instead of Counter objects. This reduces memory overhead and improves execution speed.
HashBag provides the API you'd need since it's a Collection that also allows you to query for the number of occurrences of an item.
Here's an example from the Eclipse Collections Kata.
MutableBag<String> bag =
HashBag.newBagWith("one", "two", "two", "three", "three", "three");
Assert.assertEquals(3, bag.occurrencesOf("three"));
bag.add("one");
Assert.assertEquals(2, bag.occurrencesOf("one"));
bag.addOccurrences("one", 4);
Assert.assertEquals(6, bag.occurrencesOf("one"));
Note: I am a committer for Eclipse Collections.
Counting using streams and getOrDefault:
String s = "abcdeff";
s.chars().mapToObj(c -> (char) c)
.forEach(c -> {
int count = countMap.getOrDefault(c, 0) + 1;
countMap.put(c, count);
});
Since a lot of people search Java topics for Groovy answers, here's how you can do it in Groovy:
dev map = new HashMap<String, Integer>()
map.put("key1", 3)
map.merge("key1", 1) {a, b -> a + b}
map.merge("key2", 1) {a, b -> a + b}
Hope I'm understanding your question correctly, I'm coming to Java from Python so I can empathize with your struggle.
if you have
map.put(key, 1)
you would do
map.put(key, map.get(key) + 1)
Hope this helps!
The simple and easy way in java 8 is the following:
final ConcurrentMap<String, AtomicLong> map = new ConcurrentHashMap<String, AtomicLong>();
map.computeIfAbsent("foo", key -> new AtomicLong(0)).incrementAndGet();