Java Hash Map has a size() method,
which reflects how many elements are set int the Hash Map.
I am interested to know what is the actual size of the Hash Map.
I tried different methods but can't find the correct one.
I set the initial Capacity to 16
HashMap hm = new HashMap(16);
for(int i=0;i<100;++i){
System.out.println(hm.size());
UUID uuid = UUID.randomUUID();
hm.pet(uuid ,null);
}
when i will add values this size can increase, how can i check the size that is actually allocated?
what is the actual size of the Hash Map
I'm assuming you are asking about the capacity. The capacity is the length of the array holding the buckets of the HashMaps. The initial capacity is 16 by default.
The capacity method is not public, but you can calculate the current capacity based on the current size, the initial capacity and the load factor.
If you use the defaults (for example, when you create the HashMap with the parameter-less constructor), the initial capacity is 16, and the default load factor is 0.75. This means the capacity will be doubled to 32 once the size reaches 16 * 0.75 == 12. It will be doubled to 64 once the size reaches 32 * 0.75 == 24.
If you pass different initial capacity and/or load factor to the constructor, the calculation will be affected accordingly.
You can use Reflection to check actual allocated size (bucket size) of the map.
HashMap<String, Integer> m = new HashMap<>();
m.put("Abhi", 101);
m.put("John", 102);
System.out.println(m.size()); // This will print 2
Field tableField = HashMap.class.getDeclaredField("table");
tableField.setAccessible(true);
Object[] table = (Object[]) tableField.get(m);
System.out.println(table.length); // This will print 16
Related
This is the vector class that I have created,
Vector v= new Vector(3,2);
System.out.println("v.capacity: " +v.capacity());
v.addElement(new Integer(1));
v.addElement(new Integer(2));
v.addElement(new Integer(3));
v.addElement(new Integer(4));
System.out.println("v.size "+v.size());
System.out.println("v.capacity: " +v.capacity());
this shows result as :
v.capacity: 3
v.size 4
v.capacity: 5
if I change
Vector v= new Vector(7,2);
it gives
v.capacity: 7
v.size 4
v.capacity: 7
So I want to know what is the actual output of this second capacity???Why the second out put didn't give me 9 as capacity size???
In your first case, you declared a Vector with initial capacity of 3, and capacity increment of 2. Then, you proceeded to add 4 objects to that vector, surpassing its initial capacity of 3. The vector's new capacity is now its old capacity + capacity increment.
With your second case, you declared a Vector with initial capacity of 7 and capacity increment of 2. Then, adding 4 objects does not surpass its initial capacity of 7. Therefore, there is no need to increment its capacity yet, so it remains as 7. If you add 8 objects to your second vector you will see its new capacity will be 9.
Please look at the Vector documentation: https://docs.oracle.com/javase/7/docs/api/java/util/Vector.html
Hi I was trying to find the load factor of Array list and vector but I was not able to find it. I know load factor of HashMap and other Map is 0.75. Can any one help to find me how to check the load factor of Vector and Arraylist.
ArrayList:
Initial Capacity:10
Load Factor:1 (when the list is full)
Growth Rate: current_size + current_size/2
Vector:
Initial Capacity:10
Load Factor:1 (when the list is full)
Growth Rate:
current_size * 2 (if capacityIncrement is not defined)
current_size + capacityIncrement (if capacityIncrement is defined during vector initialization)
I assume you would like to know how ArrayList and Vector increase its size.
For ArrayList, every time you put an element into it, it will check if the nested array needs to be enlarge its size. If yes, generally, its size will grow with:
newCapacity = oldCapacity + (oldCapacity >> 1);
For some special case, for example, add many or huge number of elements, things will be different. Please refer grow(int minCapacity) function in java.util.ArrayList source code.
Regarding Vector, generally, its size will grow with:
newCapacity = oldCapacity + ((capacityIncrement > 0) ?
capacityIncrement : oldCapacity);
For some special cases, please refer grow(int minCapacity) in java.util.Vector.
ArrayList al = new ArrayList();
for(int i=0; i<=10; i++){
al.add(i+1);
}
default capacity = 10
in the above example, we want to add 11 elements so new Capacity of ArrayList is
int newCapacity = (oldcapacity*3)/2+1
(10*3)/2+1 = 16
I have few doubts about the Java HashMap class. It is my understanding that
transient Entry[] table;
the table array is going to hold the data based on the value of hashCode(). I need to know when this array gets initialized. Is the array length based on the capacity we define during the HashMap's initialization or the default capacity of 16 if it is not defined when calling the constructor?
How is the hashcode scaled to the array index? For example, if the hashcode has a huge value, how it is scaled to array index like 10, 20?
I have read that when the threshold value is reached, rehashing will occur. For example, in the default case, when 16 is the capacity and 0.75 is the load factor, then the threshold value is 16*0.75=12. Once the 12 items are added rehashing will occur and capacity will increase. Does this mean that the table array size gets increased?
since your post has many questions I'm going to enumerate your questions as part of my answer. Also, please note that I'm going off HashMap's source code for Java 1.8 b132 for my answers.
Q: I need to know when this array gets initialized.
A: The table array only gets initialized when data is first entered into the map (e.g. a put() method call). It does not happen as part of the instantiation of the map, itself, unless the copy constructor is called, or the map is being deserialized into an object.
Q: Is the array length based on the capacity we define during the HashMap's initialization or the default capacity of 16 if it is not defined when calling the constructor?
A: Correct, the table array's length is based on the initial capacity your pass to the constructor. When the initial capacity is not specified and the default constructor is called, the default capacity is used.
Q: How is the hashcode scaled to the array index?
A: For the actual code that does this, itself, see the implementation of the putVal() method. Basically what happens is that the code takes the very large hash value and performs a bitwise-AND with the last element index of the table. That effectively randomizes the position of the key/value pair with the table array. For example, if the hash value is 333 (101001101 in base 2) and the table array size is 32 (100000), the last element index would be 31 (11111). Thus the index chosen would be 11111 & 101001101 == 01101 == 13.
Q: I have read that when the threshold value is reached, rehashing will occur. ... Does this mean that the table array size gets increased?
A: More or less, yes. When the threshold is exceeded, the table is resized. Note that by resizing, the existing table array isn't modified. Rather, a new table array is created with the twice the capacity of the first table array. For details, see the implementation of the resize() method.
public HashMap(int initialCapacity, float loadFactor) {
if (initialCapacity < 0)
throw new IllegalArgumentException("Illegal initial capacity: " +
initialCapacity);
if (initialCapacity > MAXIMUM_CAPACITY)
initialCapacity = MAXIMUM_CAPACITY;
if (loadFactor <= 0 || Float.isNaN(loadFactor))
throw new IllegalArgumentException("Illegal load factor: " +
loadFactor);
// Find a power of 2 >= initialCapacity
int capacity = 1;
while (capacity < initialCapacity)
capacity <<= 1;
this.loadFactor = loadFactor;
threshold = (int)(capacity * loadFactor);
table = new Entry[capacity];
init();
}
Above code block explains how and when you populate the table.
Once the rehashing occurs it doesn't increase the table array size as you can declare array size once for ever; It creates a new array every time with the updated size:
void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}
Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}
For example if I allocate a Hashmap of size 100, and if 100 buckets get created then wouldn't the Hashmap perform poorly? (As performing modular hashing will not distribute the keys evenly in all of the 100 buckets)
How does java manage this problem? Does it randomly choose a prime near 100 to be the Hashmap size ?
As you can you can see in the sources, it chooses the next biggest power of two (line 197 ff):
// Find a power of 2 >= initialCapacity
int capacity = 1;
while (capacity < initialCapacity)
capacity <<= 1;
this.loadFactor = loadFactor;
threshold = (int)(capacity * loadFactor);
table = new Entry[capacity];
Just a few minutes back I answered a question asking about the "Maximum possible size of HashMap in Java". As I have always read, HashMap is a growable data-structure. It's size is only limited by the JVM memory size. Hence I thought that there is no hard limit to its size and answered accordingly. (The same is applicable to HashSet as well.)
But someone corrected me saying that since the size() method of HashMap returns an int, there is a limit on its size. A perfectly correct point. I just tried to test it on my local but failed, I need more than 8GB memory to insert more than 2,147,483,647 integers in the HashMap, which I don't have.
My questions were:
What happens when we try to insert 2,147,483,647 + 1 element in the
HashMap/HashSet?
Is there an error thrown?
If yes, which error? If not what happens to the HashMap/HashSet, its already
existing elements and the new element?
If someone is blessed with access to a machine with say 16GB memory, you can try it out practically. :)
The underlying capacity of the array has to be a power of 2 (which is limited to 2^30) When this size is reached the load factor is effectively ignored and array stops growing.
At this point the rate of collisions increases.
Given the hashCode() only has 32-bits it wouldn't make sense to grow much big that this in any case.
/**
* Rehashes the contents of this map into a new array with a
* larger capacity. This method is called automatically when the
* number of keys in this map reaches its threshold.
*
* If current capacity is MAXIMUM_CAPACITY, this method does not
* resize the map, but sets threshold to Integer.MAX_VALUE.
* This has the effect of preventing future calls.
*
* #param newCapacity the new capacity, MUST be a power of two;
* must be greater than current capacity unless current
* capacity is MAXIMUM_CAPACITY (in which case value
* is irrelevant).
*/
void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}
Entry[] newTable = new Entry[newCapacity];
transfer(newTable);
table = newTable;
threshold = (int)(newCapacity * loadFactor);
}
When the size exceeds Integer.MAX_VALUE it overflows.
void addEntry(int hash, K key, V value, int bucketIndex) {
Entry<K,V> e = table[bucketIndex];
table[bucketIndex] = new Entry<K,V>(hash, key, value, e);
if (size++ >= threshold)
resize(2 * table.length);
}