Java 8 hashmap high memory usage

Java 8 hashmap high memory usage - java

I use a hashmap to store a QTable for an implementation of a reinforcement learning algorithm. My hashmap should store 15000000 entries. When I ran my algorithm I saw that the memory used by the process is over 1000000K. When I calculated the memory, I would expect it to use not more than 530000K. I tried to write an example and I got the same high memory usage:
public static void main(String[] args) {
HashMap map = new HashMap<>(16_000_000, 1);
for(int i = 0; i < 15_000_000; i++){
map.put(i, i);
}
}
My memory calulation:
Each entryset is 32 bytes
Capacity is 15000000
HashMap Instance uses: 32 * SIZE + 4 * CAPACITY
memory = (15000000 * 32 + 15000000 * 4) / 1024 = 527343.75K
Where I'm wrong in my memory calculations?

Well, in the best case, we assume a word size of 32 bits/4 bytes (with CompressedOops and CompressedClassesPointers). Then, a map entry consists of two words JVM overhead (klass pointer and mark word), key, value, hashcode and next pointer, making 6 words total, in other words, 24 bytes. So having 15,000,000 entry instances will consume 360 MB.
Additionally, there’s the array holding the entries. The HashMap uses capacities that are a power of two, so for 15,000,000 entries, the array size is at least 16,777,216, consuming 64 MiB.
Then, you have 30,000,000 Integer instances. The problem is that map.put(i, i) performs two boxing operations and while the JVM is encouraged to reuse objects when boxing, it is not required to do so and reusing won’t happen in your simple program that is likely to complete before the optimizer ever interferes.
To be precise, the first 128 Integer instances are reused, because for values in the -128 … +127 range, sharing is mandatory, but the implementation does this by initializing the entire cache on the first use, so for the first 128 iterations, it doesn’t create two instances, but the cache consists of 256 instances, which is twice that number, so we end up again with 30,000,000 Integer instances total. An Integer instance consist of at least the two JVM specific words and the actual int value, which would make 12 bytes, but due to the default alignment, the actually consumed memory will be 16 bytes, dividable by eight.
So the 30,000,000 created Integer instances consume 480 MB.
This makes a total of 360 MB + 64 MiB + 480 MB, which is more than 900 MB, making a heap size of 1 GB entirely plausible.
But that’s what profiling tools are for. After running your program, I got
Note that this tool only reports the used size of the objects, i.e. the 12 bytes for an Integer object without considering the padding that you will notice when looking at the total memory allocated by the JVM.

I sort of had the same requirement as you.. so decided to throw my thoughts here.
1) There is a great tool for that: jol.
2) Arrays are objects too, and every object in java has two additional headers: mark and klass, usually 4 and 8 bytes in size (this can be tweaked via compressed pointers, but not going to go into details).
3) Is is important to note about the load factor here of the map (because it influences the resize of the internal array). Here is an example:
HashMap<Integer, Integer> map = new HashMap<>(16, 1);
for (int i = 0; i < 13; ++i) {
map.put(i, i);
}
System.out.println(GraphLayout.parseInstance(map).toFootprint());
HashMap<Integer, Integer> map2 = new HashMap<>(16);
for (int i = 0; i < 13; ++i) {
map2.put(i, i);
}
System.out.println(GraphLayout.parseInstance(map2).toFootprint());
Output of this is different(only the relevant lines):
1 80 80 [Ljava.util.HashMap$Node; // first case
1 144 144 [Ljava.util.HashMap$Node; // second case
See how the size is bigger for the second case because the backing array is twice as big (32 entries). You can only put 12 entries in a 16 size array, because the default load factor is 0.75: 16 * 0.75 = 12.
Why 144? The math here is easy: an array is an object, thus: 8+4 bytes for headers. Plus 32 * 4 for references = 140 bytes. Due to memory alignment of 8 bytes, there are 4 bytes for padding resulting in a total 144 bytes.
4) entries are stored inside either a Node or a TreeNode inside the map (Node is 32 bytes and TreeNode is 56 bytes). As you use ONLY integers, you will have only Nodes, as there should be no hash collisions. There might be collisions, but this does not yet mean that a certain array entry will be converted to a TreeNode, there is a threshold for that. We can easily prove that there will be Nodes only:
public static void main(String[] args) {
Map<Integer, List<Integer>> map = IntStream.range(0, 15_000_000).boxed()
.collect(Collectors.groupingBy(WillThereBeTreeNodes::hash)); // WillThereBeTreeNodes - current class name
System.out.println(map.size());
}
private static int hash(Integer key) {
int h = 0;
return (h = key.hashCode()) ^ h >>> 16;
}
The result of this will be 15_000_000, there was no merging, thus no hash-collisions.
5) When you create Integer objects there is pool for them (ranging from -127 to 128 - this can be tweaked as well, but let's not for simplicity).
6) an Integer is an object, thus it has 12 bytes header and 4 bytes for the actual int value.
With this in mind, let's try and see the output for 15_000_000 entries (since you are using a load factor of one, there is no need to create the internal capacity of 16_000_000). It will take a lot of time, so be patient. I also gave it a
-Xmx12G and -Xms12G
HashMap<Integer, Integer> map = new HashMap<>(15_000_000, 1);
for (int i = 0; i < 15_000_000; ++i) {
map.put(i, i);
}
System.out.println(GraphLayout.parseInstance(map).toFootprint());
Here is what jol said:
java.util.HashMap#9629756d footprint:
COUNT AVG SUM DESCRIPTION
1 67108880 67108880 [Ljava.util.HashMap$Node;
29999872 16 479997952 java.lang.Integer
1 48 48 java.util.HashMap
15000000 32 480000000 java.util.HashMap$Node
44999874 1027106880 (total)
Let's start from bottom.
total size of the hashmap footprint is: 1027106880 bytes or 1 027 MB.
Node instance is the wrapper class where each entry resides. it has a size of 32 bytes; there are 15 million entries, thus the line:
15000000 32 480000000 java.util.HashMap$Node
Why 32 bytes? It stores the hashcode(4 bytes), key reference (4 bytes), value reference (4 bytes), next Node reference (4 bytes), 12 bytes header, 4 bytes padding, resulting in 32 bytes total.
1 48 48 java.util.HashMap
A single hashmap instance - 48 bytes for it's internals.
If you really want to know why 48 bytes:
System.out.println(ClassLayout.parseClass(HashMap.class).toPrintable());
java.util.HashMap object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 12 (object header) N/A
12 4 Set AbstractMap.keySet N/A
16 4 Collection AbstractMap.values N/A
20 4 int HashMap.size N/A
24 4 int HashMap.modCount N/A
28 4 int HashMap.threshold N/A
32 4 float HashMap.loadFactor N/A
36 4 Node[] HashMap.table N/A
40 4 Set HashMap.entrySet N/A
44 4 (loss due to the next object alignment)
Instance size: 48 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total
Next the Integer instances:
29999872 16 479997952 java.lang.Integer
30 million integer objects (minus 128 that are cached in the pool)
1 67108880 67108880 [Ljava.util.HashMap$Node;
we have 15_000_000 entries, but the internal array of a HashMap is a power of two size, that's 16,777,216 references of 4 bytes each.
16_777_216 * 4 = 67_108_864 + 12 bytes header + 4 padding = 67108880

Related

Resize behaviour of Maps in Java

I need to know when a map in Java enlarges. For this I need a formula to calculate a good initial capacity.
In my project I need a large map which contains large objects. Therefore, I would like to prevent a resizing of the map by specifying a suitable initial capacity. By means of reflection I have looked at the behavior of maps.
package com.company;
import java.lang.reflect.Field;
import java.util.HashMap;
import java.util.Map;
public class Main {
public static void main(String[] args) {
Map m = new HashMap();
int lastCapacity = 0, currentCapacity = 0;
for (int i = 1; i <= 100_000; i++) {
m.put(i,i);
currentCapacity = getHashMapCapacity(m);
if (currentCapacity>lastCapacity){
System.out.println(lastCapacity+" --> "+currentCapacity+" at "+i+" entries.");
lastCapacity=currentCapacity;
}
}
}
public static int getHashMapCapacity(Map m){
int size=0;
Field tableField = null;
try {
tableField = HashMap.class.getDeclaredField("table");
tableField.setAccessible(true);
Object[] table = (Object[]) tableField.get(m);
size = table == null ? 0 : table.length;
} catch (NoSuchFieldException e) {
e.printStackTrace();
} catch (IllegalAccessException e) {
e.printStackTrace();
}
return size;
}
}
The output was:
0 --> 16 at 1 entries.
16 --> 32 at 13 entries.
32 --> 64 at 25 entries.
64 --> 128 at 49 entries.
128 --> 256 at 97 entries.
256 --> 512 at 193 entries.
512 --> 1024 at 385 entries.
1024 --> 2048 at 769 entries.
2048 --> 4096 at 1537 entries.
4096 --> 8192 at 3073 entries.
8192 --> 16384 at 6145 entries.
16384 --> 32768 at 12289 entries.
32768 --> 65536 at 24577 entries.
65536 --> 131072 at 49153 entries.
131072 --> 262144 at 98305 entries.
Can I assume that a map always behaves that way? Are there any differences between Java 7 and Java 8?

The easiest way to check out this sort of behaviour is to look at the openjdk source. It's all freely available and relatively easy to read.
In this case, checking HashMap, you will see there are some extensive implementation notes that explains how sizing works, what load factor is used as a threshold (which is driving the behaviour you are seeing), and even how the decision is made whether to use trees for the bin. Read through that and come back if it's not clear.
The code is pretty well optimised with expansion a very cheap operation. I suggest using a profile to get some evidence that the performance issue are related to expansion before you do any tweaking.

As per the documentation:
The expected number of entries in the map and its load factor should be taken into account when setting its initial capacity, so as to minimize the number of rehash operations. If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.
https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html

Java 8 stream objects significant memory usage

In looking at some profiling results, I noticed that using streams within a tight loop (used instead of another nested loop) incurred a significant memory overhead of objects of types java.util.stream.ReferencePipeline and java.util.ArrayList$ArrayListSpliterator. I converted the offending streams to foreach loops, and the memory consumption decreased significantly.
I know that streams make no promises about performing any better than ordinary loops, but I was under the impression that the difference would be negligible. In this case it seemed like it was a 40% increase.
Here is the test class I wrote to isolate the problem. I monitored memory consumption and object allocation with JFR:
import java.util.ArrayList;
import java.util.List;
import java.util.Optional;
import java.util.Random;
import java.util.function.Predicate;
public class StreamMemoryTest {
private static boolean blackHole = false;
public static List<Integer> getRandListOfSize(int size) {
ArrayList<Integer> randList = new ArrayList<>(size);
Random rnGen = new Random();
for (int i = 0; i < size; i++) {
randList.add(rnGen.nextInt(100));
}
return randList;
}
public static boolean getIndexOfNothingManualImpl(List<Integer> nums, Predicate<Integer> predicate) {
for (Integer num : nums) {
// Impossible condition
if (predicate.test(num)) {
return true;
}
}
return false;
}
public static boolean getIndexOfNothingStreamImpl(List<Integer> nums, Predicate<Integer> predicate) {
Optional<Integer> first = nums.stream().filter(predicate).findFirst();
return first.isPresent();
}
public static void consume(boolean value) {
blackHole = blackHole && value;
}
public static boolean result() {
return blackHole;
}
public static void main(String[] args) {
// 100 million trials
int numTrials = 100000000;
System.out.println("Beginning test");
for (int i = 0; i < numTrials; i++) {
List<Integer> randomNums = StreamMemoryTest.getRandListOfSize(100);
consume(StreamMemoryTest.getIndexOfNothingStreamImpl(randomNums, x -> x < 0));
// or ...
// consume(StreamMemoryTest.getIndexOfNothingManualImpl(randomNums, x -> x < 0));
if (randomNums == null) {
break;
}
}
System.out.print(StreamMemoryTest.result());
}
}
Stream implementation:
Memory Allocated for TLABs 64.62 GB
Class Average Object Size(bytes) Total Object Size(bytes) TLABs Average TLAB Size(bytes) Total TLAB Size(bytes) Pressure(%)
java.lang.Object[] 415.974 6,226,712 14,969 2,999,696.432 44,902,455,888 64.711
java.util.stream.ReferencePipeline$2 64 131,264 2,051 2,902,510.795 5,953,049,640 8.579
java.util.stream.ReferencePipeline$Head 56 72,744 1,299 3,070,768.043 3,988,927,688 5.749
java.util.stream.ReferencePipeline$2$1 24 25,128 1,047 3,195,726.449 3,345,925,592 4.822
java.util.Random 32 30,976 968 3,041,212.372 2,943,893,576 4.243
java.util.ArrayList 24 24,576 1,024 2,720,615.594 2,785,910,368 4.015
java.util.stream.FindOps$FindSink$OfRef 24 18,864 786 3,369,412.295 2,648,358,064 3.817
java.util.ArrayList$ArrayListSpliterator 32 14,720 460 3,080,696.209 1,417,120,256 2.042
Manual implementation:
Memory Allocated for TLABs 46.06 GB
Class Average Object Size(bytes) Total Object Size(bytes) TLABs Average TLAB Size(bytes) Total TLAB Size(bytes) Pressure(%)
java.lang.Object[] 415.961 4,190,392 10,074 4,042,267.769 40,721,805,504 82.33
java.util.Random 32 32,064 1,002 4,367,131.521 4,375,865,784 8.847
java.util.ArrayList 24 14,976 624 3,530,601.038 2,203,095,048 4.454
Has anyone else encountered issues with the stream objects themselves consuming memory? / Is this a known issue?

Using Stream API you indeed allocate more memory, though your experimental setup is somewhat questionable. I've never used JFR, but my findings using JOL are quite similar to yours.
Note that you measure not only the heap allocated during the ArrayList querying, but also during its creation and population. The allocation during the allocation and population of single ArrayList should look like this (64bits, compressed OOPs, via JOL):
COUNT AVG SUM DESCRIPTION
1 416 416 [Ljava.lang.Object;
1 24 24 java.util.ArrayList
1 32 32 java.util.Random
1 24 24 java.util.concurrent.atomic.AtomicLong
4 496 (total)
So the most memory allocated is the Object[] array used inside ArrayList to store the data. AtomicLong is a part of Random class implementation. If you perform this 100_000_000 times, then you should have at least 496*10^8/2^30 = 46.2 Gb allocated in both tests. Nevertheless this part could be skipped as it should be identical for both tests.
Another interesting thing here is inlining. JIT is smart enough to inline the whole getIndexOfNothingManualImpl (via java -XX:+UnlockDiagnosticVMOptions -XX:+PrintCompilation -XX:+PrintInlining StreamMemoryTest):
StreamMemoryTest::main # 13 (59 bytes)
...
# 30 StreamMemoryTest::getIndexOfNothingManualImpl (43 bytes) inline (hot)
# 1 java.util.ArrayList::iterator (10 bytes) inline (hot)
\-> TypeProfile (2132/2132 counts) = java/util/ArrayList
# 6 java.util.ArrayList$Itr::<init> (6 bytes) inline (hot)
# 2 java.util.ArrayList$Itr::<init> (26 bytes) inline (hot)
# 6 java.lang.Object::<init> (1 bytes) inline (hot)
# 8 java.util.ArrayList$Itr::hasNext (20 bytes) inline (hot)
\-> TypeProfile (215332/215332 counts) = java/util/ArrayList$Itr
# 8 java.util.ArrayList::access$100 (5 bytes) accessor
# 17 java.util.ArrayList$Itr::next (66 bytes) inline (hot)
# 1 java.util.ArrayList$Itr::checkForComodification (23 bytes) inline (hot)
# 14 java.util.ArrayList::access$100 (5 bytes) accessor
# 28 StreamMemoryTest$$Lambda$1/791452441::test (8 bytes) inline (hot)
\-> TypeProfile (213200/213200 counts) = StreamMemoryTest$$Lambda$1
# 4 StreamMemoryTest::lambda$main$0 (13 bytes) inline (hot)
# 1 java.lang.Integer::intValue (5 bytes) accessor
# 8 java.util.ArrayList$Itr::hasNext (20 bytes) inline (hot)
# 8 java.util.ArrayList::access$100 (5 bytes) accessor
# 33 StreamMemoryTest::consume (19 bytes) inline (hot)
Disassembly actually shows that no allocation of iterator is performed after warm-up. Because escape analysis successfully tells JIT that iterator object does not escape, it's simply scalarized. Were the Iterator actually allocated it would take additionally 32 bytes:
COUNT AVG SUM DESCRIPTION
1 32 32 java.util.ArrayList$Itr
1 32 (total)
Note that JIT could also remove iteration at all. Your blackhole is false by default, so doing blackhole = blackhole && value does not change it regardless of the value, and value calculation could be excluded at all, as it does not have any side effects. I'm not sure whether it actually did this (reading disassembly is quite hard for me), but it's possible.
However while getIndexOfNothingStreamImpl also seems to inline everything inside, escape analysis fails as there are too many interdependent objects inside the stream API, so actual allocations occur. Thus it really adds five additional objects (the table is composed manually from JOL outputs):
COUNT AVG SUM DESCRIPTION
1 32 32 java.util.ArrayList$ArrayListSpliterator
1 24 24 java.util.stream.FindOps$FindSink$OfRef
1 64 64 java.util.stream.ReferencePipeline$2
1 24 24 java.util.stream.ReferencePipeline$2$1
1 56 56 java.util.stream.ReferencePipeline$Head
5 200 (total)
So every invocation of this particular stream actually allocates 200 additional bytes. As you perform 100_000_000 iterations, in total Stream version should allocate 10^8*200/2^30 = 18.62Gb more than manual version which is close to your result. I think, AtomicLong inside Random is scalarized as well, but both Iterator and AtomicLong are present during the warmup iterations (until JIT actually creates the most optimized version). This would explain the minor discrepancies in the numbers.
This additional 200 bytes allocation does not depend on the stream size, but depends on the number of intermediate stream operations (in particular, every additional filter step would add 64+24=88 bytes more). However note that these objects are usually short-lived, allocated quickly and can be collected by minor GC. In most of real-life applications you probably should not have to worry about this.

Not only more memory due to the infrastructure that is needed to build the Stream API. But also, it might to be slower in terms of speed (at least for this small inputs).
There is this presentation from one of the developers from Oracle (it is in russian, but that is not the point) that shows a trivial example (not much more complicated then yours) where the speed of execution is 30% worse in case of Streams vs Loops. He says that's pretty normal.
One thing that I've notice that not a lot of people realize is that using Streams (lambda's and method references to be more precise) will also create (potentially) a lot of classes that you will not know about.
Try to run your example with :
-Djdk.internal.lambda.dumpProxyClasses=/Some/Path/Of/Yours
And see how many additional classes will be created by your code and the code that Streams need (via ASM)

What symbol table can I use to store ~50 mil strings with fast lookup without running out of heap space?

I have a file of ~50 million strings that I need to add to a symbol table of some sort on startup, then search several times with reasonable speed.
I tried using a DLB trie since lookup would be relatively fast since all strings are < 10 characters, but while populating the DLB I would get either GC overhead limit exceeded or outofmemory - heap space error. The same errors were found with HashMap. This is for an assignment that would be compiled and run by a grader so I would rather not just allocate more heap space. Is there a different data structure that would have less memory usage, while still having reasonable lookup time?

If you expect low prefix sharing, then a trie may not be your best option.
Since you only load the lookup table once, at startup, and your goal is low memory footprint with "reasonable speed" for lookup, your best option is likely a sorted array and binary search for lookup.
First, you load the data into an array. Since you likely don't know the size up front, you load into an ArrayList. You then extract the final array from the list.
Assuming you load 50 million 10 character strings, memory will be:
10 character string:
String: 12 byte header + 4 byte 'hash' + 4 byte 'value' ref = 24 bytes (aligned)
char[]: 12 byte header + 4 byte 'length' + 10 * 2 byte 'char' = 40 bytes (aligned)
Total: 24 + 40 = 64 bytes
Array of 50 million 10 character strings:
String[]: 12 byte header + 4 byte 'length' + 50,000,000 * 4 byte 'String' ref = 200,000,016 bytes
Values: 50,000,000 * 64 bytes = 3,200,000,000 bytes
Total: 200,000,016 + 3,200,000,000 = 3,400,000,016 bytes = 3.2 GB
You will need another copy of the String[] when you convert the ArrayList<String> to String[]. The Arrays.sort() operation may need 50% array size (~100,000,000 bytes) for temporary storage, but if ArrayList is released for GC before you sort, that space can be reused.
So, total requirement is ~3.5 GB, just for the symbol table.
Now, if space is truly at a premium, you can squeeze that. As you can see, the String itself adds an overhead of 24 bytes, out of the 64 bytes. You can make the symbol table use char[] directly.
Also, if your strings are all US-ASCII or ISO-8859-1, you can convert the char[] to a byte[], saving half the bytes.
Combined, that reduces the value size from 64 bytes to 32 bytes, and the total symbol table size from 3.2 GB to 1.8 GB, or roughly 2 GB during loading.
UPDATE
Assuming input list of strings are already sorted, below is example of how you do this. As an MCVE, it just uses a small static array as input, but you can easily read them from a file instead.
public class Test {
public static void main(String[] args) {
String[] wordsFromFile = { "appear", "attack", "cellar", "copper",
"erratic", "grotesque", "guitar", "guttural",
"kittens", "mean", "suit", "trick" };
List<byte[]> wordList = new ArrayList<>();
for (String word : wordsFromFile) // Simulating read from file
wordList.add(word.getBytes(StandardCharsets.US_ASCII));
byte[][] symbolTable = wordList.toArray(new byte[wordList.size()][]);
test(symbolTable, "abc");
test(symbolTable, "attack");
test(symbolTable, "car");
test(symbolTable, "kittens");
test(symbolTable, "xyz");
}
private static void test(byte[][] symbolTable, String word) {
int idx = Arrays.binarySearch(symbolTable,
word.getBytes(StandardCharsets.US_ASCII),
Test::compare);
if (idx < 0)
System.out.println("Not found: " + word);
else
System.out.println("Found : " + word);
}
private static int compare(byte[] w1, byte[] w2) {
for (int i = 0, cmp; i < w1.length && i < w2.length; i++)
if ((cmp = Byte.compare(w1[i], w2[i])) != 0)
return cmp;
return Integer.compare(w1.length, w2.length);
}
}
Output
Not found: abc
Found : attack
Not found: car
Found : kittens
Not found: xyz

Use a single char array to store all strings (sorted), and an array of integers for the offsets. String n is the chars from offset[n - 1] (inclusive) to offset[n] (exclusive). offset[-1] is zero.
Memory usage will be 1GB (50M*10*2) for the char array, and 200MB (50M * 4) for the offset array. Very compact even with two byte chars.
You will have to build this array by merging smaller sorted string arrays in order not to exceed your heap space. But once you have it, it should be reasonably fast.
Alternatively, you could try a memory optimized trie implementation such as https://github.com/rklaehn/radixtree . This uses not just prefix sharing, but also structural sharing for common suffixes, so unless your strings are completely random, it should be quite compact. See the space usage benchmark. But it is scala, not java.

How much memory is taken by an int[n][]?

For example, declaring a new int[n][n] in java will result in n array references with each array containing n elements.
If I declare a new int[n][], how much memory will this take? I suspect it is just n references to null, but I want to confirm this.

In Java we have the following sizes:
int = 4 bytes
int[] = 4N + 24 bytes
int[][] ~4MN bytes
Array = 24 bytes + memory for each array entry
So, your array new int[n][] is one-dimensional array from 0 to n. It takes a 4N+24 bytes (24 bytes for array + 4*N bytes for each array entry) typically.
By the way, it is JVM dependent and may be a more accurate answer is ~4N bytes plus a header information.

Is there any performance/memory benefit if the capacity of Array List is being specified which is holding only two elements?

I've a Array List which is going to hold only two elements, I want to specify the initial Capacity to TWO since initial capacity is ten by default.
List<Integer> values = new ArrayList<integer>(2);
Will I get any performance/memory benefit out of it?
Any discussions around would be appreciated...

You will not get any performance benefit out of it, except for a very small reduction in memory usage.
If you're sure that the size is exactly two elements and it will never change, and to obtain a bit of a performance boost, simply use an array of primitive types (unless there's a really good reason to prefer Integer, an int is a better option):
int[] values = new int[2];
UPDATE
If you need to store mixed types, then use an Object[]. It's still a better alternative than using an ArrayList, if the size is fixed to two elements:
Object[] values = new Object[2];

Check out this post. EDIT: Some lists will resize after they are filled past a certain percentage (load factor), but this doesn't seem to be the case with ArrayLists.
Sorry for the mistake haha. Got hashtables and dynamic arrays a bit confused.
If you really want to know how ArrayLists work under the hood, check out the ArrayList source code. I think the ensureCapacity() method that determines whether the backing array needs to be resized:
171 /**
172 * Increases the capacity of this <tt>ArrayList</tt> instance, if
173 * necessary, to ensure that it can hold at least the number of elements
174 * specified by the minimum capacity argument.
175 *
176 * #param minCapacity the desired minimum capacity
177 */
178 public void ensureCapacity(int minCapacity) {
179 modCount++;
180 int oldCapacity = elementData.length;
181 if (minCapacity > oldCapacity) {
182 Object oldData[] = elementData;
183 int newCapacity = (oldCapacity * 3)/2 + 1;
184 if (newCapacity < minCapacity)
185 newCapacity = minCapacity;
186 // minCapacity is usually close to size, so this is a win:
187 elementData = Arrays.copyOf(elementData, newCapacity);
188 }
189 }
And the new size happens to be:
int newCapacity = (oldCapacity * 3)/2 + 1;
Hope that helps!

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.