Question about java.util.ArrayList realization [duplicate] - java

The usual constructor of ArrayList is:
ArrayList<?> list = new ArrayList<>();
But there is also an overloaded constructor with a parameter for its initial capacity:
ArrayList<?> list = new ArrayList<>(20);
Why is it useful to create an ArrayList with an initial capacity when we can append to it as we please?

If you know in advance what the size of the ArrayList is going to be, it is more efficient to specify the initial capacity. If you don't do this, the internal array will have to be repeatedly reallocated as the list grows.
The larger the final list, the more time you save by avoiding the reallocations.
That said, even without pre-allocation, inserting n elements at the back of an ArrayList is guaranteed to take total O(n) time. In other words, appending an element is an amortized constant-time operation. This is achieved by having each reallocation increase the size of the array exponentially, typically by a factor of 1.5. With this approach, the total number of operations can be shown to be O(n).

Because ArrayList is a dynamically resizing array data structure, which means it is implemented as an array with an initial (default) fixed size. When this gets filled up, the array will be extended to a double sized one. This operation is costly, so you want as few as possible.
So, if you know your upper bound is 20 items, then creating the array with initial length of 20 is better than using a default of, say, 15 and then resize it to 15*2 = 30 and use only 20 while wasting the cycles for the expansion.
P.S. - As AmitG says, the expansion factor is implementation specific (in this case (oldCapacity * 3)/2 + 1)

Default size of Arraylist is 10.
/**
* Constructs an empty list with an initial capacity of ten.
*/
public ArrayList() {
this(10);
}
So if you are going to add 100 or more records, you can see the overhead of memory reallocation.
ArrayList<?> list = new ArrayList<>();
// same as new ArrayList<>(10);
So if you have any idea about the number of elements which will be stored in Arraylist its better to create Arraylist with that size instead of starting with 10 and then going on increasing it.

I actually wrote a blog post on the topic 2 months ago. The article is for C#'s List<T> but Java's ArrayList has a very similar implementation. Since ArrayList is implemented using a dynamic array, it increases in size on demand. So the reason for the capacity constructor is for optimisation purposes.
When one of these resizings operation occurs, the ArrayList copies the contents of the array into a new array that is twice the capacity of the old one. This operation runs in O(n) time.
Example
Here is an example of how the ArrayList would increase in size:
10
16
25
38
58
... 17 resizes ...
198578
297868
446803
670205
1005308
So the list starts with a capacity of 10, when the 11th item is added it is increase by 50% + 1 to 16. On the 17th item the ArrayList is increased again to 25 and so on. Now consider the example where we're creating a list where the desired capacity is already known as 1000000. Creating the ArrayList without the size constructor will call ArrayList.add 1000000 times which takes O(1) normally or O(n) on resize.
1000000 + 16 + 25 + ... + 670205 + 1005308 = 4015851 operations
Compare this using the constructor and then calling ArrayList.add which is guaranteed to run in O(1).
1000000 + 1000000 = 2000000 operations
Java vs C#
Java is as above, starting at 10 and increasing each resize at 50% + 1. C# starts at 4 and increases much more aggressively, doubling at each resize. The 1000000 adds example from above for C# uses 3097084 operations.
References
My blog post on C#'s List<T>
Java's ArrayList source code

Setting the initial size of an ArrayList, e.g. to ArrayList<>(100), reduces the number of times the re-allocation of internal memory has to occur.
Example:
ArrayList example = new ArrayList<Integer>(3);
example.add(1); // size() == 1
example.add(2); // size() == 2,
example.add(2); // size() == 3, example has been 'filled'
example.add(3); // size() == 4, example has been 'expanded' so that the fourth element can be added.
As you see in the above example - an ArrayList can be expanded if needed to be. What this doesn't show you is that the size of the Arraylist usually doubles (although note that the new size depends on your implementation). The following is quoted from Oracle:
"Each ArrayList instance has a capacity. The capacity is the size of
the array used to store the elements in the list. It is always at
least as large as the list size. As elements are added to an
ArrayList, its capacity grows automatically. The details of the growth
policy are not specified beyond the fact that adding an element has
constant amortized time cost."
Obviously, if you have no idea as to what kind of range you will be holding, setting the size probably won't be a good idea - however, if you do have a specific range in mind, setting an initial capacity will increase memory efficiency.

ArrayList can contain many values and when doing large initial insertions you can tell ArrayList to allocate a larger storage to begin with as to not waste CPU cycles when it tries to allocate more space for the next item. Thus to allocate some space at the beginning is more effiecient.

This is to avoid possible efforts for reallocation for every single object.
int newCapacity = (oldCapacity * 3)/2 + 1;
internally new Object[] is created. JVM needs effort to create new Object[] when you add element in the arraylist. If you don't have above code(any algo you think) for reallocation then every time when you invoke arraylist.add() then new Object[] has to be created which is pointless and we are loosing time for increasing size by 1 for each and every objects to be added. So it is better to increase size of Object[] with following formula.
(JSL has used forcasting formula given below for dynamically growing arraylist instead of growing by 1 every time. Because to grow it takes effort by JVM)
int newCapacity = (oldCapacity * 3)/2 + 1;

I think each ArrayList is created with an init capacity value of "10". So anyway, if you create an ArrayList without setting capacity within constructor it will be created with a default value.

I'd say its an optimization. ArrayList without initial capacity will have ~10 empty rows and will expand when you are doing an add.
To have a list with exactly the number of items you need to call trimToSize()

As per my experience with ArrayList, giving an initial capacity is a nice way to avoid reallocation costs. But it bears a caveat. All suggestions mentioned above say that one should provide initial capacity only when a rough estimate of the number of elements is known. But when we try to give an initial capacity without any idea, the amount of memory reserved and unused will be a waste as it may never be required once the list is filled to required number of elements. What i am saying is, we can be pragmatic at the beginning while allocating capacity, and then find a smart way of knowing required minimal capacity at runtime. ArrayList provides a method called ensureCapacity(int minCapacity). But then, one has find a smart way...

I have tested ArrayList with and without initialCapacity and I got suprising result
When I set LOOP_NUMBER to 100,000 or less the result is that setting initialCapacity is efficient.
list1Sttop-list1Start = 14
list2Sttop-list2Start = 10
But when I set LOOP_NUMBER to 1,000,000 the result changes to:
list1Stop-list1Start = 40
list2Stop-list2Start = 66
Finally, I couldn't figure out how does it works?!
Sample code:
public static final int LOOP_NUMBER = 100000;
public static void main(String[] args) {
long list1Start = System.currentTimeMillis();
List<Integer> list1 = new ArrayList();
for (int i = 0; i < LOOP_NUMBER; i++) {
list1.add(i);
}
long list1Stop = System.currentTimeMillis();
System.out.println("list1Stop-list1Start = " + String.valueOf(list1Stop - list1Start));
long list2Start = System.currentTimeMillis();
List<Integer> list2 = new ArrayList(LOOP_NUMBER);
for (int i = 0; i < LOOP_NUMBER; i++) {
list2.add(i);
}
long list2Stop = System.currentTimeMillis();
System.out.println("list2Stop-list2Start = " + String.valueOf(list2Stop - list2Start));
}
I have tested on windows8.1 and jdk1.7.0_80

Related

Why we can't add element into a List to nth index before adding element to (n-1)th index even if initial capacity is provided

Suppose I declare an array of int with size 10, I can add an element to its 4th index, I can run the code without any exception.
int[] ar = new int[10];
ar[4] = 8;
System.out.println(Arrays.toString(ar)); //works fine
That's because when I say size as 10, that much memory space will be allocated for that array with initial value of its type kept in each index.
But the case is not same in List. Say I declare a list with an initial capacity of 10 and I try to add an element to its 4th index it gives
java.lang.IndexOutOfBoundsException: Index: 4, Size: 0
List<Integer> list = new ArrayList<Integer>(10);
list.add(4, 8); //exception
Of course the size of the list will return 0 even if initial capacity is given. Why is it not like Array, I think no memory is allocated for 10 elements for the list?
I was guessing that is there any way to fill List with default values once a capacity is given just like an array.
This is what the JavaDoc for ArrayList says about add(int index, E element):
throws IndexOutOfBoundsException - if the index is out of range (index < 0 || index > size())
The size is the number of elements currently stored, not the current capacity.
The fact that your car has the "capacity" to drive with 100 mph, that doesn't mean that you can magically get from 0 to 90 mph within 1 second ;-)
In other words: the answer is that size and capacity aren't the same. Capacity merely means: "that is the size this list can grow to before the underlying array needs to grow".
By now it should be clear that the initial capacity on the constructor is just a bit of memory management for the initial internal array. Without any semantical meaning.
When the actual size() overflows the array, the array is reallocated.
There is no such thing as a bulk allocation with initial elements. However there is:
List<Integer> list = Collections.nCopies(10, Integer.valueOf(0));
And the new Stream offers ways to dynamically generate lists.
You could do:
public <T> void add(List<T> list, int i, T obj) {
while (list.size() < i) {
list.add(null);
}
list.add(i, obj);
}
But it is immediately evident, that you'll introduce nulls which is unsafe and ugly,
requiring null checks.
As others have mentioned in their answers, in List<Integer> list = new ArrayList<Integer>(10), the 10 specifies the initial capacity.
Specifying the initial capacity is only an optional thing. You exercise that option only when you use the particular constructor that accepts an initial capacity as an argument. When you use other constructors, you don't have control over the initial capacity.
You specify n as the initial capacity if you want the first n additions to the list to be as efficient as possible -- otherwise, there is a possibility that the addition of each individual item to the list results in some costly internal re-sizing and re-copying into the re-sized internal area.
The above does not answer the question of why you are not allowed to add an item at position 8, when there is no item at position 7.
As some have answered, that's because the API doc says so.
That's one way to answer it. But why does the API doc say so? Why are things designed so?
Things are designed so, because:
Adding an item at position 8, when there is no item at position 7, results in a gap (before position 8).
As a programmer, you will then have to keep track of what an item's position is, among all possible positions (the full capacity). Currently, as a programmer, you only keep track of what the item's position is, among all added items. Now, wouldn't that be a programming nightmare?

In Java 8, why is the default capacity of ArrayList now zero?

As I recall, before Java 8, the default capacity of ArrayList was 10.
Surprisingly, the comment on the default (void) constructor still says: Constructs an empty list with an initial capacity of ten.
From ArrayList.java:
/**
* Shared empty array instance used for default sized empty instances. We
* distinguish this from EMPTY_ELEMENTDATA to know how much to inflate when
* first element is added.
*/
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
...
/**
* Constructs an empty list with an initial capacity of ten.
*/
public ArrayList() {
this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}
Technically, it's 10, not zero, if you admit for a lazy initialisation of the backing array. See:
public boolean add(E e) {
ensureCapacityInternal(size + 1);
elementData[size++] = e;
return true;
}
private void ensureCapacityInternal(int minCapacity) {
if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
}
ensureExplicitCapacity(minCapacity);
}
where
/**
* Default initial capacity.
*/
private static final int DEFAULT_CAPACITY = 10;
What you're referring to is just the zero-sized initial array object that is shared among all initially empty ArrayList objects. I.e. the capacity of 10 is guaranteed lazily, an optimisation that is present also in Java 7.
Admittedly, the constructor contract is not entirely accurate. Perhaps this is the source of confusion here.
Background
Here's an E-Mail by Mike Duigou
I have posted an updated version of the empty ArrayList and HashMap patch.
http://cr.openjdk.java.net/~mduigou/JDK-7143928/1/webrev/
This revised implementation introduces no new fields to either class. For ArrayList the lazy allocation of the backing array occurs only if the list is created at default size. According to our performance analysis team, approximately 85% of ArrayList instances are created at default size so this optimization will be valid for an overwhelming majority of cases.
For HashMap, creative use is made of the threshold field to track the requested initial size until the bucket array is needed. On the read side the empty map case is tested with isEmpty(). On the write size a comparison of (table == EMPTY_TABLE) is used to detect the need to inflate the bucket array. In readObject there's a little more work to try to choose an efficient initial capacity.
From: http://mail.openjdk.java.net/pipermail/core-libs-dev/2013-April/015585.html
In java 8 default capacity of ArrayList is 0 until we add at least one object into the ArrayList object (You can call it lazy initialization).
Now question is why this change has been done in JAVA 8?
Answer is to save memory consumption. Millions of array list objects are created in real time java applications. Default size of 10 objects means that we allocate 10 pointers (40 or 80 bytes) for underlying array at creation and fill them in with nulls.
An empty array (filled with nulls) occupy lot of memory .
Lazy initialization postpones this memory consumption till moment you will actually use the array list.
Please see below code for help.
ArrayList al = new ArrayList(); //Size: 0, Capacity: 0
ArrayList al = new ArrayList(5); //Size: 0, Capacity: 5
ArrayList al = new ArrayList(new ArrayList(5)); //Size: 0, Capacity: 0
al.add( "shailesh" ); //Size: 1, Capacity: 10
public static void main( String[] args )
throws Exception
{
ArrayList al = new ArrayList();
getCapacity( al );
al.add( "shailesh" );
getCapacity( al );
}
static void getCapacity( ArrayList<?> l )
throws Exception
{
Field dataField = ArrayList.class.getDeclaredField( "elementData" );
dataField.setAccessible( true );
System.out.format( "Size: %2d, Capacity: %2d%n", l.size(), ( (Object[]) dataField.get( l ) ).length );
}
Response: -
Size: 0, Capacity: 0
Size: 1, Capacity: 10
Article Default capacity of ArrayList in Java 8 explains it in details.
If the very first operation that is done with an ArrayList is to pass addAll a collection which has more than ten elements, then any effort put into creating an initial ten-element array to hold the ArrayList's contents would be thrown out the window. Whenever something is added to an ArrayList it's necessary to test whether the size of the resulting list will exceed the size of the backing store; allowing the initial backing store to have size zero rather than ten will cause this test to fail one extra time in the lifetime of a list whose first operation is an "add" which would require creating the initial ten-item array, but that cost is less than the cost of creating a ten-item array that never ends up getting used.
That having been said, it might have been possible to improve performance further in some contexts if there were a overload of "addAll" which specified how many items (if any) would likely be added to the list after the present one, and which could use that to influence its allocation behavior. In some cases code which adds the last few items to a list will have a pretty good idea that the list is never going to need any space beyond that. There are many situations where a list will get populated once and never modified after that. If at the point code knows that the ultimate size of a list will be 170 elements, it has 150 elements and a backing store of size 160, growing the backing store to size 320 will be unhelpful and leaving it at size 320 or trimming it to 170 will be less efficient than simply having the next allocation grow it to 170.
The question is 'why?'.
Memory profiling inspections (for example (https://www.yourkit.com/docs/java/help/inspections_mem.jsp#sparse_arrays) shows that empty (filled with nulls) arrays occupy tons of memory .
Default size of 10 objects means that we allocate 10 pointers (40 or 80 bytes) for underlying array at creation and fill them in with nulls. Real java applications create millions of array lists.
The introduced modification removes^W postpone this memory consumption till moment you will actually use the array list.
After above question I gone through ArrayList Document of Java 8. I found the default size is still 10 only.
ArrayList default size in JAVA 8 is stil 10. The only change made in JAVA 8 is that if a coder adds elements less than 10 then the remaining arraylist blank places are not specified to null. Saying so because I have myself gone through this situation and eclipse made me look into this change of JAVA 8.
You can justify this change by looking at below screenshot. In it you can see that ArrayList size is specified as 10 in Object[10] but the number of elements displayed are only 7. Rest null value elements are not displayed here. In JAVA 7 below screenshot is same with just a single change which is that the null value elements are also displayed for which the coder needs to write code for handling null values if he is iterating complete array list while in JAVA 8 this burden is removed from the head of coder/developer.
Screen shot link.

Java : Initialize the ArrayList [duplicate]

The usual constructor of ArrayList is:
ArrayList<?> list = new ArrayList<>();
But there is also an overloaded constructor with a parameter for its initial capacity:
ArrayList<?> list = new ArrayList<>(20);
Why is it useful to create an ArrayList with an initial capacity when we can append to it as we please?
If you know in advance what the size of the ArrayList is going to be, it is more efficient to specify the initial capacity. If you don't do this, the internal array will have to be repeatedly reallocated as the list grows.
The larger the final list, the more time you save by avoiding the reallocations.
That said, even without pre-allocation, inserting n elements at the back of an ArrayList is guaranteed to take total O(n) time. In other words, appending an element is an amortized constant-time operation. This is achieved by having each reallocation increase the size of the array exponentially, typically by a factor of 1.5. With this approach, the total number of operations can be shown to be O(n).
Because ArrayList is a dynamically resizing array data structure, which means it is implemented as an array with an initial (default) fixed size. When this gets filled up, the array will be extended to a double sized one. This operation is costly, so you want as few as possible.
So, if you know your upper bound is 20 items, then creating the array with initial length of 20 is better than using a default of, say, 15 and then resize it to 15*2 = 30 and use only 20 while wasting the cycles for the expansion.
P.S. - As AmitG says, the expansion factor is implementation specific (in this case (oldCapacity * 3)/2 + 1)
Default size of Arraylist is 10.
/**
* Constructs an empty list with an initial capacity of ten.
*/
public ArrayList() {
this(10);
}
So if you are going to add 100 or more records, you can see the overhead of memory reallocation.
ArrayList<?> list = new ArrayList<>();
// same as new ArrayList<>(10);
So if you have any idea about the number of elements which will be stored in Arraylist its better to create Arraylist with that size instead of starting with 10 and then going on increasing it.
I actually wrote a blog post on the topic 2 months ago. The article is for C#'s List<T> but Java's ArrayList has a very similar implementation. Since ArrayList is implemented using a dynamic array, it increases in size on demand. So the reason for the capacity constructor is for optimisation purposes.
When one of these resizings operation occurs, the ArrayList copies the contents of the array into a new array that is twice the capacity of the old one. This operation runs in O(n) time.
Example
Here is an example of how the ArrayList would increase in size:
10
16
25
38
58
... 17 resizes ...
198578
297868
446803
670205
1005308
So the list starts with a capacity of 10, when the 11th item is added it is increase by 50% + 1 to 16. On the 17th item the ArrayList is increased again to 25 and so on. Now consider the example where we're creating a list where the desired capacity is already known as 1000000. Creating the ArrayList without the size constructor will call ArrayList.add 1000000 times which takes O(1) normally or O(n) on resize.
1000000 + 16 + 25 + ... + 670205 + 1005308 = 4015851 operations
Compare this using the constructor and then calling ArrayList.add which is guaranteed to run in O(1).
1000000 + 1000000 = 2000000 operations
Java vs C#
Java is as above, starting at 10 and increasing each resize at 50% + 1. C# starts at 4 and increases much more aggressively, doubling at each resize. The 1000000 adds example from above for C# uses 3097084 operations.
References
My blog post on C#'s List<T>
Java's ArrayList source code
Setting the initial size of an ArrayList, e.g. to ArrayList<>(100), reduces the number of times the re-allocation of internal memory has to occur.
Example:
ArrayList example = new ArrayList<Integer>(3);
example.add(1); // size() == 1
example.add(2); // size() == 2,
example.add(2); // size() == 3, example has been 'filled'
example.add(3); // size() == 4, example has been 'expanded' so that the fourth element can be added.
As you see in the above example - an ArrayList can be expanded if needed to be. What this doesn't show you is that the size of the Arraylist usually doubles (although note that the new size depends on your implementation). The following is quoted from Oracle:
"Each ArrayList instance has a capacity. The capacity is the size of
the array used to store the elements in the list. It is always at
least as large as the list size. As elements are added to an
ArrayList, its capacity grows automatically. The details of the growth
policy are not specified beyond the fact that adding an element has
constant amortized time cost."
Obviously, if you have no idea as to what kind of range you will be holding, setting the size probably won't be a good idea - however, if you do have a specific range in mind, setting an initial capacity will increase memory efficiency.
ArrayList can contain many values and when doing large initial insertions you can tell ArrayList to allocate a larger storage to begin with as to not waste CPU cycles when it tries to allocate more space for the next item. Thus to allocate some space at the beginning is more effiecient.
This is to avoid possible efforts for reallocation for every single object.
int newCapacity = (oldCapacity * 3)/2 + 1;
internally new Object[] is created. JVM needs effort to create new Object[] when you add element in the arraylist. If you don't have above code(any algo you think) for reallocation then every time when you invoke arraylist.add() then new Object[] has to be created which is pointless and we are loosing time for increasing size by 1 for each and every objects to be added. So it is better to increase size of Object[] with following formula.
(JSL has used forcasting formula given below for dynamically growing arraylist instead of growing by 1 every time. Because to grow it takes effort by JVM)
int newCapacity = (oldCapacity * 3)/2 + 1;
I think each ArrayList is created with an init capacity value of "10". So anyway, if you create an ArrayList without setting capacity within constructor it will be created with a default value.
I'd say its an optimization. ArrayList without initial capacity will have ~10 empty rows and will expand when you are doing an add.
To have a list with exactly the number of items you need to call trimToSize()
As per my experience with ArrayList, giving an initial capacity is a nice way to avoid reallocation costs. But it bears a caveat. All suggestions mentioned above say that one should provide initial capacity only when a rough estimate of the number of elements is known. But when we try to give an initial capacity without any idea, the amount of memory reserved and unused will be a waste as it may never be required once the list is filled to required number of elements. What i am saying is, we can be pragmatic at the beginning while allocating capacity, and then find a smart way of knowing required minimal capacity at runtime. ArrayList provides a method called ensureCapacity(int minCapacity). But then, one has find a smart way...
I have tested ArrayList with and without initialCapacity and I got suprising result
When I set LOOP_NUMBER to 100,000 or less the result is that setting initialCapacity is efficient.
list1Sttop-list1Start = 14
list2Sttop-list2Start = 10
But when I set LOOP_NUMBER to 1,000,000 the result changes to:
list1Stop-list1Start = 40
list2Stop-list2Start = 66
Finally, I couldn't figure out how does it works?!
Sample code:
public static final int LOOP_NUMBER = 100000;
public static void main(String[] args) {
long list1Start = System.currentTimeMillis();
List<Integer> list1 = new ArrayList();
for (int i = 0; i < LOOP_NUMBER; i++) {
list1.add(i);
}
long list1Stop = System.currentTimeMillis();
System.out.println("list1Stop-list1Start = " + String.valueOf(list1Stop - list1Start));
long list2Start = System.currentTimeMillis();
List<Integer> list2 = new ArrayList(LOOP_NUMBER);
for (int i = 0; i < LOOP_NUMBER; i++) {
list2.add(i);
}
long list2Stop = System.currentTimeMillis();
System.out.println("list2Stop-list2Start = " + String.valueOf(list2Stop - list2Start));
}
I have tested on windows8.1 and jdk1.7.0_80

Adding value to list initialize list size

I have below code:
Hashtable<Integer, List<Model>> map = new Hashtable<Integer, List<Model>>();
for (int i = 0; i < arraylistAssignment.size(); i++) {
List<Model> temp = null;
for (int j = 0; j < arraylistModel.size(); j++) {
if (arraylistAssignment.get(i).getId() == arraylistModel.get(j)
.getId()) {
if (temp == null)
temp = new ArrayList<Model>();// DEBUG POINT 1
temp.add(arraylistModel.get(j));
}// DEBUG POINT 2 AFTER ADD FUNCTION ABOVE
}
map.put(arraylistAssignment.get(i).getId(), temp);
}
In the above code at debug point 1 when when i hv intitilzed the temp variable , there the object size is 0 as showm below :
but as soon as i add i.e temp.add the size is 1 but objects create is 12 out of which 11 values are null as shown below ...i could not understand the reason for null values here can anyone plz exaplin ...m i initilzing wrong?
An ArrayList is a dynamic array, what means that it grows as elements are added. But it doesn't change its size "one by one". Its size grows a "reasonable" amount, so the operation of resizing the list is not repeated each time you add an element, because this would be inefficient.
The reason for null values is because that's how ArrayLists work on the inside. They start off with a blank array inside, and as you add things they resize themselves as they see fit. The reason the array is larger than the number of objects you put in is because it'd be highly inefficient to resize the array every time you added something, so instead the ArrayList implementers just made the inner array start off at a certain size and approximately double in size every time it needs to be resized. They track how many elements you put in by tracking a separate size variable.
So in other words, you're initializing things just fine. Don't worry about the internals of the ArrayList -- if you look at the internal size variable, you'll see that it is 1, just as you expect.
ArrayList is a data structure in the Collections framework which is built on top of arrays I.e. It's implementation is done with the help of arrays. Since size is to be defined in arrays, it initializes size to be 10 at first. When you add values, it becomes the 11th item.
Now you might wonder how is this dynamic and how it works, well when size is reached to its limit it creates a new array double the size, copies the old stuff and discards the prev array. Would recommend you to take a look at the implementation.
To the user, it looks like dynamic but when you look through debugger you would see nulls. Array STARTS at 0 to 10 which makes it 11 elements and your newly added item becomes 12th but for public api, it's still the first element.
Check here for complete implementation of ArrayList: link
From Java Dokumentation:
http://docs.oracle.com/javase/7/docs/api/java/util/ArrayList.html
Each ArrayList instance has a capacity. The capacity is the size of the array used to store the elements in the list. It is always at least as large as the list size. As elements are added to an ArrayList, its capacity grows automatically. The details of the growth policy are not specified beyond the fact that adding an element has constant amortized time cost.
Default capacity is somehow 12 in your case, even though it should be 10.

Why start an ArrayList with an initial capacity?

The usual constructor of ArrayList is:
ArrayList<?> list = new ArrayList<>();
But there is also an overloaded constructor with a parameter for its initial capacity:
ArrayList<?> list = new ArrayList<>(20);
Why is it useful to create an ArrayList with an initial capacity when we can append to it as we please?
If you know in advance what the size of the ArrayList is going to be, it is more efficient to specify the initial capacity. If you don't do this, the internal array will have to be repeatedly reallocated as the list grows.
The larger the final list, the more time you save by avoiding the reallocations.
That said, even without pre-allocation, inserting n elements at the back of an ArrayList is guaranteed to take total O(n) time. In other words, appending an element is an amortized constant-time operation. This is achieved by having each reallocation increase the size of the array exponentially, typically by a factor of 1.5. With this approach, the total number of operations can be shown to be O(n).
Because ArrayList is a dynamically resizing array data structure, which means it is implemented as an array with an initial (default) fixed size. When this gets filled up, the array will be extended to a double sized one. This operation is costly, so you want as few as possible.
So, if you know your upper bound is 20 items, then creating the array with initial length of 20 is better than using a default of, say, 15 and then resize it to 15*2 = 30 and use only 20 while wasting the cycles for the expansion.
P.S. - As AmitG says, the expansion factor is implementation specific (in this case (oldCapacity * 3)/2 + 1)
Default size of Arraylist is 10.
/**
* Constructs an empty list with an initial capacity of ten.
*/
public ArrayList() {
this(10);
}
So if you are going to add 100 or more records, you can see the overhead of memory reallocation.
ArrayList<?> list = new ArrayList<>();
// same as new ArrayList<>(10);
So if you have any idea about the number of elements which will be stored in Arraylist its better to create Arraylist with that size instead of starting with 10 and then going on increasing it.
I actually wrote a blog post on the topic 2 months ago. The article is for C#'s List<T> but Java's ArrayList has a very similar implementation. Since ArrayList is implemented using a dynamic array, it increases in size on demand. So the reason for the capacity constructor is for optimisation purposes.
When one of these resizings operation occurs, the ArrayList copies the contents of the array into a new array that is twice the capacity of the old one. This operation runs in O(n) time.
Example
Here is an example of how the ArrayList would increase in size:
10
16
25
38
58
... 17 resizes ...
198578
297868
446803
670205
1005308
So the list starts with a capacity of 10, when the 11th item is added it is increase by 50% + 1 to 16. On the 17th item the ArrayList is increased again to 25 and so on. Now consider the example where we're creating a list where the desired capacity is already known as 1000000. Creating the ArrayList without the size constructor will call ArrayList.add 1000000 times which takes O(1) normally or O(n) on resize.
1000000 + 16 + 25 + ... + 670205 + 1005308 = 4015851 operations
Compare this using the constructor and then calling ArrayList.add which is guaranteed to run in O(1).
1000000 + 1000000 = 2000000 operations
Java vs C#
Java is as above, starting at 10 and increasing each resize at 50% + 1. C# starts at 4 and increases much more aggressively, doubling at each resize. The 1000000 adds example from above for C# uses 3097084 operations.
References
My blog post on C#'s List<T>
Java's ArrayList source code
Setting the initial size of an ArrayList, e.g. to ArrayList<>(100), reduces the number of times the re-allocation of internal memory has to occur.
Example:
ArrayList example = new ArrayList<Integer>(3);
example.add(1); // size() == 1
example.add(2); // size() == 2,
example.add(2); // size() == 3, example has been 'filled'
example.add(3); // size() == 4, example has been 'expanded' so that the fourth element can be added.
As you see in the above example - an ArrayList can be expanded if needed to be. What this doesn't show you is that the size of the Arraylist usually doubles (although note that the new size depends on your implementation). The following is quoted from Oracle:
"Each ArrayList instance has a capacity. The capacity is the size of
the array used to store the elements in the list. It is always at
least as large as the list size. As elements are added to an
ArrayList, its capacity grows automatically. The details of the growth
policy are not specified beyond the fact that adding an element has
constant amortized time cost."
Obviously, if you have no idea as to what kind of range you will be holding, setting the size probably won't be a good idea - however, if you do have a specific range in mind, setting an initial capacity will increase memory efficiency.
ArrayList can contain many values and when doing large initial insertions you can tell ArrayList to allocate a larger storage to begin with as to not waste CPU cycles when it tries to allocate more space for the next item. Thus to allocate some space at the beginning is more effiecient.
This is to avoid possible efforts for reallocation for every single object.
int newCapacity = (oldCapacity * 3)/2 + 1;
internally new Object[] is created. JVM needs effort to create new Object[] when you add element in the arraylist. If you don't have above code(any algo you think) for reallocation then every time when you invoke arraylist.add() then new Object[] has to be created which is pointless and we are loosing time for increasing size by 1 for each and every objects to be added. So it is better to increase size of Object[] with following formula.
(JSL has used forcasting formula given below for dynamically growing arraylist instead of growing by 1 every time. Because to grow it takes effort by JVM)
int newCapacity = (oldCapacity * 3)/2 + 1;
I think each ArrayList is created with an init capacity value of "10". So anyway, if you create an ArrayList without setting capacity within constructor it will be created with a default value.
I'd say its an optimization. ArrayList without initial capacity will have ~10 empty rows and will expand when you are doing an add.
To have a list with exactly the number of items you need to call trimToSize()
As per my experience with ArrayList, giving an initial capacity is a nice way to avoid reallocation costs. But it bears a caveat. All suggestions mentioned above say that one should provide initial capacity only when a rough estimate of the number of elements is known. But when we try to give an initial capacity without any idea, the amount of memory reserved and unused will be a waste as it may never be required once the list is filled to required number of elements. What i am saying is, we can be pragmatic at the beginning while allocating capacity, and then find a smart way of knowing required minimal capacity at runtime. ArrayList provides a method called ensureCapacity(int minCapacity). But then, one has find a smart way...
I have tested ArrayList with and without initialCapacity and I got suprising result
When I set LOOP_NUMBER to 100,000 or less the result is that setting initialCapacity is efficient.
list1Sttop-list1Start = 14
list2Sttop-list2Start = 10
But when I set LOOP_NUMBER to 1,000,000 the result changes to:
list1Stop-list1Start = 40
list2Stop-list2Start = 66
Finally, I couldn't figure out how does it works?!
Sample code:
public static final int LOOP_NUMBER = 100000;
public static void main(String[] args) {
long list1Start = System.currentTimeMillis();
List<Integer> list1 = new ArrayList();
for (int i = 0; i < LOOP_NUMBER; i++) {
list1.add(i);
}
long list1Stop = System.currentTimeMillis();
System.out.println("list1Stop-list1Start = " + String.valueOf(list1Stop - list1Start));
long list2Start = System.currentTimeMillis();
List<Integer> list2 = new ArrayList(LOOP_NUMBER);
for (int i = 0; i < LOOP_NUMBER; i++) {
list2.add(i);
}
long list2Stop = System.currentTimeMillis();
System.out.println("list2Stop-list2Start = " + String.valueOf(list2Stop - list2Start));
}
I have tested on windows8.1 and jdk1.7.0_80

Categories