what is the difference between the following declarations:
List list1 = new ArrayList();
List list2 = new ArrayList(10);
By default is allocates it with 10. But is there any difference?
Can I add an 11th element to list2 by list2.add("something")?
Here is the source code for you for first example
public ArrayList() {
this(10);
}
So there is no difference. Since the initial capacity is 10, no matter you pass 10 or not, it gets initialised with capacity 10.
Can I add 11th element in the list2 by list2.add("something")?
Ofcourse, initial capacity is not final capacity. So as you keep on adding more than 10, the size of the list keeps increasing.
If you want to have a fixed size container, use Arrays.asList (or, for primitive arrays, the asList methods in Guava) and also consider java.util.Collections.unmodifiableList()
Worth reading about this change in Java 8 : In Java 8, why is the default capacity of ArrayList now zero?
In short, providing initial capacity wont really change anything interms of size.
You can always add elements in a list. However, the inlying array, which is used by the ArrayList, is initialized with either the default size of 10 or the size, which you specify when initializing the ArrayList. This means, if you e.g. add the 11th element, the array size has to be increased, which is done by copying the contents of the array to a new, bigger array instance. This of course needs time depending on the size of the list/array. So if you already know, that your list will hold thousands of elements, it is faster if you already initialize the list with that approximate size.
ArrayLists in Java are auto-growable, and will resize themselves if they need to in order to add additional elements. The size parameter in the constructor is just used for the initial size of the internal array, and is a sort of optimization for when you know exactly what you're going to use the array for.
Specifying this initial capacity is often a premature optimization, but if you really need an ArrayList of 10 elements, you should specify it explicitly, not assume that the default size is 10. Although this really used to be the default behavior (up to JDK 7, IIRC), you should not rely on it - JDK 8 (checked with java-1.8.0-openjdk-1.8.0.101-1.b14.fc24.x86_64 I have installed) creates empty ArrayLists by default.
The other answers have explained really well, but just to keep things relevant, in JDK 1.7.0_95:
/**
* Constructs a new {#code ArrayList} instance with zero initial capacity.
*/
public ArrayList() {
array = EmptyArray.OBJECT;
}
/**
* Constructs a new instance of {#code ArrayList} with the specified
* initial capacity.
*
* #param capacity
* the initial capacity of this {#code ArrayList}.
*/
public ArrayList(int capacity) {
if (capacity < 0) {
throw new IllegalArgumentException("capacity < 0: " + capacity);
}
array = (capacity == 0 ? EmptyArray.OBJECT : new Object[capacity]);
}
As the comment mentions, the constructor accepting no arguments initializes an ArrayList with zero initial capacity.
And even more interesting here is a variable (with a comment) that lends a lot of information on its own:
/**
* The minimum amount by which the capacity of an ArrayList will increase.
* This tuning parameter controls a time-space tradeoff. This value (12)
* gives empirically good results and is arguably consistent with the
* RI's specified default initial capacity of 10: instead of 10, we start
* with 0 (sans allocation) and jump to 12.
*/
private static final int MIN_CAPACITY_INCREMENT = 12;
You just picked the perfect example. Both actually do the same as new ArrayList() calls this(10) ;) But internally it would define the holding array with the size 10. the ArrayList#size method on the other side does just return a variable size, which only will be changed after adding and removing elements. This variable is also the main reason for IOOB Exceptions. So you wont be able to do so.
If you check the code of the ArrayList for example, you´ll notice that the method ArrayList#add will call ArrayList#rangeCheck. The range check actually just cares for the size variable and not the actuall length of the array holding the data for the List.
Due to this you´ll still not be able to insert data at the index 5 for example. The internal length of the data array at this point will be 10, but as you didn´t add anything to your List, the size variable will still be 0 and you´ll get the proper IndexOutOfBoundsException when you´ll try to do so.
just try to call list.size() after initializing the List with any size, and you´ll notice the returned size will be 0.
The initialization of ArrayList has been optimized since JDK 1.7 update 40 and there's a good explanation about the two different behaviours at this link
java-optimization-empty-arraylist-and-Hashmap-cost-less-memory-jdk-17040-update.
So before Java 1.7u40 there're no difference but from that version there's a quite substantial difference.
This difference is about perfomance optimization and doesn't change the contract of List.add(E e) and ArrayList(int initialCapacity).
Related
The usual constructor of ArrayList is:
ArrayList<?> list = new ArrayList<>();
But there is also an overloaded constructor with a parameter for its initial capacity:
ArrayList<?> list = new ArrayList<>(20);
Why is it useful to create an ArrayList with an initial capacity when we can append to it as we please?
If you know in advance what the size of the ArrayList is going to be, it is more efficient to specify the initial capacity. If you don't do this, the internal array will have to be repeatedly reallocated as the list grows.
The larger the final list, the more time you save by avoiding the reallocations.
That said, even without pre-allocation, inserting n elements at the back of an ArrayList is guaranteed to take total O(n) time. In other words, appending an element is an amortized constant-time operation. This is achieved by having each reallocation increase the size of the array exponentially, typically by a factor of 1.5. With this approach, the total number of operations can be shown to be O(n).
Because ArrayList is a dynamically resizing array data structure, which means it is implemented as an array with an initial (default) fixed size. When this gets filled up, the array will be extended to a double sized one. This operation is costly, so you want as few as possible.
So, if you know your upper bound is 20 items, then creating the array with initial length of 20 is better than using a default of, say, 15 and then resize it to 15*2 = 30 and use only 20 while wasting the cycles for the expansion.
P.S. - As AmitG says, the expansion factor is implementation specific (in this case (oldCapacity * 3)/2 + 1)
Default size of Arraylist is 10.
/**
* Constructs an empty list with an initial capacity of ten.
*/
public ArrayList() {
this(10);
}
So if you are going to add 100 or more records, you can see the overhead of memory reallocation.
ArrayList<?> list = new ArrayList<>();
// same as new ArrayList<>(10);
So if you have any idea about the number of elements which will be stored in Arraylist its better to create Arraylist with that size instead of starting with 10 and then going on increasing it.
I actually wrote a blog post on the topic 2 months ago. The article is for C#'s List<T> but Java's ArrayList has a very similar implementation. Since ArrayList is implemented using a dynamic array, it increases in size on demand. So the reason for the capacity constructor is for optimisation purposes.
When one of these resizings operation occurs, the ArrayList copies the contents of the array into a new array that is twice the capacity of the old one. This operation runs in O(n) time.
Example
Here is an example of how the ArrayList would increase in size:
10
16
25
38
58
... 17 resizes ...
198578
297868
446803
670205
1005308
So the list starts with a capacity of 10, when the 11th item is added it is increase by 50% + 1 to 16. On the 17th item the ArrayList is increased again to 25 and so on. Now consider the example where we're creating a list where the desired capacity is already known as 1000000. Creating the ArrayList without the size constructor will call ArrayList.add 1000000 times which takes O(1) normally or O(n) on resize.
1000000 + 16 + 25 + ... + 670205 + 1005308 = 4015851 operations
Compare this using the constructor and then calling ArrayList.add which is guaranteed to run in O(1).
1000000 + 1000000 = 2000000 operations
Java vs C#
Java is as above, starting at 10 and increasing each resize at 50% + 1. C# starts at 4 and increases much more aggressively, doubling at each resize. The 1000000 adds example from above for C# uses 3097084 operations.
References
My blog post on C#'s List<T>
Java's ArrayList source code
Setting the initial size of an ArrayList, e.g. to ArrayList<>(100), reduces the number of times the re-allocation of internal memory has to occur.
Example:
ArrayList example = new ArrayList<Integer>(3);
example.add(1); // size() == 1
example.add(2); // size() == 2,
example.add(2); // size() == 3, example has been 'filled'
example.add(3); // size() == 4, example has been 'expanded' so that the fourth element can be added.
As you see in the above example - an ArrayList can be expanded if needed to be. What this doesn't show you is that the size of the Arraylist usually doubles (although note that the new size depends on your implementation). The following is quoted from Oracle:
"Each ArrayList instance has a capacity. The capacity is the size of
the array used to store the elements in the list. It is always at
least as large as the list size. As elements are added to an
ArrayList, its capacity grows automatically. The details of the growth
policy are not specified beyond the fact that adding an element has
constant amortized time cost."
Obviously, if you have no idea as to what kind of range you will be holding, setting the size probably won't be a good idea - however, if you do have a specific range in mind, setting an initial capacity will increase memory efficiency.
ArrayList can contain many values and when doing large initial insertions you can tell ArrayList to allocate a larger storage to begin with as to not waste CPU cycles when it tries to allocate more space for the next item. Thus to allocate some space at the beginning is more effiecient.
This is to avoid possible efforts for reallocation for every single object.
int newCapacity = (oldCapacity * 3)/2 + 1;
internally new Object[] is created. JVM needs effort to create new Object[] when you add element in the arraylist. If you don't have above code(any algo you think) for reallocation then every time when you invoke arraylist.add() then new Object[] has to be created which is pointless and we are loosing time for increasing size by 1 for each and every objects to be added. So it is better to increase size of Object[] with following formula.
(JSL has used forcasting formula given below for dynamically growing arraylist instead of growing by 1 every time. Because to grow it takes effort by JVM)
int newCapacity = (oldCapacity * 3)/2 + 1;
I think each ArrayList is created with an init capacity value of "10". So anyway, if you create an ArrayList without setting capacity within constructor it will be created with a default value.
I'd say its an optimization. ArrayList without initial capacity will have ~10 empty rows and will expand when you are doing an add.
To have a list with exactly the number of items you need to call trimToSize()
As per my experience with ArrayList, giving an initial capacity is a nice way to avoid reallocation costs. But it bears a caveat. All suggestions mentioned above say that one should provide initial capacity only when a rough estimate of the number of elements is known. But when we try to give an initial capacity without any idea, the amount of memory reserved and unused will be a waste as it may never be required once the list is filled to required number of elements. What i am saying is, we can be pragmatic at the beginning while allocating capacity, and then find a smart way of knowing required minimal capacity at runtime. ArrayList provides a method called ensureCapacity(int minCapacity). But then, one has find a smart way...
I have tested ArrayList with and without initialCapacity and I got suprising result
When I set LOOP_NUMBER to 100,000 or less the result is that setting initialCapacity is efficient.
list1Sttop-list1Start = 14
list2Sttop-list2Start = 10
But when I set LOOP_NUMBER to 1,000,000 the result changes to:
list1Stop-list1Start = 40
list2Stop-list2Start = 66
Finally, I couldn't figure out how does it works?!
Sample code:
public static final int LOOP_NUMBER = 100000;
public static void main(String[] args) {
long list1Start = System.currentTimeMillis();
List<Integer> list1 = new ArrayList();
for (int i = 0; i < LOOP_NUMBER; i++) {
list1.add(i);
}
long list1Stop = System.currentTimeMillis();
System.out.println("list1Stop-list1Start = " + String.valueOf(list1Stop - list1Start));
long list2Start = System.currentTimeMillis();
List<Integer> list2 = new ArrayList(LOOP_NUMBER);
for (int i = 0; i < LOOP_NUMBER; i++) {
list2.add(i);
}
long list2Stop = System.currentTimeMillis();
System.out.println("list2Stop-list2Start = " + String.valueOf(list2Stop - list2Start));
}
I have tested on windows8.1 and jdk1.7.0_80
Suppose I declare an array of int with size 10, I can add an element to its 4th index, I can run the code without any exception.
int[] ar = new int[10];
ar[4] = 8;
System.out.println(Arrays.toString(ar)); //works fine
That's because when I say size as 10, that much memory space will be allocated for that array with initial value of its type kept in each index.
But the case is not same in List. Say I declare a list with an initial capacity of 10 and I try to add an element to its 4th index it gives
java.lang.IndexOutOfBoundsException: Index: 4, Size: 0
List<Integer> list = new ArrayList<Integer>(10);
list.add(4, 8); //exception
Of course the size of the list will return 0 even if initial capacity is given. Why is it not like Array, I think no memory is allocated for 10 elements for the list?
I was guessing that is there any way to fill List with default values once a capacity is given just like an array.
This is what the JavaDoc for ArrayList says about add(int index, E element):
throws IndexOutOfBoundsException - if the index is out of range (index < 0 || index > size())
The size is the number of elements currently stored, not the current capacity.
The fact that your car has the "capacity" to drive with 100 mph, that doesn't mean that you can magically get from 0 to 90 mph within 1 second ;-)
In other words: the answer is that size and capacity aren't the same. Capacity merely means: "that is the size this list can grow to before the underlying array needs to grow".
By now it should be clear that the initial capacity on the constructor is just a bit of memory management for the initial internal array. Without any semantical meaning.
When the actual size() overflows the array, the array is reallocated.
There is no such thing as a bulk allocation with initial elements. However there is:
List<Integer> list = Collections.nCopies(10, Integer.valueOf(0));
And the new Stream offers ways to dynamically generate lists.
You could do:
public <T> void add(List<T> list, int i, T obj) {
while (list.size() < i) {
list.add(null);
}
list.add(i, obj);
}
But it is immediately evident, that you'll introduce nulls which is unsafe and ugly,
requiring null checks.
As others have mentioned in their answers, in List<Integer> list = new ArrayList<Integer>(10), the 10 specifies the initial capacity.
Specifying the initial capacity is only an optional thing. You exercise that option only when you use the particular constructor that accepts an initial capacity as an argument. When you use other constructors, you don't have control over the initial capacity.
You specify n as the initial capacity if you want the first n additions to the list to be as efficient as possible -- otherwise, there is a possibility that the addition of each individual item to the list results in some costly internal re-sizing and re-copying into the re-sized internal area.
The above does not answer the question of why you are not allowed to add an item at position 8, when there is no item at position 7.
As some have answered, that's because the API doc says so.
That's one way to answer it. But why does the API doc say so? Why are things designed so?
Things are designed so, because:
Adding an item at position 8, when there is no item at position 7, results in a gap (before position 8).
As a programmer, you will then have to keep track of what an item's position is, among all possible positions (the full capacity). Currently, as a programmer, you only keep track of what the item's position is, among all added items. Now, wouldn't that be a programming nightmare?
As I recall, before Java 8, the default capacity of ArrayList was 10.
Surprisingly, the comment on the default (void) constructor still says: Constructs an empty list with an initial capacity of ten.
From ArrayList.java:
/**
* Shared empty array instance used for default sized empty instances. We
* distinguish this from EMPTY_ELEMENTDATA to know how much to inflate when
* first element is added.
*/
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
...
/**
* Constructs an empty list with an initial capacity of ten.
*/
public ArrayList() {
this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}
Technically, it's 10, not zero, if you admit for a lazy initialisation of the backing array. See:
public boolean add(E e) {
ensureCapacityInternal(size + 1);
elementData[size++] = e;
return true;
}
private void ensureCapacityInternal(int minCapacity) {
if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
}
ensureExplicitCapacity(minCapacity);
}
where
/**
* Default initial capacity.
*/
private static final int DEFAULT_CAPACITY = 10;
What you're referring to is just the zero-sized initial array object that is shared among all initially empty ArrayList objects. I.e. the capacity of 10 is guaranteed lazily, an optimisation that is present also in Java 7.
Admittedly, the constructor contract is not entirely accurate. Perhaps this is the source of confusion here.
Background
Here's an E-Mail by Mike Duigou
I have posted an updated version of the empty ArrayList and HashMap patch.
http://cr.openjdk.java.net/~mduigou/JDK-7143928/1/webrev/
This revised implementation introduces no new fields to either class. For ArrayList the lazy allocation of the backing array occurs only if the list is created at default size. According to our performance analysis team, approximately 85% of ArrayList instances are created at default size so this optimization will be valid for an overwhelming majority of cases.
For HashMap, creative use is made of the threshold field to track the requested initial size until the bucket array is needed. On the read side the empty map case is tested with isEmpty(). On the write size a comparison of (table == EMPTY_TABLE) is used to detect the need to inflate the bucket array. In readObject there's a little more work to try to choose an efficient initial capacity.
From: http://mail.openjdk.java.net/pipermail/core-libs-dev/2013-April/015585.html
In java 8 default capacity of ArrayList is 0 until we add at least one object into the ArrayList object (You can call it lazy initialization).
Now question is why this change has been done in JAVA 8?
Answer is to save memory consumption. Millions of array list objects are created in real time java applications. Default size of 10 objects means that we allocate 10 pointers (40 or 80 bytes) for underlying array at creation and fill them in with nulls.
An empty array (filled with nulls) occupy lot of memory .
Lazy initialization postpones this memory consumption till moment you will actually use the array list.
Please see below code for help.
ArrayList al = new ArrayList(); //Size: 0, Capacity: 0
ArrayList al = new ArrayList(5); //Size: 0, Capacity: 5
ArrayList al = new ArrayList(new ArrayList(5)); //Size: 0, Capacity: 0
al.add( "shailesh" ); //Size: 1, Capacity: 10
public static void main( String[] args )
throws Exception
{
ArrayList al = new ArrayList();
getCapacity( al );
al.add( "shailesh" );
getCapacity( al );
}
static void getCapacity( ArrayList<?> l )
throws Exception
{
Field dataField = ArrayList.class.getDeclaredField( "elementData" );
dataField.setAccessible( true );
System.out.format( "Size: %2d, Capacity: %2d%n", l.size(), ( (Object[]) dataField.get( l ) ).length );
}
Response: -
Size: 0, Capacity: 0
Size: 1, Capacity: 10
Article Default capacity of ArrayList in Java 8 explains it in details.
If the very first operation that is done with an ArrayList is to pass addAll a collection which has more than ten elements, then any effort put into creating an initial ten-element array to hold the ArrayList's contents would be thrown out the window. Whenever something is added to an ArrayList it's necessary to test whether the size of the resulting list will exceed the size of the backing store; allowing the initial backing store to have size zero rather than ten will cause this test to fail one extra time in the lifetime of a list whose first operation is an "add" which would require creating the initial ten-item array, but that cost is less than the cost of creating a ten-item array that never ends up getting used.
That having been said, it might have been possible to improve performance further in some contexts if there were a overload of "addAll" which specified how many items (if any) would likely be added to the list after the present one, and which could use that to influence its allocation behavior. In some cases code which adds the last few items to a list will have a pretty good idea that the list is never going to need any space beyond that. There are many situations where a list will get populated once and never modified after that. If at the point code knows that the ultimate size of a list will be 170 elements, it has 150 elements and a backing store of size 160, growing the backing store to size 320 will be unhelpful and leaving it at size 320 or trimming it to 170 will be less efficient than simply having the next allocation grow it to 170.
The question is 'why?'.
Memory profiling inspections (for example (https://www.yourkit.com/docs/java/help/inspections_mem.jsp#sparse_arrays) shows that empty (filled with nulls) arrays occupy tons of memory .
Default size of 10 objects means that we allocate 10 pointers (40 or 80 bytes) for underlying array at creation and fill them in with nulls. Real java applications create millions of array lists.
The introduced modification removes^W postpone this memory consumption till moment you will actually use the array list.
After above question I gone through ArrayList Document of Java 8. I found the default size is still 10 only.
ArrayList default size in JAVA 8 is stil 10. The only change made in JAVA 8 is that if a coder adds elements less than 10 then the remaining arraylist blank places are not specified to null. Saying so because I have myself gone through this situation and eclipse made me look into this change of JAVA 8.
You can justify this change by looking at below screenshot. In it you can see that ArrayList size is specified as 10 in Object[10] but the number of elements displayed are only 7. Rest null value elements are not displayed here. In JAVA 7 below screenshot is same with just a single change which is that the null value elements are also displayed for which the coder needs to write code for handling null values if he is iterating complete array list while in JAVA 8 this burden is removed from the head of coder/developer.
Screen shot link.
The usual constructor of ArrayList is:
ArrayList<?> list = new ArrayList<>();
But there is also an overloaded constructor with a parameter for its initial capacity:
ArrayList<?> list = new ArrayList<>(20);
Why is it useful to create an ArrayList with an initial capacity when we can append to it as we please?
If you know in advance what the size of the ArrayList is going to be, it is more efficient to specify the initial capacity. If you don't do this, the internal array will have to be repeatedly reallocated as the list grows.
The larger the final list, the more time you save by avoiding the reallocations.
That said, even without pre-allocation, inserting n elements at the back of an ArrayList is guaranteed to take total O(n) time. In other words, appending an element is an amortized constant-time operation. This is achieved by having each reallocation increase the size of the array exponentially, typically by a factor of 1.5. With this approach, the total number of operations can be shown to be O(n).
Because ArrayList is a dynamically resizing array data structure, which means it is implemented as an array with an initial (default) fixed size. When this gets filled up, the array will be extended to a double sized one. This operation is costly, so you want as few as possible.
So, if you know your upper bound is 20 items, then creating the array with initial length of 20 is better than using a default of, say, 15 and then resize it to 15*2 = 30 and use only 20 while wasting the cycles for the expansion.
P.S. - As AmitG says, the expansion factor is implementation specific (in this case (oldCapacity * 3)/2 + 1)
Default size of Arraylist is 10.
/**
* Constructs an empty list with an initial capacity of ten.
*/
public ArrayList() {
this(10);
}
So if you are going to add 100 or more records, you can see the overhead of memory reallocation.
ArrayList<?> list = new ArrayList<>();
// same as new ArrayList<>(10);
So if you have any idea about the number of elements which will be stored in Arraylist its better to create Arraylist with that size instead of starting with 10 and then going on increasing it.
I actually wrote a blog post on the topic 2 months ago. The article is for C#'s List<T> but Java's ArrayList has a very similar implementation. Since ArrayList is implemented using a dynamic array, it increases in size on demand. So the reason for the capacity constructor is for optimisation purposes.
When one of these resizings operation occurs, the ArrayList copies the contents of the array into a new array that is twice the capacity of the old one. This operation runs in O(n) time.
Example
Here is an example of how the ArrayList would increase in size:
10
16
25
38
58
... 17 resizes ...
198578
297868
446803
670205
1005308
So the list starts with a capacity of 10, when the 11th item is added it is increase by 50% + 1 to 16. On the 17th item the ArrayList is increased again to 25 and so on. Now consider the example where we're creating a list where the desired capacity is already known as 1000000. Creating the ArrayList without the size constructor will call ArrayList.add 1000000 times which takes O(1) normally or O(n) on resize.
1000000 + 16 + 25 + ... + 670205 + 1005308 = 4015851 operations
Compare this using the constructor and then calling ArrayList.add which is guaranteed to run in O(1).
1000000 + 1000000 = 2000000 operations
Java vs C#
Java is as above, starting at 10 and increasing each resize at 50% + 1. C# starts at 4 and increases much more aggressively, doubling at each resize. The 1000000 adds example from above for C# uses 3097084 operations.
References
My blog post on C#'s List<T>
Java's ArrayList source code
Setting the initial size of an ArrayList, e.g. to ArrayList<>(100), reduces the number of times the re-allocation of internal memory has to occur.
Example:
ArrayList example = new ArrayList<Integer>(3);
example.add(1); // size() == 1
example.add(2); // size() == 2,
example.add(2); // size() == 3, example has been 'filled'
example.add(3); // size() == 4, example has been 'expanded' so that the fourth element can be added.
As you see in the above example - an ArrayList can be expanded if needed to be. What this doesn't show you is that the size of the Arraylist usually doubles (although note that the new size depends on your implementation). The following is quoted from Oracle:
"Each ArrayList instance has a capacity. The capacity is the size of
the array used to store the elements in the list. It is always at
least as large as the list size. As elements are added to an
ArrayList, its capacity grows automatically. The details of the growth
policy are not specified beyond the fact that adding an element has
constant amortized time cost."
Obviously, if you have no idea as to what kind of range you will be holding, setting the size probably won't be a good idea - however, if you do have a specific range in mind, setting an initial capacity will increase memory efficiency.
ArrayList can contain many values and when doing large initial insertions you can tell ArrayList to allocate a larger storage to begin with as to not waste CPU cycles when it tries to allocate more space for the next item. Thus to allocate some space at the beginning is more effiecient.
This is to avoid possible efforts for reallocation for every single object.
int newCapacity = (oldCapacity * 3)/2 + 1;
internally new Object[] is created. JVM needs effort to create new Object[] when you add element in the arraylist. If you don't have above code(any algo you think) for reallocation then every time when you invoke arraylist.add() then new Object[] has to be created which is pointless and we are loosing time for increasing size by 1 for each and every objects to be added. So it is better to increase size of Object[] with following formula.
(JSL has used forcasting formula given below for dynamically growing arraylist instead of growing by 1 every time. Because to grow it takes effort by JVM)
int newCapacity = (oldCapacity * 3)/2 + 1;
I think each ArrayList is created with an init capacity value of "10". So anyway, if you create an ArrayList without setting capacity within constructor it will be created with a default value.
I'd say its an optimization. ArrayList without initial capacity will have ~10 empty rows and will expand when you are doing an add.
To have a list with exactly the number of items you need to call trimToSize()
As per my experience with ArrayList, giving an initial capacity is a nice way to avoid reallocation costs. But it bears a caveat. All suggestions mentioned above say that one should provide initial capacity only when a rough estimate of the number of elements is known. But when we try to give an initial capacity without any idea, the amount of memory reserved and unused will be a waste as it may never be required once the list is filled to required number of elements. What i am saying is, we can be pragmatic at the beginning while allocating capacity, and then find a smart way of knowing required minimal capacity at runtime. ArrayList provides a method called ensureCapacity(int minCapacity). But then, one has find a smart way...
I have tested ArrayList with and without initialCapacity and I got suprising result
When I set LOOP_NUMBER to 100,000 or less the result is that setting initialCapacity is efficient.
list1Sttop-list1Start = 14
list2Sttop-list2Start = 10
But when I set LOOP_NUMBER to 1,000,000 the result changes to:
list1Stop-list1Start = 40
list2Stop-list2Start = 66
Finally, I couldn't figure out how does it works?!
Sample code:
public static final int LOOP_NUMBER = 100000;
public static void main(String[] args) {
long list1Start = System.currentTimeMillis();
List<Integer> list1 = new ArrayList();
for (int i = 0; i < LOOP_NUMBER; i++) {
list1.add(i);
}
long list1Stop = System.currentTimeMillis();
System.out.println("list1Stop-list1Start = " + String.valueOf(list1Stop - list1Start));
long list2Start = System.currentTimeMillis();
List<Integer> list2 = new ArrayList(LOOP_NUMBER);
for (int i = 0; i < LOOP_NUMBER; i++) {
list2.add(i);
}
long list2Stop = System.currentTimeMillis();
System.out.println("list2Stop-list2Start = " + String.valueOf(list2Stop - list2Start));
}
I have tested on windows8.1 and jdk1.7.0_80
The usual constructor of ArrayList is:
ArrayList<?> list = new ArrayList<>();
But there is also an overloaded constructor with a parameter for its initial capacity:
ArrayList<?> list = new ArrayList<>(20);
Why is it useful to create an ArrayList with an initial capacity when we can append to it as we please?
If you know in advance what the size of the ArrayList is going to be, it is more efficient to specify the initial capacity. If you don't do this, the internal array will have to be repeatedly reallocated as the list grows.
The larger the final list, the more time you save by avoiding the reallocations.
That said, even without pre-allocation, inserting n elements at the back of an ArrayList is guaranteed to take total O(n) time. In other words, appending an element is an amortized constant-time operation. This is achieved by having each reallocation increase the size of the array exponentially, typically by a factor of 1.5. With this approach, the total number of operations can be shown to be O(n).
Because ArrayList is a dynamically resizing array data structure, which means it is implemented as an array with an initial (default) fixed size. When this gets filled up, the array will be extended to a double sized one. This operation is costly, so you want as few as possible.
So, if you know your upper bound is 20 items, then creating the array with initial length of 20 is better than using a default of, say, 15 and then resize it to 15*2 = 30 and use only 20 while wasting the cycles for the expansion.
P.S. - As AmitG says, the expansion factor is implementation specific (in this case (oldCapacity * 3)/2 + 1)
Default size of Arraylist is 10.
/**
* Constructs an empty list with an initial capacity of ten.
*/
public ArrayList() {
this(10);
}
So if you are going to add 100 or more records, you can see the overhead of memory reallocation.
ArrayList<?> list = new ArrayList<>();
// same as new ArrayList<>(10);
So if you have any idea about the number of elements which will be stored in Arraylist its better to create Arraylist with that size instead of starting with 10 and then going on increasing it.
I actually wrote a blog post on the topic 2 months ago. The article is for C#'s List<T> but Java's ArrayList has a very similar implementation. Since ArrayList is implemented using a dynamic array, it increases in size on demand. So the reason for the capacity constructor is for optimisation purposes.
When one of these resizings operation occurs, the ArrayList copies the contents of the array into a new array that is twice the capacity of the old one. This operation runs in O(n) time.
Example
Here is an example of how the ArrayList would increase in size:
10
16
25
38
58
... 17 resizes ...
198578
297868
446803
670205
1005308
So the list starts with a capacity of 10, when the 11th item is added it is increase by 50% + 1 to 16. On the 17th item the ArrayList is increased again to 25 and so on. Now consider the example where we're creating a list where the desired capacity is already known as 1000000. Creating the ArrayList without the size constructor will call ArrayList.add 1000000 times which takes O(1) normally or O(n) on resize.
1000000 + 16 + 25 + ... + 670205 + 1005308 = 4015851 operations
Compare this using the constructor and then calling ArrayList.add which is guaranteed to run in O(1).
1000000 + 1000000 = 2000000 operations
Java vs C#
Java is as above, starting at 10 and increasing each resize at 50% + 1. C# starts at 4 and increases much more aggressively, doubling at each resize. The 1000000 adds example from above for C# uses 3097084 operations.
References
My blog post on C#'s List<T>
Java's ArrayList source code
Setting the initial size of an ArrayList, e.g. to ArrayList<>(100), reduces the number of times the re-allocation of internal memory has to occur.
Example:
ArrayList example = new ArrayList<Integer>(3);
example.add(1); // size() == 1
example.add(2); // size() == 2,
example.add(2); // size() == 3, example has been 'filled'
example.add(3); // size() == 4, example has been 'expanded' so that the fourth element can be added.
As you see in the above example - an ArrayList can be expanded if needed to be. What this doesn't show you is that the size of the Arraylist usually doubles (although note that the new size depends on your implementation). The following is quoted from Oracle:
"Each ArrayList instance has a capacity. The capacity is the size of
the array used to store the elements in the list. It is always at
least as large as the list size. As elements are added to an
ArrayList, its capacity grows automatically. The details of the growth
policy are not specified beyond the fact that adding an element has
constant amortized time cost."
Obviously, if you have no idea as to what kind of range you will be holding, setting the size probably won't be a good idea - however, if you do have a specific range in mind, setting an initial capacity will increase memory efficiency.
ArrayList can contain many values and when doing large initial insertions you can tell ArrayList to allocate a larger storage to begin with as to not waste CPU cycles when it tries to allocate more space for the next item. Thus to allocate some space at the beginning is more effiecient.
This is to avoid possible efforts for reallocation for every single object.
int newCapacity = (oldCapacity * 3)/2 + 1;
internally new Object[] is created. JVM needs effort to create new Object[] when you add element in the arraylist. If you don't have above code(any algo you think) for reallocation then every time when you invoke arraylist.add() then new Object[] has to be created which is pointless and we are loosing time for increasing size by 1 for each and every objects to be added. So it is better to increase size of Object[] with following formula.
(JSL has used forcasting formula given below for dynamically growing arraylist instead of growing by 1 every time. Because to grow it takes effort by JVM)
int newCapacity = (oldCapacity * 3)/2 + 1;
I think each ArrayList is created with an init capacity value of "10". So anyway, if you create an ArrayList without setting capacity within constructor it will be created with a default value.
I'd say its an optimization. ArrayList without initial capacity will have ~10 empty rows and will expand when you are doing an add.
To have a list with exactly the number of items you need to call trimToSize()
As per my experience with ArrayList, giving an initial capacity is a nice way to avoid reallocation costs. But it bears a caveat. All suggestions mentioned above say that one should provide initial capacity only when a rough estimate of the number of elements is known. But when we try to give an initial capacity without any idea, the amount of memory reserved and unused will be a waste as it may never be required once the list is filled to required number of elements. What i am saying is, we can be pragmatic at the beginning while allocating capacity, and then find a smart way of knowing required minimal capacity at runtime. ArrayList provides a method called ensureCapacity(int minCapacity). But then, one has find a smart way...
I have tested ArrayList with and without initialCapacity and I got suprising result
When I set LOOP_NUMBER to 100,000 or less the result is that setting initialCapacity is efficient.
list1Sttop-list1Start = 14
list2Sttop-list2Start = 10
But when I set LOOP_NUMBER to 1,000,000 the result changes to:
list1Stop-list1Start = 40
list2Stop-list2Start = 66
Finally, I couldn't figure out how does it works?!
Sample code:
public static final int LOOP_NUMBER = 100000;
public static void main(String[] args) {
long list1Start = System.currentTimeMillis();
List<Integer> list1 = new ArrayList();
for (int i = 0; i < LOOP_NUMBER; i++) {
list1.add(i);
}
long list1Stop = System.currentTimeMillis();
System.out.println("list1Stop-list1Start = " + String.valueOf(list1Stop - list1Start));
long list2Start = System.currentTimeMillis();
List<Integer> list2 = new ArrayList(LOOP_NUMBER);
for (int i = 0; i < LOOP_NUMBER; i++) {
list2.add(i);
}
long list2Stop = System.currentTimeMillis();
System.out.println("list2Stop-list2Start = " + String.valueOf(list2Stop - list2Start));
}
I have tested on windows8.1 and jdk1.7.0_80