in java, if we store 4 elements in array and array list, is it same at low level in memory OR does it make any difference in order to store elements ?
It's going to be different. An ArrayList is a wrapper around the array with several other helper functions. The memory footprint is going to be a little bit bigger for ArrayList, and it will resize itself as necessary.
It differs, ArrayList internally manages its own array for storage and has own attributes also
An array is pretty different from a List, even if the list happens to be an ArrayList and internally uses just an array as well.
For the array there exists just the array (for this discussion I'll ignore the memory the elements will occupy). Thus you have one object in the heap, the array itself.
For the ArrayList, there exists the ArrayList instance and that instance internally has an array. So there are two objects on the heap. Also, while you have exact control over the size of an array you create, the array held by the ArrayList can have any size that is >= number of elements <= Integer.MAX_VALUE.
Coincidentily, ArrayList uses an elements index directly as array index internally, so the order of elements is the same as in a plain array. But thats an implementation detail, and you normally don't care how a List organizes its data internally (after all the purpose of Lists is to abstract the messy details away).
Related
I'm trying to make a mutable list (list in a non programming sense) of values. The list is to be a predecided length n.
My first choice was to use an array initialized to n. But since an array isn't mutable, the only choice I have if I want to add another value is to copy the array into an array of bigger size. I decided to try using an ArrayList instead.
Then the trade off is that I can't set the ArrayList to an initial size n. However, I can add values to the ArrayList, since it's mutable. So I'm wondering whether it would be more efficient to create an array then copy it into a bigger array when needed, or to make an ArrayList and 'initialize' all the n values with a for loop.
What would be better to use?
Have a look at the implementation of ArrayList is does exactly what you suggest. It simply creates a bigger array whenever needed and copies all the entries form the small array.
This is quite efficient as long as the copy task must only be performed rarely and on not too big arrays. This is why you should always try to create an ArrayList with a size parameter that is about the size you expect that the list will grow to (or a bit bigger). This way you can minimize (or maybe avoid) the replacement of the internal array by a bigger one and all the effort it takes (the array copy).
P.s.
If your data is of an elementar data type you might want to avoid the overhead (memory and speed) of boxing into wrapper types. In this case ArrayList is not suitable for you and you might either implement a similar mechanism for the elementar data type you need it for or have a look at the www. I bed there are plenty implementations like this around there.
P.p.s
If you are unsure about the performance of different approaches, it is always a good idea to implement both and profile the performance for test data that are as close to your real data as possible.
What if instead of an ArrayList to manage data, I had to use arrays?
How would I have to manage the growth of this, considering the number of elements can exceed the initial size of the array?
You will need to have some sort of heuristics to resize the array. E.g. One of the ideas is to allocate a new array with double the size of the existing array when a new element cannot be added because the array is full. Before you discard the old array, you will need to copy the data from the old array to the new array. Look at Arrays.copyOf to achieve both allocating new array and copying data over.
Yet another alternative is to use a linked list of arrays. Grow the linked list by one every time the existing array is full. This is generally more efficient at runtime because you can resize the array without incurring copying penalty. Of course, it makes the code a tad bit complicated.
You would need to implement your own ArrayList to do that.
Take a look at what javadoc says about ArrayList:
Resizable-array implementation of the List interface.
What determines whether one should be used over the other?
I used to think that the deciding factor is whether you know the size of the things you want to store but I think there might be more to it than that.
Some more differences:
First and Major difference between Array and ArrayList in Java is that Array is a fixed length data structure while ArrayList is a variable length Collection class. You can not change length of Array once created in Java but ArrayList re-size itself when gets full depending upon capacity and load factor. Since ArrayList is internally backed by Array in Java, any resize operation in ArrayList will slow down performance as it involves creating new Array and copying content from old array to new array.
Another difference between Array and ArrayList in Java is that you can not use Generics along with Array, as Array instance knows about what kind of type it can hold and throws ArrayStoreException, if you try to store type which is not convertible into type of Array. ArrayList allows you to use Generics to ensure type-safety.
One more major difference between ArrayList and Array is that, you can not store primitives in ArrayList, it can only contain Objects. While Array can contain both primitives and Objects in Java. Though Autoboxing of Java 5 may give you an impression of storing primitives in ArrayList, it actually automatically converts primitives to Object.
Java provides add() method to insert element into ArrayList and you can simply use assignment operator to store element into Array e.g. In order to store Object to specified position.
One more difference on Array vs ArrayList is that you can create instance of ArrayList without specifying size, Java will create Array List with default size but its mandatory to provide size of Array while creating either directly or indirectly by initializing Array while creating it. By the way you can also initialize ArrayList while creating it.
Use array when you know the exact size of the collection and you don't expect to add/remove elements.
Use List (ArrayList) when you don't know the exact size of the collection and you expect to alter it at some point.
If you're using Java8, there is the Stream API, which helps to significantly reduce the boilerplate code when working with collections. This is another plus for ArrayList (and all Collections and Maps).
More info:
Arrays vs ArrayList in performance
Unless speed is critical (really critical, like every microsecond counts), use ArrayList whenever possible. It's so much easier to use, and that's usually the most important thing to consider.
Generally, I use ArrayList, not arrays, because they offer a lot of several methods that are very usefull. I think you can use array if performance is very important, in very special cases.
Array is fixed, ArrayList is growable.If the number of elements is fixed, use an array
Also one of the great benefits of collection implementations is they give you a lot of flexibility. So depending on your need, you can have a List behave as an ArrayList or as a LinkedList and so on. Also if you look at the Collection API, you'd see you have methods for almost everything you'd ever need to do.
Could you please let me know Performance wise why Array is better than Collection?
It is not. It will actually depend on the use you make of your container.
Some algorithms may run in O(n) on an array and in O(1) on another collection type (which implements the Collection interface).
Think about removal of an item for instance. In that case, the array, even if a native type, would perform slower than the linked list and its method calls (which could be inlined anyway on some VMs): it runs in O(n) VS O(1) for a linked list
Think about searching an element. It runs in 0(n) for an array VS O(log n) for a tree.
Some Collection implementations use an array to store their elements (ArrayList I think) so in that case performance will not be significantly different.
You should spend time on optimizing your algorithm (and make use of the various collection types available) instead of worrying of the pros/cons of an array VS Collection.
Many collections are wrappers for arrays. This includes ArrayList, HashMap/Set, StringBuilder. For optimised code, the performance difference of the operations is minimal except when you come to operations which are better suited to that data structure e.g. lookup of a Map is much faster than the lookup in an array.
Using generics for collections which are basically primitives can be slower, not because the collection is slower but the extra object creation and cache usage (as the memory needed can be higher) This difference is usually too small to matter but if you are concerned about this you can use the Trove4J libraries which are wrappers for arrays of primitives instead of arrays of Objects.
Where collections are slower is when you use operations which they are not suitable for e.g. random access of a LinkedList, but sensible coding can avoid these situations.
Basically, because arrays are primitive data structures in Java. Accesses to them can be translated directly into native memory-access instructions rather than method calls.
That said, though, it's not entirely obvious that arrays will strictly outperform collections in all circumstances. If your code references collection variables where the runtime type can be monomorphically known at JIT-time, Hotspot will be able to inline the access methods, and where they are simple, can be just as fast since there's basically no overhead anyway.
Many of the collections' access methods are intrinsically more complex than array referencing, however. There is, for instance, no way that a HashMap will be as efficient as a simple array lookup, no matter how much Hotspot optimizes it.
You cannot compare the two. ArrayList is an implementation, Collection is an interface. There might be different implementations for the Collection interface.
In practice the implementation is chosen which as the simple access to your data. Usually ArrayList if you need to loop through all elements. Hashtable if you need access by key.
Performance should be considered only after measurements are made. Then it is easy to change the implementation because the collection framework has common interfaces like the Collection interface.
The question is which one to use and when?
An array is basically a fixed size collection of elements. The bad point about an array is that it is not resizable. But its constant size provides efficiency if you are clear with your element size. So arrays are better to use when you know the number of elements available with you.
Collection
ArrayList is another collection where the number of elements is resizable. So if you are not sure about the number of elements in the collection use an ArrayList. But there are certain facts to be considered while using ArrayLists.
ArrayLists is not synchronized. So if there are multiple threads
accessing and modifying the list, then synchronization might be
required to be handled externally.
ArrayList is internally implemented as an array. So whenever a new
element is added an array of n+1 elements is created and then all the
n elements are copied from the old array to the new array and then
the new element is inserted in the new array.
Adding n elements requires on time.
The isEmpty, size, iterator, set, get and listIterator operations
require the same amount of time, independently of element you access.
Only Objects can be added to an ArrayList
Permits null elements
If you need to add a large number of elements to an ArrayList, you can use the ensureCapacity(int minCapacity) operation to ensure that the ArrayList has that required capacity. This will ensure that the Array is copied only once when all the elements are added and increase the performance of addition of elements to an ArrayList. Also inserting an element in the middle of say 1000 elements would require you to move 500 elements up or down and then add the element in the middle.
The benefit of using ArrayList is that accessing random elements is cheap and is not affected by the number of elemets in the ArrayList. But addition of elements to the head of tail or in the middle is costly.
Vector is similar to ArrayList with the difference that it is synchronized. It offers some other benefits like it has an initial capacity and an incremental capacity. So if your vector has a capacity of 10 and incremental capacity of 10, then when you are adding the 11th element a new Vector would be created with 20 elements and the 11 elements would be copied to the new Vector. So addition of 12th to 20th elements would not require creation of new vector.
By default, when a vector needs to grow the size of its internal data structure to hold more elements, the size of internal data structure is doubled, whereas for ArrayList the size is increased by only 50%. So ArrayList is more conservative in terms of space.
LinkedList is much more flexible and lets you insert, add and remove elements from both sides of your collection - it can be used as queue and even double-ended queue! Internally a LinkedList does not use arrays. LinkedList is a sequence of nodes, which are double linked. Each node contains header, where actually objects are stored, and two links or pointers to next or previous node. A LinkedList looks like a chain, consisting of people who hold each other's hand. You can insert people or node into that chain or remove. Linked lists permit node insert/remove operation at any point in the list in constant time.
So inserting elements in linked list (whether at head or at tail or in the middle) is not expensive. Also when you retrieve elements from the head it is cheap. But when you want to randomly access the elements of the linked list or access the elements at the tail of the list then the operations are heavy. Cause, for accessing the n+1 th element, you will need to parse through the first n elements to reach the n+1th element.
Also linked list is not synchronized. So multiple threads modifying and reading the list would need to be synchronized externally.
So the choice of which class to use for creating lists depends on the requirements. ArrayList or Vector( if you need synchronization ) could be used when you need to add elements at the end of the list and access elements randomly - more access operations than add operations. Whereas a LinkedList should be used when you need to do a lot of add/delete (elements) operations from the head or the middle of the list and your access operations are comparatively less.
What are the differences between the two data structures ArrayList and Vector, and where should you use each of them?
Differences
Vectors are synchronized, ArrayLists
are not.
Data Growth Methods
Use ArrayLists if there is no specific requirement to use Vectors.
Synchronization
If multiple threads access an ArrayList concurrently then we must externally synchronize the block of code which modifies the list either structurally or simply modifies an element. Structural modification means addition or deletion of element(s) from the list. Setting the value of an existing element is not a structural modification.
Collections.synchronizedList is normally used at the time of creation of the list to avoid any accidental unsynchronized access to the list.
Data growth
Internally, both the ArrayList and Vector hold onto their contents using an Array. When an element is inserted into an ArrayList or a Vector, the object will need to expand its internal array if it runs out of room. A Vector defaults to doubling the size of its array, while the ArrayList increases its array size by 50 percent.
As the documentation says, a Vector and an ArrayList are almost equivalent. The difference is that access to a Vector is synchronized, whereas access to an ArrayList is not. What this means is that only one thread can call methods on a Vector at a time, and there's a slight overhead in acquiring the lock; if you use an ArrayList, this isn't the case. Generally, you'll want to use an ArrayList; in the single-threaded case it's a better choice, and in the multi-threaded case, you get better control over locking. Want to allow concurrent reads? Fine. Want to perform one synchronization for a batch of ten writes? Also fine. It does require a little more care on your end, but it's likely what you want. Also note that if you have an ArrayList, you can use the Collections.synchronizedList function to create a synchronized list, thus getting you the equivalent of a Vector.
Vector is a broken class that is not threadsafe, despite it being "synchronized" and is only used by students and other inexperienced programmers.
ArrayList is the go-to List implementation used by professionals and experienced programmers.
Professionals wanting a threadsafe List implementation use a CopyOnWriteArrayList.
ArrayList is newer and 20-30% faster.
If you don't need something explitly apparent in Vector, use ArrayList
There are 2 major differentiation's between Vector and ArrayList.
Vector is synchronized by default, and ArrayList is not.
Note : you can make ArrayList also synchronized by passing arraylist object to Collections.synchronizedList() method.
Synchronized means : it can be used with multiple threads with out any side effect.
ArrayLists grow by 50% of the previous size when space is not sufficient for new element, where as Vector will grow by 100% of the previous size when there is no space for new incoming element.
Other than this, there are some practical differences between them, in terms of programming effort:
To get the element at a particular location from Vector we use elementAt(int index) function. This function name is very lengthy.
In place of this in ArrayList we have get(int index) which is very
easy to remember and to use.
Similarly to replace an existing element with a new element in Vector we use setElementAt() method, which is again very lengthy and may irritate the programmer to use repeatedly. In place of this ArrayList has add(int index, object) method which is easy to use and remember.
Like this they have more programmer friendly and easy to use function names in ArrayList.
When to use which one?
Try to avoid using Vectors completely. ArrayLists can do everything what a Vector can do. More over ArrayLists are by default not synchronized. If you want, you can synchronize it when ever you need by using Collections util class.
ArrayList has easy to remember and use function names.
Note : even though arraylist grows by 100%, you can avoid this by ensurecapacity() method to make sure that you are allocating sufficient memory at the initial stages itself.
Hope it helps.
ArrayList and Vector both implements List interface and maintains insertion order.But there are many differences between ArrayList and Vector classes...
ArrayList -
ArrayList is not synchronized.
ArrayList increments 50% of current array size if number of element exceeds from its capacity.
ArrayList is not a legacy class, it is introduced in JDK 1.2.
ArrayList is fast because it is non-synchronized.
ArrayList uses Iterator interface to traverse the elements.
Vector -
Vector is synchronized.
Vector increments 100% means doubles the array size if total number of element exceeds than its capacity.
Vector is a legacy class.
Vector is slow because it is synchronized i.e. in multithreading environment, it will hold the other threads in runnable or non-runnable state until current thread releases the lock of object.
Vector uses Enumeration interface to traverse the elements. But it can use Iterator also.
See Also : https://www.javatpoint.com/difference-between-arraylist-and-vector
Basically both ArrayList and Vector both uses internal Object Array.
ArrayList: The ArrayList class extends AbstractList and implements the List interface and RandomAccess (marker interface). ArrayList supports dynamic arrays that can grow as needed. It gives us first iteration over elements.
ArrayList uses internal Object Array; they are created with an default initial size of 10. When this size is exceeded, the collection is automatically increases to half of the default size that is 15.
Vector: Vector is similar to ArrayList but the differences are, it is synchronized and its default initial size is 10 and when the size exceeds its size increases to double of the original size that means the new size will be 20. Vector is the only class other than ArrayList to implement RandomAccess. Vector is having four constructors out of that one takes two parameters Vector(int initialCapacity, int capacityIncrement) capacityIncrement is the amount by which the capacity is increased when the vector overflows, so it have more control over the load factor.
Some other differences are: