Java - List or Array? [duplicate] - java

Want to improve this post? Provide detailed answers to this question, including citations and an explanation of why your answer is correct. Answers without enough detail may be edited or deleted.
This question already has answers here:
What does it mean to "program to an interface"?
(33 answers)
When to use a List over an Array in Java?
(9 answers)
Closed 9 years ago.
I know Lists make things much easier in Java instead of working with hard-set arrays (lists allow you to add/remove elements at will and they automagically resize, etc).
I've read some stuff suggesting that array's should be avoided in java whenever possible since they are not flexible at all (and sometimes can impose weird limitations, such as if you don't know the size an array needs to be, etc).
Is this "good practice" to stop using arrays at all and use only List logic instead? I'm sure the List type consumes more memory than an array and thus have higher overhead, but is this significant? Most Lists would be GC'ed during runtime if they are left laying around anyways so maybe it isn't as big of a deal as I'm thinking?

I don't like dogma. Know the rules; know when to break the rules.
"Never" is too strong, especially when it comes to software.
Arrays and Lists are both potential targets for the GC, so that's a wash.
Yes, you have to know the size of an array before you start. For the cases when you do, there's nothing wrong with it.
It's easy to go back and forth as needed using java.util.Collections and java.util.Arrays classes.

I think a good rule of thumb would be to use Lists unless you NEED an Array (for memory/performance reasons). Otherwise Lists are typically easier to maintain and thus less likely to cause future bugs.
Lists provide more flexibility/functionality in terms of auto-expansion, so unless you are either pressed for memory (and can't afford the overhead that Lists create) or do not mind maintaining the Array size as it expands/shrinks, I would recommend Lists.
Try not to micromanage the code too much, and instead focus on more discernible and readable components.

It depends on the list. A LinkedList probably takes up space only as it's needed, while an ArrayList typically greatly increases its space whenever its capacity is reached. Internally, an ArrayList is implemented using an array, but it's an array that's always larger than what you want. However, since it stores references, not objects, the memory overhead is negligible in most cases and the convenience is worth it, I believe.

I would have to say I follow this approach of using the Collections framework where I might otherwise have used an array. The collections offer you many benefits and convenience over arrays but yes there is likely to be some performance hit.
It is better to write code that is easy to understand and hard to break, arrays require you to put in a lot of checking code to ensure you don't access bits of the array you shouldn't or put to many things in it etc. Given that the majority of the time performance is not a problem it shouldn't be a concern.

Related

Java HashSet performance

I understand HashSet based on HashMap, since they are pretty similar. It makes the code more flexible and minimizes implementation effort. However, one reference variable in the HashSet's Entry seem to be unnecessary for me if the class forbids null element, therefore the whole Entry makes no point. Despite this fact, an Entry takes 24 byte memory / element, whereas a single array with the set's elements would take only 4 byte / element if my figures are correct. (aside from array's header)
If my argument is correct, does the advantages really overweight this performance hit?
(if i am wrong, i would learn from it aswell)
Though this question is primarily opinion-based, I'll summarize a few points on the topic:
HashSet appeared in Java 1.2 many years ago. It's hard to guess now the exact reasons for design decisions made at that times, but clearly Java wasn't used for high-loaded applications; performance played less role than simplicity.
You are right that HashSet is suboptimal in its memory consumption. The problem is known, the bug JDK-6624565 is registered, and discussions at core-libs-dev are held from time to time. But is this a blocker for many real world applications? Probably, no.
For those uncommon applications where HashSet memory usage is unacceptable, there are already good alternatives, like trove THashSet.
Note that open addressing algorithms has their disadvantages, e.g. significant performance degradation with load factors close to 1; element removal difficulties. See the related answer.

When to ArrayList

Very much a beginner question here, but hopefully a pertinent one.
I've been attempting to teach myself Java by way of coding a crappy little roguelike.
Since I discovered the collections framework, I've found that I'm using arraylists absolutely everywhere - so much so in fact that I find myself worrying I’m being woefully inefficient by using them in places where a regular array would suffice.
Thus my question is this: Under what circumstances should I favour using an arrayList over a regular array (or vice-versa) and why? Is there some kind of simple rule of thumb to help me pick which I should be using for any given task?
I refute that this duplicates Array or List in Java. Which is faster? - my question asks in which situation one is more methodologically sound than the other, and not which is generally quicker for any given task.
As said in Effective Java, one should prefer Lists to arrays.
One of the major differences is that arrays are covariant by their type and thus need accurate handling. Also, their type is reified and they do not mix well with generics.
But the implication is that arrays are able to work with primitives while generic collections aren't: they have Objects inside. So you might prefer arrays in performance critical parts of your code to avoid primitives boxing-unboxing.
If you know that your collection will always be a fixed length then use array.
If your collection is variable in length, I.e it could hold 1,5,100 values then use arraylist.
Example.
An application that asks the user a series of questions, the user can try get the answer right as many times as they like.
You create an array of possible answers to a question, you know there will only ever be 5 possible answers for each question, you would use an array of length 5 to store the possible answers.
You decide to create an array of all the answers the user submits, they could submit any number of answers, you'd store these in an arraylist as the user could give 1 or 100 answers before getting the question correct, a fixed length array here wouldn't do the job.
Hope that helps

Why exactly are Java arrays not expansible? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why aren’t arrays expandable?
I am starting to learn Java as my Computer Science school's assignments require this language and I'm liking the language. However, I've seen that Java arrays are not expansible - that is - you must declare their length before using them and it can't be changed further.
I'd like to know exactly why is that? Why the Java language designers chose to to make arrays as such? I guess it's for performance concerns but I'm not sure.
Thanks everyone in advance.
I'd like to know exactly why is that? Why the Java language designers chose to to make arrays as such? I guess it's for performance concerns but I'm not sure.
They designed primitives and arrays to be as simple and low level as possible. They don't do anything special and arrays don't use Object Orientated design at all. i.e. they only have a few useful methods, none specific to arrays.
The idea was that you would write higher level collections such as Lists using these low level constructs.
Java arrays are almost as simple as C arrays. C array is just a allocated memory region of n*m bytes where n is the number of elements in the array and m is the number of bytes needed to store a single element.
Then only thing Java added here is length and probably toString(). All other features can make array performance ineffective. Collections do that very well. Moreover collections are written in java itself that makes them portable.
Why the Java language designers chose to to make arrays as such?
Arrays are one of the programming data structures provided by the language. if you make Array also expandible, it'll become similar to ArrayList.
So, i guess because of two reasons:
To make Java similar to previous languages on basic constructs.
To remove duplication.
Arrays occupy consecutive memory locations and the compiler cannot make sure that the locations following the end of the array are available to be added to the array.
That is why many people Use LinkedList or ArrayList
This question is answred
Why aren't arrays expandable?

Initializing ArrayList with a new operator in Java?

What is the best practice for initializing an ArrayList in Java?
If I initialize a ArrayList using the new operator then the ArrayList will by default have memory allocated for 10 buckets. Which is a performance hit.
I don't know, maybe I am wrong, but it seems to me that I should create a ArrayList by mentioning the size, if I am sure about the size!
Which is a performance hit.
I wouldn't worry about the "performance hit". Object creation in Java is very fast. The performance difference is unlikely to be measurable by you.
By all means use a size if you know it. If you don't, there's nothing to be done about it anyway.
The kind of thinking that you're doing here is called "premature optimization". Donald Knuth says it's the root of all evil.
A better approach is to make your code work before you make it fast. Optimize with data in hand that tells you where your code is slow. Don't guess - you're likely to be wrong. You'll find that you rarely know where the bottlenecks are.
If you know how many elements you will add, initialize the ArrayList with correct number of objects. If you don't, don't worry about it. The performance difference is probably insignificant.
This is the best advice I can give you:
Don't worry about it. Yes, you have several options to create an ArrayList, but using the new, the default option provided by the library, isn't a BAD choice, otherwise it'd be stupid to make it the default choice for everyone without clarifying what's better.
If it turns out that this is a problem, you'll quickly discover it when you profile. That's the proper place to find problems, when you profile your application for performance/memory problems. When you first write the code, you don't worry about this stuff -- that's premature optimization -- you just worry about writing good, clean code, with good design.
If your design is good, you should be able to fix this problem in no time, with little impact to the rest of the system. Effective Java 2nd Edition, Item 52: Refer to objects by their interfaces. You may even be able to switch to a LinkedList, or any other kind of List out there, if that turns out to be a better data structure. Design for this kinds of flexibility.
Finally, Effective Java 2nd Edition, Item 1: Consider static factory methods instead of constructors. You may even be able to combine this with Item 5: Avoid creating unnecessary objects, if in fact no new instances are actually needed (e.g. Integer.valueOf doesn't always create a new instance).
Related questions
Java Generics Syntax - in-depth about type inferring static factory methods (also in Guava)
On ArrayList micromanagement
Here are some specific tips if you need to micromanage an ArrayList:
You can use ArrayList(int initialCapacity) to set the initial capacity of a list. The list will automatically grow beyond this capacity if needed.
When you're about to populate/add to an ArrayList, and you know what the total number of elements will be, you can use ensureCapacity(int minCapacity) (or the constructor above directly) to reduce the number of intermediate growth. Each add will run in amortized constant time regardless of whether or not you do this (as guaranteed in the API), so this can only reduce the cost by a constant factor.
You can trimToSize() to minimize the storage usage.
This kind of micromanagement is generally unnecessary, but should you decide (justified by conclusive profiling results) that it's worth the hassle, you may choose to do so.
See also
Collections.singletonList - Returns an immutable list containing only the specified object.
If you already know the size of your ArrayList (approximately) you should use the constructor with capacity. But most of the time developers don't really know what will be in the List, so with a capacity of 10 it should be sufficient for most of the cases.
10 buckets is an approximation and isn't a performance hit unless you already know that your ArrayList contains tons of elements, in this case, the need to resize your array all the time will be the performance hit.
You don't need to tell initial size of ArrayList. You can always add/remove any element from it easily.
If this is a performance matter, please keep in mind following things :
Initialization of ArrayList is very fast. Don't worry about it.
Adding/removing element from ArrayList is also very fast. Don't worry about it.
If you find your code runs too slow. The first to blame is your algorithm, no offense. Machine specs, OS and Language indeed participate too. But their participation is considered insignificant compared to your algorithm participation.
If you don't know the size of theArrayList, then you're probably better off using a LinkedList, since the LinkedList.add() operation is constant speed.
However as most people here have said you should not worry about speed before you do some kind of profiling.
You can use this old, but good (in my opinion) article for reference.
http://chaoticjava.com/posts/linkedlist-vs-arraylist/
Since ArrayList is implemented by array underlying, we have to choose a initial size for the array.
If you really care you can call trimToSize() once you have constructed and populated the object. The javadoc states that the capacity will be at least as large as the list size. As previously stated, its unlikely you will find that the memory allocated to an ArrayList is a performance bottlekneck, and if it were, I would recommend you use an array instead.

Why not always use ArrayLists in Java, instead of plain ol' arrays?

Quick question here: why not ALWAYS use ArrayLists in Java? They apparently have equal access speed as arrays, in addition to extra useful functionality. I understand the limitation in that it cannot hold primitives, but this is easily mitigated by use of wrappers.
Plenty of projects do just use ArrayList or HashMap or whatever to handle all their collection needs. However, let me put one caveat on that. Whenever you are creating classes and using them throughout your code, if possible refer to the interfaces they implement rather than the concrete classes you are using to implement them.
For example, rather than this:
ArrayList insuranceClaims = new ArrayList();
do this:
List insuranceClaims = new ArrayList();
or even:
Collection insuranceClaims = new ArrayList();
If the rest of your code only knows it by the interface it implements (List or Collection) then swapping it out for another implementation becomes much easier down the road if you find you need a different one. I saw this happen just a month ago when I needed to swap out a regular HashMap for an implementation that would return the items to me in the same order I put them in when it came time to iterate over all of them. Fortunately just such a thing was available in the Jakarta Commons Collections and I just swapped out A for B with only a one line code change because both implemented Map.
If you need a collection of primitives, then an array may well be the best tool for the job. Boxing is a comparatively expensive operation. For a collection (not including maps) of primitives that will be used as primitives, I almost always use an array to avoid repeated boxing and unboxing.
I rarely worry about the performance difference between an array and an ArrayList, however. If a List will provide better, cleaner, more maintainable code, then I will always use a List (or Collection or Set, etc, as appropriate, but your question was about ArrayList) unless there is some compelling reason not to. Performance is rarely that compelling reason.
Using Collections almost always results in better code, in part because arrays don't play nice with generics, as Johannes Weiß already pointed out in a comment, but also because of so many other reasons:
Collections have a very rich API and a large variety of implementations that can (in most cases) be trivially swapped in and out for each other
A Collection can be trivially converted to an array, if occasional use of an array version is useful
Many Collections grow more gracefully than an array grows, which can be a performance concern
Collections work very well with generics, arrays fairly badly
As TofuBeer pointed out, array covariance is strange and can act in unexected ways that no object will act in. Collections handle covariance in expected ways.
arrays need to be manually sized to their task, and if an array is not full you need to keep track of that yourself. If an array needs to be resized, you have to do that yourself.
All of this together, I rarely use arrays and only a little more often use an ArrayList. However, I do use Lists very often (or just Collection or Set). My most frequent use of arrays is when the item being stored is a primitive and will be inserted and accessed and used as a primitive. If boxing and unboxing every become so fast that it becomes a trivial consideration, I may revisit this decision, but it is more convenient to work with something, to store it, in the form in which it is always referenced. (That is, 'int' instead of 'Integer'.)
This is a case of premature unoptimization :-). You should never do something because you think it will be better/faster/make you happier.
ArrayList has extra overhead, if you have no need of the extra features of ArrayList then it is wasteful to use an ArrayList.
Also for some of the things you can do with a List there is the Arrays class, which means that the ArrayList provided more functionality than Arrays is less true. Now using those might be slower than using an ArrayList, but it would have to be profiled to be sure.
You should never try to make something faster without being sure that it is slow to begin with... which would imply that you should go ahead and use ArrayList until you find out that they are a problem and slow the program down. However there should be common sense involved too - ArrayList has overhead, the overhead will be small but cumulative. It will not be easy to spot in a profiler, as all it is is a little overhead here, and a little overhead there. So common sense would say, unless you need the features of ArrayList you should not make use of it, unless you want to die by a thousands cuts (performance wise).
For internal code, if you find that you do need to change from arrays to ArrayList the chance is pretty straight forward in most cases ([i] becomes get(i), that will be 99% of the changes).
If you are using the for-each look (for( value : items) { }) then there is no code to change for that as well.
Also, going with what you said:
1) equal access speed, depending on your environment. For instance the Android VM doesn't inline methods (it is just a straight interpreter as far as I know) so the access on that will be much slower. There are other operations on an ArrayList that can cause slowdowns, depends on what you are doing, regardless of the VM (which could be faster with a stright array, again you would have to profile or examine the source to be sure).
2) Wrappers increase the amount of memory being used.
You should not worry about speed/memory before you profile something, on the other hand you shouldn't choose what you know to be a slower option unless you have a good reason to.
Performance should not be your primary concern.
Use List interface where possible, choose concrete implementation based on actual requirements (ArrayList for random access, LinkedList for structural modifications, ...).
You should be concerned about performance.
Use arrays, System.arraycopy, java.util.Arrays and other low-level stuff to squeeze out every last drop of performance.
Well don't always blindly use something that is not right for the job. Always start off using Lists, choose ArrayList as your implementation. This is a more OO approach. If you don't know that you specifically need an array, you'll find that not tying yourself to a particular implementation of List will be much better for you in the long run. Get it working first, optimize later.

Categories