What collection should I use [duplicate] - java

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Rule of thumb for choosing an implementation of a Java Collection?
My situation is this:
I have a collection of objects that I need to hold and every now and then iterate through
The size of the collection is dynamic
The iteration should access each element
The collection does not need to be sorted
Creating or updating the collection has no time constraints but I'd like to iterate through the collection as fast as possible.
What would be the best Collection to use (or would you perhaps suggest using an array?)

You can use a List collection probaly ArrayList.
It depends on the parameters like whether it is an ordered collection, whether you want to maintain the insertion order, whether you want to maintain uniqueness etc
List vs Set
Set: Unique, unordered collection
List: ordered collection, allows duplicate elements
ArrayList vs LinkedList

In general : If you don't have particular constraints, an ArrayList is you best bet. Unless you have extremely tight performance control, don't go for a blank array, you have too much chance for errors (and a big chance of not outperforming the ArrayList implementation).
In your case the fast iteration requirements means ArrayList is a good choice.

Related

How to decide which collection should be used from List Set? [duplicate]

This question already has answers here:
When to use LinkedList over ArrayList in Java?
(33 answers)
Which Java Collection should I use?
(6 answers)
Closed 6 years ago.
I want to know when to use Set and List.On which basis it should be decided.For example, when we are dealing with order application, at that what should I use?
List any = new ArrayList<>();
or
Set any = new HashSet();
or LinkedList.
It all depends upon your current requirements
for example, consider some important points about
If you want to access the elements in the same way you're inserting them, then you should use List because List is an ordered collection of elements. You can access them using get(int index) method, whereas no such method is available for Set. The order in which they will be stored is not guaranteed.
If your elements contains duplicates, then use List because Set doesn't allow duplicates whereas if your elements are unique, then you can use Set.
As far as LinkedList and ArrayList are considered:
LinkedList are slow because they allow only sequential access. But they're good if your elements size is regularly changing, whereas if the size of your elements is fixed, then you should use ArrayList because they allow fast random read access, so you can grab any element in constant time.
However, ArrayList are not good when you require large delete operations because adding or removing from anywhere but the end requires shifting all the latter elements over.
ArrayList are not considered good when you have to insert anything in middle because if you want to insert a new element in the middle (and keep all the elements in the same order) then you're going to have to shift everything after that spot where element was inserted, whereas such operation in LinkedList requires only change of some references.
Take a look at the features of several data structures and according to your requirements, you can decide where you should use which data structure.
also see:
- When to use LinkedList over ArrayList?
- What is the difference between Set and List?
- What Java Collection should I use?
- Insertion in the middle of ArrayList vs LinkedList

Java HashSet vs Array Performance

I have a collection of objects that are guaranteed to be distinct (in particular, indexed by a unique integer ID). I also know exactly how many of them there are (and the number won't change), and was wondering whether Array would have a notable performance advantage over HashSet for storing/retrieving said elements.
On paper, Array guarantees constant time insertion (since I know the size ahead of time) and retrieval, but the code for HashSet looks much cleaner and adds some flexibility, so I'm wondering if I'm losing anything performance-wise using it, at least, theoretically.
Depends on your data;
HashSet gives you an O(1) contains() method but doesn't preserve order.
ArrayList contains() is O(n) but you can control the order of the entries.
Array if you need to insert anything in between, worst case can be O(n), since you will have to move the data down and make room for the insertion. In Set, you can directly use SortedSet which too has O(n) too but with flexible operations.
I believe Set is more flexible.
The choice greatly depends on what do you want to do with it.
If it is what mentioned in your question:
I have a collection of objects that are guaranteed to be distinct (in particular, indexed by a unique integer ID). I also know exactly how many of them there are
If this is what you need to do, the you need neither of them. There is a size() method in Collection for which you can get the size of it, which mean how many of them there are in the collection.
If what you mean for "collection of object" is not really a collection, and you need to choose a type of collection to store your objects for further processing, then you need to know, for different kind of collections, there are different capabilities and characteristic.
First, I believe to have a fair comparison, you should consider using ArrayList instead Array, for which you don't need to deal with the reallocation.
Then it become the choice of ArrayList vs HashSet, which is quite straight-forward:
Do you need a List or Set? They are for different purpose: Lists provide you indexed access, and iteration is in order of index. While Sets are mainly for you to keep a distinct set of data, and given its nature, you won't have indexed access.
After you made your decision of List or Set to use, then it is a choice of List/Set implementation, normally for Lists, you choose from ArrayList and LinkedList, while for Sets, you choose between HashSet and TreeSet.
All the choice depends on what you would want to do with that collection of data. They performs differently on different action.
For example, an indexed access in ArrayList is O(1), in HashSet (though not meaningful) is O(n), (just for your interest, in LinkedList is O(n), in TreeSet is O(nlogn) )
For adding new element, both ArrayList and HashSet is O(1) operation. Inserting in the middle is O(n) for ArrayList, while it doesn't make sense in HashSet. Both will suffer from reallocation, and both of them need O(n) for the reallocation (HashSet is normally slower in reallocation, because it involve calculation of hash for each element again).
To find if certain element exists in the collection, ArrayList is O(n) and HashSet is O(1).
There are still lots of operations you can do, so it is quite meaningless to discuss for performance without knowing what you want to do.
theoretically, and as SCJP6 Study guide says :D
arrays are faster than collections, and as said, most of the collections depend mainly on arrays (Maps are not considered Collection, but they are included in the Collections framework)
if you guarantee that the size of your elements wont change, why get stuck in Objects built on Objects (Collections built on Arrays) while you can use the root objects directly (arrays)
It looks like you will want an HashMap that maps id's to counts. Particularly,
HashMap<Integer,Integer> counts=new HashMap<Integer,Integer>();
counts.put(uniqueID,counts.get(uniqueID)+1);
This way, you get amortized O(1) adds, contains and retrievals. Essentially, an array with unique id's associated with each object IS a HashMap. By using the HashMap, you get the added bonus of not having to manage the size of the array, not having to map the keys to an array index yourself AND constant access time.

java - what is the difference between a list and an arraylist [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
What is a List vs. an ArrayList?
Ive used both of them, but im just wondering what are the pros and cons between them? What are the major differences? And which one is better to use?
Thanks.
List is an interface implemented by ArrayList class. Another well-known implementation of the List is LinkedList.
ArrayList provides constant-time random access, while LinkedList provides constant time for non-sequential access. When you declare a variable that will hold an ArrayList, consider accessing it through an interface, like this:
List<ElementType> myList = new ArrayList<ElementType>();
This will let you swap in a different implementation without disturbing the rest of your code.
List merely describes the contract of what it means to be a list. As such, it is not a concrete implementation but merely an interface. A list can be implemented in a number of ways.
In particular, you have ArrayList, which internally keeps a dynamic array for storing all the elements in order. You also have LinkedList, which stores elements as a doubly linked list i.e. a sequence of nodes which keep references to the previous and next nodes.
Vector is another List, much like an ArrayList in that its implementation is based on a dynamic array; it's, however, a relic of the older versions of Java and is guaranteed to be thread-safe by being wholly synchronized. In practice, new Vector<T>() is more-or-less equivalent to Collections.synchronizedList(new ArrayList<T>()).
The reason for having a List is that a list can come implemented in a number of ways. That being said, often you want to have some sort of generic behavior that can be applicable to all Lists... see polymorphism.
A List is an interface, and an ArrayList is an implementation of that interface. An ArrayList is a List, and so are LinkedLists, Stacks, Vectors, etc.
the other posters already answered the "what" part of your question. Some considerations to think about when choosing between them.
An ArrayList uses an array behind the scenes. So accessing by index can be done in constant time. Adding can also be done in constant time, if the array has been allocated with enough space. However, when the space runs out, ArrayList will allocate a larger array and copy the old array values into the new one.
A LinkedList uses nodes that are chained together. Accessing by an index can potentially require walking the entire list (linear time). Inserting only requires creating a new node and adding it at the end (which could be constant time if a tail pointer is maintained).
So "which one is better" can depend on how you are using it. Truthfully, I've never measured performance differences between the two, but it's just something to consider.

Most Lightweight Java Collection

If I am going to create a Java Collection, and only want to fill it with elements, and then iterate through it (without knowing the necessary size beforehand), i.e. all I need is Collection<E>.add(E) and Collection<E>.iterator(), which concrete class should I choose? Is there any advantage to using a Set rather than a List, for example? Which one would have the least overhead?
which concrete class should I choose?
I would probably just go with an ArrayList or a LinkedList. Both support the add and iterator methods, and neighter of them have any considerable overhead.
Is there any advantage to using a Set rather than a List, for example?
No, I wouldn't say so. (Unless you rely on the order of the elements, in which case you must use a List, or want to disallow duplicates, in which case you should use a Set.)
(I don't see how any Set implementation could beat a list implementation for add / iterator methods, so I'd probably go with a List even if I don't care about order.)
Which one would have the least overhead?
Sounds like micro benchmarking here, but if I'd be forced to guess, I'd say ArrayList (or perhaps LinkedList in coner cases where ArrayLists need to reallocate memory often :-)
Do not go with a Set. Sets and Lists differ according to their purpose, that you should always consider when choosing the right Collection
a List is there for maintaining elements in the order you added them; and if you insert the same element twice it will be kept twice
a Set is there for holding one specific element exactly once (uniqueness); order is only relevant for specific implementations (like TreeSet), but still elements that are 'the same' would not be added twice
Set is only meaningful if you want to sort your objects and to make sure no duplicate element is 'registered'. Else, an ArrayList is just fine.
However, if you want to add elements while iterating too, an ArrayBlockingQueue is better.
Here are some key points which can help you to choose your collection according to your requirement -
List(ArrayList or LinkedList)
Allowed duplicate values.
Insertion order preserved.
Set
Not allowed duplicate values.
Insertion order is not preserved.
So according to your requirement List seems to be a suitable choice.
Now Between ArrayList and LinkedList -
ArrayList is a random access list. Use if your frequent operation is the retrieval of elements.
LinkedList is the best option if you want to add or remove elements from the list.

Complexity of processing a collection's values

I need to store a growing large number of objects in a collection. While performing actions of each object of the collection, I regularly need to check whether an object is already stored. If an object is not stored yet I will add it to the end of the collection. I process each object iteratively while doing the checks.
Objects already processed should not be removed from the collection because I do not want put them back to processing when I stumble upon them again.
As a result I do not know what collection may fit best. HashSet has a constant time "contains" method but a List has faster methods to iterate over its elements, right ?
What would be the wiser choice ? Would it be relevant to keep two different structures at a time containing the same nodes, a HashSet for the checks and a LinkedList for the processing ?
As a result I do not know what collection may fit best. HashSet has a constant time "contains" method but a List has faster methods to iterate over its elements, right ?
How about a LinkedHashSet?
Hash table and linked list implementation of the Set interface, with predictable iteration order. This implementation differs from HashSet in that it maintains a doubly-linked list running through all of its entries. This linked list defines the iteration ordering, which is the order in which elements were inserted into the set (insertion-order)
1) Use ArrayList, not LinkedList. LinkedLists consume a lot of memory, and it's slower on iteration than ArrayList.
2) I'd suggest to use two data structures. E.g. for the sake of you being unable to add to a collection wile iterating through it (ConcurrentModificationException)
Well, it seems you are interested in two views on your collection.
A queue like view, adding things to the end and inspecting them at the front.
A contains check
All those operations are well supported in different kinds of heaps, e.g. java.util.PriorityQueue

Categories