Which implementation is less "heavy": PriorityQueue or a sorted LinkedList (using a Comparator)?
I want to have all the items sorted. The insertion will be very frequent and ocasionally I will have to run all the list to make some operations.
A LinkedList is the worst choice. Either use an ArrayList (or, more generally, a RandomAccess implementor), or PriorityQueue. If you do use a list, sort it only before iterating over its contents, not after every insert.
One thing to note is that the PriorityQueue iterator does not provide the elements in order; you'll actually have to remove the elements (empty the queue) to iterate over its elements in order.
You should implement both and then do performance testing on actual data to see which works best in your specific circumstances.
I have made a small benchmark on this issue. If you want your list to be sorted after the end of all insertions then there is almost no difference between PriorityQueue and LinkedList(LinkedList is a bit better, from 5 to 10 percents quicker on my machine), however if you use ArrayList you will get almost 2 times quicker sorting than in PriorityQueue.
In my benchmark for lists I measured time from the beginning of filling it with values till the end of sorting. For PriorityQueue - from the beginning of filling till the end of polling all elements(because elements get ordered in PriorityQueue while removing them as mentioned in erickson answer)
adding objects to the priority queue will be O log(n) and the same for each pol. If you are doing inserts frequently on very large queues then this could impact performance. Inserting into the top of an ArrayList is constant so on the whole all those inserts will go faster on the ArrayList than on the priority queue.
If you need to grab ALL the elements in sorted order the Collections.sort will work in about O n log (n) time total. Where as each pol from the priority queue will be O log(n) time, so if you grab all n things from the queue that will again be O n log (n).
The use case where priority queue wins is if you are trying to find what the biggest value in the queue is at any given time. To do that with the ArrayList you have to sort the whole list each time you want to know the biggest. But with the priority queue it always knows what the biggest value is.
If you use a LinkedList, you would need to resort the items each time you added one and since inserts are frequent, I wouldn't use a LinkedList. So in this case, I would use a PriorityQueue's If you will only be adding unique elements to the list, I recommend using a SortedSet (one implementation is the TreeSet).
There is a fundamental difference between the two data structures and they are not as easily interchangeable as you might think.
According to the PriorityQueue documentation:
The Iterator provided in method iterator() is not guaranteed to traverse the elements of the priority queue in any particular order.
Use an ArrayList and call Collections.sort() on it only before iterating the list.
The issue with PriorityQueue is that you have to empty the queue to get the elements in order. If that is what you want then it is a fine choice. Otherwise you could use an ArrayList that you sort only when you need the sorted result or, if the items are distinct (relative to the comparator), a TreeSet. Both TreeSet and ArrayList are not very 'heavy' in terms of space; which is faster depends on the use case.
Do you need it sorted at all times? If that's the case, you might want to go with something like a tree-set (or other SortedSet with a fast lookup).
If you only need it sorted occasionally, go with a linked list and sort it when you need access. Let it be unsorted when you don't need access.
java.util.PriorityQueue is
"An unbounded priority queue based on
a priority heap"
. The heap data structure make much more sense than a linked list
I can see two options, which one is better depends on whether you need to be able to have duplicate items.
If you don't need to maintain duplicate items in your list, I would use a SortedSet (probably a TreeSet).
If you need maintain duplicate items, I would go with an LinkedList and insert new items into the list in the correct order.
The PriorityQueue doesn't really fit unless you want to remove the items whenever you do operations.
Going along with the others, make sure you use profiling to make sure you're picking out the correct solution for your particular problem.
IMHO: we don't need PriorityQueue if if have LinkedList. I can sort queue with LinkedList faster than with PriorityQueue. e.g.
Queue list = new PriorityQueue();
list.add("1");
list.add("3");
list.add("2");
while(list.size() > 0) {
String s = list.poll().toString();
System.out.println(s);
}
I believe this code works too long, cause each time I add element it will sort elements. but if I will use next code:
Queue list = new LinkedList();
list.add("1");
list.add("3");
list.add("2");
List lst = (List)list;
Collections.sort(lst);
while(list.size() > 0) {
String s = list.poll().toString();
System.out.println(s);
}
I think this code will sort only once and it will be faster that using PriorityQueue. So, I can once sort my LinkedList once, before using it, in any case and it will work faster. And even if it sort the same time I don't really need PriorityQueue, we really don't need this class.
Related
I appeared for an interview where interviewer asked me about ArrayList, Linked list and Vector. His question was
ArrayList, LinkedList, and Vector are all implementations of the List interface. Which of them is most efficient for adding and removing elements from the list ? And I was supposed to answer including any other alternatives I may be aware of.
I answered him but he seems little not impressed by my answer.
Can someone tell me more about this ?
Thank you
LinkedList is implemented as a double linked list. It's performance on add and remove is better than Arraylist, but worse on get and set methods.You will have to traverse the list up to a certain point in those cases. So, definitely not LinkedList.
ArrayList is implemented as a resizable array. As more elements are added to ArrayList, its size is increased dynamically. It's elements can be accessed directly by using the get and set methods, since ArrayList is essentially an array.
Vector is similar with ArrayList, but it is synchronised.
ArrayList is a better choice if your program is thread-safe. Vector and ArrayList require more space as more elements are added. Vector each time doubles its array size, while ArrayList grow 50% of its size each time.
LinkedList, however, also implements Queue interface which adds more methods than ArrayList and Vector, such as offer(), peek(), poll(), etc.
A lot is dependent on what kind of requirement you are working on. A decision can be taken depending upon needs.
LinkedList is best suited for adding/removing items, reason being you just change the links between the items without manipulating other unrelated items to accomplish current operation. This also makes linked lists comparatively faster than other containers.
Cheers!
Choose LinkedList if you have a lot of data to adding and removing from list but be careful if you wish to get element from your list than this will be not right Data Structure.
List<T> list = new LinkedList<T>();
LinkedList is implemented using Double linked List,since its perrformance is good at adding/removing elements from/to list.
Performance of ArrayList vs. LinkedList :
The time complexity comparison is as follows:
get() : O(n)
add() : O(1)
remove() : O(1)
It looks like I can't either use an ArrayList nor a Set:
Set<> - I can avoid duplicates using a set, but no shuffle option // Collections.shuffle(List<?> list)
ArrayList<> - I can use shuffle to randomise the list, but duplicates are allowed.
I could use a Set and convert this into an ArrayList (or the other way around) to avoid the duplicates. Alternatively, loop through the set to randomise the items. But I am looking for something more efficient.
You can maintain two separate collections, an ArrayList and a HashSet, and reject insertion of any item which is present in the HashSet.
If you are concerned with encapsulation, wrap the two collections in a meta-object that implements List, and carefully document that insertions of duplicate elements will be rejected, even if the general contract of List doesn't prescribe so.
Talking about the cost of this solution, I believe that in terms of time the cost would be absolutely negligible if compared to a plain ArrayList: most operations on HashSets cost amortized O(1), namely lookup and insertion. On the other hand, your memory usage will be twice (or more, depending on the HashSet load factor).
As far as I know sets aren't ordered, so you obviously cannot shuffle items of sets. For removing duplicates from a list I found this: How do I remove repeated elements from ArrayList?.
With the least amount of code and most elegance you can do something like:
public void testFoo() {
Set<Integer> s = new TreeSet<Integer>();
s.add(2);
s.add(1);
s.add(3);
Collections.shuffle(Arrays.asList(s.toArray()));
}
But this is not very effective, you could use an array and a hash function to put the elements in the desired spot on the array, and check of they are already there before putting them, this will work in O(n) time, so it's very good, but needs a little more code and some attention to the hash function.
You can use a Map to avoid duplicates and then Map.entrySet() and shuffle the ArrayList
You could actually use an "ordered set", e.g. TreeSet. In order to get a random order, don't insert the actual item but a wrapper with some random weight and use a corresponding comparator. Re-shuffling however would require to update all wrapper weights.
I have a situation where I have need a data structure that I can add strings to. This data structure is very large.
The specific qualities I need it have are:
get(index)
delete a certain number of entries that were added initially when the limit exceeds.(LIFO)
I've tried using an ArrayList but the delete operation is o(n) and for a linkedList the traverse or get() operation will be o(n).
What other options do I have?
circular buffer - one thats implemented with an array under the hood.
LinkedHashSet might be of interest. It is effectively a HashSet but it also maintains a LinkedList to allow a predictable iteration order - and therefore can also be used as a FIFO queue, with the nice added benefit that it can't contain duplicate entries.
Because it is a HashSet too, searches (as opposed to scans) can be O(1) if they can match on equals()
You can have a look at this question and this too.
I need to store a growing large number of objects in a collection. While performing actions of each object of the collection, I regularly need to check whether an object is already stored. If an object is not stored yet I will add it to the end of the collection. I process each object iteratively while doing the checks.
Objects already processed should not be removed from the collection because I do not want put them back to processing when I stumble upon them again.
As a result I do not know what collection may fit best. HashSet has a constant time "contains" method but a List has faster methods to iterate over its elements, right ?
What would be the wiser choice ? Would it be relevant to keep two different structures at a time containing the same nodes, a HashSet for the checks and a LinkedList for the processing ?
As a result I do not know what collection may fit best. HashSet has a constant time "contains" method but a List has faster methods to iterate over its elements, right ?
How about a LinkedHashSet?
Hash table and linked list implementation of the Set interface, with predictable iteration order. This implementation differs from HashSet in that it maintains a doubly-linked list running through all of its entries. This linked list defines the iteration ordering, which is the order in which elements were inserted into the set (insertion-order)
1) Use ArrayList, not LinkedList. LinkedLists consume a lot of memory, and it's slower on iteration than ArrayList.
2) I'd suggest to use two data structures. E.g. for the sake of you being unable to add to a collection wile iterating through it (ConcurrentModificationException)
Well, it seems you are interested in two views on your collection.
A queue like view, adding things to the end and inspecting them at the front.
A contains check
All those operations are well supported in different kinds of heaps, e.g. java.util.PriorityQueue
I have 100,000 objects in the list .I want to remove few elements from the list based on condition.Can anyone tell me what is the best approach to achieve interms of memory and performance.
Same question for adding objects also based on condition.
Thanks in Advance
Raju
Your container is not just a List. List is an interface that can be implemented by, for example ArrayList and LinkedList. The performance will depend on which of these underlying classes is actually instantiated for the object you are polymorphically referring to as List.
ArrayList can access elements in the middle of the list quickly, but if you delete one of them you need to shift a whole bunch of elements. LinkedList is the opposite i nthis respect., requiring iteration for the access but deletion is just a matter of reassigning pointers.
Your performance depends on the implementation of List, and the best choice of implementation depends on how you will be using the List and which operations are most frequent.
If you're going to be iterating a list and applying tests to each element, then a LinkedList will be most efficient in terms of CPU time, because you don't have to shift any elements in the list. It will, however consume more memory than an ArrayList, because each list element is actually held in an entry.
However, it might not matter. 100,000 is a small number, and if you aren't removing a lot of elements the cost to shift an ArrayList will be low. And if you are removing a lot of elements, it's probably better to restructure as a copy-with filter.
However, the only real way to know is to write the code and benchmark it.
Collections2.filter (from Guava) produces a filtered collection based on a predicate.
List<Number> myNumbers = Arrays.asList(Integer.valueOf(1), Double.valueOf(1e6));
Collection<Number> bigNumbers = Collections2.filter(
myNumbers,
new Predicate<Number>() {
public boolean apply(Number n) {
return n.doubleValue() >= 100d;
}
});
Note, that some operations like size() are not efficient with this scheme. If you tend to follow Josh Bloch's advice and prefer isEmpty() and iterators to unnecessary size() checks, then this shouldn't bite you in practice.
LinkedList could be a good choice.
LinkedList does "remove and add elements" more effective than ArrayList. and no need to call such method as ArrayList.trimToSize() to remove useless memory. But LinkedList is a dual-linked list, each element is wrapped as an Entry which needs extra memory.