Efficient way to get/remove first element from the list? - java

I want to take out and remove first element from the List. I can see, I have two options:
First Approach:
LinkedList<String> servers = new LinkedList<String>();
....
String firstServerName = servers.removeFirst();
Second Approach
ArrayList<String> servers = new ArrayList<String>();
....
String firstServerName = servers.remove(0);
I have lot of elements in my list.
Is there any preference which one we should use?
And what is the difference between the above two? Are they technically same thing in terms of performance? What is the complexity involve here if we have lot of elements?
What is the most efficient way to do this.

If the comparison for "remove first" is between the ArrayList and the LinkedList classes, the LinkedList wins clearly.
Removing an element from a linked list costs O(1), while doing so for an array (array list) costs O(n).

You should use LinkedList.
Background:
In practical terms, LinkedList#removeFirst is more efficient since it's operating over a doubly-linked list and the removal of first element basically consist in only unlinking it from the head of the list and update the next element to be the first one (complexity O(1)):
private E unlinkFirst(Node<E> f) {
// assert f == first && f != null;
final E element = f.item;
final Node<E> next = f.next;
f.item = null;
f.next = null; // help GC
first = next;
if (next == null)
last = null;
else
next.prev = null;
size--;
modCount++;
return element;
}
ArrayList#remove is operating over an internal array structure which requires moving all subsequent elements one position to the left by copying over the subarray (complexity O(n)):
public E remove(int index) {
rangeCheck(index);
modCount++;
E oldValue = elementData(index);
int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null; // clear to let GC do its work
return oldValue;
}
Additional answer:
On the other hand, LinkedList#get operation requires traversing half of the entire list for retrieving an element at a specified index - in worst case scenario. ArrayList#get would directly access the element at the specified index since it's operating over an array.
My rule of thumbs for efficiency here would be:
Use LinkedList if you do a lot of add/remove in comparison with
access operations (e.g.: get);
Use ArrayList if you do a lot
of access operations in comparison with add/remove.

Make sure you understand the difference between LinkedList and ArrayList. ArrayList is implemented using Array.
LinkedList takes constant time to remove an element.
ArrayList might take linear time to remove the first element (to confirm I need to check the implementation, not java expert here).
Also I think LinkedList is more efficient in terms of space. Because ArrayList would not (and should not) re-size the array every time an element is removed, it takes up more space than needed.

I think that what you need is an ArrayDeque (an unfairly overlooked class in java.util). Its removeFirst method performs in O(1) as for LinkedList, while it generally shows the better space and time characteristics of ArrayList. It’s implemented as a circular queue in an array.
You should very rarely use LinkedList. I did once in my 17 years as Java programmer and regretted in retrospect.

List.subList​(int fromIndex, int toIndex)
Returns a view of the portion of this list between the specified fromIndex, inclusive, and toIndex, exclusive.
Good to use for ArrayList where removing the first element has complexity O(n).
final String firstServerName = servers.get(0);
servers = servers.subList(1, servers.size());

Removing the first element of an ArrayList is O(n). For the linked list is O(1), so I'll go with that.
Check the ArrayList documentation
The size, isEmpty, get, set, iterator, and listIterator operations run
in constant time. The add operation runs in amortized constant time,
that is, adding n elements requires O(n) time. All of the other
operations run in linear time (roughly speaking). The constant factor
is low compared to that for the LinkedList implementation.
This guys actually got the OpenJDK source link

Using a linked list is by far faster.
LinkedList
It will just reference the nodes so the first one disappears.
ArrayList
With an Array List it has to move all elements back one spot to keep the underlying array proper.

As others have rightly pointed out, LinkedList is faster than ArrayList for removal of the first element from anything other than a very short list.
However, to make your choice between them you need to consider the complete mix of operations. For example, if your workload does millions of indexed accesses to a hundred element list for each first element removal, ArrayList will be better overall.

Third Aproach.
It is exposed by the java.util.Queue interface. LinkedList is an implementation of this interface.
The Queue interface is exposing the E poll() method which effectively removes the head of the List (Queue).
In terms of performance the poll() method is comparable to removeFirst(). Actually it is using under the hood the removeFirst() method.

Related

Why get(index) slow more than remove(index) of LinkedList in Java

I have a LinkedList with 100000 items of String.When i do an operation with index such as get,remove,add, it seems same mechanism.
First it browse the list to access Node[index], then does a another manipulation,
with "get" it only references to item of Node, whereas even the "remove" does more.
But why is the "get" operation take alot more time than the "remove" operation
for(int index=99999;index>=0;index--){
links.get(index);
}
get time in nanoseconds: 15083052805
for(int index=99999;index>=0;index--){
links.remove(index);
}
del time in nanoseconds: 2310625
LinkedList's functions:
public E get(int index) {
checkElementIndex(index);
return node(index).item;
}
public E remove(int index) {
checkElementIndex(index);
return unlink(node(index));
}
Getting the element at index N is very costly, since to actually get to the element the list nodes need to be traversed until the element is reached.
Now when you remove the elements in your loop, realize that youre always removing the last element of the list (your i corresponds exactly with list.size()-1 at all times in that loop).
The node(index) method actually optimizes the access by searching from the front/back, depending on if index is closer to 0 or list.size(). In your case of remove its always searching from the back and hits on the first try.
Lesson to be learned: LinkedList is not a suitable list type to access by random index.
It seems that you are not polling System.currentTimeMillis() (getting the current time) before entering the first loop.

Best way to remove one arraylist elements from another arraylist

What is the best performance method in Java (7,8) to eliminate integer elements of one Arraylist from another. All the elements are unique in the first and second lists.
At the moment I know the API method removeall and use it this way:
tempList.removeAll(tempList2);
The problem appears when I operate with arraylists have more than 10000 elements. For example when I remove 65000 elements, the delay appears to be about 2 seconds. But I need to opperate with even more large lists with more than 1000000 elements.
What is the strategy for this issue?
Maybe something with new Stream API should solve it?
tl;dr:
Keep it simple. Use
list.removeAll(new HashSet<T>(listOfElementsToRemove));
instead.
As Eran already mentioned in his answer: The low performance stems from the fact that the pseudocode of a generic removeAll implementation is
public boolean removeAll(Collection<?> c) {
for (each element e of this) {
if (c.contains(e)) {
this.remove(e);
}
}
}
So the contains call that is done on the list of elements to remove will cause the O(n*k) performance (where n is the number of elements to remove, and k is the number of elements in the list that the method is called on).
Naively, one could imagine that the this.remove(e) call on a List might also have O(k), and this implementation would also have quadratic complexity. But this is not the case: You mentioned that the lists are specifically ArrayList instances. And the ArrayList#removeAll method is implemented to delegate to a method called batchRemove that directly operates on the underlying array, and does not remove the elements individually.
So all you have to do is to make sure that the lookup in the collection that contains the elements to remove is fast - preferably O(1). This can be achieved by putting these elements into a Set. In the end, it can just be written as
list.removeAll(new HashSet<T>(listOfElementsToRemove));
Side notes:
The answer by Eran has IMHO two major drawbacks: First of all, it requires sorting the lists, which is O(n*logn) - and it's simply not necessary. But more importantly (and obviously) : Sorting will likely change the order of the elements! What if this is simply not desired?
Remotely related: There are some other subtleties involved in the removeAll implementations. For example, HashSet removeAll method is surprisingly slow in some cases. Although this also boils down to the O(n*n) when the elements to be removed are stored in a list, the exact behavior may indeed be surprising in this particular case.
Well, since removeAll checks for each element of tempList whether it appears in tempList2, the running time is proportional to the size of the first list multiplied by the size of the second list, which means O(N^2) unless one of the two lists is very small and can be considered as "constant size".
If, on the other hand, you pre-sort the lists, and then iterate over both lists with a single iteration (similar to the merge step in merge sort), the sorting will take O(NlogN) and the iteration O(N), giving you a total running time of O(NlogN). Here N is the size of the larger of the two lists.
If you can replace the lists by a sorted structure (perhaps a TreeSet, since you said the elements are unique), you can implement removeAll in linear time, since you won't have to do any sorting.
I haven't tested it, but something like this can work (assuming both tempList and tempList2 are sorted) :
Iterator<Integer> iter1 = tempList.iterator();
Iterator<Integer> iter2 = tempList2.iterator();
Integer current = null;
Integer current2 = null;
boolean advance = true;
while (iter1.hasNext() && iter2.hasNext()) {
if (advance) {
current = iter1.next();
advance = false;
}
if (current2 == null || current > current2) {
current2 = iter2.next();
}
if (current <= current2) {
advance = true;
if (current == current2)
iter1.remove();
}
}
I suspect removing from an ArrayList, is a perfromance hit since the list may either be divided when an element in the middle is removed, or if the list must be compacted after an element is removed. It may be faster to do this:
Create 'Set' of the elements to be removed
Create a new result ArrayList that you need, call it R. You can give it enough size at construction.
Iterate thru the original list you need elements from it removed, if the element is found in the Set, don't add it to R, otherwise add it.
This should have O(N); if creating the Set and a lookup in it is assumed constant.

Extract first k elements from a Set efficiently

Problem
I'm writing a simple Java program in which I have a TreeSet which contains Comparable elements (it's a class that I've written myself). In a specific moment I need to take only the first k elements from it.
What I've done
Currently I've found two different solution for my problem:
Using a simple method written by me; It copies the first k elements from the initial TreeSet;
Use Google Guava greatestOf method.
For the second option you need to call the method in this way:
Ordering.natural().greatestOf(mySet, 80))
But I think that it's useless to use this kind of invocation because the elements are already sorted. Am I wrong?
Question
I want to ask here which is a correct and, at the same time, efficient method to obtain a Collection derived class which contains the first k elements of a TreeSet?
Additional information
Java version: >= 7
You could use Guava's Iterables#limit:
ImmutableList.copyOf(Iterables.limit(yourSet, 7))
http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/collect/Iterables.html#limit(java.lang.Iterable, int)
I would suggest you to use a TreeSet<YourComparableClass> collection, it seems to be the solution you are looking for.
A TreeSet can return you an iterator, and you can simply iterates K times, by storing the objects the iterator returns you: the elements will be returned you in order.
Moreover a TreeSet keep your elements always sorted: at any time, when you add or remove elements, they are inserted and removed so that the structure remains ordered.
Here a possible example:
public static ArrayList<YourComparableClass> getFirstK(TreeSet<YourComparableClass> set, int k) {
Iterator<YourComparableClass> iterator = set.iterator();
ArrayList<YourComparableClass> result = new ArrayList<>(k); //to store first K items
for (int i=0;i<k;i++) result.add(iterator.next()); //iterator returns items in order
//you should also check iterator.hasNext(); if you are not sure to have always a K<set.size()
return result;
}
The descendingIterator() method of java.util.TreeSet yields elements from greatest to least, so you can just step it however many times, inserting the elements into a collection. The running time is O(log n + k) where k is the number of elements returned, which is surely fast enough.
If you're using a HashSet, on the other hand, then the elements in fact are not sorted, so you need to use the linear-time selection method that you indicated.

How to merge two big Lists in one sorted List in Java?

I had an interview today, and they gave me:
List A has:
f
google
gfk
fat
...
List B has:
hgt
google
koko
fat
ffta
...
They asked me to merge these two list in one sorted List C.
What I said:
I added List B to List A, then I created a Set from List A, then create a List from the Set. The interviewer told me the lists are big, and this method will not be good for performance, he said it will be a nlog(n).
What would be a better approach to this problem?
Well your method would require O(3N) additional space (the concatenated List, the Set and the result List), which is its main inefficiency.
I would sort ListA and ListB with whatever sorting algorithm you choose (QuickSort is in-place requiring O(1) space; I believe Java's default sort strategy is MergeSort which typically requires O(N) additional space), then use a MergeSort-like algorithm to examine the "current" index of ListA to the current index of ListB, insert the element that should come first into ListC, and increment that list's "current" index count. Still NlogN but you avoid multiple rounds of converting from collection to collection; this strategy only uses O(N) additional space (for ListC; along the way you'll need N/2 space if you MergeSort the source lists).
IMO the lower bound for an algorithm to do what the interviewer wanted would be O(NlogN). While the best solution would have less additional space and be more efficient within that growth model, you simply can't sort two unsorted lists of strings in less than NlogN time.
EDIT: Java's not my forte (I'm a SeeSharper by trade), but the code would probably look something like:
Collections.sort(listA);
Collections.sort(listB);
ListIterator<String> aIter = listA.listIterator();
ListIterator<String> bIter = listB.listIterator();
List<String> listC = new List<String>();
while(aIter.hasNext() || bIter.hasNext())
{
if(!bIter.hasNext())
listC.add(aIter.next());
else if(!aIter.hasNext())
listC.add(bIter.next());
else
{
//kinda smells from a C# background to mix the List and its Iterator,
//but this avoids "backtracking" the Iterators when their value isn't selected.
String a = listA[aIter.nextIndex()];
String b = listB[bIter.nextIndex()];
if(a==b)
{
listC.add(aIter.next());
listC.add(bIter.next());
}
else if(a.CompareTo(b) < 0)
listC.add(aIter.next());
else
listC.add(bIter.next());
}
}

What is a data structure that has O(1) for append, prepend, and retrieve element at any location?

I'm looking for Java solution but any general answer is also OK.
Vector/ArrayList is O(1) for append and retrieve, but O(n) for prepend.
LinkedList (in Java implemented as doubly-linked-list) is O(1) for append and prepend, but O(n) for retrieval.
Deque (ArrayDeque) is O(1) for everything above but cannot retrieve element at arbitrary index.
In my mind a data structure that satisfy the requirement above has 2 growable list inside (one for prepend and one for append) and also stores an offset to determine where to get the element during retrieval.
You're looking for a double-ended queue. This is implemented the way you want in the C++ STL, which is you can index into it, but not in Java, as you noted. You could conceivably roll your own from standard components by using two arrays and storing where "zero" is. This could be wasteful of memory if you end up moving a long way from zero, but if you get too far you can rebase and allow the deque to crawl into a new array.
A more elegant solution that doesn't really require so much fanciness in managing two arrays is to impose a circular array onto a pre-allocated array. This would require implementing push_front, push_back, and the resizing of the array behind it, but the conditions for resizing and such would be much cleaner.
A deque (double-ended queue) may be implemented to provide all these operations in O(1) time, although not all implementations do. I've never used Java's ArrayDeque, so I thought you were joking about it not supporting random access, but you're absolutely right — as a "pure" deque, it only allows for easy access at the ends. I can see why, but that sure is annoying...
To me, the ideal way to implement an exceedingly fast deque is to use a circular buffer, especially since you are only interested in adding removing at the front and back. I'm not immediately aware of one in Java, but I've written one in Objective-C as part of an open-source framework. You're welcome to use the code, either as-is or as a pattern for implementing your own.
Here is a WebSVN portal to the code and the related documentation. The real meat is in the CHAbstractCircularBufferCollection.m file — look for the appendObject: and prependObject: methods. There is even a custom enumerator ("iterator" in Java) defined as well. The essential circular buffer logic is fairly trivial, and is captured in these 3 centralized #define macros:
#define transformIndex(index) ((headIndex + index) % arrayCapacity)
#define incrementIndex(index) (index = (index + 1) % arrayCapacity)
#define decrementIndex(index) (index = ((index) ? index : arrayCapacity) - 1)
As you can see in the objectAtIndex: method, all you do to access the Nth element in a deque is array[transformIndex(N)]. Note that I make tailIndex always point to one slot beyond the last stored element, so if headIndex == tailIndex, the array is full, or empty if the size is 0.
Hope that helps. My apologies for posting non-Java code, but the question author did say general answers were acceptable.
If you treat append to a Vector/ArrayList as O(1) - which it really isn't, but might be close enough in practice -
(EDIT - to clarify - append may be amortized constant time, that is - on average, the addition would be O(1), but might be quite a bit worse on spikes. Depending on context and the exact constants involved, this behavior can be deadly).
(This isn't Java, but some made-up language...).
One vector that will be called "Forward".
A second vector that will be called "Backwards".
When asked to append -
Forward.Append().
When asked to prepend -
Backwards.Append().
When asked to query -
if ( Index < Backwards.Size() )
{
return Backwards[ Backwards.Size() - Index - 1 ]
}
else
{
return Forward[ Index - Backwards.Size() ]
}
(and also check for the index being out of bounds).
Your idea might work. If those are the only operations you need to support, then two Vectors are all you need (call them Head and Tail). To prepend, you append to head, and to append, you append to tail. To access an element, if the index is less than head.Length, then return head[head.Length-1-index], otherwise return tail[index-head.Length]. All of these operations are clearly O(1).
Here is a data structure that supports O(1) append, prepend, first, last and size. We can easily add other methods from AbstractList<A> such as delete and update
import java.util.ArrayList;
public class FastAppendArrayList<A> {
private ArrayList<A> appends = new ArrayList<A>();
private ArrayList<A> prepends = new ArrayList<A>();
public void append(A element) {
appends.add(element);
}
public void prepend(A element) {
prepends.add(element);
}
public A get(int index) {
int i = prepends.size() - index;
return i >= 0 ? prepends.get(i) : appends.get(index + prepends.size());
}
public int size() {
return prepends.size() + appends.size();
}
public A first() {
return prepends.isEmpty() ? appends.get(0) : prepends.get(prepends.size());
}
public A last() {
return appends.isEmpty() ? prepends.get(0) : appends.get(prepends.size());
}
What you want is a double-ended queue (deque) like the STL has, since Java's ArrayDeque lacks get() for some reason. There were some good suggestions and links to implementations here:
Java equivalent of std::deque?

Categories