I am confused over the searching complexity of LinkedList in java. I have read that time complexity to search an element from a LinkedList is O(n).
say for example,
LinkedList<String> link=new LinkedList<String>();
link.add("A");
link.add("B");
link.add("C");
System.out.println(link.get(1));
Now, from here by get(index) method we can say that to search an element It should take O(1) times. But I have read that it will take O(n).
Can anybody help me out to get clear concept?
Access in a linked list implementation, like java.util.LinkedList, is O(n). To get an element from the list, there is a loop that follows links from one element to the next. In the worst case, in a list of n elements, n iterations of the loop are executed.
Contrast that with an array-based list, like java.util.ArrayList. Given an index, one random-access operation is performed to retrieve the data. That's O(1).
A linked list is as such, a list of items that are linked together by a means such as a pointer. To search a linked list, you are going to iterate over each item in the list. The most time this will take, will be T(n) where n is the length of your list.
A big-O notation stands for the upper bounds or the worst case scenario.
If you are searching for an item at index 1 it will finish almost immediately, T(1), since it was the first item in the list. If you are to search for the n^th item it will take T(n) time, thus staying in O(n).
A visual representation of a linked list:
[1] -> [2] -> [3] -> [4] -> ... [n-1] -> [n]
An example of what a get() method might look like
get(int i)
{
Node current = head
while (i > 0)
{
current = current.getNext()
i--
}
return current
}
As you see, it iterates over each node within the list.
The Ordo function means a maximal approximation. So it is O(n) because if the list has n entries and you want to get the last entry, then the search goes through n items.
On the other hand, lookup for an ArrayList is O(1), because the lookup time is close to constant, regardless the size of the list and the index you are looking for.
Related
Which would be faster when using a LinkedList? I haven't studied sorting and searching yet. I was thinking adding them directly in a sorted manner would be faster, as manipulating the nodes and pointers afterward intuitively seems to be very expensive. Thanks.
In general, using a linked list has many difficulties and is very expensive .
i)In the usual case, if you want to add values to the sorted linked list, you must go through the following steps:
If Linked list is empty then make the node as head and return it.
If the value of the node to be inserted is smaller than the value of the head node, then insert the node at the start and make it head.
In a loop, find the appropriate node after
which the input node is to be inserted.
To find the appropriate node start from the head,
keep moving until you reach a node GN who's value is greater than
the input node. The node just before GN is the appropriate node .
Insert the node after the appropriate node found in step 3.
the Time Complexity in this case is O(n). ( for each element )
but don't forget to sort the linked list before adding other elements in a sorted manner ; This means you have to pay extra cost to sort the linked list.
ii ) but if you want to add elements to the end of the linked list and then sort them the situation depends on the way of sorting ! for example you can use merge sort that has O(n*log n) time complexity ! and even you can use insertion sort with O(n^2) time complexity .
note : How to sort is a separate issue .
As you have already noticed! It is not easy to talk about which method is faster and it depends on the conditions of the problem, such as the number of elements of the link list and the elements needed to be added, or whether the cost of initial sorting is considered or not!
You have presented two options:
Keep the linked list sorted by inserting each new node at its sorted location.
Insert each new node at the end (or start) of the linked list and when all nodes have been added, sort the linked list
The worst case time complexity for the first option occurs when the list has each time to be traversed completely to find the position where a new node is to be inserted. In that case the time complexity is O(1+2+3+...+n) = O(n²)
The worst time complexity for the second option is O(n) for inserting n elements and then O(nlogn) for sorting the linked list with a good sorting algorithm. So in total the sorting algorithm determines the overall worst time complexity, i.e. O(nlogn).
So the second option has the better time complexity.
Merge sort has a complexity of O(nlogn) and is suitable for linked list.
If your data has a limited range, you can use radix sort and achieve O(kn) complexity, where k is log(range size).
In practice it is better to insert elements in a vector (dynamic array), then sort the array, and finally turn the array into a list. This will almost certainly give better running times.
1st Part :-
I was reading in the book - "Data Structure and Algorithms made easy in Java" that the time complexity for deleting the last element from Linkedlist and Arraylist is O(n). But the Linkedlist internally implements DoublyLinkedlist so the time complexity should be O(1) and similarly for Arraylist as it internally implement Array it should be O(1).
2nd Part :-
It also says that the insertion of an element at the ending of a linkedlist has a time complexity of O(n) but the linkedlist maintains pointers both at the end and at the front. So it this statement correct ? Moreover it says that the time complexity to insert an element in an arraylist at the end is O(1) if array is not full and O(n) if the array is full. Why O(n) if the array is full ?
Thanks for answering the 1st part . Can anyone please also explain the 2nd part. Thanks :)
It depends on what methods you're calling.
A glance at the implementation shows that if you're calling LinkedList.removeLast(), that's O(1). The LinkedList maintains pointers both to the first and last node in the list. So it doesn't have to traverse the list to get to the last node.
Calling LinkedList.remove(index) with the index of the last element is also O(1), because it traverses the list from the closest end. [Noted by user #andreas in comment below.]
But if you're calling LinkedList.remove(Object), then there's an O(n) search for the first matching node.
Similarly, for ArrayList, if you're calling ArrayList.remove(index) with the index of the last element, then that's O(1). For all other indices, there's a System.arrayCopy() call that can be O(n) -- but that's skipped entirely for the last element.
But if you call ArrayList.remove(Object), then again there's an O(n) search for the first matching node.
I was curious regarding a specific issue regarding unsorted linked lists. Let's say we have an unsorted linked list based on an array implementation. Would it be important or advantageous to maintain the current order of elements when removing an element from the center of the list? That hole would have to be filled, so let's say we take the last element in the list and insert it into that hole. Is the time complexity of shifting all elements over greater than moving that single element?
You can remove an item from a linked list without leaving a hole.
A linked list is not represented as an array of contiguous elements. Instead, it's a chain of elements with links. You can remove an element merely by linking its adjacent elements to each other, in a constant-time operation.
Now, if you had an array-based list, you could choose to implement deletion of an element by shifting the last element into position. This would give you O(1) deletion instead of O(n) deletion. However, you would want to document this behavior.
Is the time complexity of shifting all elements over greater than moving that single element?
Yes, for an array-based list. Shifting all the subsequent elements is O(n), and moving a single element is O(1).
java.util.List
If your list were an implementation of java.util.List, note that java Lists are defined to be ordered collections, and the List.remove(int index) method is defined to shift the remaining elements.
Yes, using an array implementation it would have a larger time complexity up to n/2(if the element was in the middle of the array) to shift all entires over. Where moving one element would be constant time.
Since you are using array the answer is yes, because you have to make multiple assignments.
If you would have used Nodes then it would be better in terms of complexity.
I need to compare about 60.000 with a list of 935.000 elements and if they match I need to perform a calculation.
I already implemented everything needed but the process takes about 40 min. I have a unique 7-digit number in both lists. The 935.000 and the 60.000 files are unsorted. Is it more efficient to sort (which sort?) the big list before I try to find the element? Keep in mind that I have to do this calculation only once a month so I don't need to repeat the process every day.
Basically which is faster:
unsorted linear search
sort list first and then search with another algorithm
Try it out.
You've got Collections.sort() which will do the heavy lifting for you, and Collections.binarySearch() which will allow you to find the elements in the sorted list.
When you search the unsorted list, you have to look through half the elements on average before you find the one you're looking for. When you do that 60,000 times on a list of 935,000 elements, that works out to about
935,000 * 1/2 * 60,000 = 28,050,000,000 operations
If you sort the list first (using mergesort) it will take about n * log(n) operations. Then you can use binary search to find elements in log(n) lookups for each of the 60,000 elements in your shorted list. That's about
935,000 * log(935,000) + log(935,000) * 60,000 = 19,735,434 operations
It should be a lot faster if you sort the list first, then use a search algorithm that takes advantage of the sorted list.
What would work quite well is to sort both lists and then iterate over both at the same time.
Use collections.sort() to sort the lists.
You start with an index for each sorted list and just basically walk straight through it. You start with the first element on the short list and compare it to the first elements of the long list. If you reach an element on the long list with an higher 7 digit number than the current number in the short list, increment your index of the short list. This way there is no need to check elements twice.
But actually, since you want to find the intersection of two lists, you might be better off just using longList.retainAll(shortList) to just get the intersection of the two lists. Then you can perform whatever you want on both of the lists in about O(1) since there is no need to actually find anything.
You can sort both lists and compare them element by element incrementing first or second index (i and j in the example below) as needed:
List<Comparable> first = ....
List<Comparable> second = ...
Collections.sort(first);
Collections.sort(second);
int i = 0;
int j = 0;
while (i < first.size() && j < second.size()) {
if (first.get(i).compareTo(second.get(j)) == 0) {
// Action for equals
}
if (first.get(i).compareTo(second.get(j)) > 0) {
j++;
} else {
i++;
}
}
The complexity of this code is O(n log(n)) where n is the biggest list size.
First of all, my basic understanding of a singly linked list has been that every node only points to the next subsequent node, so my problem might stem from the fact that my definition of such list is incorrect.
Given the list setup, getting to node n would require iterating through n-1 nodes that come before it, so search and access would be O(n). Now, apparently node insertion and deletion take O(1), but unless they are talking about first item insertion, then in reality it would be O(n) + O(1) for you to insert the item between nodes n and n+1.
Now, indexing a list would also have O(n) complexity, yet apparently building such indexes is frowned upon, and I cannot understand why. Couldn't we build a singly linked list index, which would allow us true O(1) insertion and deletion without having to perform an O(n) iteration over the list to get to our specific node? It wouldn't even need to be an index of all nodes, and we could have it pointing to subindexes i.e. for a list of 1000 items, first index would point to 10 different indexes for items between 1-100, 101-200, etc. and then those indexes could point to smaller indexes that go by 10. This way getting to node 543 could potentially take only 3+(index traversal) iterations, instead of 543 as it would for a typical singly linked list.
I guess, what I am asking is why such indexing should typically be avoided?
You are describing a skip-list.
A skip list have a search, insert, delete time complexity of O(logN), since this "smaller subindexes" you describe - you have logarithmic number of them (what happens if your list has 100 elements? How many of these levels do you need? And how much for 1,000,000 elements? and 10^30?).
Note that a skip list is usually maintained sorted, but you can do it unsorted (which is sorted by index - actually) if you wish as well.
With a singly-linked list, even if you already have a direct reference to the node to delete, the complexity is not O(1). This is because you have to update the prior node's next-node reference, which requires you to iterate through the list -- resulting in O(N) complexity. To get O(1) complexity for deletion, you'd need a doubly-linked list.
There is already a collections class that combines a HashMap with a doubly-linked list: the LinkedHashMap.