This question already has answers here:
What is a plain English explanation of "Big O" notation?
(43 answers)
Closed 8 months ago.
I wrote code for an enqueue method in a singly linked list and I'm wondering if anyone can tell me the Big O is for this code. I at first assumed it was O(n) because of the loop. However, the loop will always iterate a specific number of times depending on how many items are in the list. This makes me believe it's actually O(1). Am I wrong?
public Node<T> enqueue(T data){
Node<T> toQueue = new Node<>(data);
if (this.head == null) {
this.head = toQueue;
return toQueue;
}
Node<T> lastNode = this.head;
while(lastNode.next != null){
lastNode = lastNode.next;
}
lastNode.next = toQueue;
return toQueue;
}
Let's start from the following excerpt from the question:
However, the loop will always iterate a specific number of times depending on how many items are in the list.
This is a correct statement.
Please, note the dependency on the input size:
iterate a specific number of times depending on how many items are in the list
Therefore, the algorithm has the linear time complexity — O(n).
Linear time complexity
A slightly reformatted excerpt from the article: Time complexity: Linear_time - Wikipedia:
An algorithm is said to take linear time, or O(n) time, if its time complexity is O(n). Informally, this means that the running time increases at most linearly with the size of the input. More precisely, this means that there is a constant c such that the running time is at most cn for every input of size n. For example, a procedure that adds up all elements of a list requires time proportional to the length of the list, if the adding time is constant, or, at least, bounded by a constant.
A slightly reformatted excerpt from the article: Big O notation: Orders of common functions - Wikipedia: let's refer to the corresponding row:
Notation: O(n)
Name: linear
Example: Finding an item in an unsorted list or in an unsorted array; adding two n-bit integers by ripple carry
However, the loop will always iterate a specific number of times depending on how many items are in the list. This makes me believe it's actually O(1).
Am I wrong?
Your reasoning is wrong.
A complexity analysis of an algorithm needs to take account of all possible inputs.
While the number of elements in a single given list can be assumed to be constant while you are looping, the number of elements in any list is not a constant. If you consider all possible lists that could be inputs to the algorithm, the length of the list is a variable whose value can get arbitrarily large1.
If we call that variable N then it is clear that the complexity class for your algorithm is O(N). (I won't go into the details because I think you already understand them.)
The only way that your reasoning could be semi-correct would be if you could categorically state that the input list length was less than some constant L. The complexity class then collapses to O(1). However even this reasoning is dubious2, since the algorithm as written does not check that constraint. It has no control over the list length!
On the other hand, if you rewrote the algorithm as this:
public static final int L = 42;
public Node<T> enqueue(T data){
Node<T> toQueue = new Node<>(data);
if (this.head == null) {
this.head = toQueue;
return toQueue;
}
Node<T> lastNode = this.head;
int count = 0;
while(lastNode.next != null){
lastNode = lastNode.next;
if (count++ > L) {
throw InvalidArgumentException("list too long");
}
}
lastNode.next = toQueue;
return toQueue;
}
then we can legitimately say that the method is O(1). It will either give a result or throw an exception within a constant time.
1 - I am ignoring the fact that there are practical limits on how long a simple linked list like this can be in a Java program. If the list is too large, it won't fit in the heap. And there are limits on how large you could make the heap.
2 - A more mathematically sound way to describe the scenario is that your algorithm is O(N) but your use of the algorithm is O(1) because the calling code (not shown) enforces a bound on the list length.
Related
I am trying to understand the time complexity while using backtracking. The problem is
Given a set of unique integers, return all possible subsets.
Eg. Input [1,2,3] would return [[],[1],[2],[1,2],[3],[1,3],[2,3],[1,2,3]]
I am solving it using backtracking as this:
private List<List<Integer>> result = new ArrayList<>();
public List<List<Integer>> getSubsets(int[] nums) {
for (int length = 1; length <= nums.length; length++) { //O(n)
backtrack(nums, 0, new ArrayList<>(), length);
}
result.add(new ArrayList<>());
return result;
}
private void backtrack(int[] nums, int index, List<Integer> listSoFar, int length) {
if (length == 0) {
result.add(listSoFar);
return;
}
for (int i = index; i < nums.length; i++) { // O(n)
List<Integer> temp = new ArrayList<>();
temp.addAll(listSoFar); // O(2^n)
temp.add(nums[i]);
backtrack(nums, i + 1, temp, length - 1);
}
}
The code works fine, but I am having trouble understanding the time/space complexity.
What I am thinking is here the recursive method is called n times. In each call, it generates the sublist that may contain max 2^n elements. So time and space, both will be O(n x 2^n), is that right?
Is that right? If not, can any one elaborate?
Note that I saw some answers here, like this but unable to understand. When recursion comes into the picture, I am finding it a bit hard to wrap my head around it.
You're exactly right about space complexity. The total space of the final output is O(n*2^n), and this dominates the total space used by the program. The analysis of the time complexity is slightly off though. Optimally, time complexity would, in this case, be the same as the space complexity, but there are a couple inefficiencies here (one of which is that you're not actually backtracking) such that the time complexity is actually O(n^2*2^n) at best.
It can definitely be useful to analyze a recursive algorithm's time complexity in terms of how many times the recursive method is called times how much work each call does. But be careful about saying backtrack is only called n times: it is called n times at the top level, but this is ignoring all the subsequent recursive calls. Also every call at the top level, backtrack(nums, 0, new ArrayList<>(), length); is responsible for generating all subsets sized length, of which there are n Choose length. That is, no single top-level call will ever produce 2^n subsets; it's instead that the sum of n Choose length for lengths from 0 to n is 2^n:
Knowing that across all recursive calls, you generate 2^n subsets, you might then want to ask how much work is done in generating each subset in order to determine the overall complexity. Optimally, this would be O(n), because each subset varies in length from 0 to n, with the average length being n/2, so the overall algorithm might be O(n/2*2^n) = O(n*2^n), but you can't just assume the subsets are generated optimally and that no significant extra work is done.
In your case, you're building subsets through the listSoFar variable until it reaches the appropriate length, at which point it is appended to the result. However, listSoFar gets copied to a temp list in O(n) time for each of its O(n) characters, so the complexity of generating each subset is O(n^2), which brings the overall complexity to O(n^2*2^n). Also, some listSoFar subsets are created which never figure into the final output (you never check to see that there are enough numbers remaining in nums to fill listSoFar out to the desired length before recursing), so you end up doing unnecessary work in building subsets and making recursive calls which will never reach the base case to get appended to result, which might also worsen the asymptotic complexity. You can address the first of these inefficiencies with back-tracking, and the second with a simple break statement. I wrote these changes into a JavaScript program, leaving most of the logic the same but re-naming/re-organizing a little bit:
function getSubsets(nums) {
let subsets = [];
for (let length = 0; length <= nums.length; length++) {
// refactored "backtrack" function:
genSubsetsByLength(length); // O(length*(n Choose length))
}
return subsets;
function genSubsetsByLength(length, i=0, partialSubset=[]) {
if (length === 0) {
subsets.push(partialSubset.slice()); // O(n): copy partial and push to result
return;
}
while (i < nums.length) {
if (nums.length - i < length) break; // don't build partial results that can't finish
partialSubset.push(nums[i]); // O(1)
genSubsetsByLength(length - 1, ++i, partialSubset);
partialSubset.pop(); // O(1): this is the back-tracking part
}
}
}
for (let subset of getSubsets([1, 2, 3])) console.log(`[`, ...subset, ']');
The key difference is using back-tracking to avoid making copies of the partial subset every time you add a new element to it, such that each is built in O(length) = O(n) time rather than O(n^2) time, because there is now only O(1) work done per element added. Popping off the last character added to the partial result after each recursive call allows you to re-use the same array across recursive calls, thus avoiding the O(n) overhead of making temp copies for each call. This, along with the fact that only subsets which appear in the final output are built, allows you to analyze the total time complexity in terms of the total number of elements across all subsets in the output: O(n*2^n).
Your code works not efficiently.
Like first solution in the link, you only think about the number will be included or not. (like getting combination)
It means, you don't have to iterate in getSubsets and backtrack function.
"backtrack" function could iterate "nums" array with parameter
private List<List<Integer>> result = new ArrayList<>();
public List<List<Integer>> getSubsets(int[] nums) {
backtrack(nums, 0, new ArrayList<>(), new ArrayList<>());
return result;
}
private void backtrack(int[] nums, int index, List<Integer> listSoFar)
// This function time complexity 2^N, because will search all cases when the number included or not
{
if (index == nums.length) {
result.add(listSoFar);
return;
}
// exclude num[index] in the subset
backtrack(nums, index+1, listSoFar)
// include num[index] in the subset
backtrack(nums, index+1, listSoFar.add(nums[index]))
}
I am trying to calculate the runtime of a function I wrote in Javq, I wrote a function that calculates the sum of all the right children in a BT.
I used recursion in the function, and I don't really understand how to calculate the runtime in recursion non the less in a BT (just started studying the subject).
This is the code I wrote:
public int sumOfRightChildren(){
return sumOfRightChildren(this.root);
}
private int sumOfRightChildren(Node root){
if(root == null) //O(1)
return 0;//O(1)
int sum = 0;//O(1)
if(root.right != null)//O(1)
sum+=root.right.data;//O(1)
sum += sumOfRightChildren(root.right); //worst case O(n) ?
if(root.left != null)
{
sum += sumOfRightChildren(root.left);//worst case O(n)?
}
return sum;
}
I tried writing down the runtimes I think it takes, but I don't think I am doing it right.
If someone can help guide me I'd be very thankful.
I'm trying to calculate T(n).
Since you visit every node exactly once is easy to see the runtime cost is T(n) = n * K where n is the number of nodes in the Binary Tree and K is the expected function cost.
If you want to explicitly consider the cost of certain operations you may not be able to calculate it exactly (without having an input example). For example, calculating the number of times sum+=... is executed is not possible because it depends on the particular tree.
In this case the worst case is a full Binary Tree and, if it is n=1,2,... their depth:
the complexity is O(2^n) (no matter the operations since all of them take O(1) as you have posted).
the cost of sum+=root.right.data; is T(n) = 2^n - 1 (all internal nodes).
the cost of sum+=... is T(n) = 3 * (2^n - 1) (twice for every internal node and one more for each node).
...
(NOTE: the exact final expression may vary since your if(root.left != null) is not usefull and prefereable let that condition to the if(root == null))
Ok I think I understood,
The worst case is that is has to check all the nodes in the tree so the answer is: O(n)
This is a common interview question.
You have a stream of numbers coming in (let's say more than a million). The numbers are between [0-999]).
Implement a class which supports three methods in O(1)
* insert(int i);
* getMean();
* getMedian();
This is my code.
public class FindAverage {
private int[] store;
private long size;
private long total;
private int highestIndex;
private int lowestIndex;
public FindAverage() {
store = new int[1000];
size = 0;
total = 0;
highestIndex = Integer.MIN_VALUE;
lowestIndex = Integer.MAX_VALUE;
}
public void insert(int item) throws OutOfRangeException {
if(item < 0 || item > 999){
throw new OutOfRangeException();
}
store[item] ++;
size ++;
total += item;
highestIndex = Integer.max(highestIndex, item);
lowestIndex = Integer.min(lowestIndex, item);
}
public float getMean(){
return (float)total/size;
}
public float getMedian(){
}
}
I can't seem to think of a way to get the median in O(1) time.
Any help appreciated.
You have already done all the heavy lifting, by building the store counters. Together with the size value, it's easy enough.
You simply start iterating the store, summing up the counts until you reach half of size. That is your median value, if size is odd. For even size, you'll grab the two surrounding values and get their average.
Performance is O(1000/2) on average, which means O(1), since it doesn't depend on n, i.e. performance is unchanged even if n reaches into the billions.
Remember, O(1) doesn't mean instant, or even fast. As Wikipedia says it:
An algorithm is said to be constant time (also written as O(1) time) if the value of T(n) is bounded by a value that does not depend on the size of the input.
In your case, that bound is 1000.
The possible values that you can read are quite limited - just 1000. So you can think of implementing something like a counting sort - each time a number is input you increase the counter for that value.
To implement the median in constant time, you will need two numbers - the median index(i.e. the value of the median) and the number of values you've read and that are on the left(or right) of the median. I will just stop here hoping you will be able to figure out how to continue on your own.
EDIT(as pointed out in the comments): you already have the array with the sorted elements(stored) and you know the number of elements to the left of the median(size/2). You only need to glue the logic together. I would like to point out that if you use linear additional memory you won't need to iterate over the whole array on each insert.
For the general case, where range of elements is unlimited, such data structure does not exist based on any comparisons based algorithm, as it will allow O(n) sorting.
Proof: Assume such DS exist, let it be D.
Let A be input array for sorting. (Assume A.size() is even for simplicity, that can be relaxed pretty easily by adding a garbage element and discarding it later).
sort(A):
ds = new D()
for each x in A:
ds.add(x)
m1 = min(A) - 1
m2 = max(A) + 1
for (i=0; i < A.size(); i++):
ds.add(m1)
# at this point, ds.median() is smallest element in A
for (i = 0; i < A.size(); i++):
yield ds.median()
# Each two insertions advances median by 1
ds.add(m2)
ds.add(m2)
Claim 1: This algorithm runs in O(n).
Proof: Since we have constant operations of add() and median(), each of them is O(1) per iteration, and the number of iterations is linear - the complexity is linear.
Claim 2: The output is sorted(A).
Proof (guidelines): After inserting n times m1, the median is the smallest element in A. Each two insertions after it advances the median by one item, and since the advance is sorted, the total output is sorted.
Since the above algorithm sorts in O(n), and not possible under comparisons model, such DS does not exist.
QED.
This is problem 9.4 from Cracking the Coding Interview 5th
The Problem: Write a method to return all the subsets of a set.
Here is my solution in Java.(tested it, it works!!!)
public static List<Set<Integer>> subsets(Set<Integer> s) {
Queue<Integer> copyToProtectData = new LinkedList<Integer>();
for(int member: s) {
copyToProtectData.add(member);
}
List<Set<Integer>> subsets = new ArrayList<Set<Integer>>();
generateSubsets(copyToProtectData, subsets, new HashSet<Integer>());
return subsets;
}
private static void generateSubsets(Queue<Integer> s,
List<Set<Integer>> subsets, Set<Integer> hashSet) {
if(s.isEmpty()) {
subsets.add(hashSet);
} else {
int member = s.remove();
Set<Integer> copy = new HashSet<Integer>();
for(int i:hashSet) {
copy.add(i);
}
hashSet.add(member);
Queue<Integer> queueCopy = new LinkedList<Integer>();
for(int i:s){
queueCopy.add(i);
}
generateSubsets(s, subsets, hashSet);
generateSubsets(queueCopy, subsets, copy);
}
}
I looked at the solutions for this problem and the author said that the solution to this algorithm runs in O(2n) time complexity and O(2n) space complexity. I agree with her that this algorithm runs in O(2n) time because to solve this problem, you have to consider the fact that for any element, you have two possibilities, it can either be in the set or not. And because you have n elements, your problem will have 2n possibilities so the problem would be solved with O(2n) time.
However I believe that I have a compelling argument that my algorithm runs in O(n) space. I know that space complexity is "the total space taken by an algorithm with respect to the input size"
Space Complexity and is relative to the the depth of a recursive call(remember this from some Youtube video I watched)
An example I have is generating [1,2,3] as a subset of [1,2,3]. Here is the set of recursive calls to generate that set
generateSubsets([], subsets, [1,2,3])
generateSubsets([3],subsets,[1,2])
generateSubsets([2,3],subsets,[1])
generateSubsets([1,2,3],subsets,[])
This show that the greatest depth of a recursive call with respect to the original set size n is n itself. Each of these recursive calls will have its own stack frame. So from this, I concluded that the space complexity is O(n) Does anyone see any flaws in my proof?
You need to take into account all memory that is allocated by your algorithm (or, rather, the greatest amount of allocated memory that is "in use" at any time) - not only on the stack, but also on the heap. Each of the generated subsets is being stored in the subsets list, which will eventually contain 2n sets, each of size somewhere between 0 and n (with most of the sets containing around n / 2 elements) - so the space complexity is actually O(n 2n).
/*
* Returns true if this and other are rankings of the same
* set of strings; otherwise, returns false. Throws a
* NullPointerException if other is null. Must run in O(n)
* time, where n is the number of elements in this (or other).
*/
public boolean sameNames(Ranking other)
{
ArrayList<String> str1 = new ArrayList<String>();
ArrayList<String> str2 = new ArrayList<String>();
for(int i = 0; i < this.getNumItems(); i++){
str1.add(this.getStringOfRank(i));
}
for(int i = 0; i < other.getNumItems(); i++){
str2.add(other.getStringOfRank(i));
}
Collections.sort(str1);
Collections.sort(str2);
if(str1.size() == str2.size())
return str1.containsAll(str2);
else
return false;
}
Ok so in the code above, using str1.containsAll(str2) destroys my O(n) time complexity, as I believe it is O(n^2) in this case. My question how can I compare the contents of two arrays/arrayLists without using O(n^2). All I can think of is nested for loop, which of course is O(n^2).
/*
* Returns the rank of name. Throws an IllegalArgumentException
* if name is not present in the ranking. Must run in O(log n)
* time, where n = this.getNumItems().
*/
public int getRankOfString(String name)
{
Cities[] nameTest = new Cities[city.length];
int min = 0;
int max = city.length;
System.arraycopy(city, 0, nameTest, 0, city.length);
Arrays.sort(nameTest, Cities.BY_NAME);
while(max >= min){
int mid = (min + max)/2;
if(nameTest[mid].getName().equals(name))
return nameTest[mid].getRank();
else if(nameTest[mid].getName().compareTo(name) < 0)
min = mid + 1;
else
max = mid-1;
}
throw new IllegalArgumentException();
}
And this one, this has to be O(log n). So I used a binary search, however it only works on sorted arrays, so I have to call Arrays.sort(), BUT I can't mess with the order of the actual array so I have to copy the array using System.arraycopy(). This is most likely O(n + (n log n) + log n), which is not log n. I don't know what other way I can search for something, it seems like log n is the best, but that is binary search and would force me to sort array first, which just adds time...
P.S. I am not allowed to use Maps or Sets... :(
Any help would be awesome.
Sorry, a ranking object contains an array of city names that can be called and an array of rankings (just ints) for each city that can be called. sameNames() is simply testing two ranking objects possess the same cities, getRankofString() has a set name entered, it then checks to see if that name is in the ranking object, if it is, it returns its corresponding rank. Hope that cleared it up
And yeah, cannot use hash anything. We are basically limited to messing around with arrays and arrayLists and stuff.
Let's count the occurrences of each string. It's a bit similar to the counting sort.
Create a hash table t with hashing function f(), where keys are strings and values are integers (initially 0).
Iterate through the first strings, for each string do t[f(string)]++.
Iterate through the second strings, for each string do t[f(string)]++.
Iterate through the non-zero values in t, if all are even - return true. Otherwise - false.
Linear time complexity.
The first method has complexity at least O(n^2), given by 2*O(n*f(n)) + 2*O(n log n) + O(n^2). The O(n log n) is given by the Collections.sort() calls which will also 'destroys your O(n)' complexity as you put it.
Since both array lists are already sorted and of equal length when you try containsAll, that call is equivalent with some sort of equals (first element in one list should be equal to the first element in the second one, etc). You can easily compare the two lists manually (can't think of any build-in function that does this).
Hence, the overall complexity of the first piece of code can be reduced to O(n log n), if you can keep the complexity of getStringOfRank() under O(log n) (but that function is not shown in your post).
The second function (which isn't related to the first piece of code) has the complexity O(n log n) as pointed out by your computations. If you already copy, then sort the city array, the binary search is pointless. Don't copy, don't sort, just compare each city in the array, putting the entire complexity of this function to O(n). Alternatively, just keep a sorted copy of the city array and use binary search on that.
Either way, creating a copy of an array, sorting that copy for each function call is highly ineffective - if you want to call this function inside a loop, like you used getStringOfRank() above, construct the sorted copy before the loop and use it as an argument:
private boolean getRankOfString(String name, Cities[] sortedCities) {
// only binary search code needed here
}
Off-topic:
Based on the second function, you have something like Cities[] city declared in somewhere your code. If it were to follow conventions, it should be more like City[] cities (class name singular, array name should be the one using the plural)
So the first one just needs to compare if the two have the exact same names and nothing else?
How about this
public static boolean compare(List<String> l1, List<String> l2) {
if (l1.size() != l2.size()) return false;
long hash1 = 0;
long hash2 = 0;
for (int i = 0 ; i < l1.size() ; i++) {
hash1 += l1.get(i).hashCode();
hash2 += l2.get(i).hashCode();
}
return hash1 == hash2;
}
In theory, you could get a hash collision I suppose.