I am trying to solve the popular interview question Find the k-th smallest number in an array of distinct integers. I read some solutions and found that a heap data structure suits this problem very well.
So, I have tried to implement a solution using the PriorityQueue class of the Collections framework, assuming that it is functionally identical to a heap.
Here is the code that I've tried:
public static int getKthMinimum(int[] input, int k){
PriorityQueue<Integer> heap = new PriorityQueue<Integer>();
// Total cost of building the heap - O(n) or O(n log n) ?
for(int num : input){
heap.add(num); // Add to heap - O(log n)
}
// Total cost of removing k smallest elements - O(k log n) < O(n) ?
while(k > 0){
heap.poll(); // Remove min - O(log n)
k--;
}
return heap.peek(); // Fetch root - O(1)
}
Based on the docs, the poll & add methods take O(log n) time and peek takes constant time.
What will be the time complexity of the while loop? (I think O(k log n)).
Should O(k log n) be considered higher than O(n) for the purpose of this question? Is there a threshold where it switches?
What will be the total time complexity of this code? Will it be O(n)?
If not already O(n), is there a way to solve it in O(n), while using the PriorityQueue class?
1. What will be the time complexity of the while loop? (I think O(k log n)).
O(k log n) is correct.
2. Should O(k log n) be considered higher than O(n) for the purpose of this question? Is there a threshold where it switches?
You cannot assume that. k can be anywhere from 0 to n−1, which means that k log n can be anywhere from 0 to n log n.
3. What will be the total time complexity of this code? Will it be O(n)?
O(n log n), because that's the cost of building the heap.
It's possible to build a heap in O(n) time, but your code doesn't do that; if it did, your total complexity would be O(n + k log n) or, equivalently, O(MAX(n, k log n)).
4. If not already O(n), is there a way to solve it in O(n), while using the PriorityQueue class?
No. There exist selection algorithms in worst-case O(n) time, but they're a bit complicated, and they don't use PriorityQueue.
The fastest PriorityQueue-based solutions would require O(MAX(n, n log MIN(k, n−k))) time. (The key is to keep only the k smallest elements in the heap while iterating — or the n−k largest elements and use a max-heap, if k is large enough for that to be worthwhile.)
Related
What would be the time complexity of partitioning an array in two and finding the minimum element overall?
Is it O(n) or O(log n)?
The complexity of dividing an (unsorted) array into 2 sorted partitions is O(NlogN).
Once you have two sorted partitions, it is O(1) to find the smallest element in either ... and hence both partitions.
(The smallest element of a sorted partition is the first one.)
Time Complexity for Partitioned Array
If an array A is already divided in two sorted partitions P1 and P2, where P1 is distributed along the indexes of A 0 <= i < k and P2 along the indexes k <= i < n with k an arbitrary index within the range 0 <= k < n.
Then, you know that the smallest element of both partitions is their first. Accessing both partition's first element has a time complexity of O(1) and comparing the two smallest values retrieved has again a time complexity of O(1).
So, the overall complexity of finding the minimum value in an array divided into two sorted partitions is O(1).
Time Complexity for Array to Partition
Instead, if the given array A has to be sorted in two sorted partitions (because this is a requirement) and then you need to find its minimum element. Then you need to divide your array into two partitions with an arbitrary index k, sort the two partitions with the most efficient sorting algorithm, which has complexity O(n log n), and then applying the same logic exposed above in finding the minimum element.
For any given value of k, with k 0 <= k < n, we would have to apply the sorting algorithm twice (on both partitions). However, since the additive property of Complexity Computation states that the addition of two complexities of the same order is still equal to the same complexity, then for example for k = n/2 we would have that: O(n/2 log n/2) + O(n/2 log n/2) still produces O(n log n). More in general O(k log k) + O((n-k) log (n-k)) with 0 <= k < n and n => ∞, which will still give us O(n log n) due to the constant factors property. Finally, we need to account to this complexity the constant complexity, O(1) of finding the minimum element among the two partitions, which will still give us O(n log n).
In conclusion, the overall complexity for dividing an array A in two partitions P1 and P2 and finding the minimum element overall is O(n log n).
In java, if I sort using Arrays.sort() (n log n) and then use a for loop o(n) in the code, what will be the new complexity ? is it n^2 log n or n log n
Answer: O(nLog(n))
non nested complexities can be simply added. i.e. O(n) + O(nLog(n))
For large n, nLog(n) is significantly greater than n. Therefore, O(nLog(n)) is the answer.
Read this: https://en.wikipedia.org/wiki/Big_O_notation
Note:
if the complexities are nested then the complexities are multiplied, for example:
Inside a loop of order n, you are doing a sort of order nLog(n).
Then complexity will be O(n * nLog(n)). i.e. O(n²Log(n))
If you perform the for loop after, you have an O(nlog(n)) operation followed by an O(n) one. Since O(n) is negligible compared to O(nlog(n)), your overall complexity would be O(nlog(n)).
This question already has an answer here:
What is the time complexity of this in-place array reversal?
(1 answer)
Closed 5 years ago.
Why the run time complexity for this logic is O(N)? The number of iterations are only half here. Please explain!
for(int i = 0; i < validData.length / 2; i++)
{
int temp = validData[i];
validData[i] = validData[validData.length - i - 1];
validData[validData.length - i - 1] = temp;
}
Big O notation is about order of magnitudes and how the complexity relates to the number of elements. O(1/2 * n) == O(n)
Time complexity falls into one of the Time complexity categories :
constant (O(1))
logarithmic (O(log(N)))
linear (O(N))
quadratic (O(N^2))
cubic (O(N^3))
etc...
O(N) is the general approximation of the complexity O(N/2) in your case because the 1/2 is regarded as a constant (especially considering high values for N).
Therefore the final complexity is said to be linear (depending only on the value of N: the final execution time grows linearly with N).
We are interested in how the algorithm scales as the input size grows and constants do not affect its behavior on a very large scale, in theory.
Therefore even if you are going through half of the array O(1/2 * n) in your case, in theory its still O(n).
Good day!
I have a question regarding the time complexity of a binary search tree insertion method. I read some of the answers regarding this but some were different from each other. Is the time complexity for a binary search tree insertion method O(log n) at average case and O(n) at worst case? Or is it O(n log n) for the average case and O(n^2) for the worst case? When does it become O(n log n) at average case and O(n^2) at worst case?
In avg case, is O(log n) for 1 insert operation since it consists of a test (constant time) and a recursive call (with half of the total number of nodes in the tree to visit), making the problem smaller in constant time. Thus for n insert operations, avg case is O(nlogn). The key is that the operation requieres time proportional to the height of the tree. In average, 1 insert operation is O(logn) but in the worst case the height is O(n)
If you're doing n operations, then avg is O(nlgn) and worst O(n^2)
It's only checking the for loop 1/3n times, so it's still technically linear I guess? However I don't really understand why it wouldn't be O(logn), because many times a code with O(logn) running time ends up checking around 1/3n. Does O(logn) always divide the options by 2 every time?
int a = 0;
for (int i = 0; i < n; i = i+3)
a = a+i;
your code has complexity O(n), O(n)/3 == a * O(n) == O(n)
With time-complexity analysis, constant factors do not matter. You could do 1,000,000 operations per loop, and it will still be O(n). Because the constant 1/3 doesn't matter, it's still O(n). If you have n at 1,000,000, then 1/3 of n would be much bigger than log n.
From the Wikipedia entry on Big-O notation:
Let k be a constant. Then:
O(kg) = O(g) if k is nonzero.
It is order of n O(n) and not O(logn).
It because the run time increases linearly with the increase in n
For more information take a look at this graph and hopefully you will understand why it is not logn
https://www.cs.auckland.ac.nz/software/AlgAnim/fig/log_graph.gif
The running Time is O(n) (in unit complexity measure).