Complexity after Arrays.sort and for loop - java

In java, if I sort using Arrays.sort() (n log n) and then use a for loop o(n) in the code, what will be the new complexity ? is it n^2 log n or n log n

Answer: O(nLog(n))
non nested complexities can be simply added. i.e. O(n) + O(nLog(n))
For large n, nLog(n) is significantly greater than n. Therefore, O(nLog(n)) is the answer.
Read this: https://en.wikipedia.org/wiki/Big_O_notation
Note:
if the complexities are nested then the complexities are multiplied, for example:
Inside a loop of order n, you are doing a sort of order nLog(n).
Then complexity will be O(n * nLog(n)). i.e. O(n²Log(n))

If you perform the for loop after, you have an O(nlog(n)) operation followed by an O(n) one. Since O(n) is negligible compared to O(nlog(n)), your overall complexity would be O(nlog(n)).

Related

Time complexity of k-th smallest number using PriorityQueue in Java

I am trying to solve the popular interview question Find the k-th smallest number in an array of distinct integers. I read some solutions and found that a heap data structure suits this problem very well.
So, I have tried to implement a solution using the PriorityQueue class of the Collections framework, assuming that it is functionally identical to a heap.
Here is the code that I've tried:
public static int getKthMinimum(int[] input, int k){
PriorityQueue<Integer> heap = new PriorityQueue<Integer>();
// Total cost of building the heap - O(n) or O(n log n) ?
for(int num : input){
heap.add(num); // Add to heap - O(log n)
}
// Total cost of removing k smallest elements - O(k log n) < O(n) ?
while(k > 0){
heap.poll(); // Remove min - O(log n)
k--;
}
return heap.peek(); // Fetch root - O(1)
}
Based on the docs, the poll & add methods take O(log n) time and peek takes constant time.
What will be the time complexity of the while loop? (I think O(k log n)).
Should O(k log n) be considered higher than O(n) for the purpose of this question? Is there a threshold where it switches?
What will be the total time complexity of this code? Will it be O(n)?
If not already O(n), is there a way to solve it in O(n), while using the PriorityQueue class?
1. What will be the time complexity of the while loop? (I think O(k log n)).
O(k log n) is correct.
2. Should O(k log n) be considered higher than O(n) for the purpose of this question? Is there a threshold where it switches?
You cannot assume that. k can be anywhere from 0 to n−1, which means that k log n can be anywhere from 0 to n log n.
3. What will be the total time complexity of this code? Will it be O(n)?
O(n log n), because that's the cost of building the heap.
It's possible to build a heap in O(n) time, but your code doesn't do that; if it did, your total complexity would be O(n + k log n) or, equivalently, O(MAX(n, k log n)).
4. If not already O(n), is there a way to solve it in O(n), while using the PriorityQueue class?
No. There exist selection algorithms in worst-case O(n) time, but they're a bit complicated, and they don't use PriorityQueue.
The fastest PriorityQueue-based solutions would require O(MAX(n, n log MIN(k, n−k))) time. (The key is to keep only the k smallest elements in the heap while iterating — or the n−k largest elements and use a max-heap, if k is large enough for that to be worthwhile.)

Should I use TreeSet or HashSet?

I have large number of strings, I need to print unique strings in sorted order.
TreeSet stores them in sorted order but insertion time is O(Logn) for each insertion. HashSet takes O(1) time to add but then I will have to get list of the set and then sort using Collections.sort() which takes O(nLogn) (I assumes there is no memory overhead here since only the references of Strings will be copied in the new collection i.e. List). Is it fair to say overall any choice is same since at the end total time will be same?
That depends on how close you look. Yes, the asymptotic time complexity is O(n log n) in either case, but the constant factors differ. So it's not like one method can get a 100 times faster than the other, but it's certainly possible that one method is twice a fast as the other.
For most parts of a program, a factor of 2 is totally irrelevant, but if your program actually spends a significant part of its running time in this algorithm, it would be a good idea to implement both approaches, and measure their performance.
Measuring is the way to go, but if you're talking purely theoretically and ignoring read from after sorting, then consider for number of strings = x:
HashSet:
x * O(1) add operations + 1 O(n log n) (where n is x) sort operation = approximately O(n + n log n) (ok, that's a gross oversimplification, but..)
TreeSet:
x * O(log n) (where n increases from 1 to x) + O(0) sort operation = approximately O(n log (n/2)) (also a gross oversimplification, but..)
And continuing in the oversimplification vein, O(n + n log n) > O(n log (n/2)). Maybe TreeSet is the way to go?
If you distinguish the total number of strings (n) and number of unique strings (m), you get more detailed results for both approaches:
Hash set + sort: O(n) + O(m log m)
TreeSet: O(n log m)
So if n is much bigger than m, using a hash set and sorting the result should be slightly better.
You should take into account which methods will be executed more frequently and base your decision on that.
Apart from HashSet and TreeSet you can use LinkedHashSet which provides better performance for sorted sets. If you want to learn more about their differences in performance I suggest your read 6 Differences between TreeSet HashSet and LinkedHashSet in Java

Deciding a Big-O notation for an algorithm

I have questions for my assignment.
I need to decide what is the Big-O characterization for this following algorithm:
I'm guessing the answer for Question 1 is O(n) and Question 2 is O(log n), but I kinda confused
how to state the reason. Are my answers correct? And could you explain the reason why the characterization is like that?
Question 1 : O(n) because it increments by constant (1).
first loop O(n) second loop also O(n)
total O(n) + O(n) = O(n)
Question 2 : O(lg n) it's binary search.
it's O(lg n), because problem halves every time.
if the array is size n at first second is n/2 then n/4 ..... 1.
n/2^i = 1 => n = 2^i => i = log(n) .
Yes, your answers are right.The first one is pretty simple. 2 separate for loops. So effectively its O(n).
The second one is actually tricky. You are actually dividing the input size by 2 (half), that would effectively lead to a time complexity of O(log n).

Why don't we consider stack frame sizes while calculation Space Complexity of recursive procedures?

Consider, the case of Merge Sort on an int Array containing n elements, we need an additional array of size n in order to perform merges.We discard the additional array in the end though.So the space complexity of Merge Sort comes out to be O(n).
But if you look at the recursive mergeSort procedure, on every recursive call mergeSort(something) one stack frame is added to the stack.And it does take some space, right?
public static void mergeSort(int[] a,int low,int high)
{
if(low<high)
{
int mid=(low+high)/2;
mergeSort(a,low,mid);
mergeSort(a,mid+1,high);
merge(a,mid,low,high);
}
}
My Questions is :
Why don't we take the size of stack frames into consideration while
calculating Merge Sort complexity ?
Is it because the stack contains only a few integer variables and
one reference, which don't take much memory?
What if my recursive function creates a new local array(lets say int a[]=new int [n];).Then will it be considered in calculating Space complexity?
The space consumed by the stack should absolutely be taken into consideration, but some may disagree here (I believe some algorithms even make complexity claims ignoring this - there's an unanswered related question about radix sort floating around here somewhere).
Since we split the array in half at each recursive call, the size of the stack will be O(log n).
So, if we take it into consideration, the total space will be O(n + log n), which is just O(n) (because, in big-O notation, we can discard asymptotically smaller terms), so it doesn't change the complexity.
And for creating a local array, a similar argument applies. If you create a local array at each step, you end up with O(n + n/2 + n/4 + n/8 + ...) = O(2n) = O(n) (because, in big-O notation, we can discard constant factors), so that doesn't change the complexity either.
Because you are not calculating the space-complexity when you are doing that. That is called determining: you are doing tests and try to conclude what the space complexity is by looking at the results. This is not a mathematical approach.
And yes, you are right with statement 2.

Time Complexity of my program

I want to know the exact time complexity of my algorithm in this method. I think it is nlogn as it uses arrays.sort;
public static int largestElement(int[] num) throws NullPointerException // O(1)
{
int a=num.length; // O(1)
Arrays.sort(num); // O(1)? yes
if(num.length<1) // O(1)
return (Integer) null;
else
return num[a-1]; // O(1)
}
You seem to grossly contradict yourself in your post. You are correct in that the method is O(nlogn), but the following is incorrect:
Arrays.sort(num); // O(1)? yes
If you were right, the method would be O(1)! After all, a bunch of O(1) processes in sequence is still O(1). In reality, Arrays.sort() is O(nlogn), which determines the overall complexity of your method.
Finding the largest element in an array or collection can always be O(n), though, since we can simply iterate through each element and keep track of the maximum.
"You are only as fast as your slowest runner" --Fact
So the significant run time operations here are your sorting and your stepping through the array. Since Arrays.sort(num) is a method which most efficiently sorts your arrays, we can guarantee that this will be O(nlg(n)) (where lg(n) is log base 2 of n). This is the case because O notation denotes the worst case runtime. Furthermore, the stepping of the array takes O(n).
So, we have O(nlgn) + O(n) + O(1) + ...
Which really reduces to O(2nlg(n)). But co-efficient are negligible in asymptotic notation.
So your runtime approaches O(nlg(n)) as stated above.
Indeed, it is O(nlogn). Arrays.sort() uses merge sort. Using this method may not be the best way to find a max though. You can just loop through your array, comparing the elements instead.

Categories