Time Complexity of my program - java

I want to know the exact time complexity of my algorithm in this method. I think it is nlogn as it uses arrays.sort;
public static int largestElement(int[] num) throws NullPointerException // O(1)
{
int a=num.length; // O(1)
Arrays.sort(num); // O(1)? yes
if(num.length<1) // O(1)
return (Integer) null;
else
return num[a-1]; // O(1)
}

You seem to grossly contradict yourself in your post. You are correct in that the method is O(nlogn), but the following is incorrect:
Arrays.sort(num); // O(1)? yes
If you were right, the method would be O(1)! After all, a bunch of O(1) processes in sequence is still O(1). In reality, Arrays.sort() is O(nlogn), which determines the overall complexity of your method.
Finding the largest element in an array or collection can always be O(n), though, since we can simply iterate through each element and keep track of the maximum.

"You are only as fast as your slowest runner" --Fact
So the significant run time operations here are your sorting and your stepping through the array. Since Arrays.sort(num) is a method which most efficiently sorts your arrays, we can guarantee that this will be O(nlg(n)) (where lg(n) is log base 2 of n). This is the case because O notation denotes the worst case runtime. Furthermore, the stepping of the array takes O(n).
So, we have O(nlgn) + O(n) + O(1) + ...
Which really reduces to O(2nlg(n)). But co-efficient are negligible in asymptotic notation.
So your runtime approaches O(nlg(n)) as stated above.

Indeed, it is O(nlogn). Arrays.sort() uses merge sort. Using this method may not be the best way to find a max though. You can just loop through your array, comparing the elements instead.

Related

What's the time complexity of sorting a list of objects with two properties?

Suppose I have a class:
`
public class Interval {
int start;
int end;
Interval() { start = 0; end = 0; }
Interval(int s, int e) { start = s; end = e; }
}
`
I would like to sort a list of intervals with Collections.sort() like this:
Collections.sort(intervals, new Comparator<Interval>(){
#Override
public int compare(Interval o1, Interval o2) {
if (o1.start == o2.start) {
return o1.end - o2.end;
}
else {
return o1.start - o2.start;
}
}
});
I know that sorting an array with the built-in sorting function takes O(nlogn) time, and the question is if I am sorting a list of objects with two properties, what is the time complexity of sorting this list? Thanks!!
#PaulMcKenzie's brief answer in comments is on the right track, but the full answer to your question is more subtle.
Many people do what you've done and confuse time with other measures of efficiency. What's correct in nearly all cases when someone says a "sort is O(n log n)" is that the number of comparisons is O(n log n).
I'm not trying to be pedantic. Sloppy analysis can make big problems in practice. You can't claim that any sort runs in O(n log n) time without a raft of additional statements about the data and the machine where the algorithm is running. Research papers usually do this by giving a standard machine model used for their analysis. The model states the time required for low level operations - memory access, arithmetic, and comparisons, for example.
In your case, each object comparison requires a constant number (2) of value comparisons. So long as value comparison itself is constant time -- true in practice for fixed-width integers -- O(n log n) is an accurate way to express run time.
However, something as simple as string sorting changes this picture. String comparison itself has a variable cost. It depends on string length! So sorting strings with a "good" sorting algorithm is O(nk log n), where k is the length of strings.
Ditto if you're sorting variable-length numbers (java BigIntegers for example).
Sorting is also sensitive to copy costs. Even if you can compare objects in constant time, sort time will depend on how big they are. Algorithms differ in how many times objects need to be moved in memory. Some accept more comparisons in order to do less copying. An implementation detail: sorting pointers vs. objects can change asymptotic run time - a space for time trade.
But even this has complications. After you've sorted pointers, touching the sorted elements in order hops around memory in arbitrary order. This can cause terrible memory hierarchy (cache) performance. Analysis that incorporates memory characteristics is a big topic in itself.
The big O notation actually do neglect the least contributing factors
for example if you complexity is n+1, n will be used and the 1 neglected.
So that answer is the same n * log(n).
As your code just adds one more statement, which will be translated into one instruction.
It should read the Collection.sort() Link here
This algorithm guaranteed n log(n) performance.
Note: Comparator does't change the its complexity rather than using Loops

Should I use TreeSet or HashSet?

I have large number of strings, I need to print unique strings in sorted order.
TreeSet stores them in sorted order but insertion time is O(Logn) for each insertion. HashSet takes O(1) time to add but then I will have to get list of the set and then sort using Collections.sort() which takes O(nLogn) (I assumes there is no memory overhead here since only the references of Strings will be copied in the new collection i.e. List). Is it fair to say overall any choice is same since at the end total time will be same?
That depends on how close you look. Yes, the asymptotic time complexity is O(n log n) in either case, but the constant factors differ. So it's not like one method can get a 100 times faster than the other, but it's certainly possible that one method is twice a fast as the other.
For most parts of a program, a factor of 2 is totally irrelevant, but if your program actually spends a significant part of its running time in this algorithm, it would be a good idea to implement both approaches, and measure their performance.
Measuring is the way to go, but if you're talking purely theoretically and ignoring read from after sorting, then consider for number of strings = x:
HashSet:
x * O(1) add operations + 1 O(n log n) (where n is x) sort operation = approximately O(n + n log n) (ok, that's a gross oversimplification, but..)
TreeSet:
x * O(log n) (where n increases from 1 to x) + O(0) sort operation = approximately O(n log (n/2)) (also a gross oversimplification, but..)
And continuing in the oversimplification vein, O(n + n log n) > O(n log (n/2)). Maybe TreeSet is the way to go?
If you distinguish the total number of strings (n) and number of unique strings (m), you get more detailed results for both approaches:
Hash set + sort: O(n) + O(m log m)
TreeSet: O(n log m)
So if n is much bigger than m, using a hash set and sorting the result should be slightly better.
You should take into account which methods will be executed more frequently and base your decision on that.
Apart from HashSet and TreeSet you can use LinkedHashSet which provides better performance for sorted sets. If you want to learn more about their differences in performance I suggest your read 6 Differences between TreeSet HashSet and LinkedHashSet in Java

Time Complexity of a Recursion Function

Suppose I'm using the following code to reverse print a linked list:
public void reverse(){
reverse(head);
}
private void reverse(Node h){
if(h.next==null){
System.out.print(h.data+" ");
return;
}
reverse(h.next);
System.out.print(h.data+" ");
}
The linkedlist is printed out in the opposite order, but I don't know efficient it is. How would I determine the time complexity of this function? Is there a more effecient way to do this?
Calculating time complexity of recursive algorithms in general is hard. However, there are plenty of resources available. I would start at this stackoverflow question Time complexity of a recursive algorithm.
As far of the time complexity of this this function, it is O(n) because you call reverse n times (once per node). There are not any more efficient ways to reverse, or even print a list. The problem itself requires you to at least look at every element in the list, which by definition is an O(n) operation.
Suppose your list has n elements. Each call to reverse(Node) reduces the length of the list by a single element. The efficiency is therefore O(n), which is clearly optimal: you can't reverse a list without considering all the elements.
You can use recursion tree or just expand T(n).Both are essentially same methods. What you are doing is expanding the recursion function by noting down what it does each time it is called in its stack.
For ex. Each time your function is called, it does some constant time stuff (print data) and then recurses.
So, expanding it, you'll get :
T(n) = d + T(n-1) {since one recursion is done, so one less to go}
= d + d + T(n-2)
and it will go on until it fizzes out.So your function will go on upto the length of the list.Hence complexity : O(n)
check out this : Time complexity of a recursive algorithm

How efficient is Arrays.sort(...) for a 2 Million elements arrays that is already sorted

I need to execute another sorting for an array of 2 Million elements using Arrays.sort(..) method. In order not to keep another dirty flag like, I was wondering how costly is this method call for an already sorted array.
Any thoughts?
It depends on the content of your array. Sorting primitives uses a dual-pivot Quick Sort, as per the docs. This has an amortized complexity of O(n logn), though worst case is O(n^2)
Sorting Objects uses TimSort (docs), a modified Merge Sort. According to the docs, TimSort for a nearly-sorted (or, presumably, sorted) array takes approximately n comparisons.
It would be far cheaper still for you to keep a dirty flag rather than suffer the O(n) compares.
For primitives arrays
Best case performance O(n log n)
Worst case performance O(n2)
For more detailed information read : http://en.wikipedia.org/wiki/Quicksort
For Object[] array
Worst case performance O(n log n)
Best case performance O(n)
For more detailed information read : http://en.wikipedia.org/wiki/Timsort
Array.sort() uses the best case complexity scenario of O(n log n) to sort an array by using Merge Sort which in turn uses divide and conquer approach. In the best case where you already have a sorted array the Best Possible Case could be O(n).
Per this thread, the sorting runtime will be O(n*log n) since Arrays.sort() uses a modified version of quicksort, whether the array is already sorted or not.

Why don't we consider stack frame sizes while calculation Space Complexity of recursive procedures?

Consider, the case of Merge Sort on an int Array containing n elements, we need an additional array of size n in order to perform merges.We discard the additional array in the end though.So the space complexity of Merge Sort comes out to be O(n).
But if you look at the recursive mergeSort procedure, on every recursive call mergeSort(something) one stack frame is added to the stack.And it does take some space, right?
public static void mergeSort(int[] a,int low,int high)
{
if(low<high)
{
int mid=(low+high)/2;
mergeSort(a,low,mid);
mergeSort(a,mid+1,high);
merge(a,mid,low,high);
}
}
My Questions is :
Why don't we take the size of stack frames into consideration while
calculating Merge Sort complexity ?
Is it because the stack contains only a few integer variables and
one reference, which don't take much memory?
What if my recursive function creates a new local array(lets say int a[]=new int [n];).Then will it be considered in calculating Space complexity?
The space consumed by the stack should absolutely be taken into consideration, but some may disagree here (I believe some algorithms even make complexity claims ignoring this - there's an unanswered related question about radix sort floating around here somewhere).
Since we split the array in half at each recursive call, the size of the stack will be O(log n).
So, if we take it into consideration, the total space will be O(n + log n), which is just O(n) (because, in big-O notation, we can discard asymptotically smaller terms), so it doesn't change the complexity.
And for creating a local array, a similar argument applies. If you create a local array at each step, you end up with O(n + n/2 + n/4 + n/8 + ...) = O(2n) = O(n) (because, in big-O notation, we can discard constant factors), so that doesn't change the complexity either.
Because you are not calculating the space-complexity when you are doing that. That is called determining: you are doing tests and try to conclude what the space complexity is by looking at the results. This is not a mathematical approach.
And yes, you are right with statement 2.

Categories