Big Oh for (n log n) [closed] - java

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I am currently studying basic algorithms for Big Oh. I was wondering if anyone can show me what the code for (n log n) in Java using Big Oh would be like or direct me to any SO page where one exists.
Since I am just a beginner, I can only imagine the code before I write it. So, theoretically (at least), it should contain one for loop where we have something of n times. Then for the log n, we can use the while loop. So then the loop is executed n times and the while loop is executed log base 2 times. At least that is how I am imagining it in my head but seeing the code would clear things up.

int n = 100
for(int i = 0; i < n; i++) //this loop is executed n times, so O(n)
{
for(int j = n; j > 0; j/=2) //this loop is executed O(log n) times
{
}
}
Explanation:
The outer for loop should be clear; it is executed n times. Now to the inner loop. In the inner loop, you take n and always divide it by 2. So, you ask yourself: How many times can I divide n by 2?
It turns out that this is O (log n). In fact, the base of log is 2, but in Big-O notation, we remove the base since it only adds factors to our log that we are not interested in.
So, you are executing a loop n times, and within that loop, you are executing another loop log(n) times. So, you have O(n) * O(log n) = O(n log n).

A very popular O(n log n) algorithm is merge sort. http://en.wikipedia.org/wiki/Merge_sort for example of the algorithm and pseudocode. The log n part of the algorithm is achieved through breaking down the problem into smaller subproblems, in which the height of the recursion tree is log n.
A lot of sorting algortihms has the running time of O(n log n). Refer to http://en.wikipedia.org/wiki/Sorting_algorithm for more examples.

Algorithms with a O(.) time complexity involving log n's typically involve some form of divide and conquer.
For example, in MergeSort the list is halved, each part is individually merge-sorted and then the two halves are merged together. Each list is halved.
Whenever you have work being halved or reduced in size by some fixed factor, you'll usually end up with a log n component of the O(.).
In terms of code, take a look at the algorithm for MergeSort. The important feature, of typical implementations, is that it is recursive (note that TopDownSplitMerge calls itself twice in the code given on Wikipedia).
All good standard sorting algorithms have O(n log n) time complexity and it's not possible to do better in the worst case, see Comparison Sort.
To see what this looks like in Java code, just search! Here's one example.

http://en.wikipedia.org/wiki/Heapsort
Simple example is just like you described - execute n times some operation that takes log(n) time.
Balanced binary trees have log(n) height, so some tree algorithms will have such complexity.

Related

Calculating time complexity by just seeing the algorithm code

I have currently learned the code of all sorting algorithms used and understood their functioning. However as a part of these, one should also be capable to find the time and space complexity. I have seen people just looking at the loops and deriving the complexity. Can someone guide me towards the best practice for achieving this. The given example code is for "Shell sort". What should be the strategy used to understand and calculate from code itself. Please help! Something like step count method. Need to understand how we can do asymptotic analysis from code itself. Please help.
int i,n=a.length,diff=n/2,interchange,temp;
while(diff>0) {
interchange=0;
for(i=0;i<n-diff;i++) {
if(a[i]>a[i+diff]) {
temp=a[i];
a[i]=a[i+diff];
a[i+diff]=temp;
interchange=1;
}
}
if(interchange==0) {
diff=diff/2;
}
}
Since the absolute lower bound on worst-case of a comparison-sorting algorithm is O(n log n), evidently one can't do any better. The same complexity holds here.
Worst-case time complexity:
1. Inner loop
Let's first start analyzing the inner loop:
for(i=0;i<n-diff;i++) {
if(a[i]>a[i+diff]) {
temp=a[i];
a[i]=a[i+diff];
a[i+diff]=temp;
interchange=1;
}
}
Since we don't know much (anything) about the structure of a on this level, it is definitely possible that the condition holds, and thus a swap occurs. A conservative analysis thus says that it is possible that interchange can be 0 or 1 at the end of the loop. We know however that if we will execute the loop a second time, with the same diff value.
As you comment yourself, the loop will be executed O(n-diff) times. Since all instructions inside the loop take constant time. The time complexity of the loop itself is O(n-diff) as well.
Now the question is how many times can interchange be 1 before it turns to 0. The maximum bound is that an item that was placed at the absolute right is the minimal element, and thus will keep "swapping" until it reaches the start of the list. So the inner loop itself is repeated at most: O(n/diff) times. As a result the computational effort of the loop is worst-case:
O(n^2/diff-n)=O(n^2/diff-n)
2. Outer loop with different diff
The outer loop relies on the value of diff. Starts with a value of n/2, given interchange equals 1 at the end of the loop, something we cannot prove will not be the case, a new iteration will be performed with diff being set to diff/2. This is repeated until diff < 1. This means diff will take all powers of 2 up till n/2:
1 2 4 8 ... n/2
Now we can make an analysis by summing:
log2 n
------
\
/ O(n^2/2^i-n) = O(n^2)
------
i = 0
where i represents *log2(diff) of a given iteration. If we work this out, we get O(n2) worst case time complexity.
Note (On the lower bound of worst-case comparison sort): One can proof no comparison sort algorithm exists with a worst-case time complexity of O(n log n).
This is because for a list with n items, there are n! possible orderings. For each ordering, there is a different way one needs to reorganize the list.
Since using a comparison can split the set of possible orderings into two equals parts at the best, it will require at least log2(n!) comparisons to find out which ordering we are talking about. The complexity of log2(n) can be calculated using the Stirling approximation:
n
/\
|
| log(x) dx = n log n - n = O(n log n)
\/
1
Best-case time complexity: in the best case, the list is evidently ordered. In that case the inner loop will never perform the if-then part. As a consequence, the interchange will not be set to 1 and therefore after executing the for loop one time. The outer loop will still be repeated O(log n) times, thus the time complexity is O(n log n).
Look at the loops and try to figure out how many times they execute. Start from the innermost ones.
In the given example (not the easiest one to begin with), the for loop (innermost) is excuted for i in range [0,n-diff], i.e. it is executed exactly n-diff times.
What is done inside that loop doesn't really matter as long as it takes "constant time", i.e. there is a finite number of atomic operations.
Now the outer loop is executed as long as diff>0. This behavior is complex because an iteration can decrease diff or not (it is decreased when no inverted pair was found).
Now you can say that diff will be decreased log(n) times (because it is halved until 0), and between every decrease the inner loop is run "a certain number of times".
An exercised eye will also recognize interleaved passes of bubblesort and conclude that this number of times will not exceed the number of elements involved, i.e. n-diff, but that's about all that can be said "at a glance".
Complete analysis of the algorithm is an horrible mess, as the array gets progressively better and better sorted, which will influence the number of inner loops.

What is the time complexity for this algorithm?

public static void Comp(int n)
{
int count=0;
for(int i=0;i<n;i++)
{
for(int j=0;j<n;j++)
{
for(int k=1;k<n;k*=2)
{
count++;
}
}
}
System.out.println(count);
}
Does anyone knows what the time complexity is?
And what is the Big Oh()
Please can u explain this to me, step by step?
Whoever gave you this problem is almost certainly looking for the answer n^2 log(n), for reasons explained by others.
However the question doesn't really make any sense. If n > 2^30, k will overflow, making the inner loop infinite.
Even if we treat this problem as being completely theoretical, and assume n, k and count aren't Java ints, but some theoretical integer type, the answer n^2 log n assumes that the operations ++ and *= have constant time complexity, no matter how many bits are needed to represent the integers. This assumption isn't really valid.
Update
It has been pointed out to me in the comments below that, based on the way the hardware works, it is reasonable to assume that ++, *=2 and < all have constant time complexity, no matter how many bits are required. This invalidates the third paragraph of my answer.
In theory this is O(n^2 * log(n)).
Each of two outer loops is O(n) and the inner one is O(log(n)), because log base 2 of n is the number of times which you have to divide n by 2 to get 1.
Also this is a strict bound, i.e the code is also Θ(n^2 * log(n))
The time complexity is O(n^2 log n). Why? each for-loop is a function of n. And you have to multiply by n for each for loop; except the inner loop which grows as log n. why? for each iteration k is multiplied by 2. Think of merge sort or binary search trees.
details
for the first two loops: summation of 1 from 0 to n, which is n+1 and so the first two loops give (n+1)*(n+1)= n^2+2n+1= O(n^2)
for the k loop, we have k growing as 1,2,4,8,16,32,... so that 2^k = n. Take the log of both sides and you get k=log n
Again, not clear?
So if we set m=0, and a=2 then we get -2^n/-1 why is a=2? because that is the a value for which the series yields 2,4,8,16,...2^k

Counting sort and usage in Java applications [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I did an algoritms refresher and I (re)read about a sorting algorithm that runs in linear time, namely Counting Sort.
To be honest I had forgotten about it.
I understand the structure and logic and the fact that it runs in linear time is a very attactive quality.
But I have the folowing question:
As I understanding the concrete implementation of the algorithm relies on 2 things:
1) The range of input numbers is small (otherwise the intermediate array will be huge and with many gaps).
2) We actually know the range of numbers.
Taking that these 2 assumptions are correct (please correct me otherwise), I was wondering what is the best application domain that this algorithm applies to.
I mean specifically in Java, is an implementation like the following Java/Counting sort sufficient:
public static void countingSort(int[] a, int low, int high)
{
int[] counts = new int[high - low + 1]; // this will hold all possible values, from low to high
for (int x : a)
counts[x - low]++; // - low so the lowest possible value is always 0
int current = 0;
for (int i = 0; i < counts.length; i++)
{
Arrays.fill(a, current, current + counts[i], i + low); // fills counts[i] elements of value i + low in current
current += counts[i]; // leap forward by counts[i] steps
}
}
or it is not a trivial matter to come up with the high and low?
Is there a specific application in Java that counting sort is best suited for?
I assume there are subtleties like these otherwise why would anyone bother with all the O(nlogn) algorithms?
Algorithms are not about the language, so this is language-agnostic. As you have said - use counting sort when the domain is small. If you have only three numbers - 1, 2, 3 it is far better to sort them with counting sort, than a quicksort, heapsort or whatever which are O(nlogn). If you have a specific question, feel free to ask.
It is incorrect to say counting sort is O(n) in the general case, the "small range of element" is not a recommendation but a key assumption.
Counting sort assumes that each of the elements is an integer in the
range 1 to k, for some integer k. When k = O(n), the Counting-sort
runs in O(n) time.
In the general situation, the range of key k is independent to the number of element n and can be arbitrarily large. For example, the following array:
{1, 1000000000000000000000000000000000000000000000000000000000000000000000000000}
As for the exact break even point value of k and n where counting sort outperform traditional sort, it is hugely dependent on the implementation and best done via benchmarking (trying both out).

Time Complexity of Fibonacci Algorithm [duplicate]

This question already has answers here:
Computational complexity of Fibonacci Sequence
(12 answers)
Closed 6 years ago.
So, i've got a recursive method in Java for getting the 'n'th fibonacci number - The only question i have, is: what's the time complexity? I think it's O(2^n), but i may be mistaken? (I know that iterative is way better, but it's an exercise)
public int fibonacciRecursive(int n)
{
if(n == 1 || n == 2) return 1;
else return fibonacciRecursive(n-2) + fibonacciRecursive(n-1);
}
Your recursive code has exponential runtime. But I don't think the base is 2, but probably the golden ratio (about 1.62). But of course O(1.62^n) is automatically O(2^n) too.
The runtime can be calculated recursively:
t(1)=1
t(2)=1
t(n)=t(n-1)+t(n-2)+1
This is very similar to the recursive definition of the fibonacci numbers themselves. The +1 in the recursive equation is probably irrelevant for large n. S I believe that it grows approximately as fast as the fibo numbers, and those grow exponentially with the golden ratio as base.
You can speed it up using memoization, i.e. caching already calculated results. Then it has O(n) runtime just like the iterative version.
Your iterative code has a runtime of O(n)
You have a simple loop with O(n) steps and constant time for each iteration.
You can use this
to calculate Fn in O(log n)
Each function call does exactly one addition, or returns 1. The base cases only return the value one, so the total number of additions is fib(n)-1. The total number of function calls is therefore 2*fib(n)-1, so the time complexity is Θ(fib(N)) = Θ(phi^N), which is bounded by O(2^N).
O(2^n)? I see only O(n) here.
I wonder why you'd continue to calculate and re-calculate these? Wouldn't caching the ones you have be a good idea, as long as the memory requirements didn't become too odious?
Since they aren't changing, I'd generate a table and do lookups if speed mattered to me.
It's easy to see (and to prove by induction) that the total number of calls to fibonacciRecursive is exactly equal to the final value returned. That is indeed exponential in the input number.

BigO running time on some methods

Ok, these are all pretty simple methods, and there are a few of them, so I didnt want to just create multiple questions when they are all the same thing. BigO is my weakness. I just cant figure out how they come up with these answers. Is there anyway you can give me some insight into your thinking for analyzing running times of some of these methods? How do you break it down? How should I think when I see something like these? (specifically the second one, I dont get how thats O(1))
function f1:
loop 3 times
loop n times
Therefore O(3*n) which is effectively O(n).
function f2:
loop 50 times
O(50) is effectively O(1).
We know it will loop 50 times because it will go until n = n - (n / 50) is 0. For this to be true, it must iterate 50 times (n - (n / 50)*50 = 0).
function f3:
loop n times
loop n times
Therefore O(n^2).
function f4:
recurse n times
You know this because worst case is that n = high - low + 1. Disregard the +1.
That means that n = high - low.
To terminate,
arr[hi] * arr[low] > 10
Assume that this doesn't occur until low is incremented to the highest it can go (high).
This means n = high - 0 and we must recurse up to n times.
function 5:
loops ceil(log_2(n)) times
We know this because of the m/=2.
For example, let n=10. log_2(10) = 3.3, the ceiling of which is 4.
10 / 2 =
5 / 2 =
2.5 / 2 =
1.25 / 2 =
0.75
In total, there are 4 iterations.
You get an n^2 analysis when performing a loop within a loop, such as the third method.
However, the first method doesn't a n^2 timing analysis because the first loop is defined as running three times. This makes the timing for the first one 3n, but we don't care about numbers for Big-O.
The second one, introduces an interesting paradigm, where despite the fact that you have a single loop, the timing analysis is still O(1). This is because if you were to chart the timing it takes to perform this method, it wouldn't behave as O(n) for smaller numbers. For larger numbers it becomes obvious.
For the fourth method, you have an O(n) timing because you're recursive function call is passing lo + 1. This is similar to if you were using a for loop and incrementing with lo++/++lo.
The last one has a O(log n) timing because your dividing your variable by two. Just remember than anything that reminds you of a binary search will have a log n timing.
There is also another trick to timing analysis. Say you had a loop within a loop, and within each of the two loops you were reading lines from a file or popping of elements from a stack. This actually would only be a O(n) method, because a file only has a certain number of lines you can read, and a stack only has a certain number of elements you can pop off.
The general idea of big-O notation is this: it gives a rough answer to the question "If you're given a set of N items, and you have to perform some operation repeatedly on these items, how many times will you need to perform this operation?" I say a rough answer, because it (most of the time) doesn't give a precise answer of "5*N+35", but just "N". It's like a ballpark. You don't really care about the precise answer, you just want to know how bad it will get when N gets large. So answers like O(N), O(N*N), O(logN) and O(N!) are typical, because they each represent sort of a "class" of answers, which you can compare to each other. An algorithm with O(N) will perform way better than an algorithm with O(N*N) when N gets large enough, it doesn't matter how lengthy the operation is itself.
So I break it down thus: First identify what the N will be. In the examples above it's pretty obvious - it's the size of the input array, because that determines how many times we will loop. Sometimes it's not so obvious, and sometimes you have multiple input data, so instead of just N you also get M and other letters (and then the answer is something like O(N*M*M)).
Then, when I have my N figured out, I try to identify the loop which depends on N. Actually, these two things often get identified together, as they are pretty much tied together.
And, lastly of course, I have to figure out how many iterations the program will make depending on N. And to make it easier, I don't really try to count them, just try to recognize the typical answers - O(1), O(N), O(N*N), O(logN), O(N!) or perhaps some other power of N. The O(N!) is actually pretty rare, because it's so inefficient, that implementing it would be pointless.
If you get an answer of something like N*N+N+1, then just discard the smaller ones, because, again, when N gets large, the others don't matter anymore. And ignore if the operation is repeated some fixed number of times. O(5*N) is the same as O(N), because it's the ballpark we're looking for.
Added: As asked in the comments, here are the analysis of the first two methods:
The first one is easy. There are only two loops, the inner one is O(N), and the outer one just repeats that 3 times. So it's still O(N). (Remember - O(3N) = O(N)).
The second one is tricky. I'm not really sure about it. After looking at it for a while I understood why it loops at most only 50 times. Since this is not dependant on N at all, it counts as O(1). However, if you were to pass it, say, an array of only 10 items, all positive, it would go into an infinite loop. That's O(∞), I guess. So which one is it? I don't know...
I don't think there's a formal way of determining the big-O number for an algorithm. It's like the halting problem. In fact, come to think of it, if you could universally determine the big-O for a piece of code, you could also determine if it ever halts or not, thus contradicting the halting problem. But that's just my musings.
Typically I just go by... dunno, sort of a "gut feeling". Once you "get" what the Big-O represents, it becomes pretty intuitive. But for complicated algorithms it's not always possible to determine. Take Quicksort for example. On average it's O(N*logN), but depending on the data it can degrade to O(N*N). The questions you'll get on the test though should have clear answers.
The second one is 50 because big O is a function of the length of the input. That is if the input size changes from 1 million to 1 billion, the runtime should increase by 1000 if the function is O(N) and 1 million if it's O(n^2). However the second function runs in time 50 regardless of the input length, so it's O(1). Technically it would be O(50) but constants don't matter for big O.

Categories