time complexity : why O(nlogn)? - java

I have a document that says the average case time-complexity for the given code is O(nlog2n)
Random r = new Random();
int k = 1 + r.nextInt(n);
for (int i = 0; i < n ; i += k);
I have computed the best and worst cases as:
Best case, k = n leading to time complexity of O(1).
Worst case, k = 1 leading to time complexity of O(n).
How can average case be O(nlog2n), which is higher than the worst case. Am I missing something?
Edit: The document could be prone to mistakes, so in that case what would be the average time-complexity of the above code, and why?

For a given value of k, the for loop runs n/k times. (I'm ignoring rounding, which makes the analysis a bit more complicated but doesn't change the result).
Averaging over all values of k gives: (n/1 + n/2 + n/3 + ... + n/n) / n. That's the n'th harmonic number. The harmonic numbers tend to log(n).
Thus the average runtime complexity of this code is log(n). That's O(log n) or equivalently O(log_2 n).
Perhaps your book had an additional outer loop that ran this code n times?

Related

Time complexity of an algorithm where the input is known?

Learning about algorithms and I am slightly puzzled when it comes to calculating Time Complexity. To my understanding, if the output of an algorithm does not depend on the input size, it takes constant time i.e. O(1). Whereas when it does depend on the input, it is known as linear time i.e. O(n).
However, how does the time complexity work out when we know the size of the input?
For example, I have the following code which prints out all the prime numbers between 1 and 100. In this scenario, I know the size of the input (100) so how would that translate to the Time Complexity?
public void findPrime(){
for(int i = 2; i <=100; i++){
boolean isPrime = true;
for(int j = 2; j < i; j++){
int x = i % j;
if(x == 0)
isPrime = false;
}
if (isPrime)
System.out.println(i);
}
}
In this case, would the complexity still be O(1) because the time is constant? Or would it be O(n) n being the i condition which affects the number of iterations for both for loops?
Am I also right in saying that the condition of i affects the algorithm the most in terms of run time? Greater the i, the longer the algorithm runs for?
Would appreciate any help.
The output is not dynamic and always the same (like the input), which is per definition a constant. The complexity of calculating that is constant, it's always the same. If the upper bound was not fixed, then the complexity wouldn't be constant.
To introduce a dynamic upper bound, we need to change the code and check out the complexities of the lines:
public void findPrime(int n){
for(int i = 2; i <= n; i++){ // sum from 2 to n
boolean isPrime = true; // 1
for(int j = 2; j < i; j++){ // sum from 2 to i - 1
int x = i % j; // 1
if(x == 0) // 1
isPrime = false; // 1
}
if (isPrime) // 1
System.out.println(i); // 1, see below
}
}
As the number i gets longer and longer, the complexity to print it is not constant. For simplicity, we say that printing out to System.out is constant.
Now when we know the complexities of the lines, we translate that into an equation and simplify it.
As the result is a polynomial, due to the properties of O notation, we can see that this function is O(n^2).
As the other answers have shown, you can also say it's O(n^2) by "locking at it". You need mathematical proofs only for more difficult cases (and to be sure).
If algorithm scalability depends on the input size, it's not always/necessarily only O(n2). It may be Qubic O(n3), Logarithmic O(log2(n)) or etc.
When algorithm doesn't depend on the input size, i.e. you have a constant amount of static operations which don't grow when your input grows - that algorithm is said to have a Constant Time Complexity which in asymptotic notation is O(1).
Usually, we want to measure Worst Cast Complexity for the algorithm, because that is what interests us for increasingly/sufficiently large inputs (for small inputs, mostly, it doesn't make any difference). So, the worst case is the case, when every possible iteration will execute/happen.
Now, pay attention to your double-for-loop. If you'll have your static range [2, 100] in your code, of course, if will always hit 3 as the first prime number, and every execution will have a Constant Time Complexity **O(1)**m but usually, we want to find prime numbers in some dynamically given range, and if that's the case, then, in the worst case, both loops may iterate over entire array, and as array grows - number of iterations, hence operations, will grow.
So, your code's worst-case time complexity is definitely O(n2).
Whereas when it does depend on the input, it is known as linear time i.e. O(n).
That's not true. When it depends on the input size, it is simply not constant.
It could be polynomial, meaning that it's complexity is represented as a polynom f(n).
Here, f(n) could be anything that is a polynom with parameter n - examples for this are:
f(n) = n - linear
f(n) = log(n) - logarithmic
f(n) = n*n - squared
...and so on
f(n) could also be an exponent, for example f(n) = 2^n, which represents an algorithm, which complexity grows very fast.
Time complexity denpend on what algorithm you use. You can calculate time complexity of an algorithm by using follow simple rules:
Primitive expression: 1
N primitive expressions: N
If you has 2 separate code blocks, 1st code block has time complexity is A, 2nd code block has time complexity is B, so total time complexity is A + B.
If you loop a code block N times, code block has time complexity is M, so total time complexity is N*M
If you use recursive function, you can calculate time complexity by using Master theorem: https://en.wikipedia.org/wiki/Master_theorem_(analysis_of_algorithms)
Big O notation is a mathematical notation (https://en.wikipedia.org/wiki/Big_O_notation) describes the bound of a function. Time complexity is usually a function of input size, so, we can use big O notation to describe bound of time complexity. Some simple rules:
constant = O(constant) = O(1)
n = O(n)
n^2 = O(n^2)
...
g(a*f(n)) = O(f(n)) with a is a constant.
O(f(n) + g(n)) = O(max(f(n), g(n))
...

What is the time complexity of an iteration through all possible sequences of an array

An algorithm that goes through all possible sequences of indexes inside an array.
Time complexity of a single loop and is linear and two nested loops is quadratic O(n^2). But what if another loop is nested and goes through all indexes separated between these two indexes? Does the time complexity rise to cubic O(n^3)? When N becomes very large it doesn't seem that there are enough iterations to consider the complexity cubic yet it seems to big to be quadratic O(n^2)
Here is the algorithm considering N = array length
for(int i=0; i < N; i++)
{
for(int j=i; j < N; j++)
{
for(int start=i; start <= j; start++)
{
//statement
}
}
}
Here is a simple visual of the iterations when N=7(which goes on until i=7):
And so on..
Should we consider the time complexity here quadratic, cubic or as a different size complexity?
For the basic
for (int i = 0; i < N; i++) {
for (int j = i; j < N; j++) {
// something
}
}
we execute something n * (n+1) / 2 times => O(n^2). As to why: it is the simplified form of
sum (sum 1 from y=x to n) from x=1 to n.
For your new case we have a similar formula:
sum (sum (sum 1 from z=x to y) from y=x to n) from x=1 to n. The result is n * (n + 1) * (n + 2) / 6 => O(n^3) => the time complexity is cubic.
The 1 in both formulas is where you enter the cost of something. This is in particular where you extend the formula further.
Note that all the indices may be off by one, I did not pay particular attention to < vs <=, etc.
Short answer, O(choose(N+k, N)) which is the same as O(choose(N+k, k)).
Here is the long answer for how to get there.
You have the basic question version correct. With k nested loops, your complexity is going to be O(N^k) as N goes to infinity. However as k and N both vary, the behavior is more complex.
Let's consider the opposite extreme. Suppose that N is fixed, and k varies.
If N is 0, your time is constant because the outermost loop fails on the first iteration.. If N = 1 then your time is O(k) because you go through all of the levels of nesting with only one choice and only have one choice every time. If N = 2 then something more interesting happens, you go through the nesting over and over again and it takes time O(k^N). And in general, with fixed N the time is O(k^N) where one factor of k is due to the time taken to traverse the nesting, and O(k^(N-1)) being taken by where your sequence advances. This is an unexpected symmetry!
Now what happens if k and N are both big? What is the time complexity of that? Well here is something to give you intuition.
Can we describe all of the times that we arrive at the innermost loop? Yes!
Consider k+N-1 slots With k of them being "entered one more loop" and N-1 of them being "we advanced the index by 1". I assert the following:
These correspond 1-1 to the sequence of decisions by which we reached the innermost loop. As can be seen by looking at which indexes are bigger than others, and by how much.
The "entered one more loop" entries at the end is work needed to get to the innermost loop for this iteration that did not lead to any other loop iterations.
If 1 < N we actually need one more that that in unique work to get to the end.
Now this looks like a mess, but there is a trick that simplifies it quite unexpectedly.
The trick is this. Suppose that we took one of those patterns and inserted one extra "we advanced the index by 1" somewhere in that final stretch of "entered one more loop" entries at the end. How many ways are there to do that? The answer is that we can insert that last entry in between any two spots in that last stretch, including beginning and end, and there is one more way to do that than there are entries. In other words, the number of ways to do that matches how much unique work there was getting to this iteration!
And what that means is that the total work is proportional to O(choose(N+k, N)) which is also O(choose(N+k, k)).
It is worth knowing that from the normal approximation to the binomial formula, if N = k then this turns out to be O(2^(N+k)/sqrt(N+k)) which indeed grows faster than polynomial. If you need a more general or precise approximation, you can use Stirling's approximation for the factorials in choose(N+k, N) = (N+k)! / ( N! k! ).

What is the complexity of empty for loop?

I was wondering if the complexity of a empty for loop like below is still O(n^2)
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
}
}
update : changed height and width variable to n
If it won't get optimized out by the compiler, the complexity will still be O(n^2) (or actually O(N*M)) - even though the loops bodies are empty, the condition checks and incrementation of both counters are still valid operations which have to be performed.
The complexity of any for loop that runs from 1 .. n is O(n), even if it does not do anything inside it. So in your case it is always going to be O(n^2) irrespective of what you are doing inside the loops.
Here in your example i and j are running till n and hence individually depends on the value of n making the the nested for loops having a complexity of O(n^2)
Pay attention, you can do something else than i++, e.g. fun(i).
Based off of my understanding of time-complexity of an algorithm, we assume that there are one or more fundamental operations. Re-writing the code using a while loop and expanding for logic :
int i = 0, j = 0;
while(i < n)
{
while(j < n)
{
; //nop or no-operation
j = j + 1; // let jInc be alias for j + 1
}
i = i + 1; // let iInc be alias for i + 1
}
Now if your objective is to perform a 'nop' n^2 times, then the time complexity is O(0) where 'nop' is the fundamental operation. However, if the objective is to iterate 2 counters ('i' and 'j') from 0 to n -1 or count n^2 times then the fundamental operations can be addition (j + 1 and i + 1), comparison (i < n and j < n) or assignment (i = iInc and j = jInc) i.e. O(n^2).
Big O is just an approximation for evaluating count of steps in algorithm.
We could have formulas for exact count of steps in algorithm, but they are complex and difficult to realise the actual complexity.
1) O(0.000 000 001*n^2 - 1 000 000 000) = n^2
2) O(1 000 000 000*n ) = n
Despite of Big O first case is less e.g. for N = 0..1 000 000
More than, it doesn't take into account how fast particular step.
So, your loop is a case when O(n^2) could be less than O(1)
The nested loop performs constant work O(1) n times, so nO(1)=O(n)O(1)=O(n).
The external loop performs the above mentioned O(n) work n times so nO(n)=O(n)O(n) =O(n^2).
In general:``
f(n) ∈ O(f(n))
cf(n) ∈ O(f(n)) if c is constant
f(n)g(n) ∈ O(f(n)g(n))
It depends on the compiler.
Theoretically, it's O(n), where n is the number of loops, even if there's no task inside the loop.
But, in case of some compiler, the compiler optimizes the loop and doesn't iterates n times. In this situation, the complexity is O(1).
For the loop mentioned above, it's both O(n) and O(n^2). But, it's good practice to write O(n^2) as Big O covers upper bound.

Complexity of Bubble Sort

I have seen at lot of places, the complexity for bubble sort is O(n2).
But how can that be so because the inner loop should always runs n-i times.
for (int i = 0; i < toSort.length -1; i++) {
for (int j = 0; j < toSort.length - 1 - i; j++) {
if(toSort[j] > toSort[j+1]){
int swap = toSort[j+1];
toSort[j + 1] = toSort[j];
toSort[j] = swap;
}
}
}
And what is the "average" value of n-i ? n/2
So it runs in O(n*n/2) which is considered as O(n2)
There are different types of time complexity - you are using big O notation so that means all cases of this function will be at least this time complexity.
As it approaches infinity this can be basically n^2 time complexity worst case scenario. Time complexity is not an exact art but more of a ballpark for what sort of speed you can expect for this class of algorithm and hence you are trying to be too exact.
For example the theoretical time complexity might very well be n^2 even though it should in theory be n*n-1 because of whatever unforeseen processing overhead might be performed.
Since outer loop runs n times and for each iteration inner loop runs (n-i) times , the total number of operations can be calculated as
n*(n-i) = O(n2).
It's O(n^2),because length * length.

Explain Time Complexity?

How does one find the time complexity of a given algorithm notated both in N and Big-O? For example,
//One iteration of the parameter - n is the basic variable
void setUpperTriangular (int intMatrix[0,…,n-1][0,…,n-1]) {
for (int i=1; i<n; i++) { //Time Complexity {1 + (n+1) + n} = {2n + 2}
for (int j=0; j<i; j++) { //Time Complexity {1 + (n+1) + n} = {2n + 2}
intMatrix[i][j] = 0; //Time complexity {n}
}
} //Combining both, it would be {2n + 2} * {2n + 2} = 4n^2 + 4n + 4 TC
} //O(n^2)
Is the Time Complexity for this O(n^2) and 4n^2 + 4n + 4? If not, how did you get to your answer?
Also, I have a question about a two-param matrix with time complexity.
//Two iterations in the parameter, n^2 is the basic variable
void division (double dividend [0,…,n-1], double divisor [0,…,n-1])
{
for (int i=0; i<n; i++) { //TC {1 + (n^2 + 1) + n^2} = {2n^2 + 2}
if (divisor[i] != 0) { //TC n^2
for (int j=0; j<n; j++) { //TC {1 + (n^2 + 1) + n^2} = {2n^2 + 2}
dividend[j] = dividend[j] / divisor[i]; //TC n^2
}
}
} //Combining all, it would be {2n^2 + 2} + n^2(2n^2 + 2) = 2n^3 + 4n^2 + 2 TC
} //O(n^3)
Would this one be O(N^3) and 2n^3 + 4n^2 + 2? Again, if not, can somebody please explain why?
Both are O(N2). You are processing N2 items in the worst case.
The second example might be just O(N) in the best case (if the second argument is all zeros).
I am not sure how you get the other polynomials. Usually the exact complexity is of no importance (namely when working with higher-level language).
What you're looking for in big O time complexity is the approximate number of times an instruction is executed. So, in the first function, you have the executable statement:
intMatrix[i][j] = 0;
Since the executable statement takes the same amount of time every time, it is O(1). So, for the first function, you can cut it down to look like this and work back from the executable statement:
i: execute n times{//Time complexity=n*(n+1)/2
j: execute i times{
intMatrix[i][j] = 0; //Time complexity=1
}
}
Working back, the both the i loop executes n times and the j loop executes i times. For example, if n = 5, the number of instructions executed would be 5+4+3+2+1=15. This is an arithmetic series, and can be represented by n(n+1)/2. The time complexity of the function is therefore n(n+1)/2=n^2/2+n/2=O(n^2).
For the second function, you're looking at something similar. Your executable statement is:
dividend[j] = dividend[j] / divisor[i];
Now, with this statement it's a little more complicated, as you can see from wikipedia, complexity of schoolbook long division is O(n^2). However, the dividend and divisor DO NOT use your variable n, so they're not dependent on it. Let's call the dividend and divisor, aka the actual contents of the matrix "m". So the time complexity of the executable statement is O(m^2). Moving on to simplify the second function:
i: execute n times{ //Time complexity=n*(n*(1*m^2))=O(n^2*m^2)
j: execute n times{ //Time complexity=n*(1*m^2)
if statement: execute ONCE{ //Time complexity=1*m^2
dividend[j] = dividend[j] / divisor[i]; //Time complexity=m^2
}
}
}
Working backwards, you can see that the inner statement will take O(m^2), and since the if statement takes the same amount of time every time, its time complexity is O(1). Your final answer is then O(n^2m^2). Since division takes so little time on modern processors, it is usually estimated at O(1) (see this for a better explanation for why this is), what your professor is probably looking for O(n^2) for the second function.
Big O Notation or time complexity, describes the relationship between a change in data size (n), and the magnitude of time / space required for a given algorithm to process it.
In your case you have two loops. For each number of n (the outer loop), you process n items (in the inner loop) items. Thus in you have O(n2) or "quadratic" time complexity.
So for small numbers of n the difference is negligible, but for larger numbers of n, it quickly grinds to a halt.
Eliminating 0 from the divisor as in algorithm 2 does not significantly change the time complexity, because checking to see if a number = 0 is O(1) and several orders of magnitude less then O(n2). Eliminating the inner loop in that specific case is still O(n), and still dwarfed by the time it takes to do the O(n2). Your second algorithm, thus technically becomes (best case) O(n) (if there are only zeros in the divisor series).

Categories