Big O Notation/Time Complexity issue - java

Would a recursive method call that is recursively called n times (but does not occur in a for loop) and only contains if/else statements within the method be considered O(N) or O(1)? Thanks!

It would be O(N).
The general approach to doing complexity analysis is to look for all significant constant-time executable statements or expressions (like comparisons, arithmetic operations, method calls, assignments) and work out an algebraic formula giving the number of times that they occur. Then reduce that formula to the equivalent big O complexity class.
In your case, method calls are significant.
When you have some experience at doing this, you will be able to leave out statements that "obviously" don't contribute to the overall complexity. But to start with, it is a good exercise to count everything.

Here is an example of factorial code in python. The recursion is pretty much the same as looping, here we are calling factorial function again and again until the if condition satisfies which is same in the loops. so O(N)
def factorial( n ):
if n <1: # base case
return 1
else:
returnNumber = n * factorial( n - 1 ) # recursive call
print(str(n) + '! = ' + str(returnNumber))
return returnNumber

When you calculate time complexity, there are 3 types-
1) Best case
2) Worst case
3) Average case
As, you are wrote O(n) it means you are checking for worst case time complexity, which would be O(n) as, we will take account the block in which recursion is present as the worst case.
If you are looking for best case you can take it as O(1) as it might not get into recursion according to the if-else condition.

Related

Big Oh-Notation Q

I've got a big O notation of O(n^2 + n + n), just want to make sure, this is just O(n^2) correct? Basically I just a for-loop that goes through all of n with another for-loop that does the same inside the first one, then two separate for-loops that also go through all of n.
Generally, in Big-O notation we only consider the fastest growing term of N.
Although if we are comparing two very similar performing algorithms even other terms may matter.
For eg:
If an algorithm has N operation and other has 2N operations although both of them are O(N) but still the later will perform poorer than the former.
Hence in your case your algorithm's Big O notation is O(N^2).

Java Big O notation in algorithm

i am comfused with this big O notation problem. This code does not look like O(n) but for loop turns number of word time so it is basically not more than 20
so, if we say Length (line.split()) is constant c, can we say O(c.n) = O(n) ?
while (this.scannerMainText.hasNextLine()){
String line = this.scannerMainText.nextLine();
for (String word : line.split("[.!,\" ]+")) {
some statements
}
}
Yes, the amount of times a loop is run has a negligible impact on the duration, therefore it is usually not mentioned in the Big-O notation.
For example, if a O(n) loop is run 20 times, you would think the notation would be O(20n), but because the impact is so small it is not mentioned it the Big-O notation and therefore O(20n) = O(n). Same goes for O(20n²) = O(n²) etc..
Yes.
Time complexity is based only on the number of times an algorithm is run; the actual number/cost of steps within the algorithm (c, as you call it) is not considered for Big-O notation.
Added
I'm not familiar enough with the edge cases of the theory to say that if all of the lines are equal in length you can reduce the runtime of the inner loop to a constant. In general, this algorithm would be considered O(mn).

Calculating time complexity by just seeing the algorithm code

I have currently learned the code of all sorting algorithms used and understood their functioning. However as a part of these, one should also be capable to find the time and space complexity. I have seen people just looking at the loops and deriving the complexity. Can someone guide me towards the best practice for achieving this. The given example code is for "Shell sort". What should be the strategy used to understand and calculate from code itself. Please help! Something like step count method. Need to understand how we can do asymptotic analysis from code itself. Please help.
int i,n=a.length,diff=n/2,interchange,temp;
while(diff>0) {
interchange=0;
for(i=0;i<n-diff;i++) {
if(a[i]>a[i+diff]) {
temp=a[i];
a[i]=a[i+diff];
a[i+diff]=temp;
interchange=1;
}
}
if(interchange==0) {
diff=diff/2;
}
}
Since the absolute lower bound on worst-case of a comparison-sorting algorithm is O(n log n), evidently one can't do any better. The same complexity holds here.
Worst-case time complexity:
1. Inner loop
Let's first start analyzing the inner loop:
for(i=0;i<n-diff;i++) {
if(a[i]>a[i+diff]) {
temp=a[i];
a[i]=a[i+diff];
a[i+diff]=temp;
interchange=1;
}
}
Since we don't know much (anything) about the structure of a on this level, it is definitely possible that the condition holds, and thus a swap occurs. A conservative analysis thus says that it is possible that interchange can be 0 or 1 at the end of the loop. We know however that if we will execute the loop a second time, with the same diff value.
As you comment yourself, the loop will be executed O(n-diff) times. Since all instructions inside the loop take constant time. The time complexity of the loop itself is O(n-diff) as well.
Now the question is how many times can interchange be 1 before it turns to 0. The maximum bound is that an item that was placed at the absolute right is the minimal element, and thus will keep "swapping" until it reaches the start of the list. So the inner loop itself is repeated at most: O(n/diff) times. As a result the computational effort of the loop is worst-case:
O(n^2/diff-n)=O(n^2/diff-n)
2. Outer loop with different diff
The outer loop relies on the value of diff. Starts with a value of n/2, given interchange equals 1 at the end of the loop, something we cannot prove will not be the case, a new iteration will be performed with diff being set to diff/2. This is repeated until diff < 1. This means diff will take all powers of 2 up till n/2:
1 2 4 8 ... n/2
Now we can make an analysis by summing:
log2 n
------
\
/ O(n^2/2^i-n) = O(n^2)
------
i = 0
where i represents *log2(diff) of a given iteration. If we work this out, we get O(n2) worst case time complexity.
Note (On the lower bound of worst-case comparison sort): One can proof no comparison sort algorithm exists with a worst-case time complexity of O(n log n).
This is because for a list with n items, there are n! possible orderings. For each ordering, there is a different way one needs to reorganize the list.
Since using a comparison can split the set of possible orderings into two equals parts at the best, it will require at least log2(n!) comparisons to find out which ordering we are talking about. The complexity of log2(n) can be calculated using the Stirling approximation:
n
/\
|
| log(x) dx = n log n - n = O(n log n)
\/
1
Best-case time complexity: in the best case, the list is evidently ordered. In that case the inner loop will never perform the if-then part. As a consequence, the interchange will not be set to 1 and therefore after executing the for loop one time. The outer loop will still be repeated O(log n) times, thus the time complexity is O(n log n).
Look at the loops and try to figure out how many times they execute. Start from the innermost ones.
In the given example (not the easiest one to begin with), the for loop (innermost) is excuted for i in range [0,n-diff], i.e. it is executed exactly n-diff times.
What is done inside that loop doesn't really matter as long as it takes "constant time", i.e. there is a finite number of atomic operations.
Now the outer loop is executed as long as diff>0. This behavior is complex because an iteration can decrease diff or not (it is decreased when no inverted pair was found).
Now you can say that diff will be decreased log(n) times (because it is halved until 0), and between every decrease the inner loop is run "a certain number of times".
An exercised eye will also recognize interleaved passes of bubblesort and conclude that this number of times will not exceed the number of elements involved, i.e. n-diff, but that's about all that can be said "at a glance".
Complete analysis of the algorithm is an horrible mess, as the array gets progressively better and better sorted, which will influence the number of inner loops.

Is time complexity of an algorithm calculated only based on number of times loop excecutes?

I have a big doubt in calculating time complexity. Is it calculated based on number of times loop executes? My question stems from the situation below.
I have a class A, which has a String attribute.
class A{
String name;
}
Now, I have a list of class A instances. This list has different names in it. I need to check whether the name "Pavan" exist in any of the objects in the list.
Scenario 1:
Here the for loop executes listA.size times, which can be said as O(n)
public boolean checkName(List<String> listA, String inputName){
for(String a : listA){
if(a.name.equals(inputName)){
return true;
}
}
return false;
}
Scenario 2:
Here the for loop executes listA.size/2 + 1 times.
public boolean checkName(List<String> listA, String inputName){
int length = listA.size/2
length = length%2==0 ? length : length + 1
for(int i=0; i < length ; i++){
if(listA[i].name.equals(inputName) || listA[listA.size - i - 1].name.equals(inputName)){
return true;
}
}
return false;
}
I minimized the number of times for loop executes, but I increased the complexity of the logic.
Can we say this is O(n/2)? If so, can you please explain me?
First note that in Big-O notation there is nothing such as O(n/2) as 1/2 is a constant factor which is ignored in this notation. The complexity would remain as O(n). So by modifying your code you haven't changed anything regarding complexity.
In general estimating the number of times a loop is executed with respect to input size and the operation that actually is associated with a cost in time is the way to get to the complexity class of the algorithm.
The operation that is producing cost in your method is String.equals, which by looking at it's implementation, is producing cost by comparing characters.
In your example the input size is not strictly equal to the size of the list. It also depends on how large the strings contained in that list are and how large the inputName is.
So let's say the largest string in the list is m1 characters and the inputName is m2 characters in length. So for your original checkName method the complexity would be O(n*min(m1,m2)) because of String.equals comparing at most all characters of a string.
For most applications the term min(m1,m2) doesn't matter as either one of the compared strings is stored in a fixed size database column for example and therefore this expression is a constant, which is, as said above, ignored.
No. In big O expression, all constant values are ignored.
We only care about n, such as O(n^2), O(logn).
Time and space complexity is calculated based on the number or operations executed, respectively the number the units of memory used.
Regarding time complexity: all the operations are taken into account and numbered. Because it's hard to compare say O(2*n^2+5*n+3) with O(3*n^2-3*n+1), equivalence classes are used. That means that for very large values of n, the two previous example will have a roughly similar value (more exactly said: they have a similar rate of grouth). Therefor, you reduce the expression to it's most basic form, saying that the two example are in the same equivalence class as O(n^2). Similarly, O(n) and O(n/2) are in the same class and therefor both are in O(n).
Due to what I said before, you can ignore most constant operations (such as .size(), .lenth() on collections, assignments, etc) as they don't really count in the end. Therefor, you're left with loop operations and sometimes complex computations (that somewhere lower on the stack use loops themselves).
To better get an understanding on the 3 classes of complexity, try reading articles on the subject, such as: http://discrete.gr/complexity/
Time complexity is a measure for theoretical time it will take for an operation to be executed.
While normally any improvement in the time required is significant in time complexity we are interested in the order of magnitude. The former means
If an operation for N objects requires N time intervals then it has complexity O(N).
If an operation for N objects requires N/2 it's complexity is still O(N) though.
The above paradox is explained if you get to calculate the operation for large N then there is no big difference in the /2 part as for the N part. If complexity is O(N^2) then O(N) is negligible for large N so that's why we are interested in order of magnitude.
In other words any constant is thrown away when calculating complexity.
As for the question if
Is it calculated based on number of times loop executes ?
well it depends on what a loop contains. But if only basic operation are executed inside a loop then yes. To give an example if you have a loop inside which an eigenanaluysis is executed in each run, which has complexity O(N^3) you cannot say that your complexity is simply O(N).
Complexity of an algorithm is measured based on the response made on the input size in terms of processing time or space requirement. I think you are missing the fact that the notations used to express the complexity are asymptotic notations.
As per your question, you have reduced the loop execution count, but not the linear relation with the input size.

BigO running time on some methods

Ok, these are all pretty simple methods, and there are a few of them, so I didnt want to just create multiple questions when they are all the same thing. BigO is my weakness. I just cant figure out how they come up with these answers. Is there anyway you can give me some insight into your thinking for analyzing running times of some of these methods? How do you break it down? How should I think when I see something like these? (specifically the second one, I dont get how thats O(1))
function f1:
loop 3 times
loop n times
Therefore O(3*n) which is effectively O(n).
function f2:
loop 50 times
O(50) is effectively O(1).
We know it will loop 50 times because it will go until n = n - (n / 50) is 0. For this to be true, it must iterate 50 times (n - (n / 50)*50 = 0).
function f3:
loop n times
loop n times
Therefore O(n^2).
function f4:
recurse n times
You know this because worst case is that n = high - low + 1. Disregard the +1.
That means that n = high - low.
To terminate,
arr[hi] * arr[low] > 10
Assume that this doesn't occur until low is incremented to the highest it can go (high).
This means n = high - 0 and we must recurse up to n times.
function 5:
loops ceil(log_2(n)) times
We know this because of the m/=2.
For example, let n=10. log_2(10) = 3.3, the ceiling of which is 4.
10 / 2 =
5 / 2 =
2.5 / 2 =
1.25 / 2 =
0.75
In total, there are 4 iterations.
You get an n^2 analysis when performing a loop within a loop, such as the third method.
However, the first method doesn't a n^2 timing analysis because the first loop is defined as running three times. This makes the timing for the first one 3n, but we don't care about numbers for Big-O.
The second one, introduces an interesting paradigm, where despite the fact that you have a single loop, the timing analysis is still O(1). This is because if you were to chart the timing it takes to perform this method, it wouldn't behave as O(n) for smaller numbers. For larger numbers it becomes obvious.
For the fourth method, you have an O(n) timing because you're recursive function call is passing lo + 1. This is similar to if you were using a for loop and incrementing with lo++/++lo.
The last one has a O(log n) timing because your dividing your variable by two. Just remember than anything that reminds you of a binary search will have a log n timing.
There is also another trick to timing analysis. Say you had a loop within a loop, and within each of the two loops you were reading lines from a file or popping of elements from a stack. This actually would only be a O(n) method, because a file only has a certain number of lines you can read, and a stack only has a certain number of elements you can pop off.
The general idea of big-O notation is this: it gives a rough answer to the question "If you're given a set of N items, and you have to perform some operation repeatedly on these items, how many times will you need to perform this operation?" I say a rough answer, because it (most of the time) doesn't give a precise answer of "5*N+35", but just "N". It's like a ballpark. You don't really care about the precise answer, you just want to know how bad it will get when N gets large. So answers like O(N), O(N*N), O(logN) and O(N!) are typical, because they each represent sort of a "class" of answers, which you can compare to each other. An algorithm with O(N) will perform way better than an algorithm with O(N*N) when N gets large enough, it doesn't matter how lengthy the operation is itself.
So I break it down thus: First identify what the N will be. In the examples above it's pretty obvious - it's the size of the input array, because that determines how many times we will loop. Sometimes it's not so obvious, and sometimes you have multiple input data, so instead of just N you also get M and other letters (and then the answer is something like O(N*M*M)).
Then, when I have my N figured out, I try to identify the loop which depends on N. Actually, these two things often get identified together, as they are pretty much tied together.
And, lastly of course, I have to figure out how many iterations the program will make depending on N. And to make it easier, I don't really try to count them, just try to recognize the typical answers - O(1), O(N), O(N*N), O(logN), O(N!) or perhaps some other power of N. The O(N!) is actually pretty rare, because it's so inefficient, that implementing it would be pointless.
If you get an answer of something like N*N+N+1, then just discard the smaller ones, because, again, when N gets large, the others don't matter anymore. And ignore if the operation is repeated some fixed number of times. O(5*N) is the same as O(N), because it's the ballpark we're looking for.
Added: As asked in the comments, here are the analysis of the first two methods:
The first one is easy. There are only two loops, the inner one is O(N), and the outer one just repeats that 3 times. So it's still O(N). (Remember - O(3N) = O(N)).
The second one is tricky. I'm not really sure about it. After looking at it for a while I understood why it loops at most only 50 times. Since this is not dependant on N at all, it counts as O(1). However, if you were to pass it, say, an array of only 10 items, all positive, it would go into an infinite loop. That's O(∞), I guess. So which one is it? I don't know...
I don't think there's a formal way of determining the big-O number for an algorithm. It's like the halting problem. In fact, come to think of it, if you could universally determine the big-O for a piece of code, you could also determine if it ever halts or not, thus contradicting the halting problem. But that's just my musings.
Typically I just go by... dunno, sort of a "gut feeling". Once you "get" what the Big-O represents, it becomes pretty intuitive. But for complicated algorithms it's not always possible to determine. Take Quicksort for example. On average it's O(N*logN), but depending on the data it can degrade to O(N*N). The questions you'll get on the test though should have clear answers.
The second one is 50 because big O is a function of the length of the input. That is if the input size changes from 1 million to 1 billion, the runtime should increase by 1000 if the function is O(N) and 1 million if it's O(n^2). However the second function runs in time 50 regardless of the input length, so it's O(1). Technically it would be O(50) but constants don't matter for big O.

Categories