I am new to programming and trying to learn by exploring. I was looking for a solution to find sum of maximum time repeating integer in an array with best space complexity. Suppose we have [1, 2, 3, 3] the result should be 6 with least space complexity, say O(n).
I came up with a solution but not sure about the complexity. Need some help to understand if below mentioned code has least complexity or it could be better(definitely!). Sorry if I made any mistake and thanks in advance.
public static int maxDuplicateSumSpaceBased(int[] a)
{
int maxRepCount = 1, tempCount;
int maxRepNum = a[0];
int temp = 0;
for (int i = 0; i < (a.length - 1); i++)
{
temp = a[i];
tempCount = 0;
for (int j = 1; j < a.length; j++)
{
if (temp == a[j])
tempCount++;
}
if (tempCount > maxRepCount)
{
maxRepNum = temp;
maxRepCount = tempCount;
}
}
return maxRepNum * maxRepCount;
}
Actually the space of the input is usually not counted in the O notation so your program has a spatial complexity of O(6)=O(c)=O(1). c is a constant. In fact you always use 6 variables. If the amount of space used is dependent on the input given the situation is different but it's not your case because regardless of the length of you input you use always 6 variables.
If you want to count the input as occupied space (sometimes it's done) your space complexity would be O(6+n)=O(n) assuming that n is the length of the input.
It's impossible to do better as you can easly prove:
You can't have less memory occupied than the input (or you must memorize all the input). Since the input is the only thing that's not a constant you have that the maximum space used is the one needed to store the input that is n.
The space complexity1 of your solution is O(1). You can't get better than that.
The time complexity of your solution is O(N^2). You can improve on that in a couple of ways:
If you can modify a, then you can sort it { time O(NlogN), spaceO(1) } then find / count the most frequent value { O(N) , O(1) }. Overall complexity is { O(NlogN), O(1)}.
If you cannot modify a, then copy it { O(N) / O(N) } and then proceed as above. Overall complexity is { O(NlogN), O(N) }.
If the range of the numbers (M) is less than the number of numbers, then you can use a bucket sort. Overall complexity is { O(N), O(M) }.
You can get better time complexity overall using a HashMap. The overall complexity of that will be { O(N) on average, O(N)} ... with significantly larger constants of proportionality. (Unfortunately, the worst case time complexity will be O(NlogN) or O(N^2) depending on the hash map implementation. It occurs when all of the keys collide. That is impossible for Integer keys and HashMap, but possible for Long keys.)
1 - I am referring to space in addition to the space occupied by the input array. Obviously, the space used for the input array cannot be optimized. It is a given.
I have understand your problem.. Now there could be a solution there are n integers and all integers k [1-n]. Then to find maxrepeatnumber takes O(n) time.
public static int maxDuplicateSumSpaceBased(int[] a)
{
int maxRepCount = 1, tempCount;
int k=a.length();
for (int i = 0; i <k; i++)
{
a[a[i]%k]+=k;
}
int maxRepnumber=0,temp=a[0];
for (int j = 1; j < k; j++)
{
if (temp < a[j])
{
temp=a[j];
maxRepnumber=j;
}
}
}
return maxRepNum;
}
Then you sum all that number and it take O(n)and O(1) space.
Related
Given a function that takes in an array a that compares a[0] and a[1] and if a[0] < a[1] they swap places. The function then keeps comparing the current element with the next one and swaps if it is bigger. This way you are left with the biggest element at the end of your array. How would I go about defining a formula for the average amount of swaps it would take? I understand why Hn is what it is for other sorting algorithms but I am having a hard time understanding how you "calculate" or work your way to what the algorithm is for the given function.
public static int maxB(int[] a) {
if(a.length < 1)
throw new NoSuchElementException("empty array");
for(int i = 1; i < a.length; i++) {
if(a[i-1] > a[i]) {
int temp = a[i-1];
a[i-1] = a[i];
a[i] = temp;
}
}
return a[a.length - 1];
}
This is the code in quesiton that I have written and I am not asking for coding help or formatting etc. I know it is "bad" and primitive but I just wanted to use this as an example on how to find formulas for the average of a given algorithm and this one is one of the few I dont understand how to do it for. Appreciate the help
There is no hard and fast rule to find the performance of an algorithm.
But for this one, let's define an inversion as a pair (x, y) with x < y but a[y] < a[x]. Show that every swap reduces the number of inversions by 1. Also if the array is sorted, there are no inversions. And therefore the number of swaps you need to sort the array is the same as the number of inversions.
Your question therefore becomes, "On average, how many inversions are there?" And the answer is that there are n*(n-1)/2 pairs, and half of them will be inversions on average, for an average of n*(n-1)/4 = O(n^2) inversions.
Working on the following problem:
Given a string s, find the length of the longest substring without repeating characters.
I'm using this brute force solution:
public class Solution {
public int lengthOfLongestSubstring(String s) {
int n = s.length();
int res = 0;
for (int i = 0; i < n; i++) {
for (int j = i; j < n; j++) {
if (checkRepetition(s, i, j)) {
res = Math.max(res, j - i + 1);
}
}
}
return res;
}
private boolean checkRepetition(String s, int start, int end) {
int[] chars = new int[128];
for (int i = start; i <= end; i++) {
char c = s.charAt(i);
chars[c]++;
if (chars[c] > 1) {
return false;
}
}
return true;
}
}
Tbe big O notation is as follows:
I understand that three nested iterations would result in a time complexity O(n^3).
I only see two sigma operators being used on the start of the formula, could someone enlighten me on where the third iteration comes to play in the beginning of the formula?
The first sum from i=0 to n-1 corresponds to the outer for loop of lengthOfLongestSubstring, which you can see iterates from i=0 to n-1.
The second sum from j = i+1 to n corresponds to the second for loop (you could be starting j at i+1 rather than i as there's no need to check length 0 sub-strings).
Generally, we would expect this particular double for loop structure to produce O(n^2) algorithms and a third for loop (from k=j+1 to n) to lead to O(n^3) ones. However, this general rule (k for loops iterating through all k-tuples of indices producing O(n^k) algorithms) is only the case when the work done inside the innermost for loop is constant. This is because having k for loops structured in this way produces O(n^k) total iterations, but you need to multiply the total number of iterations by the work done in each iteration to get the overall complexity.
From this idea, we can see that the reason lengthOfLongestSubstring is O(n^3) is because the work done inside of the body of the second for loop is not constant, but rather is O(n). checkRepitition(s, i, j) iterates from i to j, taking j-i time (hence the expression inside the second term of the sum). O(j-i) time is O(n) time in the worst case because i could be as low as 0, j as high as n, and of course O(n-0) = O(n) (it's not too hard to show that checkRepitions is O(n) in the average case as well).
As mentioned by a commenter, having a linear operation inside the body of your second for loop has the same practical effect in terms of complexity as having a third for loop, which would probably be easier to see as being O(n^3) (you could even imagine the function definition for checkRepitition, including its for loop, being pasted into lengthOfLongestSubstring in place to see the same result). But the basic idea is that doing O(n) work for each of the O(n^2) iterations of the 2 for loops means the total complexity is O(n)*O(n^2) = O(n^3).
I have a quick question about Complexity. I have this code in Java:
pairs is a HashMap that contains an Integer as a key, and it's frequency in a Collection<Integer> as a value. So :
pairs = new Hashmap<Integer number, Integer numberFrequency>()
Then I want to find the matching Pairs (a,b) that verify a + b == targetSum.
for (int i = 0; i < pairs.getCapacity(); i++) { // Complexity : O(n)
if (pairs.containsKey(targetSum - i) && targetSum - i == i) {
for (int j = 1; j < pairs.get(targetSum - i); j++) {
collection.add(new MatchingPair(targetSum - i, i));
}
}
}
I know that the complexity of the first For loop is O(n), but the second for Loop it only loops a small amount of times, which is the frequency of the number-1, do we still count it as O(n) so this whole portion of code will be O(n^2) ? If it is does someone have any alternative to just make it O(n) ?
Its O(n) if 'pairs.getCapacity()' or 'pairs.get(targetSum - i)' is a constant you know before hand. Else, two loops, one nested in the other, is generally O(n^2).
You can consider that for the wors case your complexity is O(n2)
I have a very general question about calculating time complexity(Big O notation). when people say that the worst time complexity for QuickSort is O(n^2) (picking the first element of the array to be the pivot every time, and array is inversely sorted), which operation do they account for to get O(n^2)? Do people count the comparisons made by the if/else statements? Or do they only count the total number of swaps it makes? Generally how do you know which "steps" to count to calculate Big O notation.
I know this is a very basic question but I've read almost all the articles on google but still haven't figured it out
Worst cases of Quick Sort
Worst case of Quick Sort is when array is inversely sorted, sorted normally and all elements are equal.
Understand Big-Oh
Having said that, let us first understand what Big-Oh of something means.
When we have only and asymptotic upper bound, we use O-notation. For a given function g(n), we denote by O(g(n)) the set of functions,
O(g(n)) = { f(n) : there exist positive c and no,
such that 0<= f(n) <= cg(n) for all n >= no}
How do we calculate Big-Oh?
Big-Oh basically means how program's complexity increases with the input size.
Here is the code:
import java.util.*;
class QuickSort
{
static int partition(int A[],int p,int r)
{
int x = A[r];
int i=p-1;
for(int j=p;j<=r-1;j++)
{
if(A[j]<=x)
{
i++;
int t = A[i];
A[i] = A[j];
A[j] = t;
}
}
int temp = A[i+1];
A[i+1] = A[r];
A[r] = temp;
return i+1;
}
static void quickSort(int A[],int p,int r)
{
if(p<r)
{
int q = partition(A,p,r);
quickSort(A,p,q-1);
quickSort(A,q+1,r);
}
}
public static void main(String[] args) {
int A[] = {5,9,2,7,6,3,8,4,1,0};
quickSort(A,0,9);
Arrays.stream(A).forEach(System.out::println);
}
}
Take into consideration the following statements:
Block 1:
int x = A[r];
int i=p-1;
Block 2:
if(A[j]<=x)
{
i++;
int t = A[i];
A[i] = A[j];
A[j] = t;
}
Block 3:
int temp = A[i+1];
A[i+1] = A[r];
A[r] = temp;
return i+1;
Block 4:
if(p<r)
{
int q = partition(A,p,r);
quickSort(A,p,q-1);
quickSort(A,q+1,r);
}
Assuming each statements take a constant time c. Let's calculate how many times each block is calculated.
The first block is executed 2c times.
The second block is executed 5c times.
The thirst block is executed 4c times.
We write this as O(1) which implies the number of times statement is executed same number of times even when size of input varies. all 2c, 5c and 4c all are O(1).
But, when we add the loop over second block
for(int j=p;j<=r-1;j++)
{
if(A[j]<=x)
{
i++;
int t = A[i];
A[i] = A[j];
A[j] = t;
}
}
It runs for n times (assuming r-p is equal to n, size of the input) i.e., nO(1) times i.e., O(n). But this doesn't run n times everytime. Hence, we have the average case O(log n) i.e, at least log(n) elements are traversed.
We now established that the partition runs O(n) or O(log n). The last block, which is quickSort method, definetly runs in O(n). We can think of it as an enclosing for loop which runs n times. Hence the entire complexity is either O(n2) or O(nlog n).
It is counted mainly on the size (n) that can grow, so for quicksort an array it is the size of the array. How many times do you need to access each elements of the array? if you only need to access each element once then it is a O(n) and so on..
Temp variables / local variables that is growing as the n grows will be counted.
Other variables that is not growing significantly when n grows can be count as constant: O(n) + c = O(n)
Just to add to what others have said, I agree with those who said you count everything, but if I recall correctly from my algorithm classes in college, the swapping overhead is usually minimal compared with the comparison times and in some cases is 0 (if the list in question is already sorted).
For example. the formula for a linear search is
T= K * N / 2.
where T is the total time; K is some constant defining the total computation time; and N is the number of elements in the list.
ON average, the number of comparisons is N/2.
BUT we can rewrite this to the following:
T = (K/2) * N
or redefining K,
T = K * N.
This makes it clear that the time is directly proportional to the size of N, which is what we really care about. As N increases significantly, it becomes the only thing that really matters.
A binary search on the other hand, grows logarithmically (O log(N)).
The classical Two Sum problem is described in LeetCode.
I know how to solve it with a hash table, which results in O(n) extra space. Now I want to solve it with O(1) space, so I'll first sort the array and then use two pointers to find the two integers, as shown in the (incorrect) code below.
public int[] twoSum(int[] numbers, int target) {
java.util.Arrays.sort(numbers);
int start = 0, end = numbers.length - 1;
while(start < end) {
if(numbers[start] + numbers[end] < target) {
start++;
}
else if(numbers[start] + numbers[end] > target) {
end--;
}
else {
int[] result = {start + 1, end + 1};
return result;
}
}
return null;
}
This code is incorrect: I'm returning the indices after sorting. So how will I keep track of the original indices of the selected integers? Or are there other O(1) space solutions? Thank you.
If you are only worried about space complexity, and not the time complexity, then you don't need to sort. That way, the whole issue of keeping track of original indices goes away.
int[] twoSum(int[] numbers, int target) {
for (int i = 0; i < numbers.length-1; i++) {
for (int j = i+1; j < numbers.length; j++) {
if (numbers[i] + numbers[j] == target)
return new int[]{i+1, j+1};
}
}
return null;
}
If you want to return all such pairs, not just the first one, then just continue with the iterations instead of returning immediately (of course, the return type will have to change to a list or 2-d array or ... ).
There are certain limits what can be achieved and what can't be. There are some parameters that depend on each other. Time & Space complexities are two such parameters when it comes to algorithms.
If you want to optimize your problem for space, it will increase the time complexity in most of the cases except in some special circumstances.
In this problem, if you don't want to increase the space complexity and want to preserve the original indices, the only way to do it is to not sort the array and take every two numbers combinations from the array and check if their sum is your target. This means the code becomes something similar to below.
while(i < n)
{
while(j < n)
{
if(i!=j && arr[i]+arr[j]==target)
{
int[] result = {i, j};
return result;
}
j++;
}
i++;
}
As you can see this obviously is an O(n^2) algorithm. Even in the program you have written the sorting will be something like O(nlogn).
So, the bottom line is if you want to reduce space complexity, it increases time complexity.