Related
Consider the character string generated by the following rule:
F[0] = "A"
F[1] = "B"
...
F[n] = F[n-1] + F[n-2] with n > 1
Given two positive integers n and k. Let's count the number of characters 'B' in the first k positions of string F[n].
I came up with this idea and got time limit exceeded error:
public class Solution {
public static long[] F = new long[50];
public static Scanner input = new Scanner(System.in);
public static long count(int n, long k) {
if (n == 0 || k == 0) return 0;
else if (n == 1) return 1;
else {
if (k > F[n - 1]) return count(n - 1, F[n - 1]) + count(n - 2, k - F[n - 1]);
else return count(n - 1, k);
}
}
public static void main(String[] args) {
F[0] = 1; F[1] = 1;
for (int i = 2; i < 46; i++) F[i] = F[i - 2] + F[i - 1];
int T = input.nextInt();
while (T-- > 0) {
int n = input.nextInt();
long k = input.nextLong();
System.out.println(count(n, k));
}
}
}
Can someone help me to improve time complexity? Seems my solution has O(n^2) time complexity.
Test case for this question:
Input
Output
4
0 1
0
1 1
1
3 2
1
7 7
4
There seems to be a pattern related to Fibonacci numbers:
A
B
AB 1 + 1 (A count + B count)
BAB 1 + 2
ABBAB 2 + 3
BABABBAB 3 + 5
ABBABBABABBAB 5 + 8
^ k = 7
BABABBAB 3 + 5
^ k = 2 (result = 3)
BAB 1 + 2
^ k = 2 (result = 4)
AB 1 + 1
^ k = 1 = A (result = 4)
Let g(l, r, k) represent the count of Bs in the first k positions of Fib[n] = l + r. Then:
g(l, r, k):
if (1, 1) == (l, r):
return 1 if k == 2 else 0
if (1, 2) == (l, r):
return 1 if k < 3 else 2
ll, rl = getFibSummands(l)
lr, rr = getFibSummands(r)
if k > l:
return rl + g(lr, rr, k - l)
return g(ll, rl, k)
This answer above may have misinterpreted the starting order of concatenation, which possibly should be BA, in which case, we would need to reverse l, r.
A
B
BA 1 + 1 (B count + A count)
BAB 2 + 1
BABBA 3 + 2
BABBABAB 5 + 3
BABBABABBABBA 8 + 5
^ k = 7
BABBABAB
^ k = 7
BAB
^ k = 2 (result = 3)
BA
^ k = 2
A
^ k = 1 (result = 4)
g(l, r, k):
if (1, 1) == (l, r):
return 1 if k == 2 else 0
if (2, 1) == (l, r):
return 1 if k < 3 else 2
ll, rl = getFibSummands(l)
lr, rr = getFibSummands(r)
if k > l:
return ll + g(lr, rr, k - l)
return g(ll, rl, k)
For clarity, define Fib(n) to be the Fibonacci sequence where Fib(0) = Fib(1) = 1, and Fib(n+2) = Fib(n+1) + Fib(n).
Note that F[n] has Fib(n) characters, with Fib(n-1) of them being Bs.
Let C(n, k) be the number of B's in the first k characters of F[n].
The base cases are obvious
C(0, k) = 0
C(n, k) = 0 if k<=0
C(1, 1) = 1
C(2, k) = 1
In general:
C(n, k) = C(n-1, k) if k <= Fib(n-1)
= Fib(n-2) + C(n-1, k - Fib(n-1)) otherwise
This is the observation that F[n] = F[n-1] + F[n-2] and k lies in either the first part or the second. The first part has Fib(n-1) characters of which Fib(n-2) of them are B's.
If you precompute the Fibonacci numbers from 0 to n, then you can compute C(n, k) in O(n) arithmetic operations.
You tagged java, but here's a python solution including test cases:
def C(n, k):
if n == 0: return 0
total = 0
fib = [1, 1]
while len(fib) <= n and fib[-1] <= k:
fib.append(fib[-1] + fib[-2])
n = min(n, len(fib) - 1)
while True:
if n <= 2:
return total + 1
elif k <= fib[n-1]:
n -= 1
else:
total += fib[n-2]
k -= fib[n-1]
n -= 1
tcs = [
([0, 1], 0),
([1, 1], 1),
([3, 2], 1),
([7, 7], 4)
]
for (n, k), want in tcs:
got = C(n, k)
if got != want:
print('C(%d, %d) = %d, want %d' % (n, k, got, want))
This includes an optimization which reduces n so that initially k > Fib(n-1). This makes the code O(min(n, log(k))) arithmetic operations.
I was going through a simple program that takes a number and finds the number of occurrences of consecutive numbers that matches with given number.
For example:
if input is 15, then the consecutive numbers that sum upto 15 are:
1,2,3,4,5
4,5,6
7,8
So the answer is 3 as we have 3 possibilities here.
When I was looking for a solution I found out below answer:
static long process(long input) {
long count = 0;
for (long j = 2; j < input/ 2; j++) {
long temp = (j * (j + 1)) / 2;
if (temp > input) {
break;
}
if ((input- temp) % j == 0) {
count++;
}
}
return count;
}
I am not able to understand how this solves the requirement because this program is using some formula which I am not able to understand properly, below are my doubts:
The for loop starts from 2, what is the reason for this?
long temp = (j * (j + 1)) / 2; What does this logic indicates? How is this helpful to solving the problem?
if ((num - temp) % j == 0) Also what does this indicate?
Please help me in understanding this solution.
I will try to explain this as simple as possible.
If input is 15, then the consecutive numbers that sum upto 15 are:
{1,2,3,4,5} -> 5 numbers
{4,5,6} -> 3 numbers
{7,8} -> 2 numbers
At worst case, this must be less than the Sum of 1st n natural numbers = (n*(n+1) /2.
So for a number 15, there can never be a combination of 6 consecutive numbers summing up to 15 as the sum of 1st 6 numbers =21 which is greater than 15.
Calculate temp: This is (j*(j+1))/2.
Take an example. Let input = 15. Let j =2.
temp = 2*3/2 = 3; #Meaning 1+2 =3
For a 2-number pair, let the 2 terms be 'a+1' and 'a+2'.(Because we know that the numbers are consecutive.)
Now, according to the question, the sum must add up to the number.
This means 2a+3 =15;
And if (15-3) is divisible by 2, 'a' can be found. a=6 -> a+1=7 and a+2=8
Similarly, let a+1 ,a+2 and a+3
a + 1 + a + 2 + a + 3 = 15
3a + 6 = 15
(15-6) must be divisible by 3.
Finally, for 5 consecutive numbers a+1,a+2,a+3,a+4,a+5 , we have
5a + 15 = 15;
(15-15) must be divisible by 5.
So, the count will be changed for j =2,3 and 5 when the input is 15
If the loop were to start from 1, then we would be counting 1 number set too -> {15} which is not needed
To summarize:
1) The for loop starts from 2, what is the reason for this?
We are not worried about 1-number set here.
2) long temp = (j * (j + 1)) / 2; What does this logic indicates? How is this helpful to solving the problem?
This is because of the sum of 1st n natural numbers property as I have
explained the above by taking a+1 and a+2 as 2 consecutive
numbers.
3) if ((num - temp) % j == 0) Also what does this indicate?
This indicates the logic that the input subtracted from the sum of 1st
j natural numbers must be divisible by j.
We need to find all as and ns, that for given b the following is true:
a + (a + 1) + (a + 2) + ... (a + (n - 1)) = b
The left side is an arithmetic progression and can be written as:
(a + (n - 1) / 2) * n = b (*)
To find the limit value of n, we know, that a > 0, so:
(1 + (n - 1) / 2) * n = n(n + 1) / 2 <= b
n(n + 1) <= 2b
n^2 + n + 1/4 <= 2b + 1/4
(n + 1/2)^2 <= 2b + 1/4
n <= sqrt(2b + 1/4) - 1/2
Now we can rewrite (*) to get formula for a:
a = b / n - (n - 1) / 2
Example for b = 15 and n = 3:
15 / 3 - (3 - 1) / 2 = 4 => 4 + 5 + 6 = 15
And now the code:
double b = 15;
for (double n = 2; n <= Math.ceil(Math.sqrt(2 * b + .25) - .5); n++) {
double candidate = b / n - (n - 1) / 2;
if (candidate == (int) candidate) {
System.out.println("" + candidate + IntStream.range(1, (int) n).mapToObj(i -> " + " + (candidate + i)).reduce((s1, s2) -> s1 + s2).get() + " = " + b);
}
}
The result is:
7.0 + 8.0 = 15.0
4.0 + 5.0 + 6.0 = 15.0
1.0 + 2.0 + 3.0 + 4.0 + 5.0 = 15.0
We are looking for consecutive numbers that sum up to the given number.
It's quite obvious that there could be at most one series with a given length, so basically we are looking for those values witch could be the length of such a series.
variable 'j' is the tested length. It starts from 2 because the series must be at least 2 long.
variable 'temp' is the sum of a arithmetic progression from 1 to 'j'.
If there is a proper series then let X the first element. In this case 'input' = j*(X-1) + temp.
(So if temp> input then we finished)
At the last line it checks if there is an integer solution of the equation. If there is, then increase the counter, because there is a series with j element which is a solution.
Actually the solution is wrong, because it won't find solution if input = 3. (It will terminate immediately.) the cycle should be:
for(long j=2;;j++)
The other condition terminates the cycle faster anyway.
NB: loop is starting from 2 because=> (1*(1+1))/2 == 1, which doesn't make sense, i.e, it doesn't effect on the progress;
let, k = 21;
so loop will iterate upto (k/2) => 10 times;
temp = (j*(j+1))/2 => which is, 3 when j =2, 6 when j = 3, and so on (it calculates sum of N natural numbers)
temp > k => will break the loop because, we don't need to iterate the loop when we got 'sum' which is more than 'K'
((k-temp)%j) == 0 => it is basically true when the input subtracted from the sum of first j natural numbers are be divisible by j, if so then increment the count to get total numbers of such equation!
public static long process(long input) {
long count = 0, rest_of_sum;
for (long length = 2; length < input / 2; length++) {
long partial_sum = (length * (length + 1)) / 2;
if (partial_sum > input) {
break;
}
rest_of_sum = input - partial_sum
if (rest_of_sum % length == 0)
count++;
}
return count;
}
input - given input number here it is 15
length - consecutive numbers length this is at-least 2 at max input/2
partial_sum = sum of numbers from 1 to length (which is a*(a+1)/2 for 1 to a numbers) assume this is a partial sequence
rest_of_sum = indicates the balance left in input
if rest of sum is multiple of length meaning is that we can add (rest_of_sum/length) to our partial sequence
lets call (rest_of_sum/length) as k
this only means we can build a sequence here that sums up to our input number
starting with (k+1) , (k+2), ... (k+length)
this can validated now
(k+1) + (k+2) + ... (k+length)
we can reduce this as k+k+k+.. length times + (1+2+3..length)
can be reduced as => k* length + partial_sum
can be reduced as => input (since we verified this now)
So idea here is to increment count every-time we find a length which satisfies this case here
If you put this tweak in it may fix code. I have not extensively tested it. It's an odd one but it puts the code through an extra iteration to fix the early miscalculations. Even 1/20000 would work! Had this been done with floats that got rounded down and 1 added to them I think that would have worked too:
for (long j = 2; j < input+ (1/2); j++) {
In essence you need to only know one formula:
The sum of the numbers m..n (or m to n) (and where n>m in code)
This is ((n-m+1)*(n+m))/2
As I have commented already the code in the original question was bugged.
See here.
Trying feeding it 3. That has 1 occurrence of the consecutive numbers 1,2. It yields 0.
Or 5. That has 2,3 - should yield 1 too - gives 0.
Or 6. This has 1,2,3 - should yield 1 too - gives 0.
In your original code, temp or (j * (j + 1)) / 2 represented the sum of the numbers 1 to j.
1 2 3 4 5
5 4 3 2 1
=======
6 6 6 6 6 => (5 x 6) /2 => 30/2 => 15
As I have shown in the code below - use System.out.println(); to spew out debugging info.
If you want to perfect it make sure m and n's upper limits are half i, and i+1 respectively, rounding down if odd. e.g: (i=15 -> m=7 & n=8)
The code:
class Playground {
private static class CountRes {
String ranges;
long count;
CountRes(String ranges, long count) {
this.ranges = ranges;
this.count = count;
}
String getRanges() {
return this.ranges;
}
long getCount() {
return this.count;
}
}
static long sumMtoN(long m, long n) {
return ((n-m+1)* (n+m))/2;
}
static Playground.CountRes countConsecutiveSums(long i, boolean d) {
long count = 0;
StringBuilder res = new StringBuilder("[");
for (long m = 1; m< 10; m++) {
for (long n = m+1; n<=10; n++) {
long r = Playground.sumMtoN(m,n);
if (d) {
System.out.println(String.format("%d..%d %d",m,n, r));
}
if (i == r) {
count++;
StringBuilder s = new StringBuilder(String.format("[%d..%d], ",m,n));
res.append(s);
}
}
}
if (res.length() > 2) {
res = new StringBuilder(res.substring(0,res.length()-2));
}
res.append("]");
return new CountRes(res.toString(), count);
}
public static void main(String[ ] args) {
Playground.CountRes o = countConsecutiveSums(3, true);
for (long i=3; i<=15; i++) {
o = Playground.countConsecutiveSums(i,false);
System.out.println(String.format("i: %d Count: %d Instances: %s", i, o.getCount(), o.getRanges()));
}
}
}
You can try running it here
The output:
1..2 3
1..3 6
1..4 10
1..5 15
1..6 21
1..7 28
1..8 36
1..9 45
1..10 55
2..3 5
2..4 9
2..5 14
2..6 20
2..7 27
2..8 35
2..9 44
2..10 54
3..4 7
3..5 12
3..6 18
3..7 25
3..8 33
3..9 42
3..10 52
4..5 9
4..6 15
4..7 22
4..8 30
4..9 39
4..10 49
5..6 11
5..7 18
5..8 26
5..9 35
5..10 45
6..7 13
6..8 21
6..9 30
6..10 40
7..8 15
7..9 24
7..10 34
8..9 17
8..10 27
9..10 19
i: 3 Count: 1 Instances: [[1..2]]
i: 4 Count: 0 Instances: []
i: 5 Count: 1 Instances: [[2..3]]
i: 6 Count: 1 Instances: [[1..3]]
i: 7 Count: 1 Instances: [[3..4]]
i: 8 Count: 0 Instances: []
i: 9 Count: 2 Instances: [[2..4], [4..5]]
i: 10 Count: 1 Instances: [[1..4]]
i: 11 Count: 1 Instances: [[5..6]]
i: 12 Count: 1 Instances: [[3..5]]
i: 13 Count: 1 Instances: [[6..7]]
i: 14 Count: 1 Instances: [[2..5]]
i: 15 Count: 3 Instances: [[1..5], [4..6], [7..8]]
This question already has answers here:
Big O, how do you calculate/approximate it?
(24 answers)
Closed 5 years ago.
I am trying to learn about time complexity of an algorithm. My professor has pushed it beyond Big O and wants us to be able to derive an algorithm to a mathematical function. I am having a hard time conceptualizing how this is done and was looking for help. In my class notes, a selection sort algorithm was provided (as shown in the code below). The notes asked the following question: "Derive a function f(n) that corresponds to the total number of times that minIndex or any position of nums is modified in the worst case. Now the notes tell me the answer is f(n)= 1/2n^2 + 5/2n + 3. I was wondering if anyone could explain how this occurs.
My professor told us to count the operations in the inner loop and work our way out. so I believe that at worst case in the inner loop the if statement always executes, this would mean we run the loop n-i-1 times, I get these values by taking n (the boundary that the for loop has to be less than and subtracting it by the starting condition (i+1). Then I look at the outer loop and I see it goes from i until it reaches n-1, so it could be written as (n-1)-i or just like the inner loop, n-i-1. Looking further there are three modifications in the outer loop so we get (n-i-1)+3 ( could I write it as (n-i+2)?
The number of modification at the worst case for the inner loop:
n-i-1
The number of modifications at the worst case for the outer loop:
(n-i-1)+3
now I would like to know how do you go from counting the two different modifications done and becoming f(n)= 1/2n^2 + 5/2n + 3.
public static void selectionSort(int[] nums) {
int n = nums.length;
int minIndex;
for(int i = 0; i < n-1; i++) {
//find the index of themin number.
minIndex = i;
for(int j = i+1; j < n; j++) {
if(nums[j] < nums[minIndex]) {
minIndex = j;
}
int temp = nums[i];
nums[i] = nums[minIndex];
nums[minIndex] = temp;
}
}
}
How many times does outer loop run?
n - 1 times.
How many times does inner loop run, for each iteration of outer loop?
From n - 1 times down to 1 time, as outer loop progresses, so on average:
((n - 1) + 1) / 2 = n / 2 times.
So, how many times does inner loop run in total?
(n - 1) * (n / 2) = n^2 / 2 - n / 2 times.
How many times is minIndex modified?
Once per outer loop + once per inner loop:
(n - 1) + (n^2 / 2 - n / 2) = n^2 / 2 + n / 2 - 1 times.
How many times is a position of nums modified?
Twice per inner loop:
2 * (n^2 / 2 - n / 2) = n^2 - n times.
What is total number of modifications?
(n^2 / 2 + n / 2 - 1) + (n^2 - n) = (3*n^2 - n) / 2 - 1 times.
Or 1½n² - ½n - 1
That's not the same answer as you said your notes has, so let's prove it.
First, we add debug printing, i.e. print any modification including a modification number.
public static void selectionSort(int[] nums) {
int mod = 0;
int n = nums.length;
int minIndex;
for(int i = 0; i < n-1; i++) {
//find the index of themin number.
minIndex = i; System.out.printf("%2d: minIndex = %d%n", ++mod, i);
for(int j = i+1; j < n; j++) {
if(nums[j] < nums[minIndex]) {
minIndex = j; System.out.printf("%2d: minIndex = %d%n", ++mod, j);
}
int temp = nums[i];
nums[i] = nums[minIndex]; System.out.printf("%2d: nums[%d] = %d%n", ++mod, i, nums[minIndex]);
nums[minIndex] = temp; System.out.printf("%2d: nums[%d] = %d%n", ++mod, minIndex, temp);
}
}
}
Worst case is sorting an array of descending numbers, so lets try 3 numbers:
int[] nums = { 3, 2, 1 };
selectionSort(nums);
System.out.println(Arrays.toString(nums));
We'd expect (3*n^2 - n) / 2 - 1 = (3*3^2 - 3) / 2 - 1 = 24 / 2 - 1 = 11 modifications.
1: minIndex = 0
2: minIndex = 1
3: nums[0] = 2
4: nums[1] = 3
5: minIndex = 2
6: nums[0] = 1
7: nums[2] = 2
8: minIndex = 1
9: minIndex = 2
10: nums[1] = 2
11: nums[2] = 3
[1, 2, 3]
Yup, 11 modifications.
Lets try 9:
int[] nums = { 9, 8, 7, 6, 5, 4, 3, 2, 1 };
(3*n^2 - n) / 2 - 1 = (3*9^2 - 9) / 2 - 1 = 234 / 2 - 1 = 116 modifications.
1: minIndex = 0
2: minIndex = 1
3: nums[0] = 8
4: nums[1] = 9
5: minIndex = 2
6: nums[0] = 7
. . .
111: nums[6] = 7
112: nums[8] = 8
113: minIndex = 7
114: minIndex = 8
115: nums[7] = 8
116: nums[8] = 9
[1, 2, 3, 4, 5, 6, 7, 8, 9]
Yup, 116 modification.
Formula verified by empiric evidence:
f(n) = (3*n^2 - n) / 2 - 1
I am currently looking to find the largest amount of consecutive odd integers added together to equal a target number.
My current code to find 3 consecutive integers looks like
public class consecutiveOdd {
public static void main(String[] args){
int target = 160701;
boolean found = false;
for(int i = 1; i < target; i++){
if(i + (i+2) + (i+4) == target){
System.out.print(i + " + " + (i+2) + " + " + (i+4));
found = true;
}
}
if(!found){
System.out.println("Sorry none");
}
}
}
I am thinking there will need to be a while loop building iterations of (i+2) increments but am having trouble with developing a correct algorithm. Any help or tips will be much appreciated!
Best,
Otterman
Let's say that the answer is equal to k (k > 0). Then for some odd i we can write: i + (i + 2) + (i + 4) + ... + (i + 2k - 2) = target. You can see that this is a sum of arithmetic progression, therefore you can use a well known formula to compute it. Applying the formula we can get:
i = target/k - k + 1.
Basing on this formula I would suggest the following algorithm:
Iterate over the value of k.
If target/k - k + 1 is a positive odd integer, update the answer.
Simple implementation.
int answer = -1;
for (int k = 1;; k++) {
int i = target / k - k + 1;
if (i <= 0) {
break;
}
// Check if calculated i, can be the start of 'odd' sequence.
if (target % k == 0 && i % 2 == 1) {
answer = k;
}
}
The running time of this algorithm is O(sqrt(target)).
Looking at the pattern:
For 1 summand, i = target
For 2 summands, the equation is 2*i + 2 = target, so i = (target - 2) / 2
For 3 summands, the equation is 3*i + 6 = target, so i = (target - 6) / 3
For 4 summands, the equation is 4*i + 12 = target, so i = (target - 12) / 4
etc. Clearly i must be an odd integer in all cases.
You could work out the general expression for n summands, and simplify it to show you an algorithm, but you might be able to see an algorithm already...
Applying #rossum's suggestion:
For 1 summand, 2m + 1 = target
For 2 summands, 2m + 1 = (target - 2) / 2, so m = (target - 4) / 4
For 3 summands, 2m + 1 = (target - 6) / 3, so m = (target - 9) / 6
For 4 summands, 2m + 1 = (target - 12) / 4, so m = (target - 16) / 8
The sum of a sequence of n odd integers, can be calculated as the average value (midpoint m) multiplied by the number of values (n), so:
sum = 5 + 7 + 9 = m * n = 7 * 3 = 21
sum = 5 + 7 + 9 + 11 = m * n = 8 * 4 = 32
If n is odd then m will be odd, and if n is even then m will be even.
The first and last numbers of the sequence can be calculated as:
first = m - n + 1 = 8 - 4 + 1 = 5
last = m + n - 1 = 8 + 4 - 1 = 11
Other interesting formulas:
m = sum / n
m = (first + last) / 2
last = first + (n - 1) * 2 = first + 2 * n - 2
m = (first + first + 2 * n - 2) / 2 = first + n - 1
The longest sequence would have to start with the lowest possible first value, meaning 1, so we get:
sum = m * n = (first + n - 1) * n = n * n
Which means that the longest sequence of any given sum can at most be sqrt(sum) long.
So starting at sqrt(sum), and searching down until we find a valid n:
/**
* Returns length of sequence, or 0 if no sequence can be found
*/
private static int findLongestConsecutiveOddIntegers(int sum) {
for (int n = (int)Math.sqrt(sum); n > 1; n--) {
if (sum % n == 0) { // m must be an integer
int m = sum / n;
if ((n & 1) == (m & 1)) // If n is odd, mid must be odd. If n is even, m must be even.
return n;
}
}
return 0;
}
Result:
n = findLongestConsecutiveOddIntegers(160701) = 391
m = sum / n = 160701 / 391 = 411
first = m - n + 1 = 411 - 391 + 1 = 21
last = m + n - 1 = 411 + 391 - 1 = 801
Since sqrt(160701) = 400.875..., the result was found in 10 iterations (400 to 391, inclusive).
Conclusion:
Largest Amount of Consecutive Odd Integers to Equal 160701: 391
21 + 23 + 25 + ... + 799 + 801 = 160701
I'm having a great deal of trouble trying to figure this question out, and the root of that trouble is creating an algorithm of O(n) complexity. Here's the question I'm struggling with:
An Array A of length n contains integers from the range [0, .., n - 1]. However, it only contains n - 1 distinct numbers. So one of the numbers is missing and another number is duplicated. Write a Java method that takes A as an input argument and returns the missing number; the method should run in O(n).
For example, when A = [0, 2, 1, 2, 4], oddOneOut() should return 3; when A = [3, 0, 0, 4, 2, 1], oddOneOut() should return 5.
Obviously this is an easy problem to solve with an O(n2) algorithm, (and most likely O(n), I'm just not seeing it!). I've attempted to solve it using all manner of methods, but to no avail. I'm attempting to solve it in Java, but if you're more comfortable solving it Python, that would be fine as well.
Thank you in advance...
Suppose the number missing is x and the duplicate is y. If you add all numbers, the sum will be:
(n - 1) * n / 2 - x + y
From the above, you can find (x - y).....(1)
Similarly, sum the squares of the numbers. The sum will then be:
(n - 1) * n * (2 * n - 1) / 6 - x2 + y2
From the above you get (x2 - y2)....(2)
(2) / (1) gives (x + y).....(3)
(1) + (3) gives 2 * x and you can thereby find x and y.
Note that in this solution there is O(1) extra storage and is O(n) time complexity. The other solutions above are unnecessarily O(n) extra storage.
Code in mixed C/C++ for some more clarity:
#include <stdio.h>
int findDup(int *arr, int n, int& dup, int& missing)
{
int sum = 0;
int squares = 0;
for (int i = 0; i < n; i++) {
sum += arr[i];
squares += arr[i] * arr[i];
}
sum = (n - 1) * n / 2 - sum; // x - y
squares = (n - 1) * n * (2 * (n - 1) + 1) / 6 - squares; // x^2 - y^2
if (sum == 0) {
// no duplicates
missing = dup = 0;
return -1;
}
missing = (squares / sum + sum) / 2; // ((x^2 - y^2) / (x - y) + (x - y)) / 2 = ((x + y) + (x - y)) / 2 = x
dup = missing - sum; // x - (x - y) = y
return 0;
}
int main(int argc, char *argv[])
{
int dup = 0;
int missing = 0;
int a[] = {0, 2, 1, 2, 4};
findDup(a, sizeof(a) / sizeof(int), dup, missing);
printf("dup = [%d], missing = [%d]\n", dup, missing);
int b[] = {3, 0, 0, 4, 2, 1};
findDup(b, sizeof(b) / sizeof(int), dup, missing);
printf("dup = [%d], missing = [%d]\n", dup, missing);
return 0;
}
Output:
dup = [2], missing = [3]
dup = [0], missing = [5]
Some python code:
def finddup(lst):
sum = 0
sumsq = 0
missing = 0
dup = 0
for item in lst:
sum = sum + item
sumsq = sumsq + item * item
n = len(a)
sum = (n - 1) * n / 2 - sum
sumsq = (n - 1) * n * (2 * (n - 1) + 1) / 6 - sumsq
if sum == 0:
return [-1, missing, dup]
missing = ((sumsq / sum) + sum) / 2
dup = missing - sum
return [0, missing, dup]
found, missing, dup = finddup([0, 2, 1, 2, 4])
if found != -1:
print "dup = " + str(dup) + " missing = " + str(missing)
print finddup([3, 0, 0, 4, 2, 1])
Outputs:
dup = 2 missing = 3
[-1, 0, 0]
Iterate over the array twice: That is still O(n). Create a temporary array of booleans (or a Java BitSet) to hold which numbers you got. Second time you do the loop, check if there is a hole in the array of booleans.
Use a hash set and take a single pass to detect which number is the duplicate. During the same iteration, track the cumulative sum of all the numbers.
Now calculate the expected total if all the numbers were distinct: n * (n - 1) / 2. Subtract the total you found. You will be left with the "missing" number minus the duplicate. Add the duplicate back to get your answer.
Since hash table access is constant time and we're using a single pass, this is O(n). (Note that a single pass isn't strictly necessary: Martijn is correct in noting that a fixed number of passes is still linear complexity.)
This might be of interest, although I'm not certain under what conditions (if any) it performs best. The idea is that we're going to move each element into its correct place in the array (0 to index 0, etc), until it becomes clear what is missing and what is extra.
def findmissing(data):
upto = 0
gap = -1
while upto < len(data):
#print data, gap
if data[upto] == upto:
upto += 1
continue
idx = data[upto]
if idx is None:
upto += 1
continue
data[upto], data[idx] = data[idx], data[upto]
if data[upto] == data[idx]:
print 'found dupe, it is', data[upto]
data[upto] = None
gap = upto
upto += 1
elif data[upto] is None:
gap = upto
return gap
if __name__ == '__main__':
data = range(1000)
import random
missing = random.choice(data)
print missing
data[missing] = data[0]
data[0] = random.choice(data[1:])
random.shuffle(data)
print 'gap is', findmissing(data)
It's O(n) because every step either increments upto or moves a value into its "correct" place in the array, and each of those things can only happen n times.