Performance when searching for missing numbers in array - java

Given is a list containing all but 2 numbers between 1-20 (randomly ordered).
I need to find those 2 numbers.
This is the (working) program I came up with:
public static void main(String[] args) {
int[] x= {1,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19};
ArrayList al= new ArrayList();
Map map= new HashMap();
for(int i=0;i<x.length;i++)
{
map.put(x[i], x[i]);
}
for(int i=1;i<=20;i++)
{
if(map.get(i)==null)
al.add(i);
}
for(int i=0;i<al.size();i++)
{
System.out.println(al.get(i));
}
}
I would like to know if the program is good from a performance point of view (memory and bigO(n))?

You don't need a map. Just an additional boolean array with size 20.
for (int i = 0; i < input.length; i++)
arr[input[i]] = true;
for (int i = 1; i <= 20; i++)
if (arr[i] == false) {
//number `i` is missing
}
Now I will expose a straightforward math solution.
First sum all numbers in the array. For example you have 5, 1, 4 for the numbers from 1, 2, 3, 4, 5. So 2 and 3 are missing. We can find them easily with math.
5 + 1 + 4 = 10
1 + 2 + 3 + 4 + 5 = 15
So we know x + y = 15 - 10 = 5
Now we will get a second equation:
1 * 4 * 5 = 20
1 * 2 * 3 * 4 * 5 = 120
=> x * y = 120 / 20 = 6
So:
x + y = 5
x * y = 6
=> x = 2, y = 3 or x = 3, y = 2 which is the same.
So x = 2, y = 3

Another option would be to use a BitSet where each bit is set for the corresponding number in the array.

Your code runs at O(n) due to the map.put operation. The for loop below that will run at O(n) at the worst case too, so in total, your whole function runs at O(n).
You can optimise your code further. For example, you are using additional memory. To improve on this, you need to come up with 2 eqns to deduce missing numbers x and y.
Eqn1:
Summation(1 till 20) = n*(n+1)/2
Add all numbers in array and store in temp.
x+y = n*(n+1)/2 - temp
Eqn2:
Multiply(1 till 20) = n!
Multiply all numbers in array and store in temp.
x*y = temp / n!
Solve the equations to get x and y.
So this will run O(n) without much memory.

It should not be anything worse than linear time O(n): one run to populate a flag array, and a second run to check the flag array.

Related

My code returns an error when trying to split a vector

I have a problem with a Java code and I can't solve it, I have a Vector with 3.529 .txt files and I wanted to split it into 6 other vectors. I'm using the code below to split the vector. What am I doing wrong?
int cut = filesToProcess.size() / 6;
List<List<File>> arrays = new ArrayList<>();
for (int i = 0; i < filesToProcess.size(); i = i + cut) {
arrays.add(filesToProcess.subList(i, i + cut)); //line 35
}
arrays.add(filesToProcess.subList(i, i + cut));
In you for statement, i < fileToProcess.size(), so if i == fileToProcess.size() - 1, i + cut is definitely out of range.
The idea to solve this problem is to change the for statement to:
i < filesToProcess.size() - cut.
Edit: as passer-by said, divide int may have decimal loss so you can’t cut perfectly.
There are two issues with your approach:
3529 / 6 is not even so you'll end up with 7 vectors instead of 6. So you need to round up your cut with int cut = (int) Math.ceil((double) filesToProcess.size() / 6);
When you have uneven division, your last vector won't have the same number of items as the others so you will always get IndexOutOfBounds. This is clearly stated in your exception: "toIndex: 4116" where .subList() is trying to get items up to index 4116. To fix this, you need to check the "to' part of .subList() to make sure it's not too large.
int cut = (int) Math.ceil((double)items.size() / 6);
List<List<File>> arrays = new ArrayList<>();
for (int i = 0; i < filesToProcess.size(); i = i + cut) {
arrays.add(filesToProcess.subList(i, Math.min(i + cut, filesToProcess.size())));
}
With this, you should get the following split:
Vector = 1 => size = 589
Vector = 2 => size = 589
Vector = 3 => size = 589
Vector = 4 => size = 589
Vector = 5 => size = 589
Vector = 6 => size = 584

Meaning of the formula how to find lost element in array?

The task is to find lost element in the array. I understand the logic of the solution but I don't understand how does this formula works?
Here is the solution
int[] array = new int[]{4,1,2,3,5,8,6};
int size = array.length;
int result = (size + 1) * (size + 2)/2;
for (int i : array){
result -= i;
}
But why we add 1 to total size and multiply it to total size + 2 /2 ?? In all resources, people just use that formula but nobody explains how that formula works
The sum of the digits 1 thru n is equal to ((n)(n+1))/2.
e.g. for 1,2,3,4,5 5*6/2 = 15.
But this is just a quick way to add up the numbers from 1 to n. Here is what is really going on.
The series computes the sum of 1 to n assuming they all were present. But by subtracting each number from that sum, the remainder is the missing number.
The formula for an arithmetic series of integers from k to n where adjacent elements differ by 1 is.
S[k,n] = (n-k+1)(n+k)/2
Example: k = 5, n = 10
S[k,n] = 5 6 7 8 9 10
S[k,n] = 10 9 8 7 6 5
S[k,n] = (10-5+1)*(10+5)/2
2S[k,n] = 6 * 15 / 2
S[k,n] = 90 / 2 = 45
For any single number missing from the sequence, by subtracting the others from the sum of 45, the remainder will be the missing number.
Let's say you currently have n elements in your array. You know that one element is missing, which means that the actual size of your array should be n + 1.
Now, you just need to calculate the sum 1 + 2 + ... + n + (n+1).
A handy formula for computing the sum of all integers from 1 up to k is given by k(k+1)/2.
By just replacing k with n+1, you get the formula (n+1)(n+2)/2.
It's simple mathematics.
Sum of first n natural numbers = n*(n+1)/2.
Number of elements in array = size of array.
So, in this case n = size + 1
So, after finding the sum, we are subtracting all the numbers from array individually and we are left with the missing number.
Broken sequence vs full sequence
But why we add 1 to total size and multiply it to total size + 2 /2 ?
The amount of numbers stored in your array is one less than the maximal number, as the sequence is missing one element.
Check your example:
4, 1, 2, 3, 5, 8, 6
The sequence is supposed to go from 1 to 8, but the amount of elements (size) is 7, not 8. Because the 7 is missing from the sequence.
Another example:
1, 2, 3, 5, 6, 7
This sequence is missing the 4. The full sequence would have a length of 7 but the above array would have a length of 6 only, one less.
You have to account for that and counter it.
Sum formula
Knowing that, the sum of all natural numbers from 1 up to n, so 1 + 2 + 3 + ... + n can also be directly computed by
n * (n + 1) / 2
See the very first paragraph in Wikipedia#Summation.
But n is supposed to be 8 (length of the full sequence) in your example, not 7 (broken sequence). So you have to add 1 to all the n in the formula, receiving
(n + 1) * (n + 2) / 2
I guess this would be similar to Missing Number of LeetCode (268):
Java
class Solution {
public static int missingNumber(int[] nums) {
int missing = nums.length;
for (int index = 0; index < nums.length; index++)
missing += index - nums[index];
return missing;
}
}
C++ using Bit Manipulation
class Solution {
public:
int missingNumber(vector<int> &nums) {
int missing = nums.size();
int index = 0;
for (int num : nums) {
missing = missing ^ num ^ index;
index++;
}
return missing;
}
};
Python I
class Solution:
def missingNumber(self, nums):
return (len(nums) * (-~len(nums))) // 2 - sum(nums)
Python II
class Solution:
def missingNumber(self, nums):
return (len(nums) * ((-~len(nums))) >> 1) - sum(nums)
Reference to how it works:
The methods have been explained in the following links:
Missing Number Discussion
Missing Number Solution

How would I loop over the permutations of N numbers with a given range, preferably without recursion?

I have N numbers, and a range, over which I have to permute the numbers.
For example, if I had 3 numbers and a range of 1-2, I would loop over 1 1 1, 1 1 2, 1 2 1, etc.
Preferably, but not necessarily, how could I do this without recursion?
For general ideas, nested loops don't allow for an arbitrary number of numbers, and recursion is undesireable due to high depth (3 numbers over 1-10 would be over 1,000 calls to the section of code using those numbers)
One way to do this, is to loop with one iteration per permuation, and use the loop variable to calculate the values that a permuation is made off. Consider that the size of the range can be used as a modulo argument to "chop off" a value (digit) that will be one of the values (digits) in the result. Then if you divide the loop variable (well, a copy of it) by the range size, you repeat the above operation to extract another value, ...etc.
Obviously this will only work if the number of results does not exceed the capacity of the int type, or whatever type you use for the loop variable.
So here is how that looks:
int [][] getResults(int numPositions, int low, int high) {
int numValues = high - low + 1;
int numResults = (int) Math.pow(numValues, numPositions);
int results[][] = new int [numResults][numPositions];
for (int i = 0; i < numResults; i++) {
int result[] = results[i];
int n = i;
for (int j = numPositions-1; j >= 0; j--) {
result[j] = low + n % numValues;
n /= numValues;
}
}
return results;
}
The example you gave in the question would be generated with this call:
int results[][] = getResults(3, 1, 2);
The results are then:
1 1 1
1 1 2
1 2 1
1 2 2
2 1 1
2 1 2
2 2 1
2 2 2

Why does my code fail the hidden input test cases?

This is the problem to be solved:
John is assigned a new task today. He is given an array A containing N integers. His task is to update all elements of array to some minimum value x , that is, A[i] = x; 1 <= i <= N; such that sum of this new array is strictly greater than the sum of the initial array.
Note that x should be as minimum as possible such that the sum of the new array is greater than the sum of the initial array.
Input Format:
First line of input consists of an integer N denoting the number of elements in the array A.
Second line consists of N space separated integers denoting the array elements.
Output Format:
The only line of output consists of the value of x.
Sample Input:
5
12345
Sample Output:
4
Explanation:
Initial sum of array= 1 + 2 + 3 + 4 + 5 = 15
When we update all elements to 4, sum of array = 4 + 4 + 4 + 4 + 4 = 20 which is greater than 15.
Note that if we had updated the array elements to 3, sum = 15 which is not greater than 15. So, 4 is the minimum value to which array elements need to be updated.
** ==> Here is my code. How can I improve it? or What is the problem in this code? **
import java.util.Scanner;
public class Test2 {
public static void main(String []args){
Scanner s=new Scanner(System.in);
int check=0, sum=0, biggest=0;
int size=s.nextInt();
if(size>=1 && size<=100000) {
int[] arr=new int[size];
for(int i=0; i<size; i++){
int temp=s.nextInt();
if(temp>=1 && temp<=1000) {
arr[i] = temp;
biggest=biggest > temp ? biggest:temp;
sum=sum+temp;
}
else break;
}
for(int i=1; i<biggest; i++){
check=(size*i)>sum ? i:0;
}
System.out.print(check);
}
else System.err.print("Invalid input size");
}
}
Issue:
for(int i=1; i<biggest; i++){
check=(size*i)>sum ? i:0;
}
There are 2 problems with this, hence it doesn't work. They are as follows-
(size*i)>sum ? i - The problem statement states that it needs minimum possible sum greater than sum of array of elements. Your code blindly assigns i to check without checking the minimality.
check=(size*i)>sum ? i:0 - So, even if you had come across some integer previously, you lost it because you assigned it to 0 if the condition is not satisfied.
I will share my idea of how would I go about this.
Approach 1
Sum all elements like you did.
Now, take average of elements - sum / size of the array. Let's say we store it in a variable average.
Print average + 1 as your answer, as that is the value that could give you minimum possible sum > sum of array itself.
Time Complexity: O(n), where n is size of the array.
Space Complexity: O(1)
Approach 2
Sum all elements like you did.
Calculate min and max for the array and store it in variables, say mini and maxi.
Now, do a binary search between mini and maxi and keep checking the minimum sum > sum criteria.
In this process, you will have variables like low, mid and high.
low = mini,high = maxi
while low <= high:
mid = low + (high - low) / 2
If mid * size <= sum,
low = mid + 1
else
high = mid - 1
Now, print low as your answer.
Let range = maxi - mini.
Time Complexity: O(n) + O(log(range)) = O(n) asymptotically, where n is size of the array.
Space Complexity: O(1)
Not sure if I completely followed what your attempt was, but there should be a pretty straightfoward solution. You know the size of the array and you can easily iterate through the array to get the value of the elements stored in it. All you need to do to find your min x is to take sumOfArray/size of array and then add one to the result to make your result higher.
In your example 15/5=3. 3+1 = 4 so that's your answer. If the numbers summed to 43, 43/5 = 8 r 3, so your answer is 9 (9*5=45). Etc.
When trying some other test cases, then the results are wrong. Try:
Input:
5
1 1 1 1 5
Expected Output: 2 Actual Output: 4
and
Input:
5
5 5 5 5 5
Expected Output: 6 Actual Output: 0

Getting median out of frequency table (counting sort)

I can't understand the logic behind getMedian method. How exactly median is evaluated, what is the connection between count of elements and sum of elements? Appreciate if someone could explain it's logic.
public static void main(String[] args) {
Random r = new Random();
int[] ar = r.ints(0, 100).limit(9).toArray();
int k = ar.length;
int[] count = getCounts(ar);
double median = getMedian(count, k);
System.out.println(median);
}
private static int[] getCounts(int[] ar) {
int[] count = new int[100];
for (int i = 0; i < ar.length; i++) {
count[ar[i]]++;
}
return count;
}
private static double getMedian(int[] count, int d) {
int sum = 0;
for (int i = 0; i < count.length; i++) {
sum += count[i];
if (2 * sum < d)
continue;
else if (2 * sum == d)
return (2 * i + 1) / 2.0;
else
return i * 1.0;
}
return -1.0;
}
There is a relation because it is a frequency table. You are thinking it differently but let me give you an example.
1 1 1 3 3 4 4 4 5 5 5 5 if this is the array then the frequency table would be :-
1 3 4 5
- - - -
3 2 3 4
So this is median.
So now I am adding every element count and asking us the question, where does the median lie? or where is that indexm which if I consider I will cover the middle element?
Now here I am checking if sum > d/2 then it's done. We found the median.else if it is less then I still have to traverse other elements to get to the middle of the array. And if it is sum==d/2 then we have found it but we have to send the correct position. And we simply send the one in the lower middle (happens in case like 1,1,1,1).
Walk through
1 1 1 3 3 4 4 4 5 5 5 5
Now I check if I traverse all set of 1's where I am? I covered 3 elements. But it's not the half of the total numbers(6).
Now add number of 3's. 5. This is also not.
Now I add number of 4's, So 8 elements I covered. So I covered more than half of number of elements. So median lies here.
More detailed explanation:
You are asked to find the median of an array of 10 integers.
[1 2 3 4 5 6 7 8 9]
Then median is in element at position floor(9/2)=4, which is 5 Right?
[1 1 2 2 3 3 4 4 5]
Where is the median element at position floor(9/2)=4, which is 3. Right?
So now think this,
1 2 3 4 5
2 2 2 2 1
Now you will try to find the floor(9/2) th element here starting from beginning. And that's why you need to find the sum of the frequencies and all.
Hope you get it?
Correct algorithm
What you need to do is :-
N = number of elements.
F[] = frequency array
so if N is odd
find the element at floor(N/2)-th place and median is that element.
else
find the element at floor((N-1)/2) and floor(N/2) th position and return their average.
Finding the element is simple:
Find( F[], p) // find the element at position p
{
p=p+1
for i in [0..|F|]
cumulative+=F[i]
if cumulative == p
return this element.
else cumulative >p
return this element
}

Categories