Related
Given an array of size n and k, how do you find the mode for every contiguous subarray of size k?
For example
arr = 1 2 2 6 6 1 1 7
k = 3
ans = 2 2 6 6 1 1
I was thinking of having a hashmap where the key is no and value is frequency, treemap where the key is freq and value is number, and having a queue to remove the first element when the size > k. Here the time complexity is o(nlog(n)). Can we do this in O(1)?.
This can be done in O(n) time
I was intrigued by this problem in part because, as I indicated in the comments, I felt certain that it could be done in O(n) time. I had some time over this past weekend, so I wrote up my solution to this problem.
Approach: Mode Frequencies
The basic concept is this: the mode of a collection of numbers is the number(s) which occur with the highest frequency within that set.
This means that whenever you add a number to the collection, if the number added was not already one of the mode-values then the frequency of the mode would not change. So with the collection (8 9 9) the mode-values are {9} and the mode-frequency is 2. If you add say a 5 to this collection ((8 9 9 5)) neither the mode-frequency nor the mode-values change. If instead you add an 8 to the collection ((8 9 9 8)) then the mode-values change to {9, 8} but the mode-frequency is still unchanged at 2. Finally, if you instead added a 9 to the collection ((8 9 9 9)), now the mode-frequency goes up by one.
Thus in all cases when you add a single number to the collection, the mode-frequency is either unchanged or goes up by only one. Likewise, when you remove a single number from the collection, the mode-frequency is either unchanged or goes down by at most one. So all incremental changes to the collection result in only two possible new mode-frequencies. This means that if we had all of the distinct numbers of the collection indexed by their frequencies, then we could always find the new Mode in a constant amount of time (i.e., O(1)).
To accomplish this I use a custom data structure ("ModeTracker") that has a multiset ("numFreqs") to store the distinct numbers of the collection along with their current frequency in the collection. This is implemented with a Dictionary<int, int> (I think that this is a Map in Java). Thus given a number, we can use this to find its current frequency within the collection in O(1).
This data structure also has an array of sets ("freqNums") that given a specific frequency will return all of the numbers that have that frequency in the current collection.
I have included the code for this data structure class below. Note that this is implemented in C# as I do not know Java well enough to implement it there, but I believe that a Java programmer should have no trouble translating it.
(pseudo)Code:
class ModeTracker
{
HashSet<int>[] freqNums; //numbers at each frequency
Dictionary<int, int> numFreqs; //frequencies for each number
int modeFreq_ = 0; //frequency of the current mode
public ModeTracker(int maxFrequency)
{
freqNums = new HashSet<int>[maxFrequency + 2];
// populate frequencies, so we dont have to check later
for (int i=0; i<maxFrequency+1; i++)
{
freqNums[i] = new HashSet<int>();
}
numFreqs = new Dictionary<int, int>();
}
public int Mode { get { return freqNums[modeFreq_].First(); } }
public void addNumber(int n)
{
int newFreq = adjustNumberCount(n, 1);
// new mode-frequency is one greater or the same
if (freqNums[modeFreq_+1].Count > 0) modeFreq_++;
}
public void removeNumber(int n)
{
int newFreq = adjustNumberCount(n, -1);
// new mode-frequency is the same or one less
if (freqNums[modeFreq_].Count == 0) modeFreq_--;
}
int adjustNumberCount(int num, int adjust)
{
// make sure we already have this number
if (!numFreqs.ContainsKey(num))
{
// add entries for it
numFreqs.Add(num, 0);
freqNums[0].Add(num);
}
// now adjust this number's frequency
int oldFreq = numFreqs[num];
int newFreq = oldFreq + adjust;
numFreqs[num] = newFreq;
// remove old freq for this number and and the new one
freqNums[oldFreq].Remove(num);
freqNums[newFreq].Add(num);
return newFreq;
}
}
Also, below is a small C# function that demonstrates how to use this datastructure to solve the problem originally posed in the question.
int[] ModesOfSubarrays(int[] arr, int subLen)
{
ModeTracker tracker = new ModeTracker(subLen);
int[] modes = new int[arr.Length - subLen + 1];
for (int i=0; i < arr.Length; i++)
{
//add every number into the tracker
tracker.addNumber(arr[i]);
if (i >= subLen)
{
// remove the number that just rotated out of the window
tracker.removeNumber(arr[i-subLen]);
}
if (i >= subLen - 1)
{
// add the new Mode to the output
modes[i - subLen + 1] = tracker.Mode;
}
}
return modes;
}
I have tested this and it does appear to work correctly for all of my tests.
Complexity Analysis
Going through the individual steps of the `ModesOfSubarrays()` function:
The new ModeTracker object is created in O(n) time or less.
The modes[] array is created in O(n) time.
The For(..) loops N times:
. 3a: the addNumber() function takes O(1) time
. 3b: the removeNumber() function takes O(1) time
. 3c: getting the new Mode takes O(1) time
So the total time is O(n) + O(n) + n*(O(1) + O(1) + O(1)) = O(n)
Please let me know of any questions that you might have about this code.
Given an array of n elements, where every element is in the range of 2 to 10^5. Now, if we paint the elements of the array such that for every m(m <= n) consecutive elements no two elements have the same color. How do I pick M distinct elements (not necessarily consecutive) such that no two of the chosen elements have the same colour and the difference between the largest element and smallest element among the choosen elements is the smallest possible?
Ex: for n = 4, A={10 20 15 28} m = 2, we can paint the elements as R G R G or G R G R. In both cases, if we pick any m consecutive elements no two elements have the same color like R G or G R or R G. There are 4 ways to pick 2 elements 10 20 or 10 28 or 20 15 or 15 28. but 20 - 15 = 5 and this is the best answer.
** duplicates allowed in array
My approach to this is to initially put all like colour elements in seperate arrays. Like in the example above i can do:[[10,15][20,28]] 10 15 are R, 20 28 are G. then i use recursion on every element of R and try all comibonations with consecutive colours.
void recurse(List<List<Integer>> bs, int max, int min, int depth) {
if(depth == bs.size()) {
int diference = max - min;
// compare diff with old res here
return;
}
for(int i=0;i<bs.get(depth).size();++i) {
int newMax = Math.max(max,bs.get(depth).get(i));
int newMin = Math.min(min,bs.get(depth).get(i));
recurse(bs, newMax, newMin, depth+1);
}
}
This is not wrong and does produces the correct result. But Im looking for a faster algorithm. Expected time complexity is O(n) or in better words i want to pass every test cases in 1 second. Note that 2 <= m <= n <= 10^5
We can solve this in O(n log n) time and O(n) space. First notice that any assigned colour must be a distance of m elements from its neighbours of the same colour or we would invalidate the constraints.
Separate each such list of elements of the same colour (defined only by their distance from each other) into its own list and sort it.
Now merge all the m sorted lists into one sorted list where each value is also paired with a label to the colour of the list it came from (the merged list could be of tuples, for example).
(Alternatively, we could first create the entire labeled list and just sort that.)
Iterate over the sorted, labeled list with a sliding window of size m, allowing only one element of each colour to stay in the window at any one time. (We could use a hash map or simple array to track the window. Remember that the window in this case is of unique labels, not a consecutive subarray of the labeled list.) Update the smallest range existing in the window during the iteration to determine the answer.
I think you could order the numbers (but keeping track of their colors), and then walk through the result from its start, first growing a candidate to have all the colors present (so the head will cover an unique color in the sublist), then shrinking it so repeated colors are thrown from the tail (so it points at a unique color too), then check if it is the best candidate so far, then throw away the tail (so that color will be missing), and proceed again with head:
import java.util.Arrays;
import java.util.List;
import java.util.Random;
public class NewClass {
public static void doThing(int nums[],int m){
int n=nums.length;
ColorNumber l[]=new ColorNumber[n];
for(int i=0;i<n;i++)
l[i]=new ColorNumber(nums[i], i%m);
System.out.println(Arrays.asList(l));
Arrays.sort(l, null);
List printlist=Arrays.asList(l);
System.out.println(printlist);
int present[]=new int[m];
int head=0,tail=0;
int minhead=0,mintail=0,mindiff=Integer.MAX_VALUE;
while(head<n){
System.out.println("try growing");
int i=0;
while(i<m && head<n){
while(present[i]==0 && head<n){
present[l[head].color]++;
head++;
}
//if(present[i]>0)i++; // the bug
while(i<m && present[i]>0)i++; // the fix
}
if(i==m){
System.out.println(printlist.subList(tail, head));
System.out.println("try shrinking");
while(present[l[tail].color]>1){
present[l[tail].color]--;
tail++;
}
int diff=l[head-1].number-l[tail].number;
System.out.println(printlist.subList(tail, head)+" diff: "+diff);
if(diff<mindiff){minhead=head;mintail=tail;mindiff=diff;}
present[l[tail].color]--;
tail++;
}
}
System.out.println("min: "+mindiff+", "+printlist.subList(mintail, minhead));
}
static class ColorNumber implements Comparable<ColorNumber>{
final int number;
final int color;
public ColorNumber(int number, int color) {
this.number = number;
this.color = color;
}
#Override
public int compareTo(ColorNumber o) {
return number-o.number;
}
#Override
public String toString() {
return number+"("+color+")";
}
}
public static void main(String args[]){
Random r=new Random(0);
int nums[]=new int[10];
for(int i=0;i<nums.length;i++)
nums[i]=r.nextInt(100);
doThing(nums, 3);
System.out.println("---");
doThing(new int[]{10,20,15,28},2);
System.out.println("---");
doThing(new int[] {2,1},2); // test case for bug
}
}
The output (one 3-color constant random sequence - because a seed is provided -, your 2-color example and the test case for the bug you fixed):
[60(0), 48(1), 29(2), 47(0), 15(1), 53(2), 91(0), 61(1), 19(2), 54(0)]
[15(1), 19(2), 29(2), 47(0), 48(1), 53(2), 54(0), 60(0), 61(1), 91(0)]
try growing
[15(1), 19(2), 29(2), 47(0)]
try shrinking
[15(1), 19(2), 29(2), 47(0)] diff: 32
try growing
[19(2), 29(2), 47(0), 48(1)]
try shrinking
[29(2), 47(0), 48(1)] diff: 19
try growing
[47(0), 48(1), 53(2)]
try shrinking
[47(0), 48(1), 53(2)] diff: 6
try growing
[48(1), 53(2), 54(0)]
try shrinking
[48(1), 53(2), 54(0)] diff: 6
try growing
[53(2), 54(0), 60(0), 61(1)]
try shrinking
[53(2), 54(0), 60(0), 61(1)] diff: 8
try growing
min: 6 [47(0), 48(1), 53(2)]
---
[10(0), 20(1), 15(0), 28(1)]
[10(0), 15(0), 20(1), 28(1)]
try growing
[10(0), 15(0), 20(1)]
try shrinking
[15(0), 20(1)] diff: 5
try growing
min: 5 [15(0), 20(1)]
---
[2(0), 1(1)]
[1(1), 2(0)]
try growing
[1(1), 2(0)]
try shrinking
[1(1), 2(0)] diff: 1
min: 1, [1(1), 2(0)]
In the output only the color of the lowest and the highest value is going to be unique, the in-between elements can be picked at will as they do not contribute to the difference (this code outputs them all like in case of the last attempt in the first sequence ([53(2), 54(0), 60(0), 61(1)])). If a specific output is needed, some Set could be used, or a for loop over the colors, printing only one (the first one it encounters) element for each color (and skipping the rest with a simple break).
My algorithm to find the maximum number of unique integers among all possible contiguous subarrays doesn't work for larger amounts of Integers and subarrays.
For instance, I have to read a total amount of 6 Integers from the console and each subarray has a size of 3.
So, for this kind of input 5 3 5 2 3 2
my program should print 3 and this works fine.
The first subarray stores 5 3 5 so the number of unique Integers is 2.
The second subarray stores 3 5 2 so the number of unique Integers is 3.
The third subarray would also print 3 because it stores 5 2 3 and so on...
But, it seems like my algorithm can't handle a total amount of 100000 Integers with a subarray size of 99877.
Can anyone explain me, what I have done wrong?
FYI: I have to use a Deque implementation like LinkedList or ArrayDeque
for (int i = 0; i < totalAmountOfIntegers; i++) {
int anyIntegerNumber = consoleInput.nextInt();
arrayDequeToStoreAllIntegers.addLast(anyIntegerNumber);
hashSetToStoreUniqueIntegers.add(anyIntegerNumber);
if (arrayDequeToStoreAllIntegers.size() == sizeOfEachArrayDequeAsSubArray) {
if (hashSetToStoreUniqueIntegers.size() > quantityOfUniqueIntegersInSubarray) {
quantityOfUniqueIntegersInSubarray = hashSetToStoreUniqueIntegers.size();
}
int firstNumberInDeque = arrayDequeToStoreAllIntegers.remove();
if (hashSetToStoreUniqueIntegers.size() == sizeOfEachArrayDequeAsSubArray) {
hashSetToStoreUniqueIntegers.remove(firstNumberInDeque);
}
}
}
The answer would be simply the unique integers in the whole array, since the array is the superset of all subarrays, all numbers would be present in it
Just find how many unique element exist
To be honest, i don't understand your algorithm. I don't really get what the variables are referring to (although they seem to be named in a semantic way).
But what about this:
import java.util.HashSet;
import java.util.Set;
public class UniqueIntegers {
public static void main(String[] args) {
UniqueIntegers ui = new UniqueIntegers();
Integer[][] integers = {
{3,5,3,4,6},
{1,6,3,2,4},
{2,3,4},
{3,3,6,9,2}
};
Set<Integer> unique = ui.uniqueIntegers(integers);
System.out.println("Unique Integers: " + unique.size());
System.out.println("Integers: " + unique);
}
private Set<Integer> uniqueIntegers(Integer[][] ints){
Set<Integer> result = new HashSet<Integer>();
for (Integer[] iSub : ints){
for (Integer i : iSub){
result.add(i);
}
}
return result;
}
}
It prints:
Unique Integers: 7
Integers: [1, 2, 3, 4, 5, 6, 9]
After a day of researching, I have found my mistake.
My third IF-Statement is wrong. I am comparing, if the size of my HashSet variable is equal to the maximum size of elements each subarray can hold.
Instead, I should compare, if my int variable firstNumberInDeque, which I remove first from my ArrayDeque variable, contains another int variable with the same value. So if this is true, my HashSet variable remains unchanged.
But, if my ArrayDeque variable doesn't contain another int with the same value of firstNumberInDeque than firstNumberInDeque should be removed from my HashSet variable.
Here is the right code:
int firstNumberInDeque = arrayDequeToStoreAllIntegers.remove();
if (!arrayDequeToStoreAllIntegers.contains(firstNumberInDeque)) {
hashSetToStoreUniqueIntegers.remove(firstNumberInDeque);
}
I found this problem online:
You have N tonnes of food and K rooms to store them into. Every room has a capacity of M. In how many ways can you distribute the food in the rooms, so that every room has at least 1 ton of food.
My approach was to recursively find all possible variations that satisfy the conditions of the problem. I start with an array of size K, initialized to 1. Then I keep adding 1 to every element of the array and recursively check whether the new array satisfies the condition. However, the recursion tree gets too large too quickly and the program takes too long for slightly higher values of N, K and M.
What would be a more efficient algorithm to achieve this task? Are there any optimizations to be done to the existing algorithm implementation?
This is my implementation:
import java.util.Arrays;
import java.util.HashSet;
import java.util.Scanner;
public class Main {
// keeping track of valid variations, disregarding duplicates
public static HashSet<String> solutions = new HashSet<>();
// calculating sum of each variation
public static int sum(int[] array) {
int sum = 0;
for (int i : array) {
sum += i;
}
return sum;
}
public static void distributionsRecursive(int food, int rooms, int roomCapacity, int[] variation, int sum) {
// if all food has been allocated
if (sum == food) {
// add solution to solutions
solutions.add(Arrays.toString(variation));
return;
}
// keep adding 1 to every index in current variation
for (int i = 0; i < rooms; i++) {
// create new array for every recursive call
int[] tempVariation = Arrays.copyOf(variation, variation.length);
// if element is equal to room capacity, can't add any more in it
if (tempVariation[i] == roomCapacity) {
continue;
} else {
tempVariation[i]++;
sum = sum(tempVariation);
// recursively call function on new variation
distributionsRecursive(food, rooms, roomCapacity, tempVariation, sum);
}
}
return;
}
public static int possibleDistributions(int food, int rooms, int roomCapacity) {
int[] variation = new int[rooms];
// start from all 1, keep going till all food is allocated
Arrays.fill(variation, 1);
distributionsRecursive(food, rooms, roomCapacity, variation, rooms);
return solutions.size();
}
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
int food = in.nextInt();
int rooms = in.nextInt();
int roomCapacity = in.nextInt();
int total = possibleDistributions(food, rooms, roomCapacity);
System.out.println(total);
in.close();
}
}
Yes, your recursion tree will become large if you do this in a naive manner. Let's say you have 10 tonnes and 3 rooms, and M=2. One valid arrangement is [2,3,5]. But you also have [2,5,3], [3,2,5], [3,5,2], [5,2,3], and [5,3,2]. So for every valid grouping of numbers, there are actually K! permutations.
A possibly better way to approach this problem would be to determine how many ways you can make K numbers (minimum M and maximum N) add up to N. Start by making the first number as large as possible, which would be N-(M*(K-1)). In my example, that would be:
10 - 2*(3-1) = 6
Giving the answer [6,2,2].
You can then build an algorithm to adjust the numbers to come up with valid combinations by "moving" values from left to right. In my example, you'd have:
6,2,2
5,3,2
4,4,2
4,3,3
You avoid the seemingly infinite recursion by ensuring that values are decreasing from left to right. For example, in the above you'd never have [3,4,3].
If you really want all valid arrangements, you can generate the permutations for each of the above combinations. I suspect that's not necessary, though.
I think that should be enough to get you started towards a good solution.
One solution would be to compute the result for k rooms from the result for k - 1 rooms.
I've simplified the problem a bit in allowing to store 0 tonnes in a room. If we have to store at least 1 we can just subtract this in advance and reduce the capacity of rooms by 1.
So we define a function calc: (Int,Int) => List[Int] that computes for a number of rooms and a capacity a list of numbers of combinations. The first entry contains the number of combinations we get for storing 0 , the next entry when storing 1 and so on.
We can easily compute this function for one room. So calc(1,m) gives us a list of ones up to the mth element and then it only contains zeros.
For a larger k we can define this function recursively. We just calculate calc(k - 1, m) and then build the new list by summing up prefixes of the old list. E.g. if we want to store 5 tons, we can store all 5 in the first room and 0 in the following rooms, or 4 in the first and 1 in the following and so on. So we have to sum up the combinations for 0 to 5 for the rest of the rooms.
As we have a maximal capacity we might have to leave out some of the combinations, i.e. if the room only has capacity 3 we must not count the combinations for storing 0 and 1 tons in the rest of the rooms.
I've implemented this approach in Scala. I've used streams (i.e. infinite Lists) but as you know the maximal amount of elements you need this is not necessary.
The time complexity of the approach should be O(k*n^2)
def calc(rooms: Int, capacity: Int): Stream[Long] =
if(rooms == 1) {
Stream.from(0).map(x => if(x <= capacity) 1L else 0L)
} else {
val rest = calc(rooms - 1, capacity)
Stream.from(0).map(x => rest.take(x+1).drop(Math.max(0,x - capacity)).sum)
}
You can try it here:
http://goo.gl/tVgflI
(I've replaced the Long by BigInt there to make it work for larger numbers)
First tip, remove distributionsRecursive and don't build up a list of solutions. The list of all solutions is a huge data set. Just produce a count.
That will let you turn possibleDistributions into a recursive function defined in terms of itself. The recursive step will be, possibleDistributions(food, rooms, roomCapacity) = sum from i = 1 to roomCapacity of possibleDistributions(food - i, rooms - 1, roomCapacity).
You will save a lot of memory, but still have your underlying performance problem. However with a pure recursive function you can now fix that with https://en.wikipedia.org/wiki/Memoization.
My question is if given an array,we have to split that into two sub-arrays such that the absolute difference between the sum of the two arrays is minimum with a condition that the difference between number of elements of the arrays should be atmost one.
Let me give you an example.Suppose
Example 1: 100 210 100 75 340
Answer :
Array1{100,210,100} and Array2{75,340} --> Difference = |410-415|=5
Example 2: 10 10 10 10 40
Answer : Array1{10,10,10} and Array2{10,40} --> Difference = |30-50|=20
Here we can see that though we can divide the array into {10,10,10,10} and {40},we are not dividing because the constraint "the number of elements between the arrays should be atmost 1" will be violated if we do so.
Can somebody provide a solution for this ?
My approach:
->Calculate sum of the array
->Divide the sum by 2
->Let the size of the knapsack=sum/2
->Consider the weights of the array values as 1.(If you have come across the knapsack problem ,you may know about the weight concept)
->Then consider the array values as the values of the weights.
->Calculate the answer which will be array1 sum.
->Total sum-answer=array2 sum
This approach fails.
Calculating the two arrays sum is enough.We are not interested in which elements form the sum.
Thank you!
Source: This is an ICPC problem.
I have an algorithm that works in O(n3) time, but I have no hard proof it is optimal. It seems to work for every test input I give it (including some with negative numbers), so I figured it was worth sharing.
You start by splitting the input into two equally sized arrays (call them one[] and two[]?). Start with one[0], and see which element in two[] would give you the best result if swapped. Whichever one gives the best result, swap. If none give a better result, don't swap it. Then move on to the next element in one[] and do it again.
That part is O(2) by itself. The problem is, it might not get the best results the first time through. If you just keep doing it until you don't make any more swaps, you end up with an ugly bubble-type construction which makes it O(n3) total.
Here's some ugly Java code to demonstrate (also at ideone.com if you want to play with it):
static int[] input = {1,2,3,4,5,-6,7,8,9,10,200,-1000,100,250,-720,1080,200,300,400,500,50,74};
public static void main(String[] args) {
int[] two = new int[input.length/2];
int[] one = new int[input.length - two.length];
int totalSum = 0;
for(int i=0;i<input.length;i++){
totalSum += input[i];
if(i<one.length)
one[i] = input[i];
else
two[i-one.length] = input[i];
}
float goal = totalSum / 2f;
boolean swapped;
do{
swapped = false;
for(int j=0;j<one.length;j++){
int curSum = sum(one);
float curBestDiff = Math.abs(goal - curSum);
int curBestIndex = -1;
for(int i=0;i<two.length;i++){
int testSum = curSum - one[j] + two[i];
float diff = Math.abs(goal - testSum);
if(diff < curBestDiff){
curBestDiff = diff;
curBestIndex = i;
}
}
if(curBestIndex >= 0){
swapped = true;
System.out.println("swapping " + one[j] + " and " + two[curBestIndex]);
int tmp = one[j];
one[j] = two[curBestIndex];
two[curBestIndex] = tmp;
}
}
} while(swapped);
System.out.println(Arrays.toString(one));
System.out.println(Arrays.toString(two));
System.out.println("diff = " + Math.abs(sum(one) - sum(two)));
}
static int sum(int[] list){
int sum = 0;
for(int i=0;i<list.length;i++)
sum += list[i];
return sum;
}
Can you provide more information on the upper limit of the input?
For your algorithm, I think your are trying to pick floor(n/2) items and find it's maximum sum of value as array1 sum...(If this is not your original thought then please ignore the following lines)
If this is the case, then knapsack size should be n/2 instead of sum/2,
but even so, I think it's still not working. The ans is min(|a - b|) and maximizing a is a different issue. For eg, {2,2,10,10}, you will get a = 20, b = 4, while the ans is a = b = 12.
To answer the problem, I think I need more information of the upper limit of the input..
I cannot come up with a brilliant dp state but a 3-dimensional state
dp(i,n,v) := in first i-th items, pick n items out and make a sum of value v
each state is either 0 or 1 (false or true)
dp(i,n,v) = dp(i-1, n, v) | dp(i-1, n-1, v-V[i])
This dp state is so naive that it has a really high complexity which usually cannot pass a ACM / ICPC problem, so if possible please provide more information and see if I can come up another better solution...Hope I can help a bit :)
DP soluction will give lg(n) time. Two array, iterate one from start to end, and calculate the sum, the other iterate from end to start, and do the same thing. Finally, iterate from start to end and get minimal difference.