I must implement a recursive method merge(long[] arr, int i) which multiplies adjacent elements if they have the same value, starting at index i.
Example:
merge({1, 2, 2, 4}, 0)
should produce an array like this:
{1, 4, 4}
If there are multiple (n) occurrences of a number {1, 2, 2, 2, 2, 5}, all of these must be multiplied together: {1, 16, 5}.
A number which has already been merged can not be merged again {1, 4, 4, 16} -> {1, 16, 16}.
All this must be achieved by using only this one method merge and having exactly one recursive call per element in the original array.
This is a working implementation using recursion and loops:
public static long[] merge(long[] ns, int i) {
final long[] EMPTY_LONG_ARRAY = {};
if (i < 0) {
return merge(ns, 0, m); // if i negative, start at 0
} else if (i >= ns.length) {
return EMPTY_LONG_ARRAY; // if out of bounds, return empty array
} else if (i == ns.length - 1) {
return ns; // base case
} else { // recursion in here
if (ns[i] == ns[i + 1]) { // if next long is equal
int occurences = 1; // first occurence
for (int j = i; j < ns.length - 1; j++) {
if (ns[j] == ns[j + 1])
occurences++;
else
break;
} // add next occurences
long[] newArray = new long[ns.length - occurences + 1]; // new array is (occurences-1) shorter
for (int j = 0; j < newArray.length; j++) { // fill new array
if (j < i) {
newArray[j] = ns[j]; // left of i: values stay the same
} else if (j > i) {
newArray[j] = ns[j + occurences - 1]; // pull values right of i (occurences-1) to the left
} else {
int counter = occurences;
long mergedValue = ns[j];
while (counter > 1) {
mergedValue *= ns[j];
counter--;
}
newArray[j] = mergedValue; // at j: value is ns[j]^occurences
}
}
if (i == ns.length - 1)
return merge(newArray, i, m);
else
return merge(newArray, i + 1, m); // if bounds permit it, jump to next number
} else {
return merge(ns, i + 1, m); // nothing to merge, go one step forward
}
}
This implementation produces the correct result, however, the recursion depth is wrong (needs to be one recursive call per element in original array ns[]).
I'm sure there is a genius out here who can solve this using linear recursion.
Lets transform your loop into a recursive call. The only reason to do this is that the assignment asks for it - it is not more readable (at least to me), and it is actually slower. People usually want to go in the other direction for efficiency reasons: from recursion to loops.
First, an annotated version of your code:
public static long[] merge(long[] ns, int i) { // i not needed, but useful for recursion
long[] out = new long[ns.length]; // for result; allocate only once
for (int j = i; j < ns.length; j++) { // main loop, condition is "j == length"
int occurences = 0;
for (int k = i; k < ns.length; k++) { // inner loop - can avoid!
if (ns[j] == ns[k]) {
occurences++;
}
}
out[j] = (long) Math.pow(ns[j], occurences); // updating the result
}
// remove additional elements
return out; // this does not remove elements yet!
}
First, let me rewrite that to be more efficient. Since duplicates are only removed if they are next to each other, you do not need the inner loop, and can write this instead:
public static long[] merge(long[] ns) {
long[] out = new long[ns.length];
int oSize = 0; // index of element-after-last in array out
long prev = ns[0]; // previous element in ns; initial value chosen carefully
out[0] = 1; // this make the 1st iteration work right, not incrasing oSize
for (int i=0; i<ns.length; i++) {
long current = ns[i];
if (current == prev) {
out[oSize] *= current; // accumulate into current position
} else {
oSize ++; // generate output
out[oSize] = current; // reset current and prev
prev = current;
}
}
// generate final output, but do not include unused elements
return Arrays.copyOfRange(out, 0, oSize+1);
}
Assuming this works (and beware - I have not tested it), I will now transform it into tail recursion. There will be 2 parts: a driver code (everything not in the loop), and the recursive code (the loopy part).
public static long[] merge(long[] ns) {
long[] out = new long[ns.length];
int oSize = 0;
long prev = ns[0];
out[0] = 1;
int i=0;
recursiveMerge(ns, i, out, oSize, prev); // recursion!
return Arrays.copyOfRange(out, 0, oSize+1);
}
public static void recursiveMerge(long[] ns, int i, long[] out, int oSize, long prev) {
if (i == n) return; // check "loop" termination condition
// copy-pasted loop contents
long current = ns[i];
if (current == prev) {
out[oSize] *= current; // accumulate into current position
} else {
oSize ++; // generate output
out[oSize] = current; // reset current and prev
prev = current;
}
// next loop iteration is now a recursive call. Note the i+1
recursiveMerge(ns, i+1, out, oSize, prev);
}
The general idea is to pass all state as arguments to your recursive function, and check loop termination at the start, put the loop code in the middle, and at the very end, make a recursive call for the next iteration.
Related
I have a function public static int countBaad(int[] hs) that takes in an input array and I'm supposed to find how many numbers are smaller than the ones ahead of it.
For instance,
if hs = [7,3,5,4,1] the answer would be 2 because the pairs that violate the order are 3 and 5 and 3 and 4, since 3 is smaller than them and should've been ahead of them.
if hs = [8,5,6,7,2,1] the answer would be 3 because 5 is smaller than 6 and 7, giving us 2, and since 6 is also smaller than 7, we would get a total of 3 wrong pairs
Here is my current code using the merge sort approach:
public static int countBaad(int[] hs){
return mergeSort(hs, hs.length);
}
public static int mergeSort(int[] a, int n) {
if (n < 2) {
return n;
}
int mid = n / 2;
int[] l = new int[mid];
int[] r = new int[n - mid];
for (int i = 0; i < mid; i++) {
l[i] = a[i];
}
for (int i = mid; i < n; i++) {
r[i - mid] = a[i];
}
mergeSort(l, mid);
mergeSort(r, n - mid);
return merge(a, l, r, mid, n - mid);
}
public static int merge(int[] a, int[] l, int[] r, int left, int right) {
int size = 0;
int i = 0, j = 0, k = 0;
while (i < left && j < right) {
if (l[i] <= r[j]) {
a[k++] = l[i++];
size++;
}
else {
a[k++] = r[j++];
size++;
}
}
while (i < left) {
a[k++] = l[i++];
size++;
}
while (j < right) {
a[k++] = r[j++];
size++;
}
return size;
}
This code gives me the incorrect output after I put in arrays
hs = [7,3,5,4,1] returns 5
hs = [8,5,6,7,2,1] returns 6
What am I doing wrong here, can anyone please correct me?
What your code is currently doing is attempting a sort and then simply returning the size of the sorted array (big surprise, given the aptly named size variable).
Basically you are sorting in descending order and your specification calls for the result to be how many numbers were smaller than those appearing later in the array.
However, in merge you are actually adding to size regardless of their values.
Then, you're only returning the 'size' result of the final merge, not that of the sorting steps required inbetween.
And finally, perhaps the elephant in the room, is that you're performing a (unnecessary) sort as a side effect, but ignoring it completely.
Long story short, the code is too complicated and error prone for what it is supposed to do.
Here's a simple double for loop that achieves the desired outcome:
public static int countBaad(int[] hs){
int count = 0;
for(int i = 0; i < hs.length; i++) {
for(int j = i+1; j < hs.length; j++) {
//compare the i'th position with all subsequent positions
int current = hs[i];
int other = hs[j];
if(current < other) {
System.out.println("Found bad number pair: ("+current+","+other+")");
count++;
}
}
}
return count;
}
System.out.println(countBaad(new int[]{7,3,5,4,1}));
//prints:
//Found bad number pair: (3,5)
//Found bad number pair: (3,4)
//2
System.out.println(countBaad(new int[]{8,5,6,7,2,1}));
//prints:
//Found bad number pair: (5,6)
//Found bad number pair: (5,7)
//Found bad number pair: (6,7)
//3
This is much more succinct and free from side effects.
Edit:
Fixing the mergeSort code, with extra sysout logging to illustrate the algorithm:
public static int mergeSort(int[] a, int n) {
if(n==1) {
//No sorting required, so the result should be 0.
return 0;
}
int mid = n / 2;
int[] l = new int[mid];
int[] r = new int[n - mid];
//'splitting the array' loops are just arraycopy, so
// should use the native implementation:
System.arraycopy(a, 0, l, 0, mid);
if(n - mid >= 0) System.arraycopy(a, mid, r, 0, n - mid);
//add the results from all merges, not just the last one
int result = 0;
result += mergeSort(l, mid);
result += mergeSort(r, n - mid);
result += merge(a, l, r); //there is no need to pass in the array lengths
return result;
}
public static int merge(int[] a, int[] l, int[] r) {
System.out.println("Merging "+Arrays.toString(l)+" and "+Arrays.toString(r));
int size = 0;
int lIdx = 0, rIdx = 0, aIdx = 0;
while (lIdx < l.length && rIdx < r.length) {
if (l[lIdx] >= r[rIdx]) {
a[aIdx++] = l[lIdx++];
//size++; //no: left was already bigger than right
}
else {
//take from the right.
//This number is bigger than all the numbers remaining on the left.
for(int tempIdx = lIdx;tempIdx<l.length;tempIdx++) {
//this loop is for illustration only
System.out.println(" Found bad pair: (" + l[tempIdx] + "," + r[rIdx] + ")");
}
size+=l.length-lIdx;
a[aIdx++] = r[rIdx++];
}
}
//while (lIdx < left) { //NOTE that you had this condition incorrectly reversed resulting in bad merge
// a[aIdx++] = l[lIdx++];
// size++; //no, no comparisons are taking place here
//}
//while (rIdx < right) { //NOTE that you had this condition incorrectly reversed, resulting in bad merge
// a[aIdx++] = r[rIdx++];
// size++; //no, no comparisons are taking place here
//}
//we can also replace the above two loops with arraycopy
// which will perform better on large arrays
if(lIdx < left) {
System.arraycopy(l, lIdx, a, aIdx, l.length-lIdx);
}
if(rIdx < right) {
System.arraycopy(r, rIdx, a, aIdx, r.length-rIdx);
}
return size;
}
Since you value performance, you should use System.arraycopy where possible. I have also renamed the loop variables to make the code easier to understand.
System.out.println(countBaad(new int[]{7,3,5,4,1}));
//prints:
//Merging [7] and [3]
//Merging [4] and [1]
//Merging [5] and [4, 1]
//Merging [7, 3] and [5, 4, 1]
// Found bad pair: (3,5)
// Found bad pair: (3,4)
//2
System.out.println(countBaad(new int[]{8,5,6,7,2,1}));
//prints:
//Merging [5] and [6]
// Found bad pair: (5,6)
//Merging [8] and [6, 5]
//Merging [2] and [1]
//Merging [7] and [2, 1]
//Merging [8, 6, 5] and [7, 2, 1]
// Found bad pair: (6,7)
// Found bad pair: (5,7)
//3
Edit #2
To remove the side effects (sort) from this method, the input array can be copied, for example with a simple call to Arrays.copyOf(hs, hs.length); and passing in the result instead of the original.
When I check for subsequences I always duplicate the condition after the loop.
For example, I want to find the maximum subsequence of numbers with a difference of no more than one. Here is my code
public static List<Integer> maxSubsequence(List<Integer> array) {
int ind = 0;
int bestInd = 0;
int cnt = 1;
int maxCnt = 0;
for(int i = 1; i < array.size(); i++) {
if(Math.abs(array.get(ind) - array.get(i)) <= 1) {
cnt++;
continue;
}
if(cnt > maxCnt) {
bestInd = ind;
maxCnt = cnt;
}
if(Math.abs(array.get(ind) - array.get(i)) == 2) {
cnt--;
ind++;
i--;
} else {
cnt = 1;
ind = i;
}
}
// duplicate from loop
if(cnt > maxCnt) {
bestInd = ind;
maxCnt = cnt;
}
return array.subList(bestInd, bestInd + maxCnt);
}
for sequence 5, 1, 2, 3, 3, 3 answer is 2, 3, 3, 3
I am duplicating the condition because if the sequence ends with a matching subsequence, then it will not be counted without an additional condition. I would like to avoid code duplication.
My solution requires changing the input. Are there any ways to avoid code duplication without changing the input.
The solution with the transfer of the code from the condition to the function won't fit, since it does not eliminate duplication, I still need to call function twice.
The general pattern for such a question is to use two "pointers" (indexes into the list, really):
A "start" pointer, which you increment while it points to elements which are not part of a subsequence, until you reach the end of the list, or it points to the first element in the subsequence (in the specific case of the problem in the question, there are no elements not part of a subsequence).
An "end" pointer, initially equal to the start (or one more than the start), which you increment until either you hit the end of the list, or it's pointing to the first element which isn't part of the same subsequence
Your subsequence is then between start and end, inclusive and exclusive respectively. Process it as necessary
Repeat the loop with the start equal to the previous end, until you hit the end of the list
So, something like:
int start = 0;
while (start < list.size()) {
// Increase end as much as you can for this subsequence
int end = start + 1;
while (end < list.size()) {
if (/* condition meaning you don't want to increment end any more */) {
break;
}
end++;
}
// See if this subsequence is "best"
int cnt = end - start;
if (cnt > maxCnt) {
bestInd = start;
maxCnt = cnt;
}
// Prepare for next iteration.
start = end;
}
another way to solve it using map and stream
public static List<Integer> maxSubsequence(List<Integer> array) {
Map<Integer, List<Integer>> result = new HashMap<>();
List<Integer> firstArray = new ArrayList<>();
firstArray.add(array.get(0));
result.put(1, firstArray);
for (int i = 0; i < array.size() - 1; i++) {
if (Math.abs(array.get(i) - array.get(i + 1)) <= 1) {
result.get(result.size()).add(array.get(i + 1));
} else {
firstArray = new ArrayList<>();
firstArray.add(array.get(i + 1));
result.put(result.size() + 1, firstArray);
}
}
return result.values().stream().max(Comparator.comparingInt(List::size))
.orElse(null); // add filter if you do not want to return and arraylist of single element like this .filter(ar -> ar.size() != 1)
}
I always understood that defining a local variable within a loop does not slow it down because they are reused between iterations of the same loop.
I was surprised to find that when I move the definition of the local variable outside the loop, then it reduces memory significantly (39.4Mb vs 40 Mb).
Between iterations of the same loop, are local variables reused or reallocated?
I did also see Allocation of space for local variables in loops
Duplicate Zeroes Problem (leetcode): Given a fixed length array arr of integers, duplicate each occurrence
of zero, shifting the remaining elements to the right.
Note that elements beyond the length of the original array are not
written.
Do the above modifications to the input array in place, do not return
anything from your function.
import java.util.Arrays;
/**
* algorithm: the zeroes divide the array into sub-arrays or subsets.
* we move or shift the elements exactly once, to their final resting place R.I.P. ;)
* The last subset will be shifted n0s places, the one before it, n0s -1 places and so on...
* O(n)
* #author likejudo
*
*/
public class DuplicateZeroes {
static void arrayCopy(int[] arr, int begin, int end, int n) {
for (int i = end + 1; i >= begin ; i--) {
int destination = i + n;
if (destination < arr.length) {
arr[destination] = arr[i];
}
}
}
public static void duplicateZeros(int[] arr) {
int n0s = 0; // number of zeroes
int last0At = -1; // last zero at index
int boundary = 0; // rightmost boundary
// find total n0s, last0At
for (int i = 0; i < arr.length; i++) {
if (arr[i] == 0) {
n0s++;
last0At = i;
}
}
// System.out.format("n0s=%d last0At=%d \n", n0s, last0At);
// if no zeroes or all zeroes, we are done
if(n0s == 0 || n0s == arr.length) {
return;
}
boundary = arr.length - n0s;
while (n0s > 0) {
// System.out.format("before arrayCopy(%s, %d, %d, %d) ", Arrays.toString(arr), last0At, boundary, n0s);
// move subset of all elements from last0At till boundary-1, by n0s spaces.
arrayCopy(arr, last0At, boundary, n0s);
// set start of subset to 0
arr[last0At] = 0;
// System.out.format("after arrayCopy : %s assigned arr[last0At=%d]=0\n", Arrays.toString(arr),last0At);
// update boundary
boundary = last0At - 1;
// next subset to the left will have one less zero
n0s--;
last0At--;
// find the next last zer0 At index
while (last0At > 0 && arr[last0At] != 0)
last0At--;
// if no more...
if (last0At <0 || arr[last0At] != 0) {
return;
}
}
}
public static void main(String[] args) {
// input: [1, 0, 2, 3, 0, 4, 5, 0]
// output: [1, 0, 0, 2, 3, 0, 0, 4]
int[] arr = {0,0,0,0,0,0,0};
System.out.println("input: " + Arrays.toString(arr));
duplicateZeros(arr);
System.out.println("output: " + Arrays.toString(arr));
}
}
In the method arrayCopy, when I move the local variable destination outside the loop,
Before
static void arrayCopy(int[] arr, int begin, int end, int n) {
for (int i = end + 1; i >= begin ; i--) {
int destination = i + n; // >>>>>>>>>>>>>>>>>>>>>>>
if (destination < arr.length) {
arr[destination] = arr[i];
}
}
}
After
memory usage improved! (39.4 Mb vs 40 Mb)
static void arrayCopy(int[] arr, int begin, int end, int n) {
int destination = 0; // >>>>>>>>>>>>>>>>>
for (int i = end + 1; i >= begin ; i--) {
destination = i + n;
if (destination < arr.length) {
arr[destination] = arr[i];
}
}
}
About your question
I always understood that defining a local variable within a loop does
not slow it down because they are reused between iterations of the
same loop.
declaring local variable inside loop does not slow it down?
Yes, you are right. Declaring local vars does not increase the time complexity, or if it does change the runtime just a bit, it's too insignificant to be considered.
Runtime and memory measurements of LeetCode are highly inaccurate, especially runtime. For instance, I just resubmitted the following solution and it says 39.6 MB, some days ago said 43.3 MB for the exact same solution without a byte change:
Their test cases are usually limited because it is costly I guess, thus their benchmarking is not valuable.
public final class Solution {
public static final void duplicateZeros(int[] arr) {
int countZero = 0;
for (int index = 0; index < arr.length; index++)
if (arr[index] == 0) {
countZero++;
}
int length = arr.length + countZero;
for (int indexA = arr.length - 1, indexB = length - 1; indexA < indexB; indexA--, indexB--)
if (arr[indexA] != 0) {
if (indexB < arr.length) {
arr[indexB] = arr[indexA];
}
} else {
if (indexB < arr.length) {
arr[indexB] = arr[indexA];
}
indexB--;
if (indexB < arr.length) {
arr[indexB] = arr[indexA];
}
}
}
}
Overall it'd be best to focus on asymptotically efficient algorithms mostly, because benchmarking has lots of "how-tos" and we'd want to have really good resources (CPU, memory, etc.) with isolated test systems.
References
For additional details, please see the Discussion Board where you can find plenty of well-explained accepted solutions with a variety of languages including low-complexity algorithms and asymptotic runtime/memory analysis1, 2.
question: Given a sorted array nums, remove the duplicates in-place such that duplicates appeared at most twice and return the new length.
Do not allocate extra space for another array, you must do this by modifying the input array in-place with O(1) extra memory.
My solution: This code is always missing on one index no matter what. Can someone please help me why ? For example my example input is supposed to return 6,but it returns 5.
int[] arr2= {1,1,1,2,3,4,4};
int i=findDupsMedium(arr2);
System.out.println(i);
static int findDupsMedium(int[] arr) {
int index=0;
if(arr.length>1) {
for(int i=0;i<2;i++) {
arr[index++]=arr[i];
}
}
//System.out.println("index:" + index);
for(int ii=2;ii<arr.length;ii++ ) {
int diff=ii-2;
if(arr[ii] != arr[diff]) {
arr[index++]=arr[ii];
}
}
return index;
}
Your approach is ok, but missing some certain parts.
Here is a little bit dirty solution, it works for consecutive duplicates.
If input array has duplicates in different places, you have to implement another for loop.
static int findDupsMedium(int[] arr) {
int count=0;
//used for extracting duplicates from the length of array
int extract=0;
if(arr.length>1) {
// this is for having a comparison withot getting outOfBounds;
int lastItem=0;
for(int i=0; i<arr.length; i++) {
//If we had 2 duplicates and new one is the same with previous one, remove
if(count == 2 && lastItem == arr[i]){
//if end of the array has duplicate, make it "-1"
if(i==arr.length-1){
arr[i]=-1;
}
else{
extract++; //we found a duplicate
lastItem = arr[i];
//shift it
for(int j=i;j<arr.length-1;j++){
arr[j]=arr[j+1];
}
}
//printArray(arr);
count = 0;
}
else{
if(arr[i+1]==arr[i]){
count++;
lastItem = arr[i];
}
}
}
}
return arr.length - extract;
}
To do this you need to keep track of the length of the array as it changes as well as when to update the main loop's index.
A boolean flag is also used to keep track of when a series of duplicates occur.
public static int findDupsMedium(int[] arr2) {
int size = arr2.length;
boolean foundFirstDuplicate = false;
for (int i = 0; i < arr2.length - 1; i++) {
for (int k = i + 1; k < size;) {
if (arr2[i] == arr2[k]) {
if (foundFirstDuplicate) {
// If we're here, this must be third
// duplicate in a row so copy up the array
// overwriting the third dupe.
for (int g = k; g < arr2.length - 1; g++) {
arr2[g] = arr2[g + 1];
}
i--; // and readjust outer loop to stay in
// position
// and effective size of array is one smaller
// so adjust that
size--;
}
// set first time a duplicate is found and keep this set
// until no more duplicates
foundFirstDuplicate = true;
break;
}
// no third or more duplicate so set to false
foundFirstDuplicate = false;
break;
}
}
return size;
}
To verify it works ok, add the folowing method
static void display(int[] a, int size) {
int[] t = Arrays.copyOf(a, size);
System.out.println(Arrays.toString(t));
}
And call the methods as follows:
int[] arr2 = { 1, 2, 2, 2, 2, 3, 3, 4, 4, 4, 4, 5
};
int size = findDupsMedium(arr2);
display(arr2, size);
I'm working on a puzzle that involves analyzing all size k subsets and figuring out which one is optimal. I wrote a solution that works when the number of subsets is small, but it runs out of memory for larger problems. Now I'm trying to translate an iterative function written in python to java so that I can analyze each subset as it's created and get only the value that represents how optimized it is and not the entire set so that I won't run out of memory. Here is what I have so far and it doesn't seem to finish even for very small problems:
public static LinkedList<LinkedList<Integer>> getSets(int k, LinkedList<Integer> set)
{
int N = set.size();
int maxsets = nCr(N, k);
LinkedList<LinkedList<Integer>> toRet = new LinkedList<LinkedList<Integer>>();
int remains, thresh;
LinkedList<Integer> newset;
for (int i=0; i<maxsets; i++)
{
remains = k;
newset = new LinkedList<Integer>();
for (int val=1; val<=N; val++)
{
if (remains==0)
break;
thresh = nCr(N-val, remains-1);
if (i < thresh)
{
newset.add(set.get(val-1));
remains --;
}
else
{
i -= thresh;
}
}
toRet.add(newset);
}
return toRet;
}
Can anybody help me debug this function or suggest another algorithm for iteratively generating size k subsets?
EDIT: I finally got this function working, I had to create a new variable that was the same as i to do the i and thresh comparison because python handles for loop indexes differently.
First, if you intend to do random access on a list, you should pick a list implementation that supports that efficiently. From the javadoc on LinkedList:
All of the operations perform as could be expected for a doubly-linked
list. Operations that index into the list will traverse the list from
the beginning or the end, whichever is closer to the specified index.
An ArrayList is both more space efficient and much faster for random access. Actually, since you know the length beforehand, you can even use a plain array.
To algorithms: Let's start simple: How would you generate all subsets of size 1? Probably like this:
for (int i = 0; i < set.length; i++) {
int[] subset = {i};
process(subset);
}
Where process is a method that does something with the set, such as checking whether it is "better" than all subsets processed so far.
Now, how would you extend that to work for subsets of size 2? What is the relationship between subsets of size 2 and subsets of size 1? Well, any subset of size 2 can be turned into a subset of size 1 by removing its largest element. Put differently, each subset of size 2 can be generated by taking a subset of size 1 and adding a new element larger than all other elements in the set. In code:
processSubset(int[] set) {
int subset = new int[2];
for (int i = 0; i < set.length; i++) {
subset[0] = set[i];
processLargerSets(set, subset, i);
}
}
void processLargerSets(int[] set, int[] subset, int i) {
for (int j = i + 1; j < set.length; j++) {
subset[1] = set[j];
process(subset);
}
}
For subsets of arbitrary size k, observe that any subset of size k can be turned into a subset of size k-1 by chopping of the largest element. That is, all subsets of size k can be generated by generating all subsets of size k - 1, and for each of these, and each value larger than the largest in the subset, add that value to the set. In code:
static void processSubsets(int[] set, int k) {
int[] subset = new int[k];
processLargerSubsets(set, subset, 0, 0);
}
static void processLargerSubsets(int[] set, int[] subset, int subsetSize, int nextIndex) {
if (subsetSize == subset.length) {
process(subset);
} else {
for (int j = nextIndex; j < set.length; j++) {
subset[subsetSize] = set[j];
processLargerSubsets(set, subset, subsetSize + 1, j + 1);
}
}
}
Test code:
static void process(int[] subset) {
System.out.println(Arrays.toString(subset));
}
public static void main(String[] args) throws Exception {
int[] set = {1,2,3,4,5};
processSubsets(set, 3);
}
But before you invoke this on huge sets remember that the number of subsets can grow rather quickly.
You can use
org.apache.commons.math3.util.Combinations.
Example:
import java.util.Arrays;
import java.util.Iterator;
import org.apache.commons.math3.util.Combinations;
public class tmp {
public static void main(String[] args) {
for (Iterator<int[]> iter = new Combinations(5, 3).iterator(); iter.hasNext();) {
System.out.println(Arrays.toString(iter.next()));
}
}
}
Output:
[0, 1, 2]
[0, 1, 3]
[0, 2, 3]
[1, 2, 3]
[0, 1, 4]
[0, 2, 4]
[1, 2, 4]
[0, 3, 4]
[1, 3, 4]
[2, 3, 4]
Here is a combination iterator I wrote recetnly
package psychicpoker;
import java.util.ArrayList;
import java.util.Collection;
import java.util.Iterator;
import java.util.List;
import static com.google.common.base.Preconditions.checkArgument;
public class CombinationIterator<T> implements Iterator<Collection<T>> {
private int[] indices;
private List<T> elements;
private boolean hasNext = true;
public CombinationIterator(List<T> elements, int k) throws IllegalArgumentException {
checkArgument(k<=elements.size(), "Impossible to select %d elements from hand of size %d", k, elements.size());
this.indices = new int[k];
for(int i=0; i<k; i++)
indices[i] = k-1-i;
this.elements = elements;
}
public boolean hasNext() {
return hasNext;
}
private int inc(int[] indices, int maxIndex, int depth) throws IllegalStateException {
if(depth == indices.length) {
throw new IllegalStateException("The End");
}
if(indices[depth] < maxIndex) {
indices[depth] = indices[depth]+1;
} else {
indices[depth] = inc(indices, maxIndex-1, depth+1)+1;
}
return indices[depth];
}
private boolean inc() {
try {
inc(indices, elements.size() - 1, 0);
return true;
} catch (IllegalStateException e) {
return false;
}
}
public Collection<T> next() {
Collection<T> result = new ArrayList<T>(indices.length);
for(int i=indices.length-1; i>=0; i--) {
result.add(elements.get(indices[i]));
}
hasNext = inc();
return result;
}
public void remove() {
throw new UnsupportedOperationException();
}
}
I've had the same problem today, of generating all k-sized subsets of a n-sized set.
I had a recursive algorithm, written in Haskell, but the problem required that I wrote a new version in Java.
In Java, I thought I'd probably have to use memoization to optimize recursion. Turns out, I found a way to do it iteratively. I was inspired by this image, from Wikipedia, on the article about Combinations.
Method to calculate all k-sized subsets:
public static int[][] combinations(int k, int[] set) {
// binomial(N, K)
int c = (int) binomial(set.length, k);
// where all sets are stored
int[][] res = new int[c][Math.max(0, k)];
// the k indexes (from set) where the red squares are
// see image above
int[] ind = k < 0 ? null : new int[k];
// initialize red squares
for (int i = 0; i < k; ++i) { ind[i] = i; }
// for every combination
for (int i = 0; i < c; ++i) {
// get its elements (red square indexes)
for (int j = 0; j < k; ++j) {
res[i][j] = set[ind[j]];
}
// update red squares, starting by the last
int x = ind.length - 1;
boolean loop;
do {
loop = false;
// move to next
ind[x] = ind[x] + 1;
// if crossing boundaries, move previous
if (ind[x] > set.length - (k - x)) {
--x;
loop = x >= 0;
} else {
// update every following square
for (int x1 = x + 1; x1 < ind.length; ++x1) {
ind[x1] = ind[x1 - 1] + 1;
}
}
} while (loop);
}
return res;
}
Method for the binomial:
(Adapted from Python example, from Wikipedia)
private static long binomial(int n, int k) {
if (k < 0 || k > n) return 0;
if (k > n - k) { // take advantage of symmetry
k = n - k;
}
long c = 1;
for (int i = 1; i < k+1; ++i) {
c = c * (n - (k - i));
c = c / i;
}
return c;
}
Of course, combinations will always have the problem of space, as they likely explode.
In the context of my own problem, the maximum possible is about 2,000,000 subsets. My machine calculated this in 1032 milliseconds.
Inspired by afsantos's answer :-)... I decided to write a C# .NET implementation to generate all subset combinations of a certain size from a full set. It doesn't need to calc the total number of possible subsets; it detects when it's reached the end. Here it is:
public static List<object[]> generateAllSubsetCombinations(object[] fullSet, ulong subsetSize) {
if (fullSet == null) {
throw new ArgumentException("Value cannot be null.", "fullSet");
}
else if (subsetSize < 1) {
throw new ArgumentException("Subset size must be 1 or greater.", "subsetSize");
}
else if ((ulong)fullSet.LongLength < subsetSize) {
throw new ArgumentException("Subset size cannot be greater than the total number of entries in the full set.", "subsetSize");
}
// All possible subsets will be stored here
List<object[]> allSubsets = new List<object[]>();
// Initialize current pick; will always be the leftmost consecutive x where x is subset size
ulong[] currentPick = new ulong[subsetSize];
for (ulong i = 0; i < subsetSize; i++) {
currentPick[i] = i;
}
while (true) {
// Add this subset's values to list of all subsets based on current pick
object[] subset = new object[subsetSize];
for (ulong i = 0; i < subsetSize; i++) {
subset[i] = fullSet[currentPick[i]];
}
allSubsets.Add(subset);
if (currentPick[0] + subsetSize >= (ulong)fullSet.LongLength) {
// Last pick must have been the final 3; end of subset generation
break;
}
// Update current pick for next subset
ulong shiftAfter = (ulong)currentPick.LongLength - 1;
bool loop;
do {
loop = false;
// Move current picker right
currentPick[shiftAfter]++;
// If we've gotten to the end of the full set, move left one picker
if (currentPick[shiftAfter] > (ulong)fullSet.LongLength - (subsetSize - shiftAfter)) {
if (shiftAfter > 0) {
shiftAfter--;
loop = true;
}
}
else {
// Update pickers to be consecutive
for (ulong i = shiftAfter+1; i < (ulong)currentPick.LongLength; i++) {
currentPick[i] = currentPick[i-1] + 1;
}
}
} while (loop);
}
return allSubsets;
}
This solution worked for me:
private static void findSubsets(int array[])
{
int numOfSubsets = 1 << array.length;
for(int i = 0; i < numOfSubsets; i++)
{
int pos = array.length - 1;
int bitmask = i;
System.out.print("{");
while(bitmask > 0)
{
if((bitmask & 1) == 1)
System.out.print(array[pos]+",");
bitmask >>= 1;
pos--;
}
System.out.print("}");
}
}
Swift implementation:
Below are two variants on the answer provided by afsantos.
The first implementation of the combinations function mirrors the functionality of the original Java implementation.
The second implementation is a general case for finding all combinations of k values from the set [0, setSize). If this is really all you need, this implementation will be a bit more efficient.
In addition, they include a few minor optimizations and a smidgin logic simplification.
/// Calculate the binomial for a set with a subset size
func binomial(setSize: Int, subsetSize: Int) -> Int
{
if (subsetSize <= 0 || subsetSize > setSize) { return 0 }
// Take advantage of symmetry
var subsetSizeDelta = subsetSize
if (subsetSizeDelta > setSize - subsetSizeDelta)
{
subsetSizeDelta = setSize - subsetSizeDelta
}
// Early-out
if subsetSizeDelta == 0 { return 1 }
var c = 1
for i in 1...subsetSizeDelta
{
c = c * (setSize - (subsetSizeDelta - i))
c = c / i
}
return c
}
/// Calculates all possible combinations of subsets of `subsetSize` values within `set`
func combinations(subsetSize: Int, set: [Int]) -> [[Int]]?
{
// Validate inputs
if subsetSize <= 0 || subsetSize > set.count { return nil }
// Use a binomial to calculate total possible combinations
let comboCount = binomial(setSize: set.count, subsetSize: subsetSize)
if comboCount == 0 { return nil }
// Our set of combinations
var combos = [[Int]]()
combos.reserveCapacity(comboCount)
// Initialize the combination to the first group of set indices
var subsetIndices = [Int](0..<subsetSize)
// For every combination
for _ in 0..<comboCount
{
// Add the new combination
var comboArr = [Int]()
comboArr.reserveCapacity(subsetSize)
for j in subsetIndices { comboArr.append(set[j]) }
combos.append(comboArr)
// Update combination, starting with the last
var x = subsetSize - 1
while true
{
// Move to next
subsetIndices[x] = subsetIndices[x] + 1
// If crossing boundaries, move previous
if (subsetIndices[x] > set.count - (subsetSize - x))
{
x -= 1
if x >= 0 { continue }
}
else
{
for x1 in x+1..<subsetSize
{
subsetIndices[x1] = subsetIndices[x1 - 1] + 1
}
}
break
}
}
return combos
}
/// Calculates all possible combinations of subsets of `subsetSize` values within a set
/// of zero-based values for the set [0, `setSize`)
func combinations(subsetSize: Int, setSize: Int) -> [[Int]]?
{
// Validate inputs
if subsetSize <= 0 || subsetSize > setSize { return nil }
// Use a binomial to calculate total possible combinations
let comboCount = binomial(setSize: setSize, subsetSize: subsetSize)
if comboCount == 0 { return nil }
// Our set of combinations
var combos = [[Int]]()
combos.reserveCapacity(comboCount)
// Initialize the combination to the first group of elements
var subsetValues = [Int](0..<subsetSize)
// For every combination
for _ in 0..<comboCount
{
// Add the new combination
combos.append([Int](subsetValues))
// Update combination, starting with the last
var x = subsetSize - 1
while true
{
// Move to next
subsetValues[x] = subsetValues[x] + 1
// If crossing boundaries, move previous
if (subsetValues[x] > setSize - (subsetSize - x))
{
x -= 1
if x >= 0 { continue }
}
else
{
for x1 in x+1..<subsetSize
{
subsetValues[x1] = subsetValues[x1 - 1] + 1
}
}
break
}
}
return combos
}